Printing books from HTML and CSS: metrics, formatters and results

Di Marco, Antonio (2019) Printing books from HTML and CSS: metrics, formatters and results. [Laurea], Università di Bologna, Corso di Studio in Informatica [L-DM509]
Documenti full-text disponibili:
[img] Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato

Download (2MB)


In this document we analyzed the possibility to use CSS as a solid successor of XSL-FO for producing books in PDF format. We collected three test suites: a combination of crafted documents, some book excerpts from the publisher “Il Mulino” and some complete books from the public domain. We also devised a set of metrics to evaluate the quality of the produced PDFs. We tested the most used formatters to generate PDFs so as to get an overview of the market status in terms of support for CSS Paged Media. While book excerpts and crafted short PDFs were easily checkable manually, for the larger books we recurred to a software we developed to automatically scan the pages and check for errors. Most of the commercial PDF formatters’ output was up to expectations. Given enough time we hope that the gap between the open source offering and the commercial one will be reduced enough to consider them comparable. Overall, we were able to demonstrate how CSS Paged Media supports all of the requirements and output capabilities of XSL-FO either directly or, in a few cases, with manual intervention on the HTML/CSS code. Specifically, XSL-FO is designed with the concept of page sequences, lacking in CSS which instead works on individual pages. This different paradigm resulted in the inability to insert conditional spaces and blank pages at predefined positions. Thus, the overall process cannot be fully automated yet.

Tipologia del documento
Tesi di laurea (Laurea)
Autore della tesi
Di Marco, Antonio
Relatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
Data di discussione della Tesi
17 Luglio 2019

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento