Abstract
The abstract was not available at the time the proceedings were created. Please check an updated version of the paper abstracts at the conference proceedings web site.
Keywords
Table of Contents
The conceptual potential of XML lies in the strict separation of content, logical structure, and document layout. However, this advantage also means that, without additional formatting rules, the representation of XML documents is not defined. The missing rules can be specified as XSL style sheets. Software that is conform to XSL (XSLT and XSL-FO processors) can then be used to represent XML documents with different formatting and on various media.
It is commonly known that the XSL recommendation is organized into two main parts. In accordance with the title, rules for the transformation of XML documents, namely for restructuring, -sorting, and –naming XML elements, are defined in the section XSL Transformations (XSLT). Formatting of XML documents on the basis of so-called Formatting Objects (FO) is defined in the main section of XSL.
In agreement with the two XSL sections, the publishing workflow of an XML document is organized into two main processes too. Figure 1 illustrates that first of all a transformation, in which a new XML document is produced, is carried out. The logical structure of this new XML document might be valid against a Document Type Definition (DTD). The case which is most often found in practice is the HTML- or XHTML-DTD, i.e. the transformation of the original XML document into an HTML or XHTML document.
In order to publish a PDF document it is first of all necessary to transform the original XML document into a FO document, i.e. a well-formed XML document, attributes and elements of which derive from the XSL-FO namespace. The FO elements and attributes describe the desired formatting of the output. ASCII or binary formats with complex encoding, such as PDF, can then be generated by means of an XSL-FO processor.
An FO document describes the formatting of an XML document on the basis of an area model. Within the model, the representation area (for example, an A4 page) is divided into block and inline areas. Block areas are destined for paragraphs, illustrations, etc.; inline areas are primarily used to place signs and words.
To all areas layout characteristics, such as margin, font-size, font-family, etc., can be assigned. The complete set consists of all the characteristics that are already defined by the W3C recommendation for Cascading Stylesheets (CSS) and additional novelties with regard to the extended field of XML applications. An important CSS supplement is the introduction of typical areas for page-oriented formatting. Areas and regions for headers and footers as wells as for the layout of columns are available in XSL for this purpose. Another novelty is the orientation towards international application. XSL-FO therefore supports writing in any direction, for example, from right to left as in Hebrew and top down as in Chinese.
With regard to the use of XSL in high-end printing, qualified handling of color is important. Colors which are predefined and known from CSS can be correlated with the different areas of an XSL-FO document or explicit RGB values can be set by means of the rgb()-function in order to facilitate qualified color management. Since printers do not work with RGB color definitions but need CMYK values for printout, ICC color profiles can also be integrated into an XSL-FO document. Color space transformations, which are necessary for color accuracy in print, can be performed on the basis of these profiles. In the XSL-FO document, the single values of an ICC color profile can be referenced with the icc-color()-function.
Several DTP products, each of which has different specific advantages, are used in producing publishing products. On the one hand, there are professional DTP products on the basis of WYSIWYG, which are primarily suitable for the creative, manual layout process. On the other hand, established layout systems with rule-based controls, which are predominantly used for large, structured documents, are available. So, the question is: Where can XSL-FO-based workflows are placed?
Stephan Deach [Deach 2002] sees a possible application of XSL-FO primarily in batch processes, in which structured documents are formatted on-the-fly. He identifies "financial-planning guides", "owner and maintenance manuals", and "legal agreements and contracts" as specific examples of contemporary XSL-FO application. Conventional layout systems might already generally support such processes; in the following it shall be demonstrated, however, that yet more efficient publishing concepts opened up with XML philosophy.
In contrast to the controlling technologies of conventional rule-based layout systems, XSLT offers a standardized, easy-to-use, and widespread programming language, which does not only allow formatting but also structural transformations. Conceptual and technical separation of transformation and formatting supports a correspondingly modularized production flow. In addition, it is a great basis for publishing projects with different user groups and output systems.
The strict separation of content and layout is reflected directly in separate production processes. Core competences of all service providers engaged can thereby be used in the publishing workflow to the greatest advantage possible. The customer or publisher can devote him- or herself to the content of a publishing contract, a design agency takes care of the layout, the print service provider does the printing, and a web service provider publishes the content online.
Apart from presenting possibilities for optimized product quality, the depicted division of responsibilities also offers a potential for new, efficient workflows. As customary in many data-based applications today, the customer does not need to worry about layout every time data is updated but rather reuses layout parameters that are separate of any content.
As described above, conceptual and technical separation of production processes can generally also be achieved with proprietary approaches. The application of XML/XSL, however, has two advantages. Firstly, it supports more flexibility in the integration of backend systems, such as merchandise management, document management, or other systems, and thereby ensures, as does any other standardization, medium- and longer-term investment safety. Secondly, the use of standardized DTD/XML schemes provides an XML-specific advantage: The exchange of documents with standardized structure, which, for example, comply with one of the widely spread DTD for booklike publications (e.g. DocBook, ISO 12083 or TEI), enables the use of XSL style sheets for publishing orders from different customers instead of for one specific order only. Individual adjustment of customer layout, which is generally necessary, remains feasible and can be realized by gradually changing the overall style sheet. Technically, the include mechanism for XSLT style sheets provides a possible way to do so. Customer-independent transformation rules of the DTD can be administered independently of customer-specific ones.
Independent of the workflow organization, a number of areas of application suitable for XSL-FO can be specified. These shall now be characterized briefly:
Structured printed matter
Structured print products have content that is characterized by clear, logical structure and rather functional and informative character (content-driven applications). These products can have great page volumes, such as product catalogues, technical manuals, and loose-leaf-collections, but they can also be brochures, labels, classified advertisement, etc.
Data base publishing / Personalization / Individualization
XML has already proved to be an easy-to-use and efficient exchange format for data bases and is therefore also suitable for application in publishing processes with integrated backend systems. Typical applications are the integration of merchandise management data or personalization data, which are of special interest in the publishing of individualized product proposals or business reports, for example.
Cross-media publishing
Media neutrality of XML and the XSL workflow do directly support the publication of content into different media. Data management which is really free of redundancy, however, can be achieved for text and vector graphics only. Device-dependency of halftone images, which is a result of image resolution, continues to exist even if the image is embedded in an XML document.
Web printing
In his Seybold article, Ken Holman [Holman 2002] draws attention to the fact that XSL-FO could help to improve the often inadequate quality that results when longer HTML pages are printed out. What he refers to is a special cross media application: The online reader who wishes to print out an HTML document does not simply use the print function of the browser. Instead, he or she generates a document in PDF or another print specific format. This document can be generated on the basis of the original XML data either on the web server or on the user client by means of XSL-FO.
Adaptive documents
Personalization and individualization of documents on the basis of intelligent automatic processes is the subject of contemporary research in the field of conventional and electronic publishing. In this context, XML is not simply a format for structuring documents but also one that contributes to the necessary semantic description of reusable document units. Among other things, the semantic data can be described in XML-coded meta data. The meta data can also be transferred directly into XSL-FO format (as values of role attributes). Thereby an FO processor is given direct access to the semantic of the layout objects. Specific examples of application are scripts for teaching which are compiled automatically of reusable teaching modules, a context specifically composed newsletter, or an individualized travel guide, etc.
Exchange of functional charts
The integration of charts, such as pie or bar charts, is often useful in business print matter. Illustrations of that kind have a strong functional character and are often generated on the basis of spreadsheets. XSL-FO supports the integration of external format, such as vector graphics, and especially the integration of XML data, such as Scalable Vector Graphics (SVG/SVGP). XSL-FO therefore enables the use of vector graphics without the necessity to apply separate technologies for text and graphic data. Furthermore, XSL-FO documents can be utilized to exchange vector graphics within formatted documents without loss of data structure or semantics.
Therefore, XSL-FO enables the exchange of visual data that, on the one hand, contains all layout information necessary for printout onto paper or display on screen and, on the other hand, can also be used further for data processing in backend systems. Possible applications are, for example, business documents with visual data in table form, such as a quarterly report that contains sales figures with respect to type of branch, the evaluation of a product with extensive numerical series, or charts of stocks with the various days’ rates, etc.
PDF has developed into the standard format for exchanging and processing layout data in prepress. From a pragmatic point of view on the production flow, it is primarily the integrity and straightforward use of the format that caused PDF to spread widely. In addition, important characteristics, such as platform neutrality, compressibility, object orientation, etc. have reinforced the use of PDF. Typically, PDF is generated as the result of the layout process, i.e. the final stage of the creative design process. Depending on the particular print product, several steps, such as color management, trapping, and imposition, are subsequent carried out in the technically oriented prepress. It is predominantly for this technical workflow that PDF has established itself as data exchange format. Hence, one of the tasks within an XML publishing workflow with paper output is the generation of a PDF document from XML. Various strategies that can be differentiated in terms of technical framework and business model can be employed to master this task.
From a technical point of view, there are two basic alternatives: standardized or proprietary conversion. Standardized conversion requires both an XSLT and an XSL-FO processor, which run on the basis of an XSL style sheet. Appropriate processors are available either as commercial or as public domain products. Many standard software products (office packages, DTP tools, data base systems, etc.) also contain those processors as embedded modules nowadays. Proprietary conversion into PDF is mainly possible in some rule-based layout software into which XML can be imported and, by means of an internal formatting concept, printed out in various formats afterwards.
With regard to the respective business model, the following workflows are of interest for a print service provider:
The customer delivers PDF
Conversion of the XML document prior to delivery to the print service provider is the most plausible case. The customer provides the PDF document and the print service provider does not face any changes in comparison with conventional workflows. On this basic level, every print service provider therefore also supports XML publishing workflows.
The customer delivers XML content and XSL style sheets
Conversion is performed by the print service provider in this case. The customer, however, remains responsible for generating the style sheet. This proceeding is especially interesting if the print service provider offers specific services which are based on the quality or on the characteristics that the employed FO processor supports.
A special processing of bar codes or the embedding of special sets of characters can be offered, for example. Quality of FO processors primarily differs in the support of different XML name fields, which can be embedded as instream foreign objects in XSL-FO. A print service provider can therefore offer first-rate support of DTDs, such as XForms, SVG, MathML, etc., as an additional service.
The customer delivers XML content without XSL style sheet
In addition to the services in the previous case, the print service provider also produces the XSL style sheet when XML is delivered without it. Both the drafting of a style sheet, which is generally done by specialized designers, and the application of style sheets from an available XSL data base have to be viewed as services in this context.
In practice, this proceeding is interesting when the customer uses a generally known DTD (Open E-Book (OEB), DocBook, etc.). The print service provider can offer various XSL style sheets for the respective DTD. Ideally, the customer can test and choose the style sheets online via the Internet.
The customer delivers XSL-FO content
In practice, there will also be cases in which a customer, who cross-media publishes content, does part of the publishing process him- or herself. The customer can, for example, independently carry out the transformation of XML documents, the related publication of HTML documents, and also the transformation into XSL-FO. He or she leaves the formatting by means of an FO processor to the print service provider, who is specialized in FO and PDF, however. As in the above case, the print service provider offers specific services on the basis of an extended XSL-FO processor.
By now, XML and XSLT have proved themselves in many online publishing applications. Many of the HTML pages that can be accessed today are already generated by XSLT processors without the user noticing it. The spread of XSL-FO and the realization of XSL-FO workflows are still in their early stages. While generation of PDF documents is already being applied for electronic use, production of premium printed matter on the basis of XSL-FO is still an exception.
The possibilities for XSL-FO business models in prepress, which have been presented above, and the manifold products demonstrate, however, that there is also a great potential for XSL-FO in printing industries. Moreover, we can count on XML/XSL gaining in importance as exchange format between customer and print service provider as FO processors are further optimized.
In addition, future spread of XSL-FO technologies certainly depends on the extent to which future XSL-FO extensions also serve "layout intensive" applications and to which degree support of printout onto paper in various XML standards is advanced. The SVG Printing Requirements (SVGP) are an important and interesting initiative in this context.
Considering that XSL-FO has been adopted as recommendation only in October 2001, it is too early to deliver any extensive prognoses. An observable trend is, however, that many producers in DTP apply XSL-FO not only in marketing but also in development departments.
[Deach 2002] Stephen Deach: What Is XSL-FO and When Should I Use It? The Seybold Report, Analyzing Publishing Technologies, Vol. 2, No. 17, 2002.
[Homan 2002] Ken Holman: XSL-FO for Web Services, end-to-end book publishing. The Seybold Report, Analyzing Publishing Technologies, Vol. 2, No. 18, 2002.
[XSL] XSL - Extensible Stylesheet Language, W3C recommendation and information:http://www.w3.org/Style/XSL/
[SVG] SVG - Scalable Vector Graphics, W3C recommendation and information:http://www.w3.org/Graphics/SVG/Overview.htm8
![]() ![]() |
Design & Development by deepX Ltd. |