XML Europe 2004 logo

Using XSL-FO 1.1 for Business-Type Documents

Abstract

In addition to the powerful features available now, the upcoming XSL-FO 1.1 will bring several new features. In the world of business-type documents, marketing material and forms, there is currently a need for end-of-page subtotals, multiple flows, easier page number citation, things that will be possible with XSL-FO 1.1.

This presentation will cover the features of XSL-FO that are needed for this type of documents. Formatting objects and properties of both XSL-FO 1.0 as 1.1 will be covered, as well as how to combine these things to create a good-looking business-type document, because these types of documents need have the perfect layout.


Table of Contents

1. Introduction
2. What are business-type documents?
2.1. XSL-FO 1.0 and business documents
2.2. OASIS UBL
3. Absolute positioning
4. Page masters
5. Tables
6. Table headers/footers
7. Lists
8. Conditions
9. Page numbers
10. Multiple flows
11. Other non-XSL-FO features
11.1. Bar codes & charts
11.2. Print specific properties
11.3. Envelope machine steering codes
12. XSL-FO Implementations
Biography

1. Introduction

'Disclaimer': Everything that is said about XSL-FO 1.1 is not final yet. The XSL Working Group has published a working draft, but this is a work in progress. However, the functionality of XSL 1.1 that will be discussed here, is important for me because I work with business documents every day and I need these features to be able to create these documents.

Comments to the working draft are VERY welcome.

2. What are business-type documents?

For me, a book-type document is a document like a book. It has sections, tables, images, a table of contents, a back-of-the-book index. But you don't know until you look at the data in which order these things come.

The XML data will determine the order in which things appear in the document.

This uses the XSLT push strategy: the data is pushed to the XSLT stylesheet and the data determines which things will appear. It typically uses <xsl:template match="elems"> and <xsl:apply-templates>.

On the other hand, a business-type document is a document that is not a book-type document. Business-type document are invoices, purchase orders, picking lists and even marketing material and forms.

Typically the structure of a business-document, and also of the data, is fixed. Generally you know in advance which element must appear were: e.g. The document always has a header with a company logo and the company name. It has an address on a fixed position, and then contains a paragraph. Then there is a table with 4 columns, with a dynamic number of pages. The table is followed by a paragraph and a signature. And there is a footer.

The document will decide what XML data will occur on which position in the document. It uses the XSLT pull mechanism: The stylesheet will determine what data is retrieved from the XML. It uses a lot of <xsl:for-each> and <xsl:value-of>, and possibly <xsl:template name="tname"> and <xsl:call-template name="tname">.

2.1. XSL-FO 1.0 and business documents

XSL-FO 1.0 already has a lot of capabilities that allow you to create business-type documents.

2.2. OASIS UBL

OASIS has developed a standard library of schema's for XML business documents. Ken Holman has developed XSLT+XSL-FO stylesheets that can render UBL XML instances to XSL-FO and thus PDF and other formats. This is using XSL-FO 1.0 and it also shows the power of XSL-FO 1.0 right now.

Currently it can be found at http://www.cranesoftwrights.com/resources/ublss/

3. Absolute positioning

When an object is placed on a page, it can be positioned absolute or relative. Most objects are relative, which means that if the preceding objects grow/become larger, that the relative objects will shift (in most languages it shifts down).

In business-type documents, a typical example of something that is positioned relative is a table that contains a dynamic number of rows. And a paragraph following that table that says something like 'Thank you for paying your bills on time...'.

You might also want to use absolute positioned objects. For example, when creating a document that will be printed and inserted in an envelope with a window. Through this window you will see the address. The address must be on that specific location of the document, even if for example the company name that comes before the address is too long and gets wrapped into 2 lines.

Only fo:block-container can be placed absolutely, and this can be done by setting the absolute-position property to 'absolute' or 'fixed'. The value 'fixed' means that the object has a position relative to the page. The value 'absolute' means that the object has a position relative to the containing reference-area, typically another fo:block-container. This containing reference-area does not need to be positioned absolutely, which means you can position an object on a specific absolute location relative to another object that flows in the page. (This is a point where XSL-FO and CSS differ.)

Especially with absolute positioning, it can get a stressful job to get all values for all the sizes and positions right. It is extremely helpful to have a WYSIWYG Design Tool available to generate these values for you.

4. Page masters

In business documents, pages are generated in a certain sequence, typically:

  1. The odd pages (front side of page) contain the contents

  2. Even pages (back side of page) contain the conditions of sale

This can be achieved by making an fo:conditional-page-master-reference that has no body for the even pages, or a very small body that is so small that it can't contain any content, and everything automatically wraps to the next (odd) page.

There is also an other page layout that is very common in business-type documents (that can possibly co-exist in one document with the sequence above):

  1. The first page of a document contains a big header and a small footer

  2. The last page of a document contains a small header and a big footer

  3. Pages in between contain a small header and a small footer

  4. This also implies that if there is only one page, it has a big header and a big footer.

The first three requirements can already be done in XSL-FO 1.0 with an fo:conditional-page-master-reference using the page-position="first", ="last" or ="other". The latter can be done with an fo:conditional-page-master-reference since XSL-FO 1.1 using the page-position="only".

5. Tables

XSL-FO supports very complex tables. You can create tables with an arbitrary number of rows and columns, specify borders, cell spanning and you can also nest tables.

However, you should try to avoid using tables for things that are not a table. In some formatters you need to abuse tables because it is not possible to get the positions right another way, because they don't support certain properties.

An important feature of XSL-FO is border-collapse inn tables. Border-collapse will make sure that if two adjacent cells, that both specify a border on a shared side, will end up drawing only one border. If not, all borders inside of a table will appear to have double the size of the borders at the side.

6. Table headers/footers

With XSL-FO 1.0 it already was possible to create table headers and footers which will make sure the table header and footer is repeated whenever a table is split over multiple pages. This can be done with the fo:table-header and fo:table-footer elements.

A typical requirement for business-type documents is that they need to have subtotals at the bottom of a table, o every page if that table is split. This was not possible with XSL 1.0 because you didn't know at the time you generated the XSL-FO where exactly page-breaks would fall.

In XSL-FO 1.0 markers allowed you to do something that is very close to that. Markers allow you to associate some content to an object, and you can then retrieve the marker that is tied to the last object on the age for example. You could only retrieve markers from within static-content. For example for subtotals in the footer of the page (and not the table-footer), you can attach a marker to every table row, that contains the subtotal until that row, and retrieve that marker in the footer of the page.

Since XSL-FO 1.1 it's possible to retrieve markers into the table header or footer. This enables you to create tables that contain subtotals or also create tables that contain 'Continued on next page' captions.

As with all things, you 'misuse' this concept even if your not actually creating contains in the form of a table. Just create a table with 1 row and 1 column, and without borders, and that way you can simulate having 'Continued...' captions on every object.

7. Lists

Lists are also supported by XSL-FO. It is a very basic list mechanism, that resembles a table with 2 columns (one column for the list-item label and on column for the list-item-body). A big difference with a table however is that you specify the with of the list labels (the first column) different that just specifying the with of that column.

Using XSLT you can create numbered lists, or just bulleted lists.

In the working group, we are looking at whether we could add a property that influences the placement of the list-item-label so it can be placed at the left side or the right side of the list-item-body (with the possible values of 'inside' and 'outside' so that for recto verso pages the position with alternate depending on whether it is on an even or an odd page).

8. Conditions

XSL-FO itself doesn't support conditions explicitly, for example to say 'make text red if amount is negative'. However, this explicitly not available, because XSLT (or eventually another mechanism creating the XSL-FO) can evaluate these conditions.

9. Page numbers

In XSL-FO 1.0, it is possible to make references to other objects in the document. This can be used to create links to the object, or to retrieve the page number the object starts on, using the fo:page-number-citation.

In XSL-FO 1.1 it will be possible to retrieve the page number the object ends on as well, using fo:page-number-citation-last.

However, this extra fo was not enough to add. In some environments, you are creating the XSL-FO as one big stream, and you don't know when you are generating the last object in the document. You need to know the last object in the document, when you want to know the total number of pages, for text such as 'Page 1 of 10'. You need to know the last object of the document at the moment that you are generating it to be able to set the property id="last-in-the-document"

For example you create an fo:page-sequence for every record in a database. In some cases, you don't know when you are processing the last record of the database until after you have already processed it and find out that too late, because you can not return to the previously created object, because it is already created.

To solve this problem, XSL-FO 1.1 will allow the property id to be set on fo:root as well. It will behave as if the id property is inherited to all the object lying down, such that you can retrieve page number of the last object in the document with the fo:page-number-citation-last.

XSL-FO also adds the capability to create back-of-the-book indexes, with page range collapsing etc. However, this is not a general requirement for business-type documents.

10. Multiple flows

In XSL-FO, a page-sequence can only have one flow. And that flow is mapped to one region (mostly region-body).

In XSL-FO 1.1, it is possible to have several flows and multiple region-bodies. The content of one flow can be placed into one or multiple regions, and flows can be directed to output their content one after another in one or multiple regions as well. How the content is flowed is defined by a flow-map, at document level.

A typical use case for this is a magazine, where you have several articles starting on the first page, and every article is continued on a later page.

This is generally not needed for invoices and purchase orders etc. However, marketing material can use this to accomplish much more complex page layouts. You can even use this to simulate text wrapping around images etc., by creating several regions: one that contains the image, and others that fill up the space around the image with rectangles. (Note that this only works with rectangular area's).

Another use case can be to create a document that has multiple unequal columns. Just create a body-region for each column, and link them together.

11. Other non-XSL-FO features

Apart from the features that are built in to XSL-FO, there are of course other features that are not a part of XSL-FO, but can be used in combination with XSL-FO.

Some things are extensions to XSL-FO and could possibly some day end up in XSL-FO X.Y. Other things will not be a part of XSL-FO, and don't need to be a part of XSL-FO.

11.1. Bar codes & charts

Bar codes and charts currently already work with XSL-FO, through the use of fo:instream-foreign-object. Inside of that fo you can place for example and SVG document that represents the barcode or the chart.

Apart from an SVG, it is also possible to include an XML document in a separate namespace that represents the barcode or the chart specifically. In that case, the XSL-FO formatter should recognize that XML namespace and convert it properly to the output format. For some output formats (like PDF) this will be converted into a vector graphics. For other output formats (like AFP for high-volume printers, or ZPL for label printers) that have native support for barcodes, it is possible to use the data stream's native format for these barcodes.

The same applies for charts, although I've not yet seen a data stream that has native support for charts, and so this chart XML will always be converted into vector graphics.

There are some implementations that use XSLT to generate the SVG code that represents a barcode. However, I don't think this is the best solution, because some barcodes involve very complicated computational algorithms, and XSLT is not meant to be a real programming language. If this would be done in XSLT, I prefer and XSLT extension, so that this can be done in a conventional programming language.

11.2. Print specific properties

If documents need to be printed on paper, you might need some settings as duplex/simplex or input and output paper trays.

Some output streams, such as AFP, have even more concepts, like AFP overlays that define the 'background' of a page, but in a very specific manner. These things are not included in XSL-FO, and I don't expect them to be in XSL-FO, because they are so specific for a certain application that they're hard to standardize. However, implementation support XSL-FO extensions to accomplish this.

11.3. Envelope machine steering codes

In environments where printer letters are sent to customers, there is a need to put the letters in envelopes, using machines. These machines need to recognize when they need to start a new envelope and in most cases, they also need a global page-count that they use as a checksum to verify that no pages are lost or skipped.

It is not possible to create these types of documents with standard XSL-FO. Our company is in the process of helping our customers with XSL-FO extensions to create these envelope machine steering codes, and once we think we have covered enough cases so we know all the requirements, we will working through the XSL working group to try and standardize these extensions.

12. XSL-FO Implementations

Apart from the XSL-FO features and functions, you of course need software that allows you to create the XSL-FO stylesheets and that can process these stylesheets to create the final output format.

I think it is very important to have a WYSIWYG designer available that can help you designer and positioning complex objects like nested tables, absolute positioned block-containers, and that allows you to preview the content interactively.

The formatter of course needs to be able to process these documents, and create the output format you want. If you create stylesheets that have to be used in a high-volume environment, you want to be able to create a 50,000 page document without problems. You probably also want to create multiple data streams like PDF to send files electronically, or AFP (Advanced Function Print) to send documents to a high-speed printer.

You have to look at the features you need, like border collapse, absolute positioning, etc. Also look out for upcoming XSL-FO 1.1 features that will be implemented once XSL-FO has become a Candidate Recommendation (and make sure to review the Working Drafts and send you remarks to the xsl-editors list).

Biography

Klaas Bals studied Computer Science at the University of Antwerp, Belgium. He is a member of the XSL working group. He is working for Inventive Designers, where he's active in research and development.

At Inventive Designers, he is responsible for new technologies, including Inventive Designers' product called 'Scriptura', a document design and generation solution for XSL-FO, including the WYSIWYG designer.