XML Europe 2003 logo

Solution for Multilingual Literature

Abstract

First of all, the considerable difficulties in formatting multilingual using computer are listed. Our aim is to find the practical solution of these difficulties as fast as possible.

How to make a data for multilingual formatting

1.Selection of character encoding

2.Selection of computer

3.How can we put date in computer

4.Data expression using XML

5.Selection of editor software

How to format multilingual

1.Selection of formatting engine

2.Which fonts?

3.Specifying the layout by XSL-FO

The way of printing, outputting to PDF

The difference between PDF for production-quality document printing and PDF for the Web.

How to make a table of contents, sort order in multi languages for index.

Second, preliminary knowledge is explained.

Character and language

Computers must handle the characters which describe the language. Knowledge of respective local character code sets are necessary. Explains the principal languages showing the table.

Unicode

Unicode provides the capacity to encode any character of all known language around the world.

OS and character code inside application

Explain here which OS should be selected for multilingual formatting.

Font

When processing languages through computers, font technology is considered to be the next important infrastructure to the character code. Explains the practical font family and principal character which it covers showing the table.

PDF technology

PDF technology is another promotional feature of a multi-lingual formatting. It is a medium that coverts paper into a digital version.

XML and XSL Technology

XML

XML is the optimum method of expressing contents of multi language documents. In XML, a document file can be divided into many article files. Graphics are independent from main documents and are linked to the main documents as external files. Using this mechanism, when creating a document, one can store the text portion of different languages into separate files. Finally all these parts are integrated together to form a complete document.

XSL

XSL is a specification that is designed for formatting XML on paged media. XSL has designed as compatible with global languages from the following aspects:

Specifying font

Mixture of Japanese and Chinese

Mixture of multi languages

Internationalization function of XSL,

Writing-mode can be specified. Unicode BIDI is adapted as fo:bidi-override in XSL, which is described later in the chapter of 'Formatting example using Arabic document'.

XSL Formatter and Multi language formatting

Hereafter, is a brief explanation of how XSL Formatter deals with multilingual typesetting issues.

Glyph substitution

Line break position

Japanese Punctuation

Hyphenation

Justification and word spacing

Beautiful typesetting

Formatting example using Thai document

Formatting example using Arabic document

Formatting example of Multilingual Mixture document

Conclusion

There seems to be no product among current typesetting software that can process all main languages of the world by only one version or one edition. Our objective remains to improve XSL Formatter to the point where it can achieve high-quality output available for publishing purpose of all global languages. We would appreciate any advice from experts.

Keywords


The full paper was not available at the time the proceedings were created. Please check the conference web site, http://www.xmleurope.com, to find an updated version of this paper.

Biography