Abstract
The era of the personal computer brought with it a proliferation of proprietary binary file formats. With the rise of the Internet, there has been a need to break down the barriers between machines and between applications, resulting in the current trend toward open, Web-accessible publication formats including XML grammars. Therefore there is a pervasive need for software to convert binary file formats to XML.
Just as XML is at the root of this conversion problem, so is it a basis for a solution. By taking advantage of the XML meta-language at every stage of the data conversion process, one can maximize code re-use.
The author will present research aimed at facilitating and partially automating the process of creating binary-to-XML file format translators. The process consists of the following stages: (i) file format analysis, (ii) creation of a parser, (iii) mapping analysis, (iv) creation of a mapper, (v) serialization of the target format. Here is how XML processing software is leveraged throughout:
First, we define an XML grammar for binary file format schema. Then we write special-purpose parser-generator software that reads in binary file format schema instances. Next, we write a rapid file format analysis tool that allows the user to discover file format schema in an iterative process: at each step a single schema modification is suggested by the user, a new parser is generated, a series of test files are parsed, and the results are dumped as XML for inspection.
The outcome of applying the rapid file format analysis tool to any particular binary format is a parser that handles that format. The outcome of applying this parser to any particular test file is an in-memory tree representation of that file, serializable as XML. Therefore the mapping stage reduces to an XML to XML transformation. In simple cases this mapping can be done efficiently enough with XSLT; in more realistic cases the XSLT needs to be cross-compiled with more traditional languages (e.g. with Java, using XSLTC).
Because the source and target of the mapping are of a common meta-language, it is possible to create a rapid mapping analysis tool that generates the mapper software. Again, the tool is used in an iterative process that culminates in the final mapping software.
As long as the target format is an open standard, one can take advantage of existing software to style, view, search, augment, edit or otherwise process that format. An important special case is the use of SVG (Scalable Vector Graphics) as a target. By translating from a binary format to SVG, one effectively has a viewer for that binary format without having to write any rendering software. The rendering stage is accomplished by standard browser or browser plug-in functionality. With the help of CSS, the view can be customized for user, purpose or device. With the help of SMIL Animation, the view can be dynamic. With the help of JavaScript and the DOM, the view can be interactive, with rich navigation, search or redlining functionality.
Keywords
![]() ![]() |
Design & Development by deepX Ltd. 2002 |