XML 2003 logo

Rich Interactive Publications with XSLT and little else

Abstract

This paper introduces the Infonyte Reader, a configurable XML Browser for querying and presenting XML. It supports a simple XML vocabulary for configuring customized user interfaces and deploys an embedded native XML database to perform queries with XPath and transformations with XSLT. Thereby, interactive publications with rich search and navigation capabilities can be realized for arbitrarily structured and voluminous XML sources using only XML, XPath, and XSLT.

Keywords


Table of Contents

1. Introduction
2. iReader Configuration
3. Architecture
4. Conclusions
Bibliography
Glossary
Biography

1. Introduction

The major added value of electronic publications as opposed to print publications is support for querying, for better navigation, and for customized presentation. The prevalent approaches to realize this added value are as follows: Either the material to be published is converted to some proprietary format, such as online help formats, or it is completely transformed to HTML or PDF, or it is fed into a database as a basis for a custom application.

The first approach limits search, navigation, and presentation to the capabilities anticipated by the proprietary format and the underlying implementation; more advanced capabilities require significant custom development. In addition, such formats often mix data-, storage-, and user-interface semantics, which further complicates production and maintenance.

The second approach restricts navigation to generated links, and supports, if at all, only unqualified text search. Moreover, the generation of multiple views on the same piece of information often results in severe data bloat. For example, the entire Java API documentation, which consumes about 10 MB in XML, consumes well over 100 MB in HTML due to the additional overviews generated for packages, class hierarchies, method summaries etc.

The third approach is the most flexible, and can in principle support comfortable navigation and search on large data volumes. However, it typically requires a fairly demanding technology mix consisting at least of a database, a webserver, and extensive application development.

This paper presents the Infonyte Reader (in short iReader) as an approach in the middleground that combines the strengths of these approaches while avoiding their deficiencies. Rather than transforming the source material, it is stored in an embedded XML database in its original structure. On this basis an interactive electronic publications are realized as follows. A simple XML vocabulary is used to configure a customized user interface, which consists of components for user input, viewers for several mime types, cursors, and tree navigation. The user interface components use XPath for querying the database and dynamically evaluated, parameterized XSLT stylesheets for more elaborate transformations and customized presentations. Thereby the flexibility and scalability of a database is combined with the simplicity and openness of XSLT transformations to realize functionally rich, self-contained electronic publications.

The remainder of this paper is organized as follows. In Section 2, “iReader Configuration”, we will introduce by way of example the XML vocabulary used for configuring iReader applications. In Section 3, “Architecture”, we will describe the underlying architecture. Finally, in Section 4, “Conclusions” we will summarize and give an envoy on future work.

2. iReader Configuration

The user interface of an iReader application is configured by means of a small XSLT-like vocabulary consisting of about 25 elements. The configuration consists of two main constituents, views and actions.

Views specify the user interface components for navigation, search, and presentation and their layout. The most important kind of a view is the doc-view, which is used for input forms and for displaying query results. Depending on its mime-type, a doc-view can present html, pdf, svg, and xml. The cursor-view provides a cursor, which is used to split large query results into smaller chunks, and an interaction element, which is used to step over these chunks. The tree-view provides a configurable tree-explorer over XML data (see below). In addition to these content views, the components split-view and tab-view provide a means to organize multiple views. Compared to other XML vocabularies for describing user interfaces such as XUL [XUL], the iReader vocabulary is deliberately simpler and relies more on html forms for realizing simple user dialogues.

  <ir:split-view orientation="horizontal" divider-size="290">
    <ir:split-view orientation="vertical" divider-size="500">
      <ir:tab-view>
        <ir:tree-view name="package-view" label="Packages" border="on"/>
        <ir:tree-view name="class-hierarchy-view" label="Class Hierarchy" 
                            border="on"/>
      </ir:tab-view>
      <ir:split-view orientation="vertical" divider-size="30">
        <ir:cursor-view name="result-cursor" label="Result-Cursor"/>
        <ir:doc-view name="result-list" label="Result-List">
          <ir:input mime-type="text/html" href="documents/empty.html"/>
        </ir:doc-view>
      </ir:split-view>
    </ir:split-view>
    <ir:tab-view>
      <ir:doc-view name="doc-view" label="Documentation">
        <ir:input mime-type="text/html" href="documents/empty.html"/>
      </ir:doc-view>
      <ir:doc-view name="q1" label="Class Queries">
        <ir:input mime-type="text/html" href="documents/cform.html"/>
      </ir:doc-view>
      <ir:doc-view name="q2" label="Method Queries">
        <ir:input mime-type="text/html" href="documents/mform.html"/>
      </ir:doc-view>
      <ir:doc-view name="q3" label="Field Queries">
        <ir:input mime-type="text/html" href="documents/fform.html"/>
      </ir:doc-view>
    </ir:tab-view>
  </ir:split-view>

The example view specification above describes the user interface for the Java API doc browser displayed in Figure 1. The top split-view divides the interface into an overview part on the left hand side and a query and display part on the right hand side. The overview part consists of two tree-views for exploring the package hierarchy and the class hierarchy, and a cursor-view for displaying query results. Three query forms for java classes, methods, and fields are realized by means of doc-views, which are filled with a static html forms. The display part is also realized as a doc-view, which is initialized with an empty document.

Java API Doc Browser

Figure 1.  Java API Doc Browser

Actions specify the behaviour of views. For doc-views they are called by intercepting the http requests generated by the built-in browser plug-ins, for tree-views they are called with a built-in event handling mechanism. The syntax of actions is similar to the syntax of named templates in XSLT: An action has a name, a number of parameters, and a body. The body consists of a sequence of statements for variable assignment, simple XSLT-like control statements (for, case), and statements for updating and creating views.

  <ir:action name="search-classes">
    <ir:param name="package-search-string"/>
    <ir:param name="class-search-string"/>
    <ir:update-view>
      <ir:cursor-view name="result-cursor">
        <ir:cursor name="list" size="10" select="
            $classes[
              ($package-search-string='*' or 
               contains(parent::package/name, $package-search-string))
              and
              ((name and $class-search-string='*') or
               contains(name,$class-search-string))
            ]
        "/>
        <ir:update-view>
          <ir:doc-view name="result-list">
            <ir:call-xslt href="stylesheets/classlist.xsl" 
                              mime-type="text/html">
              <ir:input select="$javadoc"/>
              <ir:with-param name="list" select="$list"/>
            </ir:call-xslt>
          </ir:doc-view>
        </ir:update-view>
      </ir:cursor-view>
    </ir:update-view>
  </ir:action>

The example above implements the action for querying classes. The action gets two parameters package-search-string and class-search-string from the html-form in cform.html (recall the definition of the doc-view q1). These parameters are used by an XPath query, which returns all classes with a matching package name and class name. The result of the XPath query is bound to the cursor variable $list, which is used as input parameter for the XSLT stylesheet classlist.xsl. The result of applying the stylesheet to the first 10 nodes in $list is used to update the doc-view result-list. Stepping forth in the cursor-view updates the doc-view with the next 10 hits.

While the cursor-view realizes flat cursors, tree-views can also realize nested cursors by superimposing a virtual tree on the underlying XML data. Such trees are also configured with a small XSLT-like vocabulary: A configuration consists of a number of templates with an optional match pattern and up to three subelements: The obligatory label element specifies an XPath expression for the label of a node, the expand element specifies the (virtual) children of a node, and the show element specifies the action or sequence of statements to be executed when a node is double clicked.

 <ir:template match="class[@type='interface']">
    <ir:label select="concat('(I) ',qname)"/>
    <ir:expand select=
        "//class[implements/type/@qname=current()/qname or 
                       extends/type[1][@qname=current()/qname]]"/>
    <ir:show>
      <ir:call-action name="show-class">
        <ir:with-param name="class" select="qname"/>
      </ir:call-action>
    </ir:show>
  </ir:template>
  <ir:template match="class[@type='class' or @type='exception']">
    <ir:label select="concat('(C) ',qname)"/>
    <ir:expand select="(//class[extends/type[1]/@qname=current()/qname]
        | $extern-types[
              following-sibling::type[1]/@qname=current()/qname])[1]"/>
    <ir:show>
      <ir:call-action name="show-class">
        <ir:with-param name="class" select="qname"/>
      </ir:call-action>
    </ir:show>
  </ir:template>

The above example shows two (out of five) templates for the tree-view used to explore the class hierarchy. The first template matches interfaces, the second template matches implementations. The children calculated by the XPath expression in the expand element comprise all classes that implement or extend the current class.

Altogether the Java API doc configurations consists of 13 views and 12 actions (250 lines of XML code), three XSLT stylesheets, and two tree-view configurations. The biggest stylesheet (1500 lines of code) had already been available for the production of statically generated HTML and needed only minor modifications to substitute the static links with action calls. Thus the entire application could be developed within 2 person days. Because the application operates on the orignal XML sources and generates overviews for packages, class hierarchies, method summaries etc. only on demand, the overall size required is about 10 MB as opposed to the over 100 MB required for the statically generated HTML version.

Figure 2 and Figure 3 show a more elaborate iReader application. This application combines two thesauri, WordNet and its German counterpart GermaNet, a German/English dictionary, and the "semcor" version of the Brown text corpus, which is semantically marked up with concepts from the WordNet thesaurus. Together with auxiliary information for merging the dictionary with the thesauri, the underlying XML sources, which are stored in several document collections, comprise about 400 MB (600 MB in serialized form). A more detailed account on the techniques used for producing, enriching, and combining these XML sources is given in [FK03].

Thesaurus Browser

Figure 2.  Thesaurus Browser

The application consists of two main parts. The thesaurus browser in Figure 2 supports simple (prefix) search for English and German words, rather elaborate means for exploring the semantic neighborhood of concepts in the thesaurus, for switching between the English and the German thesaurus (where possible), and for retrieving texts that contain a given concept. Computing the semantic neighborhoods, which can comprise up to about 300 concepts out of over 100 000 concepts, requires two extensions to XSLT. An XPath extension function supports indexed access to concepts to quickly traverse references, and an XSLT extension element for simple updates (as a generalization of updatable variables) allows for efficient realization of transitive closures with fixpoint semantics, which is needed, because some of the semantic relationships, such as antonyms, are cyclic. With these extensions, the computation of a semantic neighborhood requires a few milliseconds on an average PC.

Corpus Analysis

Figure 3.  Corpus Analysis

The corpus analysis part in Figure 3 utilizes the sense tagging of the brown corpus to identify so called lexical chains, consisting of words that are semantically related, for example, dictionary, dictionary_entry, entry, and glossary. In the input form, users can select the part-of-speech and the kind of semantic relationships to be considered for establishing lexical chains, and choose between several views on the computed lexical chains; the tabular view in the above figure uses a collumn for each chain and a row for each sentence. The underlying stylesheet comprises about 1500 lines of code, and also uses extensions for indices and updates. A more detailed account on this part of the application is given in [TF03].

Even though this application is considerable more elaborate than the Java API doc, the iReader specific portions views, actions, and tree-view configurations are not much more complex - 15 actions and about 30 views. Including the production of XML versions for all sources, the application has been developed in about 30 person days. The complexity is mainly due to the more elaborate XSLT stylesheets, some of which essentially deploy XSLT as a general purpose programming language. However, with the introduced extensions, the complexity has been largely managable. One of the most missing features have been user defined functions for XPath, especially for realizing rather complex queries across several data sources in a reusable, modular way. Both, XSLT 2.0 (<xsl:function>) [K03] and XQuery 1.0 [BC03] will support user defined functions for XPath, whereby these difficulties can be overcome.

Several other applications for technical documentation, product catalogues, bibliographic databases, and financial reporting, have been realized with the iReader. Especially, when the XML sources and the main stylesheets for presentation have already been available, these applications have been developed in a few person days by specifying the user interface and adpating the stylesheets to support parameterized search and navigation.

3. Architecture

Figure 4 displays the layered architecture of the iReader. The presentation manager on the top consists of built-in components for managing split-views, tab-views, tree-views, and cursor-views, and plugged-in components for the several mime-types supported by doc-views. Viewers for other mime types and additional special purpose components, e.g., for browsing through indices or for more sophisticated input components can be plugged in via a small Java API. The iReader Core layer consists of several components for parsing and representing an iReader configuration, for handling events generated by the presentation components, and for executing actions. Events and variable states are logged to support undos. The two layers on the bottom provide for scalable XPath and XSLT processing on top of the native XML database system Infonyte DB. For small XML sources (< 2 MB) and simple queries and transformations, which can be reasonably performed in main memory, these two layers can in principle be substituted with any other XSLT processor. Larger data volumes and more elaborate queries, however, indeed require indexed access and persistent processing provided by an XML database system.

iReader Architecture

Figure 4.  iReader Architecture

In this architecture, a query is processed as follows: A view component generates an event with an action name and an arbitrary number of parameters in the form of key-value pairs. This event is dispatched by the iReader interpreter to trigger the action with the given name. The action body is executed by evaluating some stylesheets on the database and using their result to update some target view components.

The entire architecture is realized in Java and accordingly platform independent. Together with the embedded database system and the persistent XPath and XSLT processor, the core implementation comprises about 2 MB executable code and thus can be reasonably packaged with application data.

4. Conclusions

This paper has introduced the Infonyte Reader, an XML browser to rapidly produce interactive publications with rich search and navigation capabilities. An evaluation version is available for download [IR03].

Currently, we are working on a web-enabled version of the Infonyte Reader by generating an application for Apache Cocoon [AC] extended with support for scalable XSLT processing based on Infonyte DB. Moreover, we are designing extensions for supporting simple updates along the lines of [CF02] such that iReader applications can also be used for simple data maintenance. Finally, we are working on a compilation mode, which allows to compile iReader applications into self-contained archive formats and thereby protect the XML sources, stylesheets, and application configurations.

Bibliography

[AC] The Apache Cocoon Project. http://cocoon.apache.org/

[BC03] Boag, S., Chamberlin, D., Fernandez, F., Florescu, D., Robie, J, and Simeon, J. (eds): XQuery 1.0: An XML Query Language. W3C Working Draft, August 2003.

[CF02] Chamberlin, D., Florescu, D., Lehti, P., Melton, J., Robie, J., Rys, M., Simeon, J. (eds.): Updates for XQuery. W3C Working Draft 15, October 2002

[IR03] Infonyte Reader. Available for download at http://www.infonyte.com

[K03] Kay, M. (ed): XSL Transformations (XSLT) Version 2.0. W3C Working Draft, May 2003.

[FK03] Fankhauser, P., Klement T.: XML for Data Warehousing - Chances and Challenges. Proc. of the 5th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2003, Prague, CR, Sept. 3-5, 2003. Springer Lecture Notes in Computer Science 2737, pp. 1-3.

[TF03] Teich, E., Fankhauser P.: WordNet and Lexical Cohesion Analysis. submitted for publication.

[XUL] Goodger, B., Hickson, I., Hyatt, D., Waterson, C.: XML User Interface Language (XUL) 1.0. 2001, available at http://www.mozilla.org/projects/xul/xul.html.

Glossary

IPSI

the Fraunhofer Integrated Publication and Information Systems Institute

Biography

Peter Fankhauser is co-founder and director of research of Infonyte GmbH, a spin-off of the Fraunhofer Integrated Publication and Information Systems Institute (IPSI) in Darmstadt (Germany), devoted to build scalable storage and processing solutions for XML. In addition, he is affiliated as a senior researcher to IPSI. He has been the technical leader in the development of tools for unified access to information retrieval systems, document structure recognition, and the integration of heterogeneous databases, and has lectured on databases and XML at the universities of Darmstadt, Duisburg (Germany), and Linz, Vienna (Austria). He is member of the W3C XML Query working group, and co-editor of the W3C XML-Query Requirements, XML-Query Use Cases, and XML-Query Formal Semantics working drafts. He holds a PHD in computer science from the Technical University of Vienna (Austria).

Christoph Fitzner is a student in computer science at the Technical University of Darmstadt. He has been involved in several projects on scalable XML processing and is the chief architect of the Infonyte Reader presented in this paper.