XML 2003 logo

Practical Applications of XQuery

Abstract

XQuery is the soon to be standard for querying XML data sources. It provides a host of features that allow precise and expressive definition of queries over XML documents and over data sources that can be viewed through equivalent XML models. Apart from being an effective query language, it can also be used gainfully for XML transformations. It would be particularly useful in cases where the transformations are not just for presentation but have other prerogatives such as a strong requirement for preservation of types (such as transformations of business documents between systems that use incompatible formats). Given these applications, XQuery is likely to become an important technology for data and application integration. This session will introduce the audience to the XQuery language and discuss some real world applications of XQuery in data and appplication integration with some demonstrations of such applications.


Table of Contents

1. Introduction
2. An XML Query language
3. Data Integration
4. Data Aggregation
5. Application Integration
6. Summary
Bibliography
Biography

1. Introduction

The XQuery specification will soon be a W3C candidate recommendation. Implementations with varying levels of conformance have been available for a while now and will hopefully demonstrate conformance and interoperability in the near future. It is time for developers to take a closer look at XQuery and see how they can exploit it in their applications. This paper attempts to get developers some food for thought as they investigate how they can apply XQuery in their applications

2. An XML Query language

XML data sources have dealt with an absence of a standardized query mechanism by either supporting non-standard query languages or by co-opting XPath for generic querying. Niether of these approaches is satisfactory as it leads to reduced functionality and vendor lock-in. XQuery fills this gap by being the first XML query language to have significant XML query functionality and widespread vendor acceptance. With XML data sources such as XML databases (native or non-native) and XML content management systems, XQuery can be used by reporting applications to extract content from the data stores. The presentation at XML2003 will include a short tutorial on the XQuery query language. The W3C XML Query working group page [W3C XML Query] contains references to resources that provide a good introduction to the XQuery query language.

3. Data Integration

The semantics of the XML Query language are defined over an abstract data model. All XQuery expressions operate on instances of the data model and return instances of the data model. Also, there is little restriction on how implementations would produce instances of these data models. For example, a data model instance could be constructed from non-XML data sources including, relational databases, data in binary format, log files etc. This provides us a unique opportunity to create XQuery implementations that can query non-XML data sources. So, at least in theory, XQuery can be used to query any data source that is amenable to being represented in the XQuery/XPath data model.

One classification of data sources is that they are either queriable (relational databases, for instance) or non-queriable (log files, free-standing documents, web accessible documents etc.). XQuery based data integration soutions take different approaches based on the nature of the data source being integrated. If the source is queriable, data integration solutions typically take a query re-writing approach where an XQuery expression may be translated as one or more queries based on the capabilities of the target data source(s), and the results are made available as XQuery data model instances. On the other hand, when the data source does not allow for a query mechanism, a scheme is needed for converting the data from that source into an XQuery data model instance. Using these mechanisms, an XQuery engine can expose multiple data sources to applications through a consistent data model and query interface.

The illustration below depicts the scenario of an XQuery query over a couple of relational database instances:

XQuery over relational database

When the target data source is queriable, the data integration solution can re-write the query in the data source's native language to present it as an XML data source.

Figure 1. XQuery over relational database

4. Data Aggregation

While data integration capabilities give users of XQuery implementations to view multiple data source through the same data model, it also advances a related feature of being able to perform data aggregation. Information from disparate sources can now be aggregated using XQuery as the mechanism of aggregation. This eases the work load on developers by hiding from them proprietary interfaces and query languages supported by data sources and allows them to concentrate on higher level issues.

The example that follows illustrates data aggregation using XQuery. The scenario depicted is one in which we search a news archive for news items that match certain search criteria and for each company mentioned in the news article, we retrieve the stock quote and public information about that company from web services. Data aggregated in this fashion can be made available to a portal user

Data aggregation using XQuery

Retrieve information from multiple sources (News archives, stock quote web services, public search engine) aggregate that information and present it to a portal user.

Figure 2. Data aggregation using XQuery

5. Application Integration

Applications in a domain can speak multiple vocabularies, some XML based, some not; some proprietary, some standard and some quasi-standard. If these applications need to talk to each other, it is necessary to translate between these different vocabularies. XQuery fits the bill as it has many of the same transformation capabilities of XSLT and can take input from potentially disparate data sources. XQuery has very flexible mechanisms for construction of results trees (including computed constructors), which allows for easy structural transformations, in a (sometimes) more concise syntax than XSLT. A simple example illustrates how transformations could be performed using XQuery. Consider a transformation for which the input is the document shown below:

          <?xml version="1.0" ?>
          <PO id="1002213">
                <item id="123" qty="10"/>
            <item id="329" qty="22"/>
            <item id="144" qty="15"/>
          </PO>

Here is a sample query that transforms this document in a simple way, by adding a little more structure and using a different element types

          let $po:=document("po.xml")/PO
          return
          <PurchaseOrder>
              <Order>{$po/@id}</Order>
              <LineItems>{
                  for $Item in $po/:item
                  return
                      <Item>
                           <Id>{string($Item/@id)}</Id>
                           <Qty>{string($Item/@qty)}</Qty>
                      </Item>}
              </LineItems>
          </PurchaseOrder>

And the result of the transformation would as follows:

        <PurchaseOrder>
            <Order id="1002213"/>
            <LineItems>
                <Item>
                    <Id>123</Id>
                    <Qty>10</Qty>
                </Item>
                <Item>
                    <Id>329</Id>
                    <Qty>22</Qty>
                </Item>
                <Item>
                    <Id>144</Id>
                    <Qty>15</Qty>
                </Item>
            </LineItems>
        </PurchaseOrder>
        

A real life requirement for transformation between vocabularies would be quite a bit more complicated than this example, but the transformation capabilities of XQuery should be adequate to address most such requirements

6. Summary

XQuery is a query language for XML and that is only part of the story. XQuery is defined over an abstract data model, instances of which can be constructed from multiple types of information sources. This gives XQuery engines the unique ability to act as a tool for integration and aggregation of information from those sources. In addition, XQuery's transformation capabilities can also be applied to application integration.

Bibliography

[W3C XML Query] The W3C XML Query Working Group (http://www.w3.org/XML/Query) page contains links to the specifications and other resources for learning XQuery

Biography

Srinivas Pandrangi is a software architect for Ipedo Inc, a leading vendor of XML data management solutions. At Ipedo, Srinivas leads the design and development of some of the key components of the Ipedo XML Information Hub including the XQuery engine, data integration and security tools. Srinivas also represents Ipedo in standards organizations such as W3C and IETF. Prior to Ipedo, Srinivas was with Critical Path, where he worked on PKI (Public Key Infrastructure) implementations and related standards.