Abstract
The XML Information set (XML Infoset) defines an abstract data set and provides a consistent set of definitions to refer to the information in a well formed XML document. There are several application programming interfaces (API), such as Document Object Model (DOM) and Simple API for XML (SAX), that provide a way to access and/or query the XML Infoset. The W3C XML Schema Recommendation defines the post-schema-validation infoset (PSVI). It extends the XML Infoset with augmentations that result from validation and/or assessment. For example, after XML Schema validation and/or assessment, the element information item defined in the XML Infoset has the additional properties: type definition and validity.
As XML Schema is being widely adopted by the XML community and the industry, and as other W3C specifications are using the XML Schema PSVI, e.g. XQuery 1.0 and XPath 2.0 Data Model, DOM Level 3 Core, etc., the need to have an API to access and query schema components and other augmentations of the PSVI becomes crucial.
IBM has developed such an API and contributed it to the Apache Xerces2 project. The API defines a way to access the XML Schema components, such as complex types, simple types, and element declarations, and interfaces to access the PSVI from a document instance when the DOM is used and in a streaming way when SAX is used.
The main two interfaces to access the PSVI information are the ElementPSVI and AttributePSVI interfaces, which expose augmentations such as [validity], [validation attempted], [type definition], [schema default]. In an implementation that exposes element and attribute information items via some memory structures, the ElementPSVI interface is implemented by the objects that represent element information items and the AttributePSVI interface is implemented by the objects that represent attribute information items. With the W3C DOM, this means that ElementPSVI is implemented by the objects that implement the dom.Element interface and AttributePSVI is implemented by the objects that implement the dom.Attr interface.
To expose the PSVI via an API that provides a streaming document infoset, such as SAX or Xerces Native Interface (XNI), we defined the PSVIProvider interface to be implemented by the parser. The PSVIProvider interface essentially provides application developers with a way to retrieve the PSVI information from the parser using methods such as getElementPSVI().
This session will give an overview of this API, including the requirements it fulfills. Some use cases will also be presented, for example, Xalan (the Apache XSLT processor) uses this API to access the type information defined by the XQuery 1.0 and XPath 2.0 Data Model.
We will also examine how one can use this API along with the DOM and how using some new DOM Level 3 functionality one can recompute the PSVI information in the DOM dynamically.
Keywords
![]() ![]() |
Design & Development by deepX Ltd. |