Abstract
This paper will provide an overview of the XML Schema API [XML Schema API] which defines interfaces to query the post-schema validation infoset (PSVI) including the XML Schema components.
Keywords
Table of Contents
When an XML document is validated against an XML Schema [XML Schema Part 1], the original XML Infoset [XML Information Set] is augmented. The infoset is extended with additional properties on certain information items: items that describe schema components (such as simple and complex types and element declarations), validity status, default values, and error codes, among others. The augmented infoset is called the post-schema-validation infoset (PSVI).
While developers can access an instance document's XML Infoset from the World Wide Web Consortium (W3C) Document Object Model (DOM) API [Document Object Model] and partially from SAX [SAX], there is no standard API for accessing the PSVI. Many developers have expressed that such an API is needed to be able to query XML Schema components. This API would help developers write applications such as:
Editors that allow editing of an XML instance document based on a schema
Tools that examine and compare schemas
Implementations of XSLT 2.0, WSDL W3C specifications
Applications in which the XML data needs to be mapped to actual types (integers, floats, objects, etc.) as opposed to their XML textual form
This paper will provide an overview of the XML Schema API [XML Schema API], which is a read-only API that enables users to query the PSVI. In particular, by using the API, users can examine the XML Schema components (such as complex types, simple types, and element declaration) and the post-schema-validation infoset items. In addition, the API defines a set of interfaces that provide access to the PSVI from DOM or SAX and defines an interface for loading XML Schema documents.
This paper begins with the description of the XML Infoset and post-schema-validation infoset information items. Then it introduces the XML Schema API starting from the description of the set of interfaces for accessing the post-schema-validation information items from DOM and SAX. The paper continues to describe the set of interfaces that represent the XML Schema components. Finally, it describes how XML Schema documents can be loaded independently of an instance.
An understanding of the PSVI requires a basic understanding of the XML Infoset. The XML Information Set, or infoset, is an abstract description of the information present in an XML document. In other words, the infoset defines a common vocabulary for the information in an XML document. For simplicity, the information is described as a tree; however there is no requirement that implementations provide the document infoset using a tree structure. For example, the document infoset could be provided using a streaming model (e.g. SAX) where events received by an application contain some properties of the XML Infoset.
The infoset is defined for a well-formed document, that also conforms to the Namespaces in XML specification [Namespace in XML]. A document’s infoset is comprised of a number of information items, each of which is a description of some part of the XML document. Each information item has a number of associated properties, which are denoted in square brackets.
The infoset for a document will contain a single document information item, as well as several other information items. The properties of a document information item provide information about the document as a whole, such as:
The version and character encoding of the document, denoted by the [version] and [character encoding scheme] properties.
The element information item for the document element, provided in the [document element] property.
An ordered list of child information items, including the information item for the document element itself, as well as information items for processing instructions and comments appearing outside of the document element; these are included in the [children] property.
Information items for the notations and unparsed entities declared in the document’s DTD; the [notations] and [unparsed entities] properties provide this information.
There is an element information item for each element in the XML document. We have already seen that one such element information item is the [document element]. Each element information item has properties describing information about the element. Some examples are:
The [namespace name] and [local name] properties provide the namespace name and local part of the element name.
The [children] property contains an ordered list of children for the element, including element, processing instruction, character, comment, and unexpanded entity reference information items.
The [attributes] property contains an unordered list of attribute information items, each corresponding to one of the attributes for this element. This includes attributes included directly in the document, and those defaulted in by the DTD. Attributes corresponding to namespace declarations are included in the [namespace attributes] property.
The [parent] property points to the information item that contains this element.
Similarly, attribute information items provide information about attributes in the document’s infoset. As with element information items, the properties for an attribute contain information about the namespace and local name of the attribute, in addition to information specific to attributes, such as:
The [normalized value] of the attribute.
The [attribute type], which is the DTD-declared type for the attribute
The element information item, which owns this attribute, as provided in the [owner element] property
As we have seen already, one of the possible [children] of an element information item is a character information item. A character information item is provided for each data character that appears in an XML document, and has the following properties:
[character code], which is the ISO 10646 code for the character.
[element content whitespace], which indicates whether the whitespace is within element content.
[parent], which is the owning element information item
Other information items include: processing instruction information items, unexpanded entity reference information items, comment information items, unparsed entity information items, notation information items, namespace information items and the document type declaration information item. See the “XML Information Set (Second Edition)” [XML Information Set] for more details about these items.
Consider the following XML instance document:
<p:person xmlns:p=”http://www.example.com/people”>
<firstName>Sue</firstName>
<lastName>Smith</lastName>
<dateofbirth>1960-10-23</dateofbirth>
</p:person>
Figure 1. The Person, person.xml
The following shows some of the information items and selected properties for this example.
| Information item | Property name | Property value |
|---|---|---|
| Information item | Property name | Property value |
| Document info item | [version] | "1.0" |
| [character encoding scheme] | "UTF-8" | |
| [document element] | Element info item [1] | |
| [children] | Element info item [1] | |
| ... | ... | |
| Element info item [1] | [namepsace name] | ”http://www.example.com/people” |
| [local name] | "person" | |
| [namespace attribute] | Attribute info item [1] | |
| [children] | Element info item [2], Element info item [3], Element info item [4], and some character info items for whitespace (not shown here) | |
| ... | ... | |
| Attribute info item [1] | [namespace name] | ”http://www.w3.org/2000/xmlns” |
| [local name] | ”p” | |
| [prefix] | ”xmlns” | |
| [normalized value] | ”http://www.example.com/people” | |
| [attribute type] | CDATA | |
| [owner element] | Element info item [1] | |
| ... | ... | |
| Element info item [2] | [namespace name] | No value |
| [local name] | "firstName" | |
| [children] | Character info items [1], [2] and [3] | |
| ... | ... | |
| Element info item [3] | [namespace name] | No value |
| [local name] | "lastName" | |
| ... | ... | |
| Character info item [1] | [character code] | "S" |
| Character info item [2] | [character code] | "u" |
| Character info item [1] | [character code] | "e" |
| ... | ... |
Table 1.
The infoset associated with an XML document may be retrieved using the W3C Document Object Model API [Document Object Model]. The most recent DOM Level 3 Recommendation has added several new methods for the XML Infoset information items ([Discover key features of DOM Level 3 Core]) missing from DOM Level 2. For example, you can retrieve an [attribute type] property on the attribute information item using the schemaTypeInfo attribute of the Attr interface.
To retrieve the properties of the document information item, use the Document interface, as shown in the figure below:
Document doc; ... // get the document element Element documentElement = doc.getDocumentElement(); // get the character encoding scheme String characterEncoding = doc.getXMLEncoding(); //get the version of the document String version = doc.getXmlVersion(); ...
Figure 2. Accessing the XML Infoset properties via DOM
Also, it is possible to access to information for an element using the Element interface defined in DOM, and to the data in an attribute information item using the Attr interface. For more information on these interfaces and other interfaces that provide access to the infoset in DOM, see Appendix C in the DOM Level 3 Core specification ([Document Object Model]) that provides the mappings between the XML Infoset model and the DOM.
An XML Schema is comprised of components, such as type definitions and element declarations, which can be used to assess the validity of the element and attribute information items covered in the Section 2 section. In the process of verifying that element and attribute information items conform to the Schema, a processor obtains access to additional information about such elements and attributes. The additional information is used to augment the original infoset of the document; this augmented infoset is called the post-schema-validation infoset, or PSVI.
During the assessment of an element information item, it is possible for a completely new attribute information item to be added to the infoset for the element. This is the case when an attribute default value is provided in the schema and no attribute exists in the infoset for the element.
For both element and attribute information items that are assessed, the PSVI contains properties that describe the outcome of the assessment. These properties are as follows:
[validation context]: names the ancestor element information item at which assessment began
[validity]: describes the outcome of validity assessment (valid, invalid, or not known)
[validation attempted]: indicates whether full, partial or no validation was attempted
This information is useful, as it provides very comprehensive information about the validity of an attribute or element and the extent of validation that was performed.
If an element or attribute information item is assessed as invalid, it is also possible to determine the error, or errors, which caused the invalid outcome. The [schema error code] property provides a list of error codes.
If an element or attribute information item is assessed as valid, then the PSVI may contain, at processor option, the element or attribute declaration used for assessment. In the case of element information items, this is the [element declaration] property; for attributes, it is the [attribute declaration] property. If the declaration contains a [value constraint] (i.e. a fixed or default value), then the PSVI will include the [schema default] property, which is the canonical form for the value.
In addition, the PSVI will contain the property [schema normalized value], which is the normalized value of the information item (as validated); for element information items, this property will only appear if the element’s type is simple. The [schema specified] property indicates whether the [schema normalized value] was derived from a default value in the schema or not.
The PSVI also provides information about the type definition used for validation of the element or attribute information; two alternate sets of properties may be included in the PSVI, depending on whether the schema processor is light-weight or not. The first set (which is the most informative) consists of the following properties:
[type definition]: for attributes, this is the type declaration referred to by its attribute declaration; for elements, this is the type definition against which the element was assessed.
[member type definition]: if [type definition] is a union type, then this property provides the actual member type definition that validated the item.
Element information items have additional properties in the PSVI. If identity constraints apply to a particular element, then the PSVI may contain, at processor option, an identity constraint table containing information about each eligible identity constraint. In addition, if an element information item contains an attribute information item that is valid with respect to a NOTATION, then additional PSVI properties providing information about the NOTATION declaration are included in that element’s PSVI. See “XML Schema Part 1: Structures” [XML Schema Part 1] for more details.
Finally, the element information item at which assessment began (i.e. the validation root) has additional properties that provide information that is global to the schema assessment. For example, the PSVI may be augmented, at processor option, with an ID/IDREF table, providing ID/IDREF binding information. There are additional properties that provide information about the resulting schema. For example, for each target namespace in the schema, the following information is included:
[schema namespace] (the namespace)
[schema components] (the schema components in the namespace)
[schema documents] (a possibly empty set of schema document information items)
Therefore, it is possible for an application to obtain complete information about all of the components in the schema. This is useful for applications such as query processors that must be able to access information about type hierarchy.
Consider the following schema, which can be used to validate the person.xml described in Figure 1:
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:p="http://www.example.com/people"
targetNamespace="http://www.example.com/people">
<complexType name="personType">
<sequence>
<element name="firstName" type="string"/>
<element name="lastName" type="string"/>
<element name="dateofbirth" type="date"/>
</sequence>
</complexType>
<element name="person" type="p:personType"/>
</schema>Figure 3. The Person schema, person.xsd
The following shows a portion of the augmentation to the “firstName” element information item (using the example from Figure 1).
| PSVI Property name | PSVI Property value |
|---|---|
| PSVI Property name | PSVI Property value |
| [validity] | ”valid” |
| [validation context] | Element info item [1] (i.e. ”person”) |
| [validation attempted] | ”full” |
| [element declaration] | An item isomorphic to the element declaration component for ”firstName” |
| [type definition] | An item isomorphic to the XML Schema type ”string” |
| [schema normalized value] | ”Sue” |
| ... | ... |
Table 2.
To understand the design, it is important to understand the requirements and goals of this API.
The basic requirement of the XML Schema API was to enable users to query the PSVI (i.e. to examine the XML Schema components and query the post-schema-validation infoset items). To satisfy this requirement, the API needs to cover all of the XML Schema components and be isomorphic to the schema components to the greatest possible extent. While this requirement has constrained the design, it also facilitates learning the API for someone who is familiar with the definition of the PSVI.
Designers of the XML Schema API also wanted to encourage multiple implementations. Therefore, the design could not impose any concrete implementation, but needed to allow one to implement the API in a layer that sits on top of applications with different class hierarchies. In this case, the layer provides the necessary glue between the API and the existing internal structure that can be different from the view one gets of it through the API. To satisfy this requirement, we use interfaces rather than classes. In general, interfaces provide a simplified view on what could be a complex internal representation of the data and hide the actual implementation.
The other two requirements that the XML Schema API has been designed with are language and platform independence. The API is independent of the programming language being used and independent of the actual implementation and system that it runs on.
These requirements have also impacted the design in several ways. For instance, as with the DOM API, this API could not rely on the Java™ instanceof operator to query a run-time type of an object. Instead, the API provides a special method to query the type defined on the base interface XSObject for the schema component model.
Another example of a trade-off, given this constraint, is that the API could not use any of the available libraries for containers and lists. The API defines a specialized collections framework to retrieve and query objects of the types defined in the specification. For example, the StringList interface is an immutable collection of string objects.
Despite the constraints these requirements have placed on the design, the design allowed this API to be implemented in both the C++ and Java Apache Xerces parsers.
The XML Schema API defines the ElementPSVI and AttributePSVI interfaces that allow users to access the PSVI augmentations for element and attribute information items in tree models, such as DOM, and streaming models, such as SAX.
The ElementPSVI and AttributePSVI represent the PSVI items for element and attribute information items respectively. The common properties of the element and attribute post-schema-validation infoset items, such as [schema default], [schema error code], [schema normalized value], [schema specified], and [type definition], are available on the ItemPSVI interface. Both the ElementPSVI and AttributePSVI interfaces extend from the ItemPSVI interface.
In implementations that represent the XML document infoset via some memory structures (e.g. DOM) and support the XML Schema API, the objects that represent element information items should also implement ElementPSVI. Objects that represent attribute information items should also implement AttributePSVI. For example, in a DOM implementation that supports this API, the objects that implement dom.Element also implement ElementPSVI. The objects that implement dom.Attr also implement AttributePSVI.
In implementations that provide a streaming XML document infoset and support the XML Schema API, a parser object should also implement the PSVIProvider interface. The PSVIProvider interface defines methods to retrieve the PSVI augmentations for elements (PSVIElement) and for attributes (PSVIAttribute). These methods must be called within the scope of the document handler's startElement and endElement methods.
Let us take a closer look at how you can retrieve PSVI from DOM and SAX.
In general, there are two types of applications working with the DOM that would want to retrieve PSVI: applications that build the DOM tree from scratch in memory and applications that parse an existing XML document and modify it in memory.
Developers of the first type of application first need to retrieve an implementation that supports the XML Schema API. It is better to avoid writing implementation-dependent code. Instead, use the DOM Level 3 Bootstrapping [Document Object Model] mechanism to retrieve a DOM implementation that supports the XML Schema API, as shown in the figure below:
System.setProperty(DOMImplementationRegistry.PROPERTY,
"org.apache.xerces.dom.DOMXSImplementationSourceImpl");
DOMImplementationRegistry registry =
DOMImplementationRegistry.newInstance();
DOMImplementation impl =
(DOMImplementation) registry.getDOMImplementation("psvi 1.0");
Figure 4. Getting an implementation that supports the XML Schema API
Using the DOMImplementation class, developers can start building the DOM tree in memory.
It is important to note that in the Xerces Java DOM implementation [Xerces2 XML Java Parser], the PSVI information will not be added or modified as you modify the tree in memory. To get the updated PSVI information, developers have to validate the DOM tree in memory using the DOM Level 3 normalizeDocument method.
The following figure shows how developers of the second type of application can use Xerces’ JAXP [JAXP] implementation to retrieve the PSVI:
//dbf is a JAXP DocumentBuilderFactory
// all of the following features must be set:
dbf.setNamespaceAware(true);
dbf.setValidating(true);
dbf.setAttribute("http://apache.org/xml/features/validation/schema",
Boolean.TRUE);
// you also must specify Xerces PSVI DOM implementation
// "org.apache.xerces.dom.PSVIDocumentImpl"
dbf.setAttribute("http://apache.org/xml/properties/dom/document-class-name",
"org.apache.xerces.dom.PSVIDocumentImpl");
Document doc = db.parse("person.xml");
Element documentElement = doc.getDocumentElement() ;
if (documentElement.isSupported("psvi", "1.0")){
ElementPSVI psviElem = (ElementPSVI)doc.getDocumentElement();
XSModel model = psviElement.getSchemaInformation();
XSElementDeclaration decl = psviElem.getElementDeclaration();
}
Figure 5. Using JAXP to retrieve XML Schema API implementation
This example is taken from the Xerces FAQ [Xerces2 XML Java Parser]
To retrieve the PSVI from SAX, developers need to retrieve an implementation of org.xml.sax.XMLReader that supports the XML Schema API and then need to cast the XMLReader to the PSVIProvider. Use the instance of the PSVIProvider to retrieve PSVI augmentations as shown in the figure below:
public class MyDocumentHandler extends DefaultHandler{
// Retrieve psviProvider by casting SAXParser to PSVIProvider
PSVIProvider psviProvider;
public void startElement(String uri, String localName, String qname,
Attributes attributes) throws SAXException {
// retrieve partial PSVI information
ElementPSVI elemPSVI = psviProvider.getElementPSVI();
// retrieve PSVI information for attributes
AttributePSVI attrPSVI;
for (int i=0;i<attributes.getLength();i++){
attrPSVI = psviProvider.getAttributePSVI(i);
}
}
public void endElement(String uri, String localName, String qname)
throws SAXException {
// retrieve all PSVI augmentations for current element
ElementPSVI elemPSVI = psviProvider.getElementPSVI();
}
}
Figure 6. Using PSVIProvider
It is important to note that not every PSVI property for an element information item is available in the scope of the startElement method because of the nature of the streaming implementation. For example, [schema normalized value] may not be available because the startElement event could be sent to the application as soon as the parser processes the start tag. Therefore, the content of the element might not be even parsed. See the XML Schema API for more information on which properties are available in the scope of the startElement event and which are available in the scope of endElement event.
The interfaces defined are isomorphic to the XML Schema components. Most of the interfaces defined in the API map to the schema component they represent, as shown in the following table:
| Interface | Schema component |
|---|---|
| Interface | Schema component |
| XSModel | Schema |
| XSAttributeDeclaration | Attribute declaration |
| XSElementDeclaration | Element declaration |
| XSComplexTypeDefinition | Complex type definition |
| XSSimpleTypeDefinition | Simple type definition |
| XSFacet | Facet |
| XSMultiValueFacet | Facet: pattern and enumeration |
| XSParticle | Particle |
| XSModelGroup | Model group |
| XSModelGroupDefinition | Model group definition |
| XSAttributeUse | Attribute use |
| XSWildcard | Wildcard |
| XSIDCDefinition | Identity Constraint definition |
| XSNotationDeclaration | Notation declaration |
| XSAnnotation | Annotation |
Table 3.
Most of the XML Schema components have several common properties, such as [name] and [target namespace]. Therefore, most of the interfaces that represent schema components inherit from the XSObject interface that is a base object for the XML Schema component model. The only exception is the schema (XSModel) component, which is basically an abstract collection of all other components, such as type definitions, element and attribute declarations, etc.
The XML Schema specification has two type definition types: simple and complex. However, in many places in the specifications the schema refers to a “type definition” that could be either a complex or a simple type definition. For example, the element component has a [type definition] property that is either a simple or a complex type definition. Therefore, this API defines the XSTypeDefinition interface and both XSComplexTypeDefinition and XSSimpleTypeDefinition interfaces derive from this common interface.
The XSTypeDefinition interface provides access to the common properties for simple and complex types, e.g. [base type definition].
In general, for each property defined for an XML Schema component there is a corresponding method (attribute) available.
The following figure shows the mapping between the complex type definition and the XSComplexTypeDefinition interface.
| Schema Component: Complex Type Definition | The XSComplexTypeDefinition Interface |
|---|---|
| Schema Component: Complex Type Definition | The XSComplexTypeDefinition Interface |
| {name} | public String getName(); |
| {target namespace} | public String getNamespace(); |
| {base type definition} | public XSTypeDefinition getBaseType(); |
| {derivation method} | public short getDerivationMethod(); |
| {final} | public short getFinal(); |
| {abstract} | public boolean getAbstract(); |
| {attribute uses} | public XSObjectList getAttributeUses(); |
| {attribute wildcard} | public XSWildcard getAttributeWildcard(); |
| {content type} | public short getContentType(); public XSParticle getParticle(); public XSSimpleTypeDefinition getSimpleType() |
| {prohibited substitutions} | public short getProhibitedSubstitutions(); |
| {annotations} | public XSObjectList getAnnotations(); |
Table 4.
While the specification is defined using IDL [OMG IDL], for simplicity, the examples in this paper are given in the Java language
However, the mapping is not necessarily a one-to-one mapping, i.e. for each property there is one method (attribute). For example, the XML Schema specification defines that the schema component has the following properties.
{type definitions}
{attribute declarations}
{element declarations}
{attribute group definitions}
{model group definitions}
…
Figure 7. Schema Component: Schema
However, the XSModel interface defines one method that enables users to retrieve the above properties by specifying the type of the component to retrieve, e.g. ATTRIBUTE_DECLARATION, ELEMENT_DECLARATION (all of the constants are defined in the XSConstants interface).
Simple types can be constrained by specifying one or more facets. Most of the facets are designed to specify one constraining value, e.g. the “length” facet constrains the number of units of length a value of a particular type may have. This kind of facet is represented by the XSFacet interface.
<simpleType name='employeeId'>
<restriction base='string'>
<length value='6'/>
</restriction>
</simpleType>
Figure 9. Using facets
In contrast, the enumeration facet can have multiple constraining values.
<simpleType name='state'>
<restriction base='string'>
<enumeration value='NY'/>
<enumeration value='CA'/>
<enumeration value='IL'/>
</restriction>
</simpleType>Figure 10. Using enumeration facet
The resulting schema simple type component will contain only one enumeration facet with the multiple values specified in the example above. This facet is represented using the XSMultiValueFacet interface. Note that for simplicity of implementation, the pattern facet is also represented using the XSMultiValueFacet even though the XML Schema specification defines that the pattern facet has only one value.
In some cases the API provides more than one way to retrieve information. For example, one could use the “light-weight” methods to retrieve information about facets defined on this simple type definition, as shown in the next figure.
public short getDefinedFacets();
public boolean isDefinedFacet(short facetName);
public String getLexicalFacetValue(short facetName);
public StringList getLexicalEnumeration();
public StringList getLexicalPattern();
Figure 11. Using the "light-weight" methods to retrieve facets
Or use the “heavy-weight” methods to retrieve facets defined on this type as objects.
The particle schema component defines a “term” property to be one of a model group, a wildcard, or an element declaration. To represent a “term” type, the API defines an XSTerm interface and specifies that objects implementing XSElementDeclaration, XSModelGroup, and XSWildcard interfaces also implement the XSTerm interface.
The API also provides a set of convenience methods. These are methods that do not correspond to a property on an XML Schema component, but do provide useful information or make common operations much simpler to perform.
For example, the XSModel interface provides a method that allows users to retrieve a list of all namespaces that belong to this XSModel (remember that a schema can be comprised from several schema documents, each of which could define components in different target namespaces).You can also retrieve all components in the given namespace or query for global components using name and namespace.
The XSTypeDefinition interface provides a couple of methods that allow users to check if the current type is derived from another type.
public boolean derivedFromType(XSTypeDefinition ancestorType,
short derivationMethod);
public boolean derivedFrom(String namespace,
String name,
short derivationMethod);
Figure 13. The isDerivedFrom methods
Finding out if one type is derived from another could be helpful in several applications. For example, if an application tries to assist the user in editing an XML instance document based on a specified schema, finding possible derived types for a given type helps in determining possible values for the xsi:type attribute.
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:p="http://www.example.com/people"
targetNamespace="http://www.example.com/people">
<include schemaLocation=”person.xsd”/>
<complexType name="employee">
<complexContent>
<extension base="p:personType">
<sequence>
<element name="employeeId" type="ID"/>
<element name="salary" type="integer"/>
</sequence>
</extension>
</complexContent>
</complexType>
<element name="person" type="p:personType"/>
</schema>
Figure 14. The Employee schema, employee.xsd
Note that the “employee” complex type definition is derived from “personType” (see Figure 3) using the derivation method of extension. While the schema defines the “person” element to be of type “personType”, it is possible to add in the instance document an xsi:type attribute that would specify the “person” element type as the “employee” complex type.
<p:person xmlns:p=”http://www.example.com/people”
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.example.com/people employee.xsd"
xsi:type="p:employee">…</:p:person>
Figure 15. The employee.xml
Assuming you already have the XSModel that represents the “employee.xsd” document (Figure 14), here is how you check for type derivation:
XSTypeDefinition typeEmployee =
model.getTypeDefinition(“http://www.example.com/people”,
“employee”);
XSTypeDefinition typePerson =
model.getTypeDefinition (“http://www.example.com/people”,
“person”);
if (typeEmployee.derivedFromType(typePerson,
DERIVATION_EXTENSION|DERIVATION_RESTRICTION)){
// add xsi:type on the instance document
…
}
Figure 16. Checking type dericvation
The schema document can be loaded in memory independently of an XML instance document using the XSLoader interface. As was mentioned before, application developers should use the DOM Level 3 Bootstrapping mechanism to retrieve the XSImplementation that provides a method to create an XSLoader and then use the XSLoader to load the schema, as shown in the following figure:
// set the system property to reference all known DOM implementations
System.setProperty(DOMImplementationRegistry.PROPERTY,
"org.apache.xerces.dom.DOMXSImplementationSourceImpl");
// get an instance of DOMImplementationRegistry
DOMImplementationRegistry registry =
DOMImplementationRegistry.newInstance();
// retrieve an implementation that supports XSImplmenetation
// by specifying “XS-Loader” feature
XSImplementation impl = (XSImplementation) registry.getDOMImplementation("XS-Loader");
// create XSLoader
XSLoader schemaLoader = impl.createXSLoader(null);
XSModel model = schemaLoader.loadURI(“employee.xsd”);
Figure 17. Loading schema documents
The XSModel in this case represents a schema that was created from the employee.xsd document (Figure 14), which includes the person.xsd document (Figure 3). Therefore, the resulting XSModel contains schema components from both schema documents. Using this XSModel, the user can query and examine type definitions, element declarations and other schema components.
This paper has described the basic concepts of the XML Infoset and the post-schema-validation infoset and has explained how the two relate to each other. It was then shown how one can query PSVI from DOM and SAX using the XML Schema API. Finally, it has described how this API maps to the XML Schema components and what additional functionality it provides, for example, loading XML Schema documents independently of an instance.
We would like to thank Sandy Gao and Michael Glavassevich for taking the time to provide valuable feedback
The opinions expressed in this paper are those of the authors, not of the IBM Corporation.
© Copyright IBM Canada Ltd., 2004. All rights reserved.
IBM, developerWorks are registered trademarks of International Business Machines Corporation in the United States, other countries, or both.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
Other company, product and service names may be trademarks or service marks of others.
[XML Schema Part 1] World Wide Web Consortium. XML Schema Part 1: Structures. Available at http://www.w3.org/TR/xmlschema-1/.
[XML Schema Part 2] World Wide Web Consortium. XML Schema Part 2: Datatypes. Available at http://www.w3.org/TR/xmlschema-2/.
[XML Schema API] XML Schema API. E.Litani, Author. Available at http://www.w3.org/Submission/.
[XML Information Set] World Wide Web Consortium. XML Information Set. Available at http://www.w3.org/TR/xml-infoset.
[Namespace in XML] World Wide Web Consortium. Namespaces in XML. Available at http://www.w3.org/TR/REC-xml-names/.
[Document Object Model] World Wide Web Consortium. Document Object Model technical reports. Available at http://www.w3.org/DOM/DOMTR.
[OMG IDL] OMG IDL Syntax and Semantics. Available at http://www.omg.org/technology/documents/formal/corba_2.htm.
[SAX] Simple API for XML. Available at http://www.saxproject.org/.
[Xerces2 XML Java Parser] Apache Xerces2 XML Java Parser. Available at http://xml.apache.org/xerces2-j/index.html.
[Discover key features of DOM Level 3 Core] Discover key features of DOM Level 3 Core A. Le Hors, E. Litani, Editors. Available on developersWorks at http://www-106.ibm.com/developerworks/xml/library/x-keydom2.html.
[JAXP] Java API for XML Processing (JAXP). Available at http://java.sun.com/xml/jaxp/index.jsp/.
[XSLT 2.0] World Wide Web Consortium. XSL Transformations (XSLT) Version 2.0. Available at http://www.w3.org/TR/2003/WD-xslt20-20031112/
![]() ![]() |
Design & Development by deepX Ltd. |