XML 2002 logo

Converting RELAX NG to W3C XML Schema

Abstract

RELAX NG, especially in its compact syntax, provides a very easy to learn and easy to use schema language for XML. On the other hand, W3C XML Schemas currently enjoys much more widespread industry support. Automatic conversion of RELAX NG to W3C XML Schema allows users to have the best of both worlds.

RELAX NG is more expressive than W3C XML Schema. Thus there are RELAX NG schemas that it is impossible to exactly translate into W3C XML Schemas. However, such schemas can be "approximated" by generating a W3C XML Schema that allows a superset of what the RELAX NG schema allows.

When generating W3C XML Schema, the goal is not simply to produce a schema that validates the same documents as the original schema. It is also desirable to preserve the way that the original RELAX NG schema used defines and includes, so that the resulting W3C XML Schema is as human-understandable as possible. Ideally, the resulting schema should be similar to something that might be produced by somebody authoring directly in W3C XML Schema.

Some examples of the challenges to be confronted in performing the conversion are:

- Handling multi-namespace documents: RELAX NG allows elements and attributes from multiple namespaces to be freely mixed, whereas W3C requires a rigid segmentation of the schema into separate namespaces.

- Wildcards: RELAX NG handles elements whose names are specified by wildcards in a way that is relatively uniform with other elements, whereas in W3C XML Schema wildcards are handled quite differently

- Attribute constraints: RELAX NG integrates attributes into content models allowing very expressive constraints, whereas W3C XML Schema supports only optional/required attributes; this requires approximation

- Definitions: RELAX NG provides one kind of top-level definition (using the <define> element), whereas W3C XML Schema provides many kinds of top-level definitions/declarations (element, attribute, group, attributeGroup, complexType, simpleType); the conversion has to intelligently select the appropriate kind to use

This talk will assess how well RELAX NG can be made to work as a mechanism for creating W3C XML Schemas.

Keywords

»Schema.