XML 2003 logo

Namespace Routing Language (NRL)

Abstract

The XML Namespaces Recommendation allows an XML document to be composed of elements and attributes from multiple independent namespaces. Each of these namespaces may have its own schema; the schemas for different namespaces may be in different schema languages. The problem then arises of how the schemas can be composed in order to allow validation of the complete document. This document proposes the Namespace Routing Language (NRL) as a solution to this problem. NRL is an evolution of the author's earlier Modular Namespaces (MNS)[mns] language.

A sample implementation of NRL is included in Jing[jing].


Table of Contents

1. Getting started
2. Processing model
3. Specifying the schema
4. Concurrent validation
5. Built-in schemas
6. Namespace wildcards
7. Modes
8. Related namespaces
9. Built-in modes
10. Open schemas
11. Element-name context
12. Attributes
13. Mode inheritance
14. Transparent namespaces
15. Related work
Acknowledgements
Bibliography
Biography

1. Getting started

In its simplest form, an NRL schema consists of a mapping from namespace URIs to schema URIs. An NRL schema is written in XML. Here is an example:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rng"/>
  </namespace>
</rules>

We will call a schema referenced by an NRL schema a subschema. In the above example, ‘soap-envelope.xsd’ is the subschema for the namespace URI ‘http://schemas.xmlsoap.org/soap/envelope/’ and ‘xhtml.rng’ is the subschema for the namespace URI ‘http://www.w3.org/1999/xhtml’.

The absent namespace can be mapped to a schema by using ‘ns=""’.

2. Processing model

NRL validation has two inputs: a document to be validated and an NRL schema. We will call the document to be validated the instance. NRL validation divides the instance into sections, each of which contains elements from a single namespace, and validates each section separately against the subschema for its namespace.

Thus, the following instance:

<env:Envelope xmlns="http://www.w3.org/1999/xhtml"
              xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
  <env:Body>
    <html>
      <head>
        <title>Document 1</title>
      </head>
      <body>
        <p>...</p>
      </body>
    </html>
    <html>
      <head>
        <title>Document 2</title>
      </head>
      <body>
        <p>...</p>
      </body>
    </html>
  </env:Body>
</env:Envelope>

would be divided into three sections, one with the envelope namespace

<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
  <env:Body/>
</env:Envelope>

and two with the XHTML namespace:

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Document 1</title>
  </head>
  <body>
    <p>...</p>
  </body>
</html>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Document 2</title>
  </head>
  <body>
    <p>...</p>
  </body>
</html>

Note that two elements only belong to the same section if they have a common ancestor and if all elements on the path to that common ancestor have the same namespace. Thus, if one of the XHTML documents happened to contain an element from the envelope, it would not be part of the same section as the root element.

This validation process can be refined in several ways, which are described in the following sections.

3. Specifying the schema

In most cases the schema will be in some namespaced XML vocabulary, and the type of schema can be automatically detected from the namespace URI of the root element. In cases where the schema is not in XML and there is no MIME type information available to determine the type, a ‘schemaType’ attribute can be used to specify the type. The value of this should be a MIME media type. For RELAX NG Compact Syntax[compact], a value of ‘application/x-rnc’ should be used.

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rnc"
              schemaType="application/x-rnc"/>
  </namespace>
</rules>

With many schema languages, there can be different ways to use a particular schema to validate an instance. For example, Schematron[schematron] has the notion of a phase; an instance that is valid with respect to a Schematron schema using one phase may not be valid with respect to the same schema in another phase. NRL allows validation to be controlled by specifying a number of options. For example, to specify that validate with respect to ‘xhtml.sch’ should use the phase named ‘Full’, an option could be specified as follows:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.sch">
      <option name="http://www.thaiopensource.com/validate/phase"
              arg="Full"/>
    </validate>
  </namespace>
</rules>

Options may have arguments. Some options do not need arguments. For example, for Schematron there is a ‘http://www.thaiopensource.com/validate/diagnose’ option. If this option is present, then errors will include Schematron diagnostics; if it is not, then errors will not include diagnostics. With this option, no ‘arg’ attribute is necessary:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.sch">
      <option name="http://www.thaiopensource.com/validate/diagnose"/>
    </validate>
  </namespace>
</rules>

Options are named by URIs. A number of standard options are defined which all start with the URI ‘http://www.thaiopensource.com/validate/’:

http://www.thaiopensource.com/validate/phase

Argument is a string, specifying Schematron phase

http://www.thaiopensource.com/validate/diagnose

No argument. If present, include Schematron diagnostics in error messages

http://www.thaiopensource.com/validate/check-id-idref

No argument. If present, check ID/IDREF in accordance with RELAX NG DTD Compatibility[dtdcompat] specification.

http://www.thaiopensource.com/validate/feasible

No argument. If present, check that the document is feasibly valid. This applies to RELAX NG[relaxng]. A document is feasibly valid if it could be transformed into a valid document by inserting any number of attributes and child elements anywhere in the tree. This is equivalent to transforming the schema by wrapping every ‘data’, ‘list’, ‘element’ and ‘attribute’ element in an ‘optional’ element and then validating against the transformed schema. This option is useful while a document is still under construction.

http://www.thaiopensource.com/validate/schema

Argument is a URI specifying an additional schema to be used for validation. This applies to W3C XML Schema[wxs]. This option may be specified multiple times, once for each additional schema.

For convenience, the URI specified by the ‘name’ attribute may be relative; if it is, it will be resolved relative to the NRL namespace URI. The result is that the standard options above can be specified without the ‘http://www.thaiopensource.com/validate/’ prefix. For example,

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.sch">
      <option name="phase"
              arg="Full"/>
    </validate>
  </namespace>
</rules>

Normally, an NRL implementation will make a best-effort attempt to support the specified option and will simply ignore options that it does not understand or cannot support. If it is essential that a particular option is supported, then a ‘mustSupport’ attribute may be added to the ‘option’ element:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.sch">
      <option name="phase"
              arg="Full"
              mustSupport="true"/>
    </validate>
  </namespace>
</rules>

If there is a ‘mustSupport’ attribute and the NRL implementation cannot support the option, it must report an error.

4. Concurrent validation

Multiple ‘validate’ elements can be specified for a single namespace. The effect is to validate against all of the specified schemas.

For example, we might have a Schematron schema for XHTML, which makes various checks that cannot be expressed in a grammar. We want to validate against both the Schematron schema and the RELAX NG schema. The NRL schema would be like this:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rng"/>
    <validate schema="xhtml.sch"/>
  </namespace>
</rules>

5. Built-in schemas

Instead of a ‘validate’ element, you can use an ‘allow’ element or a ‘reject’ element. These are equivalent respectively to validating with a schema that allows anything or with a schema that allows nothing.

For example, the following would allow SVG without attempting to validate it:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rng"/>
  </namespace>
  <namespace ns="http://www.w3.org/2000/svg">
    <allow/>
  </namespace>
</rules>

Note that, just as with ‘validate’, ‘allow’ and ‘reject’ apply to a section not to a whole subtree. Thus, in the above example, if the SVG contained an embedded XHTML section, then that XHTML section would be validated against ‘xhtml.rng’.

6. Namespace wildcards

You can use an ‘anyNamespace’ element instead of a ‘namespace’ element. This specifies a rule to be used for an element for which there is no applicable ‘namespace’ rule.

Namespace wildcards are particularly useful in conjunction with ‘allow’ and ‘reject’. The following will validate strictly, rejecting any namespace for which no subschema is specified:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rng"/>
  </namespace>
  <anyNamespace>
    <reject/>
  </anyNamespace>
</rules>

In contrast, the following will validate laxly, allowing any namespace for which no subschema is specified:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rng"/>
  </namespace>
  <anyNamespace>
    <allow/>
  </anyNamespace>
</rules>

The default is to validate strictly. Thus, if there is no ‘anyNamespace’ rule, then the following rule will be implied:

<anyNamespace>
  <reject/>
</anyNamespace>

7. Modes

You can apply different rules in different contexts by using modes. For example, you might want to restrict the namespaces allowed for the root element.

The ‘rules’ element for an NRL schema that uses multiple modes does not contain ‘namespace’ and ‘anyNamespace’ elements directly. Rather, it contains ‘mode’ elements that in turn contain ‘namespace’ and ‘anyNamespace’ elements. The ‘validate’ elements can specify a ‘useMode’ attribute to change the mode in which their child sections are processed. The ‘rules’ element must have a ‘startMode’ attribute specifying which mode to use for the root element.

For example, suppose we want to require that the root element come from ‘http://schemas.xmlsoap.org/soap/envelope/’ namespace.

<rules startMode="soap"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="soap">
    <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
      <validate schema="soap-envelope.xsd"
                useMode="body"/>
    </namespace>
  </mode>
  <mode name="body">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml.rng"/>
    </namespace>
  </mode>
</rules>

If a ‘validate’ element does not specify a ‘useMode’ attribute, then the mode remains unchanged. Thus, in the above example, child sections inside an XHTML section will be processed in mode ‘body’, which does not allow the SOAP namespace; so if the XHTML were to contain a SOAP ‘env:Envelope’ element, it would be rejected.

The ‘reject’ and ‘allow’ elements can have a ‘useMode’ attribute as well.

8. Related namespaces

A single subschema may not handle just a single namespace; it may be handle two or more related namespaces. To deal with this possibility, NRL allows the rule for a namespace to specify that elements from that namespace are to be attached to a parent section and be validated together with that parent section.

Suppose we have RELAX NG schemas for XHTML and for SVG. We could use these directly as subschemas in NRL. But we might prefer instead to use RELAX NG mechanisms to combine these into a single RELAX NG schema. This would allow us conveniently to allow SVG elements only to occur in places where XHTML block and inline elements are allowed and to disallow them in places that make no sense (for example, as children of a ‘ul’ element). If we have such a combined schema, we could use it as follows:

<rules startMode="soap"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="soap">
    <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
      <validate schema="soap-envelope.xsd"
                useMode="xhtml"/>
    </namespace>
  </mode>
  <mode name="xhtml">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml+svg.rng"
                useMode="svg"/>
    </namespace>
  </mode>
  <mode name="svg">
    <namespace ns="http://www.w3.org/2000/svg">
      <attach/>
    </namespace>
  </mode>
</rules>

This will cause SVG sections occurring within XHTML to be attached to the parent XHTML section and be validated as part of it.

RDF is another example where ‘attach’ is necessary. RDF can contain elements from arbitrary namespaces.

<rules startMode="root"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="root">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml.rng"
                useMode="body"/>
    </namespace>
  </mode>
  <mode name="body">
    <namespace ns="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <validate schema="rdfxml.rng"
                useMode="rdf"/>
    </namespace>
  </mode>
  <mode name="rdf">
    <anyNamespace>
      <attach/>
    </anyNamespace>
  </mode>
</rules>

We could use the approach of attaching all namespaces as an alternative solution to the XHTML+SVG example. Instead relying on NRL to reject namespaces other than XHTML and SVG, we can instead attach sections from all namespaces to the XHTML section, and allow the ‘xhtml+svg.rng’ schema to reject namespaces other than XHTML and SVG.

<rules startMode="soap"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="soap">
    <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
      <validate schema="soap-envelope.xsd"
                useMode="xhtml"/>
    </namespace>
  </mode>
  <mode name="xhtml">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml+svg.rng"
                useMode="attach"/>
    </namespace>
  </mode>
  <mode name="attach">
    <anyNamespace>
      <attach/>
    </anyNamespace>
  </mode>
</rules>

9. Built-in modes

There is a built-in mode named ‘#attach’, which contains just the rule:

<anyNamespace>
  <attach/>
</anyNamespace>

Thus, the last example in the previous section can be simplified to:

<rules startMode="soap"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="soap">
    <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
      <validate schema="soap-envelope.xsd"
                useMode="xhtml"/>
    </namespace>
  </mode>
  <mode name="xhtml">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml+svg.rng"
                useMode="#attach"/>
    </namespace>
  </mode>
</rules>

Suppose you are not interested in the namespace-sectioning capabilities of NRL, but you just want to validate a document concurrently against two schemas. The simplest way is like this:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <anyNamespace>
    <validate schema="xhtml.rng"
              useMode="#attach"/>
    <validate schema="xhtml.sch"
              useMode="#attach"/>
  </anyNamespace>
</rules>

The ‘useMode="#attach"’ ensures that the document will be validated as is, rather than divided into sections.

Similarly, there is a built-in mode named ‘#reject’, which contains just the rule:

<anyNamespace>
  <reject/>
</anyNamespace>

and a built-in mode named ‘#allow’, which contains just the rule:

<anyNamespace>
  <allow/>
</anyNamespace>

10. Open schemas

Up to now, sections validated by one subschema have not participated in the validation of parent sections. Modern schema languages, such as W3C XML Schema and RELAX NG, can use wildcards to allow elements and attributes from any namespace in particular contexts. It is useful to take advantage of this in order to allow one subschema to constrain the contexts in which sections validated by other subschemas can occur. For example, the official schema for ‘http://schemas.xmlsoap.org/soap/envelope/’ uses wildcards to specify precisely where elements from other namespaces are allowed: they are allowed as children of the ‘env:Body’ and ‘env:Header’ elements but not as children of the ‘env:Envelope’ element. Our NRL schema bypasses these constraints because the XHTML sections are not seen by the SOAP validation. We can use ‘attach’ to solve this problem:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
    <validate schema="soap-envelope.xsd"/>
  </namespace>
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rng"/>
    <attach/>
  </namespace>
</rules>

When an XHTML section occurs inside a SOAP section, the XHTML section will participate in two validations:

  • it will be validated independently against the XHTML schema, and

  • it will be attached to the SOAP section and validated together with the SOAP section against the SOAP schema

11. Element-name context

So far we have seen how to make the processing of an element depend on the namespace URIs of its ancestors. NRL also allows the processing to depend on the element names of its ancestors. For example, suppose we wish to allow RDF to occur only as a child of the ‘head’ element of XHTML. We can do this as follows:

<rules startMode="root"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="root">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml.rng">
        <context path="head"
                 useMode="rdf"/>
      </validate>
    </namespace>
  </mode>
  <mode name="rdf">
    <namespace ns="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <validate schema="rdfxml.rng"
                useMode="#attach"/>
    </namespace>
  </mode>
</rules>

Any element that takes a ‘useMode’ attribute can also have one or more ‘context’ children that override the ‘useMode’ attribute in specific contexts. The ‘path’ attribute specifies a test to be applied to the parent element of the section to be processed. The ‘path’ attribute allows a restricted form of XPath: a list of one or more choices separated by ‘|’, where each choice is a list of one or more unqualified names separated by ‘/’, optionally preceded by ‘/’. It is interpreted like a pattern in XSLT, except that the names are implicitly qualified with the namespace URI of the containing ‘namespace’ element. When more than one path matches, the most specific is chosen. It is an error to have two or more equally specific paths. The path is tested against a single section not the entire document: a path of ‘/foo’ means a ‘foo’ element that is the root of a section; it does not mean a ‘foo’ element that is the root of the document.

12. Attributes

Up to now, we have considered attributes to be inseparably attached to their parent elements. Although this is the default behaviour is to attach attributes to their parent elements, attributes are in fact considered to be separate sections and can be processed separately. Attributes with the same namespace URI and same parent element are grouped in a single section. Such sections are called attribute sections; sections that contain elements are called element sections.

A ‘namespace’ or ‘anyNamespace’ element can have a ‘match’ attribute, whose value must be a list of one or two of the tokens ‘attributes’ and ‘elements’. If the value includes the token ‘attributes’, the rule matches attribute sections.

The default behaviours of attaching attributes to their parent elements occurs because the default value of the ‘match’ attribute is ‘elements’ and because all of the built-in modes include a rule:

<anyNamespace match="attributes">
  <attach/>
</anyNamespace>

Most, if not all, XML schema languages do not have any notion of validating a set of attributes; they know only how to validate an XML element. Therefore, before validating an attribute section, NRL transforms it into an XML element by creating a dummy element to hold the attributes. NRL also performs a corresponding transformation on the schema. This is schema-language dependent. For example, in the case of RELAX NG, a schema s is transformed to ‘<element><anyName/> s </element>’.

For example, suppose ‘xmlatts.rng’ contains a schema for the attributes in the ‘xml:’ namespace written in RELAX NG:

<group datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
       xmlns="http://relaxng.org/ns/structure/1.0">
  <optional>
    <attribute name="xml:lang">
      <choice>
        <data type="language"/>
        <value/>
      </choice>
    </attribute>
  </optional>
  <optional>
    <attribute name="xml:base">
      <data type="anyURI"/>
    </attribute>
  </optional>
  <optional>
    <attribute name="xml:space">
      <choice>
        <value>preserve</value>
        <value>default</value>
      </choice>
    </attribute>
  </optional>
</group>

An NRL schema could use this as follows:

<rules xmlns="http://www.thaiopensource.com/validate/nrl">
  <namespace ns="http://www.w3.org/1999/xhtml">
    <validate schema="xhtml.rng"/>
  </namespace>
  <namespace ns="http://www.w3.org/XML/1998/namespace"
             match="attributes">
    <validate schema="xmlatts.rng"/>
  </namespace>
</rules>

13. Mode inheritance

One mode can extend another mode. Suppose in our SOAP+XHTML example, we want to allow both SOAP element and XHTML elements to contain RDF. By putting the rule for RDF in its own mode and extending that mode, we can avoid having to specify the rule for RDF twice:

<rules startMode="soap"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="common">
    <namespace ns="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <validate schema="rdfxml.rng"
                useMode="#attach"/>
    </namespace>
  </mode>
  <mode name="soap"
        extends="common">
    <namespace ns="http://schemas.xmlsoap.org/soap/envelope/">
      <validate schema="soap-envelope.xsd"
                useMode="body"/>
    </namespace>
  </mode>
  <mode name="body"
        extends="common">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml.rng"/>
    </namespace>
  </mode>
</rules>

It is possible to extend a built-in mode. Thus, a mode that validates laxly can be specified simply just by extending ‘#allow’. This works because of how wildcards and inheritance interact. Suppose mode x extends mode y; then when using mode x, the following order will be used to search for a matching rule:

  1. a non-wildcard rule in x

  2. a non-wildcard rule in y

  3. a wildcard rule in x

  4. a wildcard rule in y

The requirement that there is an implicit rule of

<anyNamespace>
  <reject/>
</anyNamespace>

can be restated as a requirement that the default value of the ‘extends’ attribute is ‘#reject’.

14. Transparent namespaces

Many schema languages can deal with the kind of extensibility that involves adding child elements or attributes from different namespaces. A more difficult kind of extensibility is where we need to be able to wrap an extension element around an existing non-extension element. This can arise with namespaces describing templating and versioning. Imagine XHTML inside an XSLT stylesheet: in such a document we might have a ‘ul’ element containing an ‘xsl:for-each’ element containing an ‘li’ element, although the schema for XHTML requires ‘li’ elements to occur as direct children of ‘ul’ elements. In such a situation, we need to need to make the XHTML schema unwrap the ‘xsl:for-each’ element, ignoring its start-tag and end-tag, but not ignoring its content.

Suppose we have a namespace ‘http://www.example.org/edit’ containing elements ‘inserted’ and ‘deleted’, which describe edits that have been made to a document, and suppose we want to use these elements inside an XHTML document. The following NRL schema would allow us still to validate the XHTML document.

<rules startMode="root"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="root">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml.rng"
                useMode="xhtml"/>
    </namespace>
  </mode>
  <mode name="xhtml">
    <namespace ns="http://www.example.org/edit">
      <unwrap/>
    </namespace>
    <namespace ns="http://www.w3.org/1999/xhtml">
      <attach/>
    </namespace>
  </mode>
</rules>

When ‘unwrap’ is applied to an element section e, it ignores the elements in e and their attributes and just processes the child element sections of e; if processing the child element sections causes a section to try to attach to e, it will instead attach to the parent of e. Thus, in the above schema the section from the edit namespace will be ignored, but child sections will be processed according to rules applicable in the ‘xhtml’ mode. When a edit section has an XHTML child section, then that XHTML child section will be attached to the parent of the edit section (which can only be another XHTML section).

The above schema does not deal with validating the edit namespace. Let us suppose that ‘inserted’ and ‘deleted’ elements cannot nest. Our schema ‘edit.rnc’ for the edit namespace is just two lines:

default namespace = "http://www.example.org/edit"
element inserted|deleted { empty }

The following NRL schema would allow validation of the edit namespace:

<rules startMode="root"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="root">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml.rng"
                useMode="xhtml"/>
    </namespace>
  </mode>
  <mode name="xhtml"
        extends="noEdit">
    <namespace ns="http://www.example.org/edit">
      <validate schema="edit.rnc"
                schemaType="application/x-rnc"
                useMode="#allow"/>
      <unwrap useMode="noEdit"/>
    </namespace>
  </mode>
  <mode name="noEdit">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <attach/>
    </namespace>
  </mode>
</rules>

The above schema is still not quite right. Suppose a ‘title’ element was both inserted and deleted. With the above NRL schema, XHTML validation would see two ‘title’ elements, which would get an error. We should instead do XHTML validation twice, once including the content of the ‘inserted’ elements and ignoring the content of the ‘deleted’ elements and once doing the opposite. We only need to validate the edit elements once. The following NRL schema accomplishes this:

<rules startMode="root"
       xmlns="http://www.thaiopensource.com/validate/nrl">
  <mode name="root">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <validate schema="xhtml.rng"
                useMode="new"/>
      <validate schema="xhtml.rng"
                useMode="old"/>
    </namespace>
  </mode>
  <mode name="new"
        extends="noEdit">
    <namespace ns="http://www.example.org/edit">
      <validate schema="edit.rnc"
                schemaType="application/x-rnc"
                useMode="#allow"/>
      <unwrap useMode="noEdit">
        <context path="deleted"
                 useMode="#allow"/>
      </unwrap>
    </namespace>
  </mode>
  <mode name="old"
        extends="noEdit">
    <namespace ns="http://www.example.org/edit">
      <unwrap useMode="noEdit">
        <context path="inserted"
                 useMode="#allow"/>
      </unwrap>
    </namespace>
  </mode>
  <mode name="noEdit">
    <namespace ns="http://www.w3.org/1999/xhtml">
      <attach/>
    </namespace>
  </mode>
</rules>

15. Related work

The fundamental idea of dividing the instance into sections, each of which contains elements from a single namespace, and then validating each section separately against the schema for its namespace originated in Murata Makoto's RELAX Namespace[relaxns]. ISO/IEC JTC1/SC34 (the ISO subcommittee responsible for Document Description and Processing Languages) is developing ISO/IEC 19757 Document Schema Definition Languages (DSDL) as a multi-part standard. A Committee Draft (CD) of Part 4: Selection of Validation Candidates[N363], which was based on RELAX Namespace, has been approved. Comments on the CD have been resolved[N415]. MNS[mns], the predecessor to NRL, was input to the CD comment resolution process. In response to MNS, Rick Jelliffe produced the Namespace Switchboard[nsswitch], which was also input to the CD comment resolution process. Some of the evolution of NRL from MNS was inspired by the Namespace Switchboard. A Final Committee Draft (FCD) of Part 4 is currently in preparation; NRL will be submitted as input.

At this stage, no guarantees can be made about how NRL will relate to the FCD. In the opinion of this document's author and of the DSDL Part 4 project editor (Murata Makoto), the functionality is likely to be similar, with the following possible exceptions:

However, the syntax may well be different. In particular:

  • Names of elements and attributes may be different.

  • Syntactic sugar for modes may be different. The FCD may not provide Section 13, “Mode inheritance”. The FCD may use nesting to avoid the need to name modes in some cases.

  • The FCD is expected to provide syntactic sugar for an action equivalent to ‘<attach useMode="x"/>’, where x is a built-in mode like ‘#allow’ except that it allows attributes as well as elements. The idea is to allow subschemas to use empty elements as placeholders.

  • The FCD is expected to provide a schema inclusion mechanism (not just using NRL as a subschema).

  • The FCD is expected to allow inline schemas, for example, by allowing ‘validate’ to have a ‘schema’ element containing the schema as an alternative to the ‘schema’ attribute containing the schema's URL.

The group working on DSDL (SC34/WG1) welcomes public discussion of DSDL. Comments on NRL would be useful input to the Part 4 FCD preparation process. See the DSDL web site[dsdl.org] for information on how to make comments.

Acknowledgements

Thanks to Murata Makoto and Rick Jelliffe for helpful comments.

Bibliography

[N363] Committee Draft of Document Schema Definition Languages (DSDL) -- Part 4: Selection of Validation Candidates, http://www.y12.doe.gov/sgml/sc34/document/0363.htm

[N415] Comment Disposition of Committee Draft Ballot of Document Schema Definition Languages (DSDL) -- Part 4: Selection of Validation Candidates, http://www.y12.doe.gov/sgml/sc34/document/0415.htm

[dsdl.org] DSDL Web Site, http://www.dsdl.org

[mns] Modular Namespaces (MNS), http://www.thaiopensource.com/relaxng/mns.html

[relaxcore] RELAX Core, http://www.xml.gr.jp/relax/

[relaxng] RELAX NG, http://relaxng.org

[wxs] W3C XML Schema, http://www.w3.org/TR/xmlschema-1/

Biography

James Clark has been involved with SGML and XML for more than 10 years, both in contributing to standards and in creating open source software. James was technical lead of the XML WG during the creation of the XML 1.0 Recommendation. He was editor of the XPath and XSLT Recommendations. He was the main author of the DSSSL (ISO 10179) standard. Currently, he is chair of the OASIS RELAX NG TC and editor of the RELAX NG specification. The open source software that James has written includes SGML parsers (sgmls and SP), a DSSSL implementation (Jade), XML parsers (expat and XP), an XPath/XSLT processor (XT), a RELAX NG validator (Jing), a schema conversion tool (Trang), and an XML mode for GNU Emacs (nXML mode). Prior to his involvement with SGML and XML, James wrote the GNU groff typesetting system. James read Mathematics and Philosophy at Merton College, Oxford, where he obtained First Class Honours. James lives in Thailand, where he runs the Thai Open Source Software Center.