XML Europe 2002 logo

When Well-Formed is too much and Validity is too little

Abstract

XML well-formedness and validity are properties of documents at distribution time. But when a prose document is being created or edited, it will rarely be valid or even well-formed. This paper looks at three new or recent techniques.

1) Partial ordering is based on finding which tags (start- or end-) can feasibly appear before or after each other; unfeasible markup can be discovered before the document is even well-formed.

2) Weak validation is based on strength-reducing schema (or DTD) particles so that all elements are optional. Again, only infeasible documents will cause validation errors.

3) Schematron's phases provide a managed way to express many kinds of different constraints, allowing documents to be vaidated first against some criteria then others, suitable to the document's progress through a markup process.

These can be contrasted to "partial validity", where all the elements in every element must match unambiguously part of the way through the content model of the element. The first wave of XML editors required well-formedness and usually provided partial validity only, or even required strict validity.

Examples of these new kinds of validation may be demonstrated using the Topologi Schematron Validator (a free tool) and the Topologi Markup Editor (a state-of-the-art commercial product for release in 1Q/2002)

Keywords


The full paper was not available at the time the proceedings were created. Please check the conference web site, http://www.xmleurope.com, to find an updated version of this paper.

Biography

Rick Jelliffe is the founder and CTO of Topologi, a new company which develops innovative and productive tools for XML deployment. The company has three areas of expertise: productive editing tools for publishers, schemas, and internationalization. It is just bringing to market the "Topologi Markup Editor", a new design for XML editors.He has been a member of the W3C XML Schemas Working Group, the W3C Internationalization Interest Group, the W3C XML Interest Group, the China/Japan/Korea Document Processing Group, and was Austalian delegate tothe ISO Working Group on Document and Document Processing Languages, which developed SGML and DSSSL. He has worked in Australia, Taiwan and Japan and was the intigator of the "Chinese XML Now!" project at Academia Sinica, Taipei, an unpublicized technical website that receives over 500,000 hits per year. He is the inventor of the Schematron schema language, the author of "The XML & SGML Cookbook", and a frequent contributor to XML forums and of open source code.