Abstract
Many web applications collect and manipulate user input as XML documents, usually by receiving input through forms. It is natural to think of a schema for a document created by the form to be the specification for the application's user interface. This schema defines the structure and type information for the documents moving in and out of the application. Incoming documents may be treated differently by the application depending on whether they conform to a schema.
Because W3C XML Schema is itself an XML vocabulary, XML Schema files can be transformed using XSLT into any number of target formats: HTML, XML, text, etc. Our goal was to transform XML Schemas into XForms user interfaces. By doing so we tie the user experience of the application directly to the application's data model. Changes in the data model are reflected in the user interface via a "push of a button", thus eliminating the need for tedious rewriting of presentation code.
There have been other efforts to construct interfaces from XML Schemas, but most of those we found are limited by the types of schemas they could consume. For instance, many make assumptions that there will be no named global types, or neglect imported types and extension. To be truly generic, a schema processor should be able to identify and process types and elements from any imported schema. It should be able to recognize when a type is extending or restricting another type, and locate, in any namespace, the parent type to process its content model as well. A generic processor also must be able to handle different schema encoding styles, such as Venetian Blind or Garden of Eden.
This paper will address some of the key design and implementation issues we addressed to create a generic XSD to XForms processor. We will then outline how the processor is being used today, and how it might be improved in the future.
Keywords
Table of Contents
This presentation will discuss some of the key issues surrounding our implementation of automated methods for transforming a library of XML schemas into a user interface. This project is part of a joint research and software development project between the UC Berkeley Center for Document Engineering (CDE), the E-Berkeley initiative, and other Berkeley academic computing units. Our work is motivated by the CDE's commitment to developing model driven software development tools and techniques.
The efforts described here fulfill this commitment in two ways. Our processor tightly couples user interfaces with an underlying data model expressed in an XML Schema. This helps ensure data integrity and consistency at all levels of the user application. Second, our processor represents a formal process for encoding user interfaces. The automated nature of the process allows us to develop complex user interfaces without the need to write much custom code. It also allows us to easily propagate changes in the data model up to the user interface level, again with a minimum of coding. We believe that our techniques are a simpler, more developer-friendly approach to creating and maintaining user interfaces. Our interfaces do not require complicated and obscure script code for validating and marshaling form data. We minimize the danger that the application's data model and user interface go our of sync while one changes and the other does not.
The process described here is just one part of a larger effort at the CDE to develop an entire application framework built on XML and related technologies. For more information about the CDE's entire suite of research, see http://cde.berkeley.edu.
This paper will make heavy use of code examples for explaining our techniques. In the space provided here in the paper, we can only touch on a few key areas of the design. We encourage anyone who is interested to take a look at our code, which is available at http://cde.berkeley.edu/publications/uigen-xml-2003/, and contact the authors with any questions.
We are aware of a number of efforts similar to ours. We began this effort in late 2002, shortly before an article appeared in XML.COM that described transforming XML Schema into a user interface [Gropp 2003]. We applied some of Gropp's techniques for stepping through the XML Schema structure, but found that the approach in his article was limited to a single schema, while our requirements include support for schemas assembled from multiple files and namespaces.
In addition to articles, we experimented with the Chiba project, an open source XForms implementation that supports XForms generation from XSD. But we decided that a more lightweight solution that isn't tied to an entire XForms package better suited our needs.
Our choice of XForms as the target format for our transformation resulted from a number of important considerations. The Gropp article transforms a schema into an XHTML form, which is perhaps sufficient for a simple web form. However, in our case, we wanted to transform the schema into a more powerful and expressive format. We considered a few options.
First, we thought perhaps we could transform the schema into an instance, and then transform the instance into a UI. Members of our team associated with a product of CommerceOne are familiar with this technique. However, an instance, while it will have the structure necessary to build a UI for the application, lacks crucial typing information. There would be no way to construct radio buttons corresponding to the values of xs:enumeration elements if an instance is used to create the UI.
Next we considered two XML based user interface vocabularies: UIML, an effort of OASIS, and XForms, currently a W3C Proposed Recommendation. We found UIML interesting, but perhaps too abstract for our purposes, as it seems designed to describe any user interface running in any context - Web, Java Swing, MFC, etc. [UIML]. So we decided to use XForms as our target format. Since no commercial browser currently supports XForms, we have relied on a server side XForms implementation, Orbeon's OXF, for ultimate presentation to the user [Orbeon].
Choosing XForms as a target format has implications for how the schema to UI transform operates. The stylesheet must make at least two passes through the schema document to construct the XForms. The first pass creates the XForms Model element, which contains a complete instance document based on the schema. The second pass will create the XForms Body and all the UI controls. Both passes use similar template matching sequences and templates, but differ in their mode. See later sections of this paper for more examples of this, as well as a discussion of how the process might be modified in the future to support additional output formats.
Before discussing anything about the schema to XForms transformation process we should first describe the schemas themselves. Our efforts at the CDE on behalf of UC Berkeley began with a data modeling project around a system for approving new courses of instruction. We examined a number of campus systems and developed an enterprise data model for a course that a number of campus systems could share. We encoded this data model as a set of XML Schemas that formed the beginnings of what we named the Berkeley Academic Business Language, or BABL for short [BABL]. BABL itself used the Universal Business Language [UBL], an e-business standards initiative led by OASIS, as the foundation for its type system.
At the time of its design, BABL consisted of a number of schemas: one for a core set of components, another for a core Course type, and others that layered catalog, scheduling, and approval contexts on top of the core Course. Each higher level namespace imported the lower level ones. In addition, all the namespaces depended on the UBL Core Component Types for their primitive data types. The schema that described the document we initially wished to build an interface around relied on five namespaces, including itself, for constructing its type system. This amount of complexity far exceeded that of any example we found during the research stage of our project.
The above graphic shows the process for transforming a library of XML Schemas into an XForms user interface. First, a stylesheet transforms the schema containing the document element of the form's target instance into a complete XForms model and set of controls. Most of this presentation will focus on this first process. Next, a second stylesheet splits the first XForms file into two files: one that contains the xforms:model and another that contains all of the XForms controls. Finally a set of stylesheets, each of which corresponds to a single screen in the application, selects the controls for that screen while styling them using CSS. The screens' controls all point back to the original xforms:model.
Our discussion of the stylesheet which transforms the library schema into the complete XForms will confine itself to a few of the key problems we had to solve in order to implement a generalized solution that could work for any schema. These issues included:
Processing imported types
Schema encoding styles
Handling type extension
Rules for encoding XML Schema elements as XForms controls
The entire transformation is available for viewing at http://cde.berkeley.edu/publications/uigen-xml-2003/
Dealing with types that are imported from other namespaces posed a significant challenge when planning to construct this transform. We investigated several possible solutions. We thought that perhaps, when validating a schema using Java, that the entire schema (including imported types) would be loaded into a DOM tree that we could access and transform. We also investigated using the DOM level 3 ASBuilder class, as it contains a representation of a schema.
A fundamental goal of our research, however, is to test the limits of a pure XML solution, which led us to use XSLT. We found that the document() function in XSLT enabled us to load each of the imported schemas into a variable we could access to obtain type information. To match types across namespaces we also needed access to the prefix assigned to reference each imported schema. Rather than try to parse out the "xmlns" attributes in the xs:schema element of the main XSD and match them to xs:import statements and a schema document, we used xs:appinfo inside the xs:import statement to specify the prefix as follows:
<xs:import namespace="urn:ns1"
schemaLocation="ns1.xsd">
<xs:annotation>
<xs:appinfo>
<Prefix>ns1</Prefix>
<xs:appinfo>
</xs:annotation>
</xs:import>The stylesheet begins processing by matching on each xs:import element and loading each imported schema into a global variable as follows:
<xsl:variable name="imports">
<xsl:apply-templates select="/xs:schema/xs:import"/>
</xsl:variable>
<xsl:template match="xs:import">
<xsl:element name="schema">
<!-- get the Prefix element from the annotation and put its
value into an attribute -->
<xsl:attribute name="prefix">
<xsl:value-of select="xs:annotation/xs:appinfo/Prefix"/>
</xsl:attribute>
<xsl:attribute name="urn">
<xsl:value-of select="@namespace"/>
</xsl:attribute>
<xsl:copy-of select="document(@schemaLocation)"/>
</xsl:element>
</xsl:template>
To access a type in one of the imported schemas, the stylesheet examines the prefix from the type name and locates the appropriate schema like so:
<!-- declaration of an element whose type is located in an imported schema -->
<xs:element name="Foo" type="ns1:FooType"/>
<xsl:template match="xs:element[@name]">
<!-- ... -->
<!-- separate the prefix from the name of the type -->
<xsl:variable name="typeNS" select="substring-before(@type,':')"/>
<xsl:variable name="typeName" select="substring-after(@type,':')"/>
<!-- match on the type in the imported schema,
using the prefix and the $imports variable -->
<!-- the 'ns1' schema has been loaded into the
$imports variable -->
<xsl:apply-templates select="$imports/schema[@prefix=$typeNS]/xs:schema/
xs:complexType[@name=$typeName]"/>
<xsl:apply-templates select="/xs:schema/xs:complexType[@name=$typeName]"/>
<!-- ... -->
</xsl:template>Using XSL 1.0, the $imports variable will be a result tree fragment, and the above xsl:apply-templates will not work. There are two solutions to this problem. The first is to change the "version" attribute on the xsl:stylesheet element to 1.1. This will work with current versions of Saxon or Xalan, and is the solution we used. Another solution is to use the exslt:node-set() function to convert the $imports result tree fragment into a node set and make its contents accessible via XPath statements.
The stylesheet will transform a schema using either the "Venetian Blind" or "Garden of Eden" encoding style. A Venetian Blind style schema has globally scoped types and locally scoped elements, except for it's global "document" element(s) that can be the root element of an XML instance. A Garden of Eden style schema has globally scoped elements and types [Maler 2002]. We chose to support both of these encoding styles, as we use schemas with both styles. UBL has chosen to release its schemas according to the "Garden of Eden" style, which is the main reason why we chose to use it too. Supporting both of these styles was not terribly difficult. The transform, as it walks the document tree, only needs one extra step to look for referenced elements as well as named element declarations. If it matches on a referenced element, it will search for the global element declaration the same way we looked for the element's type in the above example:
<xsl:template match="xs:element[@ref]" mode="buildModel"> <xsl:variable name="elementNSPrefix" select="substring-before(@ref, ':')"/> <xsl:variable name="elementName" select="substring-after(@ref,':')"/> <!-- first check for the referenced element in an imported schema before looking locally--> <xsl:apply-templates select="$imports/schema[@prefix=$elementNSPrefix] /xs:schema/xs:element[@name = $elementName]" mode="buildModel"/> <xsl:apply-templates select="/xs:schema/xs:element [@name=$elementName]" mode="buildModel"/> </xsl:template>
Types which extend other types posed yet another challenge. When building a form based on a complex type which extends another type, we don't want to leave out the parent type's contents. If the stylesheet matches on an xs:extension element, it will find and process the extended type before processing the current type. As before, the stylesheet will search both the imported and local schemas for the base type being extended. See the following example:
<xsl:template match="xs:extension" mode="buildModel"> <!-- separate the prefix and the name of the base type --> <xsl:variable name="typeNS" select="substring-before(@base,':')"/> <xsl:variable name="typeName" select="substring-after(@base,':')"/> <!-- find the base type and process it --> <xsl:apply-templates select="$imports/schema[@prefix=$typeNS] /xs:schema/xs:complexType[@name=$typeName]" mode="buildModel"/> <xsl:apply-templates select="/xs:schema/xs:complexType [@name=$typeName]" mode="buildModel"/> <!-- get the elements added via extension --> <xsl:apply-templates select="*" mode="buildModel"/> </xsl:template>
Handling extension when building controls (see the next section for more detail) presented a challenge, as we had to suppress the normal rules for mapping an xs:complexType element to a control when matching on that type as a result of an extension. In this case, the type's content model should appear as part of the control created for the extending type, without creating a control for the extended type. This was accomplished using stylesheet parameters that indicate whether an extension has occurred.
An XForms document has two main parts: a model which contains an XML instance which describes the data structure for the form's output, and a set of controls that are bound to the elements and attributes in the model. The binding can be specified by xforms:bind elements or with "ref" attributes on the controls. Each ref attribute contains an XPath expression tying the data collected by the control to a single element or attribute in the instance. This quick example shows how to bind an input control to the "/Person/Name" element in an XForms model:
<xforms:input ref="/Person/Name"/>
The schema to XForms stylesheet must build both an XForms model and the set of controls that bind to the model. To do this, the stylesheet makes two passes through the schema document. Both passes follow identical matching rules through the schema(s), and the stylesheet uses modes to differentiate the output from each pass. The stylesheet first uses the "buildModel" mode, which simply converts the schema into an instance document. The second mode, "buildUI" follows the same matching path and builds controls. Since XPaths are used to identify controls, each template with mode="buildUI" must accept a parameter that holds the current XPath in the instance document that was created by the "buildModel" templates. Each time the stylesheet matches on an xs:element element inside an xs:sequence, it appends the element's name to the current XPath, and recursively processes that element before returning to process the next element in the sequence.
The stylesheet follows a few simple rules for building the controls. Please see the code for examples of how the rules are implemented.
Leaf elements without enumerated content and attributes become xforms:input elements.
Leaf elements with enumerated content that can appear more than once in an instance document become xforms:select elements.
Leaf elements with enumerated content that appear once only in an instance become xforms:select1 elements.
Elements of complex type that appear once in an instance become xforms:group elements
Elements of complex type that can appear more than once in an instance become xforms:repeat elements. Additionally, the stylesheet will create xforms:trigger elements to add to and delete from the repeating list.
Labels for XForms controls are expressed as xforms:label elements inside the control. We copied this idea in our XSD and specified each element's label in its declaration. For the following example, the stylesheet will convert FirstName into an xforms:input which contains an xforms:label with value "First Name":
<xs:element name="FirstName" type="xs:string">
<xs:annotation>
<xs:appinfo>
<Label>First Name</Label>
</xs:appinfo>
</xs:annotation>
</xs:element> One important and unintended consequence of the transformation and its rules for creating controls is that the rules require the schema designer schema to be explicit and declarative about the underlying data model. This goes beyond the requirement that there not be wild cards like xsd:any in the schema's model. In fact, the designer must even avoid using abstract types and placeholder elements for semantically equivalent but structurally divergent elements.
For example, our original design of the BABL Course modeled prerequisites in the following way. A prerequisite for a course might be the completion of another course, a minimum GPA, a major, etc. All these things might be expressed as Prerequisite elements, yet structurally their types share nothing in common. Our original design called for an abstract and empty PrerequisiteType, that the different types would all derive from, but the different derivations shared nothing in common with each other. The declaration of a placeholder Prerequisite element of the abstract base type would be replaced in the instance with elements of the derived types.
While elegant from a modeling perspective, this approach wreaked havoc on our attempts to use the schema to automatically build a user interface. Imparting the transform with enough intelligence to recognize this pattern and find all the types that could possibly be used in place of the abstract type seemed much more difficult an effort than to just give the elements in our schema meaningful names. This sort of discipline only served to make us better modelers, because it forced us to be explicit about our model and to understand how our model might be used beyond just validation of XML instance documents.
A large online form, such as the one central to the application this process was designed for, ought to be split over multiple screens so as to not overwhelm the user (or the user agent). An XForms allows the developer to associate a screen of controls with an xforms:model and XML instance either embedded in the same file as the controls, or in an external file. Multiple screens can even point to the same xforms:model. As the user completes each screen, the external instance will be updated with that screen's results.
The initial XSD to XForms transformation produces a file that contains a complete xforms:model and set of controls. A second, very simple transform splits them into two files. The xforms:model element will be referenced by the screen(s) created with the controls.
The XForms file generated by this process can be quite large. One of our schemas for an application transforms into an XForms file over 2500 lines long. Many of the controls in that file mapped to elements which were not relevant to the end application user. For instance, since we used UBL types, metadata attributes such as "languageCode" corresponded to input controls. We're not about to ask users to specify the language code of every bit of text they type in, so such controls should be filtered out of the user interface. Also, the controls need to be styled using XHTML div elements and CSS. Finally, the controls might be broken up over multiple screens.
The CDE applications that currently use the XSD to XForms transform rely on custom stylesheets that perform all three of the above functions. These stylesheets identify controls by their "ref" attribute XPath, and copy them to a result XForms. We plan to eventually automate this process using a configuration file. The configuration file will contain a list of XPaths and styling information, and a processing program, either a stylesheet or Java, will select controls based on their XPaths, style them accordingly, and output them into an XForms document for a single screen.
Several projects maintained by the CDE started with the XML Schema to XForms transformation described above. However, since no commercial quality user agents natively support XForms yet, an intermediate step must intervene to transform the XForms instances into XHTML or some other presentation-specific language. The CDE's application suite rests on top of OXF, an XML-based application development platform providing a server-side implementation of XForms, which displays XHTML forms to the end user and maps traditional form name-value pairs to XML instance documents. Since the majority of web browsers will probably not provide native XForms support in the near future, this workaround allows application developers to exploit the power and flexibility of XForms without relying on a client-side implementation.
One can imagine many valuable extensions to and applications of the existing XSD to XForms transformation process. For example, more flexibility over the look and feel of the final output could be achieved by incorporating an XPath-based configuration file to aid in determining fine-grained correlations between XForms controls and their GUI counterparts. The transformation could be modified to allow for multiple output target formats, such as SVG, PDF, or Microsoft's InfoPath. Or, a web-based application allowing users to edit the very schemas driving the application in real time could be developed.
Two applications within the CDE's domain rely heavily on a data model driven approach to software development, so XForms processing provides a convenient means for coupling schemas and interfaces. Though quite different in scope, both applications spent their nascent days as data modeling projects, ultimately producing schemas defined in XSD and subsequently transformed into XForms. One application rests on top of complicated schemas that contain many imports and type extensions, while the other uses a trivial schema definition with no imports or extensions at all. While transforming a simple schema into an XForms instance may seem unimportant, supporting more complex varieties proves extremely convenient.
OXF supports a limited subset of the XForms specification (the XForms "processor" is based on the W3C's Candidate Recommendation, not the current Proposal), so certain features are lacking. Furthermore, since XForms was designed for use on the client side, some of its features have no obvious server side parallels. For example, the XForms Recommendation defines a straightforward way to validate user data on the client side before form submission, by associating data typing and other constraint information with controls. Server side XForms technologies by their nature do not include an analogous mechanism to achieve such results.
To solve this problem, staff researchers at the CDE developed a simple user interface schema and UI-specific transformation to aid in complex user interface generation. The transformation provides mechanisms to both mimic XForms' client side data validation capabilities and dictate XForms control appearance in the XHTML sent in the response. An XSLT template (shown below) called "InitJavascript" automatically generates Javascript functions that check whether certain elements are required, and whether or not they conform to the data types mandated in the XForms instance. While only date and number formatting functions are defined, extending this toolset to include constraints on string length or ranges for numbers and dates could be put into place easily. In addition, error message functions are defined in this context, so that errors are displayed inline to the user immediately if they enter bad data, eliminating the need for irritating alert windows, and more importantly, round trip form submission. Finally, the XHTML form output is grouped together logically and styled with CSS, allowing the document to maintain most of the semantic information originally encoded in the XForms instance.
<xsl:template name="InitJavascript">
function Init() {
var el = null;
<!-- loop through each Control element with a declared
type attribute and with an XForms output descendant -->
<xsl:for-each select="//ui:Control[@type and descendant::xr:*]">
el = document.getElementById("<xsl:value-of
select="generate-id()"/>");
el.onchange = <xsl:choose>
<xsl:when test="@type='shortdate'">_HandleDate</xsl:when>
<xsl:when test="@type='float'">_HandleFloat</xsl:when>
<xsl:otherwise>function() {}</xsl:otherwise>
</xsl:choose>;
el.isRequired = <xsl:choose>
<xsl:when test="@required='true' or @required='required'">
<xsl:value-of select="@required"/>
</xsl:when>
<xsl:otherwise>false</xsl:otherwise>
</xsl:choose>;
el.isDirty = false;
el.isValid = null;
el.showErrors = "inline";
el.errorMsgId = "<xsl:value-of select="generate-id()"/>Err";
el.showErrorMsg = _ShowErrorMsg;
<xsl:if test="@type='float'">
//format number
if(el.value!="")
el.value = parseFloat(el.value).formatFloat();
</xsl:if>
</xsl:for-each>
}
if(document.createElement) window.onload = Init;
</xsl:template>The state of the art of XML Schema to XForms transformation leaves much room for exploration and improvement. For example, methods could be incorporated to handle situations where elements can occur variable numbers of times. In addition, decoupling the processing logic from the templates controlling output format would facilitate support for multiple display formats, such as PDF or SVG.
The templates that build the xforms:model presently disregard the minOccurs and maxOccurs attributes of the elements it encounters. Instead they assume that any element type encountered during Schema processing should be instantiated as a single element of its specified type and copied to the output tree. While we could update the code to handle schemas in which these attributes have arbitrary values, OXF's lack of support for the xforms:repeat element limits implementation.
If an XML Schema were to dictate that an element must appear with minimum and maximum occurrence of n, our code could be easily changed without affecting working applications. The transformation would simply generate n elements of a certain type for output to the model. Another template would create n control elements assigned to the model in one-to-one correspondence using XPath predicates to bind positions 1, 2, ..., n. Unfortunately, this solution provides no insight into how to deal with elements whose XSD definition allows variable numbers of element instances. When OXF starts supporting xforms:repeat, this will no longer be a problem. If the minOccurs attribute equals zero or one, one element of that type will be copied to the output tree, along with a control that cues the user whether or not the input is necessary. If minOccurs is greater than one, the number specified in the minOccurs attribute will determine the number of elements copied to the model's portion of the output tree. Furthermore, maxOccurs will provide an upper bound as to how many elements of a certain type can successfully be added by a user.
Because we assumed that XForms would be the target output of the transformation, processing logic templates are tightly coupled with output types within the xforms namespace. We could change this assumption and separate program flow from output, allowing support for any number of forms languages. Decoupling would introduce several benefits. First, format specific language would be independent of processing flow. Second, a library of different stylesheets could be developed and imported into a master formatting stylesheet as needed. The CDE projects mentioned above use XHTML as a target display format, however presentation is not limited to vanilla web forms. Instead, transformations could yield PDF, SVG or Microsoft InfoPath based forms that still tie themselves directly to an application's underlying data model.
<xsl:template match="xs:complexType" mode="buildUI>
<xsl:choose>
<xsl:when test="$output = 'xforms'">
<xsl:apply-templates select="." mode="xforms">
</xsl:when>
<xsl:when test="$output = 'svg'">
<xsl:apply-templates select="." mode="svg">
</xsl:when>
<xsl:when test="$output = 'xhtml'">
<xsl:apply-templates select="." mode="xhtml">
</xsl:when>
</xsl:choose>
</xsl:template>
The authors wish to thank our fellow researchers Peter Charles, Bob Daly, Marc Gratacos, John Leon, Justin Makeig, Scott McMullan, and Calvin Smith.
We also would like to thank the CDE director Dr. Bob Glushko, and our co-advisor Brian Hayes.
[Gropp 2003] Gropp, Eric. Transforming XML Schemas xml.com http://www.xml.com/pub/a/2003/01/15/transforming-schemas.html?page=1
![]() ![]() |
Design & Development by deepX Ltd. |