Abstract
The Universal Business Library (UBL) has embraced many of the object-oriented aspects of W3C XML Schema (XSD) in its approaches to versioning, customization of schemas, and overall library management. The type-aware features of XSD are a critical element in enabling the broadest possible degree of interoperability across versions and between customized schemas sharing a common base. These features of XSD are not widely appreciated, despite being clearly intended to offer users a great deal of power within applications, for a number of reasons.
One of the main reasons is that the kinds of type-aware processors that this approach relies on for maximum benefit are only now starting to appear. XPath version 2.0 will embrace XSD's type-awareness, allowing matching not only on element names but on elements sharing a common type anywhere in their tree. By making XSLT type-aware, a major shift will happen, enabling the object-oriented direction embraced by the UBL design.
Other applications will also increasingly rely on a type-aware approach, using information found in the post schema validation infoset ("PSVI") to exhibit object-oriented behaviors consistent with the XSD specification.
Building up on a series of talks that the authors have presented over the last couple of years both at USA XML200X and XMLEurope200X, this paper reviews UBL use cases for type-awareness, including versioning and namespace packaging, customization and the automation of "business context", as well as issues around interoperability. For each case, the behaviors needed in applications will be demonstrated, with an emphasis on schema and instance examples, the PSVI, and requisite application behaviors. Those standards that are expected to be important for the creation of type-aware applications will be highlighted (including XPath 2.0, XSLT, and XSD). The availability of tools and their functionality in this area will be covered. Further, the implications of this strategy for application architectures and business automation strategies will be explored.
The specific benefits and liabilities of this approach will be explored and discussed. The UBL design justifications for the choice of an object-oriented approach to these difficult issues will be presented from the perspective of those implementing e-business systems.
With the advent of type-aware specification and their utilization by applications like UBL, we can truly say "now we're cooking" -- and this paper attempts to show implementors exactly how UBL (and, by extension, other type-aware XML applications) can serve as a powerful basis for creating interoperable e-business applications.
Keywords
Table of Contents
W3C's XML Schema Language (XSD) promised to make our XML applications more "object-oriented," a promise which has been kept. There are a large number of features touted as "object-oriented," and some of these features have been used in the design of the Universal Business Library (UBL). This paper examines the functional use of object-oriented aspects of XSD in the UBL schema design, and what the implications are for application design and implementation.
UBL is an initiative that has its roots in the ebXML initiative, particularly in the Core Components effort within ebXML, being based on the Core Components Technical Specification now available from the TBG group within UN/CEFACT. UBL is an OASIS Technical Committee, with a version 1.0 release of the library that should be available in a matter of weeks, as it goes through the OASIS standardization process.
The focus of UBL is on business-to-business e-commerce, and the initial release contains core e-business documents such as purchase orders and responses, invoices, despatch advices/advance ship notices, etc. It is also designed and intended to serve as a library of core XML types for use within other, non-UBL documents. An inherent part of the UBL design is the assumption that users will need to extend it to meet their own application needs within industry verticals, specific countries, etc.
The major object-oriented features used within the UBL library design are the XSD extension and restriction elements. This provides for additive extensions and subtractive refinement capabilities. The use of this approach requires that all constructs be described in types. Because the schemas contain information about type relationships – the type hierarchy – it is possible for applications to exhibit “polymorphic” behaviors. These behaviors consist of substituting one element construct for another, because the type hierarchy can guarantee that the substituted element is a complete instance of the one required by the content model. Substituted elements are “descendants” (in an inheritance sense) of their required parents.
There are two major use cases within UBL for the use of object-oriented features: capturing the relationships of objects across minor versions, and expressing the customizations necessitated by its use in various contexts. There are rules regarding how these versioning and customization relationships are expressed in UBL, which we will summarize briefly here
All modifications made to schemas to reflect customizations or minor versioning changes must be expressed using XSD additive extension and/or substractive refinement.
All minor versions exist in their own namespaces, and import the namespace of the preceding minor version.
All customizations exist within their own namespace, which imports the UBL namespace containing the types to be customized.
Note that the conventions around the use of namespaces - while very much in line with the intent of that standard - are required by the UBL specification, rather than being a part of the underlying XSD or XML namespaces specifications. XSD offers us features such as additive extension and substractive refinement, and allows us to import namespaces into one another. The UBL conventions for versioning and customization rely on these features, but further constrain their use.
The intent of these features in UBL is to provide for a maximum degree of interoperability across minor versions, and between users of customized and non-customized documents. The diagrams below illustrate these two cases.
For versioning, we can see that an application which supports an earlier version of a standard document type (in this case, Trading Partner B, whose application only understands versions 1.0 or 1.1 of the document type in question) can process later minor versions of that document type as if they were of the closest related version (in this case, Trading Partner A transmits a version 1.3 of the document type, so it can be processed as a version 1.1 document type, which is the closest thing to version 1.3 Trading Partner B understands).

This is because the type relationships between minor versions guarantee backward compatibility between these two related-but-different instances. Note that a type-aware system will allow this to happen without requiring any changes on the part of the receiving application: it was built to understand versions 1.1 and 1.0, and that is all it needs to understand. A standard XML parser "reports" the version 1.3 document as if it were a version 1.0 document. (Note that this is not without some danger of data loss when extensions are used, allowing the introduction of components in 1.3 that are therefore ignored when processed as if the document were a 1.1 one - see Section 3, “Application Design”below.)
Customization works along similar lines, but here the picture can become slightly more complex. If we know the type derivations of a given customization, and are capable of processing any ancestor in the type hierarchy of the received document, a standard parser can show us the customized document as if it were standard. So, in the diagram below, even though Trading Partner B only understands their own customization, or a standard UBL document, Trading Partner A can send their own, different customization. Although Trading Partner B may not have the customization they might like, they can at least process the incoming document as something they understand - the standard version of the document type.

Clearly, this is not an ideal solution to the need for interoperability between trading partners, but it does begin to reduce the burden of version transformations and transformation to handle customizations of document types.
Be aware that while these examples are couched in terms of document types, the leveraging of type hierarchies in this way operates at all levels of the XML structure, functioning as much for a lower-level construct as for a document type .
Without XSD, XML technologies do not have the basic awareness of types that is required for supporting polymorphic processing. When processing an XML instance in such fashion, knowledge of its type hierarchy is needed, so we must have access to the schemas. Only the schemas describe fully the set of object-relations that we need to support this functionality. But, since there is a set of relationships between namespaces (as described above) that also forms a critical part of the picture in a UBL application, this functionality can get complicated, as well.
While having access to the schemas at run-time may seem an obvious requirement, it should be noted that many processors today do not parse instances against their schemas at run-time, typically for reasons of speed of processing. For polymorphic processing, this is not an option.
There are implications for how schemas are designed, too. Schemas are required to declare types for all of their constructs. This is not a particularly onerous burden, as many common approaches to schema design advocate this already. There are three well know designs, first named by Roger Costello of xfront.com. One of these well-known approaches is called the "Venetian Blind" approach, in which all elements reference declared types, rather than being directly declared within the content models of the elements that use them. A related - but slightly different - approach is used within UBL, called the "Garden of Eden" approach. (Note that this name was created by Eve [Maler], who seemed to prefer it to the suggested alternative, "Garden of the Blind.") This approach further requires the use of global names for all constructs, whether types or elements, and so - for our purposes - is functionally the same. (These approaches can be contrasted to the “Russian Doll” type of design, for example, where all elements are declared locally within their types, or to much flatter designs resembling DTDs where there is simply no type information provided at all. XSD allows many design choices.)

(Where the yellow squares are elements and the blue rounded lozenges are types)
One of the keys to this use of features in the XSD specification is that conformant, generic XML parsers must exhibit a useful behavior: they allow an extended or refined XML element to be substituted for an element that is an ancestor in their type hierarchy.
Thus, even though our example schema requires element "X", the parser allows element "Y" - an extension of "X" - to be substituted for "X" in the XML instance.
For instance, given a schema that has a complex type called Address which references the elements Street, Number, CityName, and Country, one could have a different schema that imports the first one and then extends this type, adding an element called PostalZone. Having done this, one can then use the extended type as desired by using, instead of the element <Address> , the element <Address xsi:type="mod:MyAddress"> .
XPath 2.0 provides us with the critical standards basis on which to implement polymorphic processing, because it is fundamental to many common XML processes, notably transformation and formatting using XSLT tools. Today, we must match explicitly on elements and attributes, regardless of the structural similarities implicit in two elements derived from a single type. Thus, although I may use some common processing on all addresses, I would still need to use XPath expressions that explicitly match on all elements that are of that type, in our case "BuyerParty" and "SellerParty" despite the fact that both are derived from PartyType.
Let us look at this example more closely:
Let's say I am transforming a UBL Order document into a flat-file text format for use by a back-office application that does order fulfillment. With existing Xpath 1.0 support in XSLT, my instance will look like this:
<BuyerParty>
<ID>R300</ID>
<PartyName>
<Name>IDES Retail INC US</Name>
</PartyName>
<Address xsi:type="drvext:DRV2AddressType">
<Street>West Chester Pike</Street>
<CityName>Parsippany</CityName>
</Address>
<BuyerContact>
<Name>Joe Bloggs</Name>
</BuyerContact>
</BuyerParty>
<SellerParty>
<ID>R3002</ID>
<PartyName>
<Name>Meyer Hardware Inc.</Name>
</PartyName>
<Address>
<Street>South Hollow Road</Street>
<CityName>London</CityName>
<Country>
<Code listID="3166-1" listAgencyID="ISO">UK</Code>
</Country>
</Address>
</SellerParty>The XSLT code to handle the buyer and seller addresses would look like this:
<xsl:template match="BuyerParty/Address|SellerParty/Address|Party/Address|ThirdParty/Address"/>
In the XPath 2.0 world, I can reduce this code as follows:
<xsl:template match="document-node(element(mod:Address)/Address)
The advantage is not so much the compactness of the match expression as the fact that I do not have to examine the schema carefully in order to know what possible elements may be of a given type, as I would have to when using XSLT 1.0
As you can see, because both address elements have a common structure, and are differentiated only by the names of the outermost element, I can use XPath expressions that match explicitly on the elements, or I can use code that matches any element of this type. Only the actual application functionality needed will tell you which approach is best, however: in this case, because all addresses are being treated to identical processing (transformation into a flat-file format for storage in a database system), the one bit of code will do double-duty.
Note that the reuse of code is not the primary benefit of this approach, but is a side-effect. The benefits are benefits of interoperability, as outlined above.
While XPath 2.0 is still very new, we already have an XSLT engine that supports it. Michael Kay's Saxon XSLT package provides support in versions 7.0 and later. For many XML applications, XSLT provides a large part of the functionality. With the availability of a type-aware XSLT processor, it should be possible to start making many applications type-aware through the simple expedient of upgrading XSLT stylesheets to use type-based path expressions rather than element-name-based path expressions.
The reliance on type-awareness to provide interoperability obviously has repercussions for the design and implementation of XML applications. Perhaps the biggest question here concerns the possibility of loss of critical data.
If a user extends a document type, this may indicate the need for data that is absolutely essential to the business application. Thus, a type-based system that allows a trading partner to ignore the extended data will not be acceptable. This is a distinct possibility with polymorphic processing, and applications need to be able to detect the cases where there is a threat of data loss.
In reality, there is a fairly straightforward solution, at least in those cases where the namespace conventions recognized by UBL are employed.
Because UBL uses namespaces as “packages” of unchanging constructs, the business requirements supported by any package can be absolutely known. Thus, if an application is capable of understanding which namespaces (that is, versions or customizations of core documents) is being received, it can determine whether or not a particular instance is acceptable.
For example, let’s say that version 1.0 of a document type does not support multiple delivery dates, but that this feature has been added to version 1.1. A trading partner could process a received version 1.3 document as if it were any minor version: 1.0, 1.1, 1.2, or 1.3, depending on their capabilities. If the transaction requires information about multiple delivery dates – only available in 1.1, 1.2, and 1.3, then a polymorphic rendering of the document as version 1.0, although possible, is not acceptable to business requirements. The application must be able to enforce this logic.
Obviously, trading partners will need to identify which minor versions and customizations as part of the trading-partner agreement. While it might be technically possible for this same phenomenon to occur for each type within every schema, this approach is not feasible from a business perspective. The same is true if namespaces do not consist of unchanging sets of constructs that can be absolutely known – there must be a way for trading-partner agreements to be reached simply, and to be clear and explicit in terms of the information to be exchanged. UBL’s design provides this.
It must be remembered that e-commerce documents are often legal documents, and as such may need to be preserved to keep a record of what has occurred between trading partners. Thus, even in cases where the polymorphic processing has been allowed for specific document versions or customized versions, there may still be a requirement to store the received document as it was sent by the trading partner. It may be desirable to provide for applications to display this additional data as well.
Embedding the possibility for type-aware processing in the UBL design both adds a potential degree of complexity to the implementation of processing applications, and provides a higher level of interoperability than that found in non-type-aware systems. The benefits and costs of this approach can be weighed.
It must be remembered, however, that there is never a requirement for any application using UBL documents to process them in a type-aware fashion. While this application approach may provide some benefits, it has no downside for users who do not wish to leverage these features: UBL schemas will behave in non-type-aware systems exactly as would any other schema. The difference between the UBL design and some others is that UBL offers the possibility of controlled, type-aware processing, something that is absent in many other e-business (and general XML) vocabularies.
First, let us look at the negative aspects of building type-aware processors:
Run-time schema processing is required
Management of the type hierarchy to enforce business requirements
Preventing data loss and meeting legal requirements
The first of our negatives arguably is not a negative at all: many people feel that XML documents should be parsed against a schema whenever they cross system boundaries (rather than being just tested for well-formedness), and the receipt of a document from a trading partner certainly meets this criteria. However, this does have a negative impact at run time in situations where processing efficiency is critical. As mentioned above, many XSLT transformations - as an example - are done without benefit of parsing against the schema. In type-aware processing, this is no longer an option. The schema is the artifact that contains the bulk of the information about the type hierarchy, and consequently must be available not only at design time, but at run time.
Note, too, that the use of extension and refinement in UBL, combined with rules about the importing of namespaces, requires that all of the schemas describing the entire type hierarchy must be available. It is recommended that schemas be available locally - and, ideally, cached in memory - to overcome the potential processing burden this represents.
Our second point is less obvious: if one trading partner requires the extensions they have added to a base document, in order to fully support the business transaction, then a recipient of this document cannot accept the extended document, and then be unable to process it fully. Obviously, this condition can be identified at the time the trading-partner agreements are made. Having identified this condition, however, systems must still enforce the attendant business rules.
UBL provides a mechanism for applications to perform this function: namespaces. Namespaces in UBL represent specific versions and/or ownership of the schemas, and are unchanging. Thus, when a specific document type (or other type) requires a certain set of features, then not all of its ancestors in the type hierarchy may be acceptable. The namespace provides a handle so that systems can accept or reject specific XML documents based on the specific namespace which they use. Of course, this does not prevent application developers from having to implement this business logic, which does represent a burden not seen with the use of non-type-aware XML.
The last negative concerns the legal transmission of information to trading partners. It is certainly bad practice - and in some cases legally ill-advised - to lose any of the information transmitted by a trading partner as part of a legal transaction. When using type-aware processing to cross minor-version boundaries, or to process a customized document as a standard one, there is a possibility that some critical information may be ignored: specifically, the additions made in the minor version or the customization that are beyond the ability of the receiving application to process.
It is considered good practice to store any unprocessed information in the incoming XML documents in buffers, so that it is available to some generic processing at run time. Typically, this type of processing would include notifying end-users about the existence of additional information, and making it visible in a notes field or other display.
Further - and this is true of all XML e-commerce transactions, and not merely ones subject to type-aware processing - the actual instance received from a trading partner should always be archived. While this may not be important with informational transmissions, it is certainly true of those transmissions that represent actions concerned with contractual obligations. For type-aware systems, the actual document received - and not just the one processed by the type-aware system - must be the one that is archived.
The negative factors outlined above could be seen as daunting, but there is a significant upside in this picture as well. It is important to understand the vastness of the scale of a global e-commerce standard. There is no centralized control over which version might be supported by existing applications, nor of the mechanisms through which schemas are distributed (aside from making them available). Further, while the mechanism for expressing customizations can be specified, there is no way to control the content of customizations - users will make them to meet their own, non-standard requirements. This means that any built-in mechanism that can assist interoperability in these areas is perhaps more valuable that it might be in more controlled systems.
The possibility of type-aware processing gives us a new mechanism for enabling cross-version and cross-customization boundaries.
It should be recognized that the majority of e-commerce in terms of volume is performed with the exchange of a very small amount of standard information: it is only the abnormal cases that require a large percentage of the constructs reflected in e-commerce schemas, even though their use is occasional. (For an example, think about the amount of information actually used in most ERP systems to record the ordering of a single line item: a few fields in most cases. Now, compare this set of data with any e-commerce XML vocabulary, and you will see that there are a lot of optional, needed-but-rarely-used fields allowed in these structures.)
Type-aware processing allows for interoperability to be enabled in the majority case where the standard data set is good enough. This is particularly valuable when going across domain boundaries: while it is reasonable to expect trading partners who do business within a single industry domain to support the customization and versions typically used within that domain, this is not the case for many trading partners. For those trading partners whose business is concerned with goods or services that are useful within many different industry domains, it is not reasonable to expect them to support the customizations and versions across a wide variety of domains. The further down a supply-chain you go, the more likely it is that required goods and services will be generic, rather than specific to that industry supply-chain. (A good example of this can be seen in the case of an adhesives manufacturer: their product might be used to glue together airplane seats, car seats, sneakers, and bags for carrying golf-clubs.)
This type of interoperability can be considered from a more technical perspective as well. The answer to a lack of version interoperability between trading partners in a non-type-aware scenario is often not to build support for a new version within the processing applications themselves, but to support new interfaces within the gateways, using transformation. This requires development work to identify the mappings between versions, and to implement those mappings. Further, there is the processing cost (and potential for data loss) inherent in performing the transformations themselves. With UBL's type-aware processing (at least for customizations and minor versions), this work is unnecessary. Further, the lack of transformations simplify the trading-partner agreement process, as there are fewer points of contention about who performs the transformation.
As a last consideration, it is reasonable to expect world-wide e-commerce applications as a whole to gradually adopt later - and improved - versions of any standard. But the timing of this adoption will be driven by many factors - supported features, available commercial software, etc. It is reasonable to suppose that only a small number of major versions will be in use at any given point in time. The number of minor versions in use at the same point in time will be far greater. If transformations or other schemes to provide minor-version interoperability are not needed - because type-aware systems make them unnecessary - then the problem of interoperability generally becomes much more tractable.
It could be argued that the costs of this design, in terms of added complexity, are not worth the benefits. But this argument is not convincing. It must be borne in mind that users will never need to deal with the potential complexity of type-aware systems unless they choose to do so - there is no downside!
It is also true that the type-aware aspects of XML processing implicit in XSD will become more common in XML tools. We are already beginning to see it in some tools. And because a global standard like UBL must survive the evolution of XML technology, without becoming outdated, the possibility of type-aware processing built into its design is a major feature for ensuring its continued utility into the future.
Overall, the scope of a global e-commerce standard is such that issues of interoperability present a large barrier to adoption. Because type-aware processing can significantly reduce this burden, the approach taken by UBL seems beneficial. UBL has chosen the object-oriented features of XSD carefully in supporting this type of functionality, and married these with an approach to using namespaces that reflects real-world practice. It is the overall design - and not merely the use of XSD's extension and refinement mechanisms - that allows for the promise of much easier user implementation moving forward. Other XML applications might benefit from this approach as well - they should be aware that it is the whole design that provides the real value, not the adoption of any particular specification or technology.
![]() ![]() |
Design & Development by deepX Ltd. |