Abstract
I propose a paper that addresses the question, "What is Knowledge of XML?" and answers that question in terms of a matrix of knowledge. I call that matrix an "XML Learning Framework".
This matrix is an analysis of XML knowledge as such. It consists of two axes: identification of the categories of XML knowledge, followed by assessment of degrees of knowledge for each category.
The paper will present in general terms the categories of XML knowledge, and then explicate the levels of learning possible for each category. The categories identified are:
Definition of XML
XML Processing
XML Syntax
XML Data Modeling
XML Companion Specifications
Other Technologies
System Implementation
Each category may be analyzed in terms of degrees of knowledge. The degrees of knowledge are important because they allow an exact assessment of where a given piece of knowledge fits into the matrix. The greatest value of the learning framework derives from this aspect of the matrix. The accompanying file, XML_Framework.doc, not only lays out the categories of knowledge, but also the degrees of learning for the Syntax category. It not only indicates these components of the Framework, but also illustrates what the remainder of the Framework looks like. Handouts will be provided that delineate fully the levels of learning for each category of knowledge of XML (as I analyze it, of course).
The XML Learning Framework provides an excellent opportunity to discuss what it means to know XML and how such knowledge could/should be organized.
The Framework is more than an intriguing account of what it means to know XML. It is also a useful tool that anyone can modify to their own needs or recreate as they see fit. As a methodical assessment of knowledge of XML, the Framework may be applied in any domain where knowledge of XML is required. For example, one can use it to develop individual or collective learning paths for XML. Others are obviously possible.
Keywords
Table of Contents
XML's status as a core technology means that it covers a wide range technical and business issues. This fact constitutes the central problem in analyzing XML knowledge. It would be nice if XML were limited to the simple syntax embodied in the rules for well-formedness. But the number of XML related specifications in the W3C continues to grow at an astonishing rate (not to mention the various specifications that have emerged outside the W3C). Starting with XML 1.0, the following list provides an indication of what is involved in knowing XML:
XML 1.0
XML Namespaces
XSLT (XSL Transformations)
XSL (Extensible Stylesheet Language)
XPath (XML Path Language)
XLink (XML Linking Language)
W3C Schemas
SAX (Simple API for XML)
DOM (Document Object Model)
CSS (Cascading Style Sheets)
RDF (Resource Description Framework)
XML Topic Maps
XHTML (Extensible Hypertext Markup Language)
The list could go on quite a ways, and I'm making no attempt here to be comprehensive (so don't be offended if you're personal favorite isn't listed). The obvious issue is how to go about analyzing this mountain of information.
Our solution is to create a matrix that divides XML into categories knowledge. We (myself and Marcy Thompson, who co-created this learning tool) call this matrix The XML Learning Framework. In the XML Learning Framework (XLF), each category may be analyzed for levels of learning. The combination of the categories and the levels of learning produces a useful account of XML knowledge.
The XLF identifies seven categories of XML knowledge. The categories are generally applicable to all students of XML, regardless of whether they are involved in managing XML projects, selecting technologies for implementation, creating XML in a publishing environment, or writing code which uses XML in an application-based information exchange environment. The categories are:
Definition of XML
XML Processing
XML Syntax
XML Data Modeling
XML Companion Specifications
Other Technologies
System Implementation
The aim of the categories is to cover all possible contexts of knowledge for XML and still be serviceable so far as learning it or teaching it is concerned. Obviously, the choices made here determine the outcome of the analysis in general. In brief, defining XML as a text-based syntax by which information can be modeled and processed covers the first four categories. The remainder address the fact that a plethora of companion specifications exist to aid in using XML and that XML must work with other technologiesin system environments.
Placing the categories into a table with the levels of learning produces a general account of XML knowledge:
Table 1. The Basic Categories of XML Knowledge
| Learning Levels | Definition of XML | XML Processing | XML Syntax | XML Data Modeling | XML Companion Specifications | Other Technologies | System Implementation |
| 1 | |||||||
| Basic definition of XML | Basic XML Processing Model | Recognize | Concepts | Definition | Definition | Overview of n-tiered Architectures | |
| 2 | |||||||
| Complicating the Model | Read | Process Overview | Capabilities and Use | Key Issues for Use with XML | Overall Process | ||
| 3 | |||||||
| Contextualizing the Model | Write | Detailed Practices | Recognize Syntax (vocabularies) | Detailed Practices | Detailed Practices | ||
| 4 | Parse | Read | |||||
| 5 | Write | ||||||
| 6 | Incorporate Others (optional) | ||||||
| 7 | Parse |
The levels of learning for the categories are just as important for the analysis as are the categories of knowledge themselves. The levels of learning establish, in a pedagogically oriented way, the relationships between the components of knowledge for a given category. As is obvious from the table, the levels of learning range from one to seven for the particular categories. As one might guess from the vague contents of some cells in the table, it is at points necessary to explicate the content of the learning levels further than this initial table does. We do this for the XML Syntax and XML Companion Specifications categories, but one could do the same for any other categories as needed.
It is important to realize that the characterizations of the learning levels say nothing particular about how to teach a topic. Their purpose is to account generally for the relationships between concepts that make up knowledge of a given category, progressing from the most basic to the most advanced. This converts in most (maybe even all) cases to pedagogical dependencies between concepts, and so suggests a general order for moving through educational material, but even so a wide latitude in curriculum implementation remains. The analysis in no way limits your creativity in developing course materials.
In what follows, each category of the Learning Framework is presented in a table that explicates its levels of learning. Brief introductory remarks introduce each table, but no attempt is made to reproduce the full content of the table.
The analysis of the Definition category is straight-forward, as it consists of a single learning level. This is another example about how the Framework implies nothing about how to teach the topic. One could teach the content of this category in any number of ways:
The XML Processing category consists of three learning levels: the first explains how an XML document is parsed into an XML information set and then passed on to a processing application. The second level explains how the output from an XML parser might take the form of a DOM tree or a series of SAX events. The third level explains the reason why anyone wants to put information in XML and parse it in the first place: in order to apply business rules.
Table 3. XML Processing
| Learning Levels | XML Processing |
| 1 | Basic XML Processing Model |
| The XML processing model is defined in its simplest form: an XML text object is parsed, producing some parser output, which is acted upon by one or more applications. | |
| 2 | Complicating the Model |
| The processing model is refined by an understanding of different parser outputs and considering different types of application processing that may occur. | |
| 3 | Contextualizing the Model |
| The processing model is placed in the context of business rules and processes, providing an understanding of the relationship between business needs and the processing of an XML document. | |
| 4 | |
| 5 | |
| 6 | |
| 7 |
The XML Syntax category divides into four sub-categories: Well-formedness, DTDs, W3C Schema Structures and W3C Schema Datatypes. Further, we use the characterizations "Recognize," "Read," "Write," and "Parse" as summary terms for many of the knowledge components. We then make further discriminations within these: Basic, Intermediate, and Advanced. It should be noted that using these further distinctions is not mandatory, and not uniform. It really depends on the subject itself. It might be the case that even further discrimination is required on occasion.
An important point to note about these summary terms Read, Write, etc., is that, again, they do not suggest anything about the teaching process used for that portion of the Learning Framework. For example, in educating someone to the level of Read for W3C Schemas, you would almost certainly have them write one or more in class, but their knowledge level would still be categorized as Read.
Table 4. XML Syntax
| Learning Levels | Well-formed | DTD | W3C Schema (structures) | W3C Schema (datatypes) | ||
| 1 | Recognize | |||||
| Know what an XML tag is and understand that tags structure and label information. | Know what a DOCTYPE declaration looks like and what it is for, understand the use of internal and external subsets, know what declarations you are likely to find in a DTD. | Understand basic schema structure and namespace. Know what declarations look like. Understand schema namespace for XML document instances. | Know what a schema type definition looks like. Understand the difference between a named type and an anonymous type. | |||
| 2 | Read | |||||
| Be able to draw the structure tree associated with an XML document and understand what content appears where. | Be able to interpret all DTD declarations. Be able to determine whether an instance conforms to the DTD. | Be able to read a schema and understand all associations between declarations and definitions; understand all namespace qualifications; determine whether or not an document instance conforms to the declarations. | Be able to read type definitions and determine their schema validity. | |||
| 3 | Write | |||||
| (a) Basic: be able to create a well-formed document using elements, attributes and escaped characters. | (a) Basic: Be able to write a simple DTD that expresses the elements and attributes in a data model that has been provided to you. | (a) Basic: Be able to write a schema that declares global and local elements and attributes. Understand how to use occurrence indicators and set default and fixed content/values. Be able to create annotations for a schema. Know the three basic schema design patterns. | (a) Basic: Be able to write a schema that uses built-in types (e.g., string, integer, etc.). Be able to define new simple and complex types using derivation by extension and restriction. | |||
| (b) Intermediate: Be able to define parameter entities and parsed entities. | (b) Intermediate: Understand the use of the target namespace and the qualification of local declarations. Be able to use references to declarations, model groups and attribute groups. Be able to use substitution groups. | (b) Intermediate: Be able to use all 12 facets. Be able to derive new types via list and union. | ||||
| (b) Advanced: be able to create a well-formed document that uses CDATA sections. | (c) Advanced: Be able to define unparsed entities, notations and conditional sections. Be able to use parameter entities to control conditional sections and make a DTD extensible. | (c) Advanced: Understand the use of abstract types and control from the document instance using the xsi:type attribute. Understand the use of nillible content. Be able to use wildcards. Understand the use of schema include, import, and redefine. Know how to set uniqueness constraints using unique, key, and keyref. | (c) Advanced: Be able to define datatypes that reflect an object oriented approach to XML information objects. | |||
| 4 | Parse | |||||
| Understand how a parser processes a stream of text characters. | Understand how a parser processes a valid XML document, handles the prolog and validates the instance, finally producing its output. | Understand the Post Schema Validation Infoset (PSVI) and how it might be exploited by applications. | ||||
| 5 | ||||||
| 6 | ||||||
| 7 | ||||||
The XML Data Modeling category addresses Concepts, Processes, and Practices. This raises questions of XML data modeling and other modeling domains (such as object oriented and relational), as well as modeling containment and non-containment relations with XML. The processes and practices of data modeling point to the fact that this discipline is as much an art as a science.
Table 5. XML Data Modeling
| Learning Levels | XML Data Modeling |
| 1 | Concepts |
| What is an XML data model? Why is it fair to say that one exists for all XML documents? What does it mean to formalize a data model? Why might one want to do that? What are the options for instantiating an abstract XML data model? | |
| 2 | Process Overview |
| A clear understanding of the high-level process that results in a useful data model, including consideration of the logical model vs. the actual model, the importance of requirements, what document analysis and XML data design are, and how to ensure that models are maintainable. | |
| 3 | Detailed Practices |
| Exposure to and practice with designing XML data models, adapting data models, and instantiating data models. Includes consideration of element/attribute distinctions, determining needs for data typing, setting up a data model to allow the inclusion of non-text objects, building XML versions of relational data, etc. | |
| 4 | |
| 5 | |
| 6 | |
| 7 |
The Companion Specifications category is a classic example of hiding a lot of information behind a small topic. It's as near to a black hole as XML education gets. This presentation divides the category into twelve subcategories, but, as already indicated, there are many more things that could be addressed. For convenience, we've broken the category into three tables.
The first table for Companion Specifications deals with the stalwarts: the two portions of XSL, XPath and XLink. The success of XSLT is already clear, as is that of its helper specification XPath. XSL has also been well received. XLink came forth with remarkably little fanfare, but its use in applications such as native XML databases suggests that a broader audience will soon perceive the value of semantic links.
Table 6. Companion Specifications 1
| Learning Levels | XSLT | XSL | XPath | XLink | ||
| 1 | Definition | |||||
| What is this specification for? What does it do? | ||||||
| 2 | Capabilities and Use | |||||
| What are the features of this specification? How is it intended to be used with XML? Where does it fit in an XML system? | ||||||
| 3 | Recognize syntax (vocabularies) | |||||
| Be able to identify elements that function as instructions for the XSLT transformation and elements that act as output of the transformation. | Understand how an FO document is put together, know what the highest-level branches in an FO document do. | Be able to identify full and abbreviated syntax and know the difference between a node test and a filter. | Be able to identify attributes that establish XLink features. | |||
| 4 | Read | |||||
| Be able to manually apply an XSLT stylesheet to a source document and produce the output document. | Be able to manually apply an FO document and determine what the output would look like. | Be able to apply an XPath and determine what it will return. | Be able to identify all resources in a link (both simple and extended), any traversal rules between them, and the use and location of a linkbase. | |||
| 5 | Write | |||||
| (a) Basic: Write a simple XSLT stylesheet. | (a) Basic: Write a simple FO document that does not use complex page masters or page sequences. | (a) Basic: Write a relative and an absolute XPath using the abbreviated syntax, without filters. | (a) Basic: Define a simple XLink that uses semantic and behavior description. Define the data model for this link with appropriate default/fixed values. | |||
| (b) Intermediate: Write an XSLT stylesheet that uses modes, named templates and conditional processing. | (b) Intermediate: Write an FO document that uses complex page masters. | (b) Intermediate: Write relative and absolute XPaths using the full syntax. Apply filter to both full and abbreviated syntax node tests. | (b) Intermediate: Define an extended XLink for two or more resources that uses full semantic description and outbound arcs. Define the data model for this link with appropriate default/fixed values. | |||
| (c) Advanced: Be able to write an XSLT stylesheet which uses variables and parameters, which has a complex include tree, and which uses priority attributes to resolve collisions. | (c) Advanced: Be able to write an FO document that uses the full range of FO features. | (c) Advanced: Use XPath functions. | (c) Advanced: Define third party arcs and linkbases. Define the data model for the linkbase reference with appropriate fixed value. | |||
| 6 | Incorporate Others (optional) | |||||
| Incorporate a script. Know how to use CSS with output of XSLT. | ||||||
| 7 | Parse | |||||
| Understand how this kind of object is processed. | ||||||
The second Companion Specifications table focuses on processing XML. SAX and DOM are mainstays for educating developers. Namespaces are educational challenge for all the reasons that make them so controversial. Teaching people to think of XML information in terms of the Information Set is a key part of raising people's thinking about XML to a higher level of abstraction than they frequently start with.
Table 7. Companion Specifications 2
| Learning Levels | SAX | DOM | XML Namespaces | XML Information Set | ||
| 1 | Definition | |||||
| What is this specification for? What does it do? | ||||||
| 2 | Capabilities and Use | |||||
| What are the features of this specification? How is it intended to be used with XML? Where does it fit in an XML system? | ||||||
| 3 | Recognize syntax (vocabularies) | |||||
| Know what an event handler looks like and how to identify the actions to which it is bound. | Know what the basic classes of DOM interfaces are and what they are used for. | Recognize a namespace declaration in all its parts. Be able to distinguish the namespace prefix from the local part of the qualified name. Understand that namespace identity is established on the basis of character-level string matching of the namespace URI. | Understand the concepts of information items and properties and what it means to specify these. | |||
| 4 | Read | |||||
| Be able to read a SAX-enabled script or program and determine what it does. Be able to manually apply it to an XML document. | Be able to read a DOM-enabled script or program and determine what it does. Be able to manually apply it to an XML document. | Be able to determine what namespace an element or attribute belongs to, including elements that are in the default namespaces. | Be able to read an instance of XML markup and determine its information set. | |||
| 5 | Write | |||||
| (a) Basic: Create a SAX 2 application that implements ContentHandler and ErrorHandler interfaces. | (a) Basic: Create a DOM application that includes a recursive tree-walker class. | (a) Basic: Be able to write a namespace declaration, both with a prefix identifier and with a default namespace. | (a) Basic: Understand that XML applications are free to determine which information items and properties are relevant for processing, and may even add new properties to XML information items. | |||
| (b) Intermediate: Create a SAX 2 application that validates an XML document using the XMLReaderFactory class for both DTD and Schema constrained documents. | (b) Intermediate: Create a DOM application that generates XML from a DOM object. | |||||
| (c) Advanced: Create SAX applications that use the remaining interfaces of the org.xml.sax package and those from the org.xml.sax.helpers package. | (c) Advanced: Create a DOM application that creates and edits a DOM tree from scratch. | (b) Advanced: Be able to assign an attribute to a namespace different than the element it belongs to. | (b) Advanced: Write a formal description of the XML information set that a given application will use for processing. | |||
| 6 | Incorporate Others (optional) | |||||
| 7 | Parse | |||||
| Understand how this kind of object is processed. | ||||||
The third table of Companion Specifications provides a hodge-podge of user interface, stylesheet, and metadata specifications. Knowledge of XHTML and its modularization is important with the advent of new user agents. CSS is the venerable means for styling HTML that can be applied to XML as well. With its continuing development and implementation in standard web browsers, it will be used more and more with XML (especially with XML vocabularies like Scalable Vector Graphics).
Considerable buzz attends both Topic Maps and RDF these days. These metadata specifications are geared toward organizing knowledge and making inference by applications a common web experience. The future for both these specifications appears ensured as people are beginning to learn of their promise.
Table 8. Companion Specifications 3
| Learning Levels | XHTML | Topic Maps | RDF | CSS | ||
| 1 | Definition | |||||
| What is this specification for? What does it do? | ||||||
| 2 | Capabilities and Use | |||||
| What are the features of this specification? How is it intended to be used with XML? Where does it fit in an XML system? | ||||||
| 3 | Recognize syntax (vocabularies) | |||||
| Understand the differences between HTML 4.x and XHTML. | Understand the basic components of a Topic Map: topics, associations, and occurrences and be able to recognize their syntactic expression. | Understand the components of an RDF statement (resource, property, and value) and how they relate to one another. Know the expression of an RDF statement in XML. | Be able to identify selectors and rules. | |||
| 4 | Read | |||||
| Be able to read an XHTML document and predict what it will look like in a browser. | Be able to read a Topic Map and determine the ontology it represents, the individual topics in it (including their characteristics and taxonomic relationships), and the associations between the topics. | Be able to read an RDF document and identify statements it contains, the resources, properties, and values of those statements, and any statements about statements being made. | Be able to read a CSS and determine what will happen when it is applied to a specific XML document. | |||
| 5 | Write | |||||
| (a) Basic: Be able to create a basic XHTML document using the required DOCTYPE declaration and basic tags. | (a) Basic: Be able to create an ontology of topics (without characteristics) and associations. Be able to specify all taxonomies for the topics and types for the associations. | (a) Basic: Be able to create an RDF document that contains the statements that create classes, properties, and resources. | (a) Basic: Be able to write a simple CSS. | |||
| (b) Intermediate: Be able to create an XHTML document using any of the tags in the vocabulary. | (b) Intermediate: Be able to add characteristics to topics: names, occurrences, and roles in associations. Be able to implement scopes for characteristics. | (b) Intermediate: Be able to make statements about the declared resources using declared properties. Understand the use of containers and higher-order statements. | ||||
| (c) Advanced: Understand the modularization of XHTML and be able to create a document that conforms to any given module. | (c) Advanced: Understand the rules for merging Topic Maps. Be able to identify situations where maintaining multiple Topic Maps and merging them dynamically makes sense from both maintenance and user interface perspectives. | (b) Advanced: Understand how RDF Schema augments RDF documents. | (b) Advanced: Use CSS in combination with XSLT to produce browsable output. | |||
| 6 | Incorporate Others (optional) | |||||
| 7 | Parse | |||||
| Understand how this kind of object is processed. | ||||||
The category of Other Technologies is a cousin to XML Companion Specifications in that it too offers quite a large number of technologies to consider. In this presentation we have kept the subcategories to the obvious ones of relational databases and object oriented programming languages, keeping within the semantic domain of Java, JavaScript, and Java Server Pages. Obviously many other languages could be listed depending on the education requirements.
Other Technologies raises a question one is bound to encounter at some point in XML education: in teaching XML, how much of the other technologies with which XML works must one cover? People knowledgeable about other technologies frequently desire advanced instruction that pushes the limit of XML education.
Table 9. Other Technologies
| Learning Levels | RDMS | Java | JavaScript | JSP | ||
| 1 | Definition | |||||
| What are the different possible relationships between XML and RDBMS? | What is Java and how can it be used with XML? | What is JavaScript and how can it be used with XML? | What is JSP and how can it be used with XML? | |||
| 2 | Key Issues for Use with XML | |||||
| How can XML conveniently interact with RDBMS? What are the key issues that always pose problems? How do the various RDBMS systems handle interfacing with XML? | What are the gotchas for using Java with XML? | What are the limitations of JavaScript as an embedded part of an XSLT document? | What are the limitations of JSP when it's used with XML? | |||
| 3 | Detailed Practices | |||||
| How can RDBMS data and relationships be expressed in XML syntax? How can XML objects be squirted into RDBMS? How can hybrid objects be created and processed? | Program with Java to manipulate and process an XML document. | Create a complex XSLT stylesheet with embedded JavaScript in both the XSLT document and the created XHTML document. | Create a JSP system that includes XML in some key way. | |||
| 4 | ||||||
| 5 | ||||||
| 6 | ||||||
| 7 | ||||||
The System Implementation category bring us to the last category. Here vision of XML as a pliable tool capable of use in many ways across an information system comes forward. As with Other Technologies, System Implementation pushes the limit of XML education because it raises fundamental questions about system design, implementation, and maintenance.
Table 10. System Implementation
| Learning Levels | System Implementation |
| 1 | Overview of n-tiered Architectures |
| Basic introduction to concepts related to the selection, design and implementation of an XML system, using n-tiered architectures as the conceptual model. | |
| 2 | Overall Process |
| Creating requirements, designing a system, planning for maintenance, updating a system. | |
| 3 | Detailed Practices |
| Exposure to and practice with the various aspects of successful implementation, including design, instantiation and maintenance. | |
| 4 | |
| 5 | |
| 6 | |
| 7 |
How can one use the XML Learning Framework? To this point we have primarily used it in two ways: as a guide in developing an XML curriculum and as a mapping tool between job roles/tasks and the knowledge of XML required for the role/task. The paper has addressed the former of these throughout, but not the latter.
Once a curriculum of XML has been mapped to the XLF determining a learning path for a particular person or group of people is straight-forward. If Project Mangers (for example) need to know enough XML to oversee projects that involve it, you determine the knowledge of XML they need by first stating the requirements of their project and then looking to the XLF's categories and learning levels. Identifying the course (or courses) that deliver that knowledge is now automatic. This method is obviously applicable to any role or task, and works equally well for one or one hundred people. There are no doubt other uses for the Learning Framework.
One aspect of the Framework is clear regardless of the use one puts it to: it is a living map. The Framework will continue to grow and be modified as long as XML is a dynamic technology. But the Framework constantly undergoes change not only because XML is still in an early stage of development, but also because your own understanding of XML continues to grow and becomes sharper. As you teach XML the Framework functions dialectically, alternatively instructing you and then receiving modification as you gain clarity regarding the conceptual components of some point. So use the XML Learning Framework: adopt it, modify it, or completely remake it; but by all means, be clear about what it means to know XML.
I'd like to thank my co-creator of the XML Learning Framework, Marcy Thompson, for her help developing the XLF and the initial account of how the XLF matrix works.
![]() ![]() |
Design & Development by deepX Ltd. 2002 |