Abstract
XLink used to be the hot topic in the XML world, but lately, it seems to have been shadowed by just about everything, from ebXML to XSL, from XML Signature to SVG. Yet, XLink is in many ways the very essence of XML, making practical XML's promises of single-source publishing and extensive reuse of resources.
Here, XLinks are discussed from a practical perspective even though some the necessary theory behind it is mentioned first. What's the difference between a simple and an extended XLink, really? When is a simple XLink not enough, and why would anyone want to describe linking completely outside the participating documents, anyway? Isn't that an implementor's nightmare? And if not, how can you do it using your favourite editor?
The practicalities of XLink implementation are discussed from (mainly) two perspectives: a low-cost yet fully functional authoring system using Simple XLink, and a state-of-the-art Extended XLink environment, where just about anything is possible. In many ways, these two systems are each other's opposites and offer very different views on XLink implementation and usage. They are also similar, however, and share a linking philosophy that has little to do with the price tag and everything to do with enabling efficient resource reuse and single-source publishing.
Discussing and comparing these two systems, topics ranging from link usage contexts (for example, cross-referencing and resource inclusion, etc.) and linking strategies in DTDs, and to implementations in tools are covered, higlighting such issues as link resolution in editors, link localization, and extensive resource reuse through conditional linking. How, exactly, do you write DTDs with cross-referencing mechanisms that actually work both on paper and online? Should you implement ID attribute generation? How do you display a linked image in the editing environment when the URI to the image is somewhere in a linkbase on a server and all you've got is an ID attribute? Etc. Xlink is a hugely exciting technology, but even Simple XLink is anything but simple if implemented carelessly.
It is the hope of the author that this paper will inspire and encourage XML developers to implement XLink in their XML systems, regardless of whether the goal is to create a user-friendly XML editing environment for a documentation department consisting of two writers, or making practical the use of linkbases in documentation systems handling one and a half million XML documents. Especially since this paper describes both of these cases.
Keywords
Table of Contents
Not so long ago, XLink was the hot topic in the XML world. Recently, however, XLink has been overshadowed by almost everything, from XSLT to ebXML, from SVG to SOAP. There are books that deal with almost anything XML, but not a single one dedicated to XLink.
All too often, implementors choose ID/IDREF mechanisms and ENTITY references or device their own, system-specific, linking solutions instead of using XLink. Even World Wide Web Consortium (W3C) is guilty of this sin; instead of using XLink in their other recommendations, they device new ones. XInclude is a perfect example of this; while defining an inclusion mechanism that was so sorely lacking in the XLink specification, the linking mechanism used was akin more to HTML hyperlinks than XLink.
Why is this? Is the XLink recommendation too hard to understand? Too problematic to implement? And if so, what can we do about it?
I've been involved in designing and implementing a number of linking mechanisms in SGML and XML documentation systems, most of the latter being XLink implementations, and while the requirements for each and every one of these systems varied, there were also enough similarities to reuse methods and mechanisms from previous projects. Two of these projects I describe below, as I believe that much can be learned from them.
This whitepaper assumes a working knowledge of the XLink recommendation. For a refresher, see http://www.w3.org/TR/2001/REC-xlink-20010627/.
I've chosen not to include the names of the companies in my case studies. The information included here is not exclusive to any of them; I have removed or changed the information that is.
Company A is a small company specializing in certain subsystems within the automation industry. They're world leaders in what they do; among other things, they've delivered subsystems to Disney World. They're still a small company, however, with only two technical writers. Their documentation is published mainly as HTML and HTML Help, with context-sensitive help, and as PDFs. The information is written in XML, using SoftQuad's (now Corel's) XMetaL 2.1 as editor, and the files are stored in the file system, for now, because even though a database would simplify things, acquiring and implementing one is deemed to be too expensive.
XMetaL has been customized with functionality for HTML and HTML conversion, PDF output with FOP, and simple XLink support. Simple XLink is used for all kinds of links, that is, cross-references, fragment inclusions, and image references. Image references are absolute URLs so even though the resulting image reference is, technically speaking, an XLink, the only thing required is an ActiveX file browser (that happens to be part of a DLL that that comes in the box) and a few JScript snippets to invoke and use it. Cross-references and fragment inclusions require an XLink application, however, but even so, this application still only consists of a set of JScript macros and a dialog built using XMetaL 2.1's dialog editor.
To further ease linking, an ID generation mechanism is included.
Company B is an actor in the automotive industry and creates all of their aftersales information, from service manuals to user handbooks, in XML. The content is published in a number of ways: paper, promotional CDs, an online system for service information, and so on. The editor used is Arbortext's Epic Editor and files are stored using the Documentum 4i e-Content Server document management system and an Oracle database. Most publishing requirements are handled using Arbortext's Epic e-Content Engine (E3).
XLink is again used for all linking. There are the usual cross-references, XML fragment inclusions, and image references, but there are also links to software downloads, embedded software-based tools, and some other exotic online options. All links can be conditional, which means that, for example, one image may be displayed in one context and another image in another context, depending on what meta-data the links have in relation to the parent's meta-data. The abiltity to create conditional links is, of course, the reason why extended XLinks were preferred in the first place. A conditional link is really a multi-ended link since a single source may point out several targets, albeit in different contexts.
Conditionality can be expressed using simple XLink and the xlink:role attribute, as defined in the XLink recommendation, but this requires a very different method of defining URIs, an approach that causes other problems.
To make links easily searchable and processable in the database, and to make the documents themselves as version-independent as possible, the links are expressed out-of-line; they exist as link objects in Documentum, coupled with the meta-data that profile the links. The links are not always out-of-line, however: during editing, the links and their meta-data are included inline for processing purposes. A linkbase can easily be extracted from the link object model in the database, however, and linkbases are used for XML-to-XML conversions, to name just one use.
Most components in the system are heavily customized; a middleware component is used to contain the shared business logic (link functionality, meta-data handling, etc), but most of the components also contain various modifications on their own, in addition to interfaces and event handling. The editor, for example, handles ID generation, a number of dialogs, various client-side transformations for the editor view, and so on.
The two case studies are seemingly very different. The scope is very different, for one; Company A produces at most a couple of dozen documents each year, albeit customized for each customer, while the Company B document database presently contains more than half a million XML documents, with thousands of new or slightly changed versions of the old ones added each year.
In spite of the seemingly huge differences, both systems encountered a number of similar or even identical problems during design. These problems, and their solutions, are discussed in this chapter.
Here's a basic premise: if you can make do with only one linking mechanism, why use several? This is a basic similarity between the two case studies, a fact that alone lends the premise validity.
In both case studies, there are three basic uses for links (and this should not come as a big surprise):
Image referencing
Cross-referencing
Fragment inclusion
In Company A's case, this is all. Company A links have a defined and unique meaning for and in each type of link—there is no conditionality, as in Company B's case, where the link points out different resources, depending of the exact meta-data context of the document. A multi-end linking requirement affects the choice of linking mechanism as such, but the basic link uses are exactly the same. Company B does add others, though:
Pointers to software download and parts handling
[Online tool] links, that is, software-based instruments (such as a Volt meter) that are embedded in the online service documentation
Do these additional requirements affect the choice of linking mechanism? Let's have a look at the basic uses first, and then get back to this.
Traditionally, images are handled using unparsed file entities, an alias mechanism for including non-markup content. Editors support this, but increasingly, XML-based tools tend to leave out some entity support. The basic problem is how XML is often conceived: XML introduced the concept of well-formedness, which basically means that DTDs aren't necessarily required, as long as the XML instance follows some basic syntactical conventions. Unfortunately, many developers interpret this as meaning that DTDs aren't needed which means that the !DOCTYPE declaration is left out, which in turn means that file entities are rendered useless because there's no place to include the declaration.
This has lead to diminishing support for anything entity-based in XML tools and recommendations (such as XSLT); entities are now increasingly difficult to handle using most tools (for example, Xerces). Therefore, using entities for image handling is in practical terms out of the question. A reference-based mechanism must be used.
Simple XLink is an excellent mechanism for this. In principle, simple XLink is a souped-up HTML link, with attributes thrown in to control when and how the content is displayed (the xlink:show and xlink:actuate attributes). The links can also be classified using the xlink:role attribute, offering additional functionality.
Note that there is very little to speak against the use of XLink as such when including an image, not counting the difficulties that occur when a multi-ended link is required.
Traditionally, cross-references in documents are created using ID/IDREF pairs, a built-in SGML and XML mechanism for linking within a physical file. The advantage is that these links are validated: for each target IDREF, a source ID must exist, in the same file.
This is where the mechanism starts to fail, however. Increasingly, sincle-source publishing is desired, which means that XML is written in smaller fragments and the source and target elements are therefore seldom located in the same physical file, rendering the mechanism useless unless a normalization process (and a few tricks of the trade, such as replacing the IDREF attribute declarations in the fragment DTDs with CDATA) is implemented.
The mechanism is again severely strained, however, if multi-ended links are required (an IDREFS list of target values is allowed, admittedly), or if controlling attributes for link traversal are required.
XLinks do not provide an explicit mechanism for target validation, but are much easier to implement. Also, they can benefit from the same customizations as image references.
In the easiest case, a direct reference-based cross-reference mechanism is easy to implement (but more on this in Section 4.4); a multi-ended link mechanism is more difficult. Section 4.5 discusses this.
In the olden (SGML) days, fragment inclusion was often implemented using file entities, but for the same reasons as with images, this solution is not practical today. Another, more advanced, solution involved SUBDOC documents, but it was fairly complex to implement and thus, editors sometimes lacked sufficient functionality for this. No directly corresponding mechanism is available in XML, even though XInclude can be said to be related to it. XInclude, unfortunately, uses its own linking mechanism, and it would be preferable to use the same mechanism for inclusions as for all other types of links.
XLink to the rescue, then. If XLink is used, the xlink:show attribute value [embed] fits in nicely, in effect defining the linked resource to be embedded in the source. Using XLink allows us to, again, use the same processes as for images and cross-references, and all that is left is to decide how to implement the embedding functionality.
The XLink recommendation does not define a processing model for link traversal, embedded or otherwise.
If simple XLink is used, a normalization of the resources during document loading is sufficient, provided that the fragment uses the same DTD as the source, and that the structure is allowed in the context. Alternatively, a presentation DTD may be used.
If the fragment uses another DTD, the use of XInclude should be considered as it defines a processing model for including such a structure.
Multi-ended links do not necessarily require a different mechanism, if the target fragments use the same DTD. The presentation may be a problem, however; how does one visualize a multi-ended and perhaps conditional link?
As I indicated previously, other link types may be required. At Company B, when accessing the service information, the service information tool not only includes a browser but also diagnostics hardware. Therefore, other link-related functionality is available such as embedded software downloads and software-based instruments (Volt meters, oscilloscopes, etc).
These, as well as other functions, are implemented using XLink (target resources do not need to be XML, remember). Also, the behaviour of these links can be controlled using the same attributes as for other link types: xlink:show and xlink:actuate are perfect for this, and what's even better is that the same customizations that are needed for images, cross-references, and fragment inclusions can be used. What differs here are presentation requirements, for example, the code (such as ActiveX objects, if the Internet Explorer browser is used) required to display a software-based Volt meter online. The linking mechanism is the same.
An often-overlooked detail is the generation of ID attribute values. The links must point at something, and that something needs to be (preferably) uniquely identified. The method often used in older documentation systems, where the author inserts IDs in the targets, quickly becomes impractical when dealing with hundreds or perhaps thousands of document fragments, with link arcs all over the place. Firstly, it is doubtful that the IDs will remain unique. Secondly, making users define ID attribute values is far from user-friendly and diverts them from more important issues. Thirdly, the method is error-prone since if the DTD allows ID attributes in other elements than those allowed as link targets, the links can simply point in the wrong places. For example, it is all too common to point at a section's title, when what was meant was a reference to the section that wraps the title and the section content.
Both of my case studies use—and need—ID generation software. In Company A's case, a JScript function that generates unique IDs for allowed target elements is invoked at element insert so when links are created, all possible targets already contain IDs. In Company B's case, every element contains an ID attribute value, generated at element insert, and thus, the allowed link targets again contain ID values when a link is created.
In the case of Company B, allowed target elements are the only ones displayed during lookup.
ID generation is easy enough to implement since most editors have sufficient event handling (and control) for element insert. Most also support a number of programming languages that in themselves can make the task considerably easier.
ID generation causes additional customization needs, however. Classic is the case of copy-paste: if an element is copied and pasted within a document (or in different document fragments that end up in the same document after a normalization), all IDs included in the element must be regenerated to avoid parsing errors.
As a link implementator, you must also decide which elements should contain IDs. At Company A, we took a minimalistic approach that works well in the context: IDs are only generated for allowed targets. At Company B, however, a different solution was required since the IDs are also used to enable document compare functionality.
The knowledgeable reader now may point out that XLinks aren't limited to elements containing IDs. This is true. However, a document management system using indirect references using the XPointer W3C recommendation is much more difficult to implement than an ID-based system.
The presentation of the links made is essential. Texts need to be generated, image URIs need to be retrieved and shown, and fragments need to be handled. So before we discuss lookup mechanisms, here are a few pointers on presentation.
A basic image inclusion mechanism does not require explicit XLink support in the editor or when publishing. Most editors allow simply defining a graphics element and reference attribute for displaying images, which means that only the additional functionality in XLink requires customization in the editor. When publishing, a simple XLink is again very easy to handle, requiring no explicit XLink support; XSLT, for example, does not need to be XLink-aware to handle an XLink reference to an image.
The built-in image display mechanism in an editor can in many cases be made to understand inline extended XLinks, but customizations are required as soon as multi-ended links are required, or if the extended XLinks are expressed out-of-line (see Section 4.5). Extended XLink link arcs are defined in arc-type elements that use a labeling mechanism to classify links, so an XLink arc identifies from and to labels for pointing out the source(s) and target(s), instead of including direct references. The references are defined in locator-type elements, which means that these links have to be resolved somehow. Here's a somewhat simplified multi-ended link example:
<links xlink:type="extended">
<locator xlink:href="source.xml#id1"
xlink:type="locator"
xlink:label="source"/>
<locator xlink:href="target.gif"
xlink:type="locator"
xlink:label="target-pic"/>
<locator xlink:href="target2.gif"
xlink:type="locator"
xlink:label="target-pic2"/>
<arc xlink:from="source"
xlink:to="target-pic"
xlink:type="arc"/>
<arc xlink:from="source"
xlink:to="target-pic2"
xlink:type="arc"/>
</links>
If the above links are expressed inline, and if there are additional rules to explain when and how multi-ended links are displayed, the above example can be handled. Expressed out-of-line, however, they require a link resolution mechanism; see Section 4.5 for more on this.
Cross-references on paper are usually presented using a combination of a node count (listing, for example, a section number), the target's title or caption, and a page number. Additionally, generated text such as [Table] or [Section] may be used. Depending on the circumstances, some of these components may be left out; for example, a procedure step reference probably requires only the step number (and perhaps a page number).
Online, page numbers lose their meaning, and the target's title is often presented as a hyperlink that, when clicked at, traverses the link.
Both environments, however, require the same basic presentation components:
A node count (although less useful online, especially if the link is presented as a hyperlink)
The target's title or caption
If the reference is made to an external document, the title (or other identification) of that document
Both case studies use the xlink:title attribute to store away the target's title (and sometimes a node count and other information) for formatting purposes. A linking element in a Company A document looks like this:
<loc xlink:type="simple"
xlink:title="Automating the Tracks"
xlink:href="#section-2002-7-7-9-34-58-45763502-8"
/>On paper, the link formatting looks like this, complete with a node count, generated text, and a page number:
Section 4.2, "Automating the Tracks", Page 29
The node count is achieved using standard functionality in XSLT. Here's a basic example:
<xsl:template match="//body/section/title"> <fo:block> <xsl:number level="multiple" count="section" format="1.1"/> ... </fo:block> </xsl:template>
And before you raise your voice: the title element above is required in the DTD so it's perfectly OK to count it instead of the parent section.
Online... well, we don't show anything of this online even though we could easily make a presentable hyperlink out of the xlink:title and xlink:href information. Instead, we customize the document for both environments. See Section 4.6 for more.
Finally, cross-references created using extended XLink require more from us so I'll get back to it in Section 4.5.
How should fragments be presented after an inclusion reference? On paper, the fragments must be embedded in the parent since hyperlink traversal tends to fail on paper. A normalization of all material is required. Online, a hyperlink may suffice and even be preferable, since most readers prefer shorter chunks of text on a monitor.
Company A's system normalizes the fragments included in a parent document that really only is a glorified map: it consists of some meta-data and fragment inclusion links. James Clark's SP package does the job, invoked from a JScript snippet in an XMetaL macro when generating PDF or HTML Help. The fragments are always complete documents, however, and use the same top element as the parent document. Therefore, the publishing process uses its own DTD, a very loose—and very dangerous, if used directly by authors—version of the authoring DTD, to validate the normalized document.
Normalization is used at Company B as well, but normalization requirements vary greatly, depending on the presentation format and the meta-data context that dictates the conditionality of each link. For example, a procedure where the majority of steps is shared by three different products may use conditional links to include product-specific information. The parent document's meta-data indicates that the parent procedure is applicable to all three products, but the meta-data in each of the three included fragments indicates only the applicable product. Thus, the meta-data context when publishing will decide what is normalized and what is not.
Obviously, choosing all three products when publishing can still be interpreted in a number of ways. One is to ignore all product-specific information. Another is to include all three data profiles and format them so that each profile may be uniquely identified (for example, by using meta-data to generate a title or render all product-specific information using different colours). Fortunately, the majority of Company B's service information is published online, and the product's profile is chosen by the reader before the information is displayed, so the information may be presented in a number of different ways.
How should a fragment be presented during editing? While its possible to normalize the fragment directly, in effect by closing the present document and opening a new one, it costs processing power and time. A client-side fragment inclusion is more easily achievable, and might be recommended if the inclusions occur on more than two or three levels. In either case, the basic operation is a node inclusion, and should probably be implemented using DOM, making the process easier.
Normalization in the editing environment becomes a big problem in the above examples with a parent procedure and three product-specific included fragments. All three fragments usually originate from the same linking element, which means that they cannot coexist when a specific profile is chosen. An unfortunate consequence of a normalization for editing purposes is that if the fragments are embedded in the parent, creating content between each of the fragments, in the parent, will be possible unless the fragments are read-only (but if they are, then what's the point of including them in the first place?) There are programmatic solutions to this, of course, but they all cost processing power and, above all, time. A response time of more than a precious few seconds will result in displeased users unwilling to ever use the feature.
Better, and certainly cheaper, processing-wise, is to simply display the fragments in separate editor windows, and using the link itself as means to traversing the link. This method saves a lot of time while still being reasonably user-friendly. If organized well, locating the conditional fragments is not a problem.
Creating, editing, and refreshing a link requires reliable lookup and resolution mechanisms. In tools such as FrameMaker, this kind of linking functionality is taken for granted—who would accept having to locate the target in a MIF file, and be forced to enter an ID value manually—but how about XML and XLink?
Simple XLink lookup involves representing the allowed targets in the target as a node tree and picking the right one using some kind of representation of this node tree. Most, but not all, editors offer DOM functionality, allowing this functionality to be implemented using the most common scripting languages. SoftQuad's XMetaL editor, for example, makes DOM available to any programming language that supports Windows Scripting Host. Arbortext's Epic Editor prefers its own scripting language, ACL, even though it allows scripting using other languages; the downside is that these languages must still use ACL method calls which means that essentially, you're limited to what ACL can offer.
Company A's linking functionality includes an XLink dialog created using the dialog editor that comes with XMetaL. It is an interface to the DOM operations, and works quite well. It is shown in Figure 1.
A simple XLink implementation basically looks like this:
Invoke the linking functionality (usually a dialog such as the one displayed in Figure 1). The event should normally happen at link element insert.
Get the desired document node tree (either by looking at the current document or by browsing to one, and opening it as a node tree). This requires the target document to be opened. A client-side operation is often easily accessible using the editor's DOM functionality.
Create a list of the allowed target elements. In the simplest case, this is a list of all element nodes in the target document (again, an easy DOM operation). If only certain elements are required, a number of approaches will work, from an element name lookup table to methods that check whether or not the the target elements will contain IDs.
Choose the desired element type and generate a list of the those elements of the right type that contain an ID in a lookup browser displaying the title or caption of the target (section titles, figure captions, etc; they are usually, but not always, the first or last children containing #PCDATA).
Choose the desired target element. The complete URI to it should be retrieved (for example, target.xml#id-123456), but also any generated text such as the title, and possibly a target element node count.
Insert the linking element with the retrieved URI. The generated text and the node count, if used, need to be stored someplace. The xlink:title attribute is perfect for this, and very easy to format.
To this process, it is fairly easy to add link behaviour functionality, for example ,by allowing xlink:show and xlink:actuate values to be inserted from the lookup functionality.
A link refresh function is very similar to the above steps, except that the initial target node tree is already in place. No dialog is needed since we only need to check if the target is still in place, refresh node counts and generated text, and insert them in the link element.
Link resolution when formatting is, in the easiest case a matter of using what's in the xlink:title attribute. More advanced solutions may involve a variant of the refreshing functionality, or be a separate solution in, say, XSL-FO or a FOSI.
When extended XLinks are involved, the situation becomes more complex. Inline extended XLinks can use pretty much the same mechanism as the one outlined above; out-of-line XLinks are more difficult. At Company B, the basic lookup function is still a DOM operation: a node tree displaying allowed target elements is created and browsed using Java components, but there is also functionality to express and edit a link's conditionality.
The concept of out-of-line links is exciting: links—relationships between resources—are described without any of the participants being aware of that they in fact participate in a link. There are good reasons to using them, too, as I pointed out in the previous section. Implementation, however, can be very difficult.
Out-of-line link mechanisms can in the absence of an actual href only trigger on the node's ID, even though that is not necessarily required either (for example, an XPointer could be used to point out the element). For a basic implementation, let's assume that the linkbase is stored on a server and that the client contains as little actual linking software as possible (for a couple of reasons to this, see Section 5.4). Let's further assume that even though the participating resources will not contain actual linking information, we will use a dedicated linking element (out-of-line extended XLinks can be made between any type of elements but this requires a very different implementation from the one I'm describing).
The creation of a new link starts with the insertion of a linking element, together with an ID attribute value. The event invokes a target lookup dialog and sends the necessary information to the server. So what is necessary information?
The ID value, for one—we need to identify the starting resource and include that information in the linkbase.
The current document—again, this is required information for the linkbase.
Possibly, the linking element's type; this may be needed to refine allowed context, use, and so on. Also, the context of the linking element (block level, inline, etc). This is not necessary, strictly speaking, but may be needed to decide the types of allowed target elements.
Relevant meta-data for the current document, if this is required for the link. The meta-data context may influence the values returned by the server to process the link in the editor.
Quite possibly, the target resides in the current document, so the first node tree displayed by the dialog should be the current one. There are a number of solutions to this, but the important thing to remember here is that this node tree should be available on the client. It is more difficult, and above all time-consuming, to let the server dynamically build (and update) the node tree based on what's presently stored on the client. Therefore, the second, third, and fourth items on the above list could be replaced with one:
The complete node tree of the current document.
The server should include functionality for browsing and searching for other target documents, and functionality to represent the relevant parts of these as lookup node trees for the dialog. Obviously, there should also be functions to display the titles or captions of possible targets, and to fetch the target's ID once the target has been determined.
The target document's URI and the target element's ID attribute value go to the linkbase, making it a strictly server-based operation, even though the link can be made conditional or be otherwise modified or classified by the client-side query. The XLink recommendation alone contains plenty of such possibilities; meta-data used to make a link conditional will add even more.
What should be returned to the client?
In case of fragment inclusion, the URI to the target document and the complete document.
In case of an image reference, the URI pointing at the image file.
In case of a cross-reference, the target's URI, possibly a node count, and the title or other identifying string of the target.
For all types of links, anything dictating the behaviour of the link.
Other types of targets receive similar treatment; common to the above is that the URI is always returned. Why? Well, for one, for link traversal, but also for refresh funtionality when refreshing the link.
The information returned to the client must be stored in some way; the problem here, remember, is that the linking elements do not contain any information about the link, only an ID. A good solution is to use a hash table. For those of you that do not program in Perl or other languages using the concept, a hash table is simply a list of key-value pairs. On the client, the keys are the ID values of the linking elements, and the values are the information returned by the server for each link. Note that a value may be a reference to another hash table, where the keys might subclass the information returned by the server in some convenient way.
Target lookup is, in principle, not different from the simple XLink lookup outlined in Section 4.4 even though the linking information is used in a different manner.
Different editors will cause different client-side implementations; for an example, see Section 4.5.2. Also, there are editing requirements to be considered, such as how to display included fragments so that they may be conveniently edited, and how to display a conditional cross-reference.
Finally, note that the outline given here is sketchy at best; for more detailed solutions, I probably ought to write a book.
Company B's extended XLink solution is, for storage purposes, truly out-of-line. No linking information is retained in the documents themselves when they are stored in the database. The links and their meta-data are handled as link objects, separately from the file objects that contain the actual resources. In Documentum, this results in a database object model that represents the concept of a link base reasonably well. It's necessary, too; most resources are extensively reused, with very different meta-data contexts.
Problems arise, however, when attempting to handle out-of-line links in Arbortext's Epic Editor. It's extremely difficult to make a hash table solution, such as the one outlined in the previous section, work—a lookup mechanism based on the ID value as a key in a hash table rather than a direct, attribute-based reference doesn't seem to work. Fragment inclusion might have worked, but only by normalizing; storing a URI to the fragment elsewhere than in the inclusion element was unthinkable.
What we did was to place the linking information inline, but only during editing. When documents are checked out, the relevant linking information (depending on meta-data context) is lifted into the document. Thus, instead of this
<!ELEMENT ref EMPTY>
we get this:
<!ELEMENT ref (link-grp*)>
where link-grp is the wrapper element for extended XLinks (of type [extended], as specified in the XLink recommendation) and contains all [locator] and [arc] type elements necessary. In this way, the client-side processing can be handled while keeping the concept of out-of-line links intact where it's essential. It's not an ideal solution, but it works. When the documents are checked in, the linking elements (the link-grp trees) are again removed from the documents, and the Documentum link objects are updated.
Single-source publishing is a fancy term implying that the same source document can be published in all target environments, from paper to online. In practical terms, however, while it is perfectly feasible (and not that difficult) to create both PDFs and HTML from the same source XML, the results aren't always pretty. Consider the following example:
Link localization is an important issue when designing a flexible linking mechanism. For more information, see [Section 5.4, "Link Localization", Page 16].
On paper, the above will work just fine. Online, however, simply leaving out node counts and page numbers, and making the target section's title a hyperlink, will still cause a lot of excess information to be presented:
Link localization is an important issue when designing a flexible linking mechanism. For more information, see Link Localization.
Preferred would be
Link localization is an important issue when designing a flexible linking mechanism.
where the paper-based link and the sentence required to introduce it is removed, and the words [Link localization] become a hyperlink.
Where does this leave us? Well, the paper and online requirements are very different from each other; we need to create two links, not one, and remove the sentence intended for the paper cross-reference version. Fortunately, this is easy to solve in a DTD. Here're a few declarations that do the trick (and I've included all XLink-related attributes in the %xlink.atts; parameter entity:
<!ELEMENT para (#PCDATA|hlink|xref)*> <!ELEMENT hlink (#PCDATA)> <!ATTLIST hlink %xlink.atts;> <!ELEMENT xref (#PCDATA|locator)*> <!ELEMENT locator EMPTY> <!ATTLIST locator %xlink.atts;>
So our example above is tagged as follows:
<para><hlink xlink:href="#target-id">Link localization</hlink> is an important issue when designing a flexible linking mechanism.<xref> For more information, see <locator xlink:href="#target-id" xlink:title="Link Localization"/>.</xref></para>
The PDF generation process (XSL-FO, for example) accepts everything within the para element, but since we're publishing on paper, the hlink element doesn't receive any special formatting. After all, what would you do with a hyperlink on paper? The contents of the xref element are included, and the locator element and its target are processed for a node count and, later in the XSL-FO process, a page number.
When publishing online, however, the XSLT process knows that the xref element and its contents must be removed, and the hlink element and its contents be converted into a hyperlink. Pretty easy, huh? An author that is aware of the publishing processes and the required target formats can easily author content that is truly single-source.
And please note that this mechanism is just as easy to implement using extended XLink, as it is when using simple XLink. Also, the wrapper xref inline element can easily be made to handle conditionality.
An important consideration when reusing links is that included resources are often shared. This means that more than one writer at a time might need to edit the same linked resource in different contexts, and one of the writers could conceivably make changes unacceptable to the others. In the Company A case, this is not an issue. There are only two technical writers so it's fairly simple to create authoring processes minimizing the risks. In Company B's case, with several dozens of writers, we decided to implement what is called [optimistic check-out].
Optimistic check-out means that resources are not locked for editing by other users when checked out. Instead, the system notifies the new user if the resource is already in use, forcing that writer to communicate with the others so that incompatible changes are not inserted in the shared resource, or if such changes cannot be avoided, that the incompatible parts are lifted out from the shared resource and referenced using a conditional link applicable only to those that need the change.
The alternative is to lock the files, or implement automated means of breaking out the incompatibilities, both of which are impractical and expensive, processing-wise. Sometimes it's easier to simply force the writers to talk to each other.
Again, please note that this solution is as applicable to simple XLink systems as it is to those using extended XLink.
Link localization is really a matter of hooking a language attribute (preferably xml:lang) to the link or to an ancestor element, and using that information to change, for example, the generated text's language, the numbering in the node count, or some other pertinent information.
Important to realize is that localization mechanisms are not connected to XLink mechanisms. They touch on presentation only. This doesn't mean that you don't need to include localization in your link lookup and resolution functionality, such as the one outlined in Section 4.4; you do. However, including the functionality is an add-on, just as conditionality, and it is wise to separate the language function or functions in your code.
Language attributes can also be used as additional conditionality; for example, a fragment inclusion may be valid in one market, but not in another. This should not be confused with link localization even though using standard xml:lang functionality in tools could well be used to achieve the correct behaviour in the conditional link.
The previous chapter listed a number of problems and their solutions when implementing XLink. A lot of things can be learned from these; while seemingly very different, both case studies use many of the same solutions (and should have used still others). This chapter gives a few pointers that we would have killed for, had we known...
Here's something obvious: Don't use several linking mechanisms when one is enough! In other words, don't mix XLink with, say, unparsed file entities. It leads to duplicated functionality in your software. And here's something else equally obvious: Use a standard mechanism, don't invent your own (unless you absolutely have to)! XLink is a de facto standard. It isn't perfect but it's by far more powerful than other available standards. Use it, we need more practical XLink implementations!
When deciding on which XLink flavour to use, ask yourself the following questions:
Do you need multi-ended links? That is, do you need to point out more than one resource from a single starting point? Multi-ended links are not that hard to implement but they do bring with them various difficulties when presented onscreen or on paper.
Do you need conditionality in your links? That is, do you need to express a condition such as [if condition A, point to target 1, if condition B, point to target 2]?
Do you need to express your links out-of-line, in a linkbase?
A [yes] to any of the above means that you probably have to implement extended XLink. Healthy [no's] to the above means that simple XLink is enough. And in most cases, it should be. Company A in my case study does well using simple XLink; they have little need for conditionality or a linkbase. Company B, however, requires both, plus multi-ended links (the online embedded tools, such as the software-based oscilloscopes, require thumbnail images that correspond to the tools, among other things). But remember why Company B really needs extended XLink: their multi-ended links express link conditionality based on meta-data and avoid making the XML fragments more version-dependant than necessary.
Simple XLink, however, is sufficient for most linking, regardless of the type of target. For example, pointing out Company B's [online tool] is easy enough to implement using simple XLink. The problem is not in the linking mechanism as such, it is in embedding the target when publishing, which is a separate problem, unrelated to XLink as such.
Extended XLink is far more difficult to implement than simple XLink. And if the links are out-of-line, client-side implementations will be much more time-consuming and expensive to achieve, requiring lookup and presentation mechanisms as queries to the linkbase. Also keep in mind the limitations the editor will set: can you format the link onscreen? If the editor has difficulties with fetching the required URIs and other necessary data (such as generated text or node counts), you might well end up having to use simple XLink anyway, or at least inline extended XLink.
A DTD that includes an XLink mechanism is no more difficult than one containing three different mechanisms. It can be considerably simpler, however:
Parameterize your linking components! For example, regardless of in how many elements you decide to use XLink in, see to it that the relevant attributes are listed only once.
Don't use more linking elements than you have to! For example, I've seen too many DTDs with different elements for different target element types: figureref, tableref, and topicref are typical examples, quoted from a well-known industry standard DTD. Using different elements can lead to duplicated code and added confusion from both writers and implementors. It's easy to implement a lookup mechanism that can separate between all different types of targets, even though only one linking element is used.
On the other hand, do use dedicated linking elements! While it's possible to link from anything to anything else using XLink, a dedicated linking element is easier to grasp for writers and programmers alike.
Don't forget to include ID attributes for all allowed targets (and sources)! XPointer-based referencing is quite possible, but let's do this one step at a time, huh?
Plan for single-source! For example, witness the hlink/xref combo I described in Section 4.6. The mechanism as such is uncomplicated, yet allows writers to efficiently plan for single-source. And if you mean to allow extensive fragment inclusion, see to it that links are allowed wherever your fragments are.
Plan for reuse! (But if you want to reuse single paragraphs or inline elements, make sure to secure project financing first.)
XLink is today largely regarded as theoretical. Few have actually implemented it in a production environment so when it's discussed, the discussion tends to focus on how XPath or XPointer can be used in a link, how semantic webs can be created, and so on.
We have, however, so here are some pointers:
Isolate business logic from the tools. That is, don't write code that is irrevocably integrated with any specific tool. Some day, you may want to replace the tool with something else. Also, since we're dealing with XLink here, we're discussing something that encompasses the whole system, not just an editor (or database, or presentation system, or whatever). It is likely that you'll need the same basic XLink functionality in many, or even most, of your applications. Thus, it is a good idea to share the same basic functionality instead of duplicating it.
Don't confuse the logic associated with linking to that of profiling, meta-data, and conditionality. While the profiling mechanism should certainly be connected to the linking ditto in the DTD, the profiling functionality should have nothing whatsoever to do with XLink functionality as such.
Develop an XLink Application Programmer's Interface (API) and have your developers use it instead of forcing them to unnecesarily duplicate DOM operations client-side when all they really need is to access some well-defined XLink-related task. See Section 5.5.
Choose among programming languages with existing XML libraries. The less you have to develop from scratch, the better. (Obvious, isn't it? Why do so many insist on reinventing the wheel?) Don't sacrifice string handling, pattern matching, and the like, though, just because you happen to love coding in a particular language.
Beware of different, and differing, parsers. While most parsers will do the job just fine in 90% of the cases, there are annoying differences in the details, say, in the treatment of namespaces. So decide on one parser and disable all others, to the extent possible. For example, even though an editor may have a perfectly OK internal parser, a few quirks may cause you a lot of headache if you use something else on the server and still something else in a publishing environment.
And do force all of your (and your client's) subcontractors to use that parser, or some very strange things may occur, late in the project, or long after it's finished.
Here are some relevant pointers on DTD design:
Place section titles, figure captions, and so on, so that your lookup software doesn't have to use a customized path to find each and every one ([third #PCDATA element if a table is present, second if not...]). A nice rule of thumb is to set the caption or title first or last in the content model (a very easy rule to implement: [first or last child containing #PCDATA]).
Don't complicate text generation (unless you have to) by allowing for, say, other cross-references in the generated text. Multi-level resolution is not a whole lot of fun.
Keep in mind your presentation requirements, and the tools required to publish, and remember that the requirements stemming from them may differ from those you got used to in the editing environment.
A huge problem when implementing Company B's extended XLink solution was the sheer number of developers writing very similar XLink-related DOM operations from their various applications in and around the server. Taking the time to define an XLink API would have saved months of work and bug fixes, and resulted in a reusable code base.
Company A's simple XLink system was fairly uncomplicated so there were hardly months or money to save by creating an API to be used instead of the numerous direct DOM operations in the JScript macros. But we do plan on using similar solutions again so an API is certainly useful. Developing a reasonably generic API for Company A (as it happens, they were first) would have made the task of creating extended, out-of-line XLink functionality for Company B considerably easier.
Extended and simple XLink are very different, you may now say, but they really aren't. Simple XLink is, as has been stated in the recommendation and elsewhere, a special case of extended XLink so the obvious solution is to create something that handles extended XLink, and then leave out some of the processing when encountrering the xlink:type="simple" attribute value.
Here are a few similarities worth noting when identifying definable functions:
Target identification is the same: xlink:href is used for both, and behaves identically.
Also consider source identification in extended XLink: it's very similar to a simple XLink target description (and it happens to use the same attribute, too). This is not by accident, so if you leave out the implicit arc from the linking element to the target in a simple XLink (I say implicit because apart from the fact that a non-XLink namespace element or attribute happens to share the tag boundaries with the xlink:href attribute and that there is an attribute stating that the link is of type [simple], there is nothing to bind them together, semantically), there's again something to remember in the API design.
The handling of xlink:role is identical. What differs is xlink:arcrole—don't confuse them. But if you take a look at the role mechanism as such, and remove the coupling between an arcrole and an arc, say, more similarities become apparent.
The use of xlink:title is identical; the title is there to give the XLink semantics represented by an element a title.
The behaviour attributes (xlink:show, xlink:actuate) are the same. They dictate the behaviour of the link, stating when and how, rather than changing the linking mechanism itself.
What's different is how link arcs are defined. A labelling system is used for extended XLinks, as an aliasing and classifying mechanism. The result is a sort of multi-dimensional link: An arc element implicitly points out (creates links between it and) any resource (a [locator] or [resource] type XLink element) using the labels identified in the arc element's xlink:from and xlink:to attribute values. The [locators] and [resources] in turn point out the actual link participants.
Obviously there's more to the definition if roles are used (either the roles in the locators and arcs, or the [arcroles] defined for the arc elements themselves). But these are definable modifications to the default behaviour, and can be implemented separately (or not at all, depending on your requirements).
I should also point out that there's nothing here that contradicts the lookup steps outlined in Section 4.4, quite the contrary. A well-planned factoring of the indvidual components of the API will make implementation fairly straight forward, and accessible to anyone needing the XLink functionality.
Outlining a complete XLink API is beyond the scope of this paper, but the above hopefully illustrates how a common XLink API could take form. A link is a description of the relationship(s) between the participating resources. Adding participating resources does not change this, it only makes the task of defining a processing model, and above all, a presentation model, more complex. Thus, if you can implement simple XLink properly, extending it shouldn't be a problem; deciding how to present the resulting links is, but that particular problem is defined more by how our minds work.
There is a lot to say about XLinks from the paractical point-of-view, far more than is possible within the confines of a whitepaper (approx.8000 words or thereabouts), so I've only touched upon most of the topics central to the issue. However, I hope that there's enough here to inspire you to consider implementing XLink instead of yet another ID/IDREF or file entity solution. XLink is the standard we have. It's comes with the territory, so to speak, and it's well worth using.
Myself, I would probably have been stuck with those nasty ID/IDREF pairs, had it not been for my friend and colleague, Mr Henrik Mårtensson, who wrote the original JScript macros that ended up as the simple XLink implementation at Company A. Henrik, like I, like XLink, and want to see it in action. Therefore, thanks, Henrik.
Also, thanks to my colleagues at Information & Media. They have listened to me and moved from the traditional and easy solutions to XLink dittos more than once, in spite of the extra work I've caused, and I have learned much from them.
![]() ![]() |
Design & Development by deepX Ltd. 2002 |