Markup of overlapping structures is, depending on one's point of view, either a perpetual hot topic or a negligeable edge case in the theory of markup. Starting from the belief that overlapping structures are not just common but important, not only in the analysis of literary works but also in the management of changing content, we explore ways to represent overlapping structures in tractable ways. The requirements for representing overlapping structures differ, depending on what kinds of structures we expect to represent. The similarities and differences among existing proposals for overlapping structures are illuminated by defining ways to represent them in relational form. The experiment shows that surprising variations in structure can be created by very restricted variations on a schema for the relational representation of XML.
Keywords: Concurrent Markup/Overlap
| XML Source | PDF (for print) | Author Package | Typeset PDF |
Recent work on overlapping structures has described a number of different models for representing and thinking about textual structures: concurrent trees, multi-colored trees, LMNL documents, MultiX documents, and Goddag structures among them.
One way to improve understanding of a data structure is to consider different ways of representing it. This paper outlines relational representations of several of these data structures, as a way of getting a slightly different angle on them and allowing more or less direct comparisons.
As a side effect, relational representation of these structures also provides a simple way to build proof-of-concept systems for searching and manipulating them. It should be noted, however, that the relations described here have been designed for simplicity in exhibiting the similarities and differences among the different data structures, and not to optimize performance of existing relational systems.
It is assumed that the reader is familiar with SGML and XML markup and the kinds of tree data structures commonly used to represent SGML and XML documents.
The other data structures mentioned here have been well described in their original publications, or in reviews (see, for example, [Witt 2004] and [DeRose 2004]); the notes which follow provide only a cursory introduction to each.
Although originally introduced largely with the thought of providing both ‘logical’ and ‘formatted’ views of documents, the SGML feature called CONCUR was specifed with enough generality and abstraction that it is not limited to that original application. Multiple concurrent views of the same document can be provided, without restriction as to number or kind of view. Each view takes the form of a conventional document tree, and all views share the same frontier of character data.
Although seldom exploited by SGML users, concurrent markup has seen something of a revival in recent years, both in linguistic annotation (see for example [Witt 2004], [Hilbert et al. 2005], and [Schonefeld 2007]) and manuscript studies ([Dekhtyar / Iacob 2005]).
A simple example from [Schonefeld / Witt 2006] may illustrate the mechanism:1
<(L1)div type="dialog" org="uniform">
<(L2)text>
<(L1)u who="Peter">
<(L2)s>Hey Paul! </(L2)s>
<(L2)s>Would you give me
</(L1)u>
<(L1)u who="Paul">
the hammer? </(L2)s>
</(L1)u>
</(L2)text>
</(L1)div>This presents two trees. First, the L1 tree, whose conventional XML representation would be:
<div type="dialog" org="uniform">
<u who="Peter">
Hey Paul!
Would you give me
</u>
<u who="Paul">
the hammer?
</u>
</div> <text>
<s>Hey Paul! </s>
<s>Would you give me
the hammer? </s>
</text>Multi-colored trees are collections of trees, each a more or less conventional XML tree when viewed individually, but stored in such a way as to share structure wherever possible [Jagadish et al. 2004]. A collection of information may thus be addressed in several different XML forms, without having to store the information multiple times (which wastes storage space as well as causing problems when updating the material).
A collection of information about movies, directors, actors, and so on, for example, might in conventional XML be organized either by actor (with <movie> elements appearing as children of <actor> elements), or by movie (with <actor> as child of <movie>), or in some other way. Almost any organizational principle may involve redundancy and may be inconvenient for certain kinds of searches, while making other searches very simple (to find the movies in which an actor has appeared, just find the appropriate <actor> element and then select all of the children named <movie>). The use of multi-colored trees allows the information to be stored and addressed not as a single XML document, but as several documents, each optimized for certain kinds of query and retrieval. In XML databases where traversal of the parent-child or next-sibling links is fast, but value-based lookups of other elements are slow), multi-colored trees can help provide faster retrieval.
Multi-colored trees resemble CONCUR in providing multiple tree structures over the same collection of data. They differ from CONCUR in that they do not require or guarantee that the different trees have the same sequence of characters at the leaves, nor that all character data appear in every tree, nor that shared structures occur in the same sequence in different trees. They do not forbid these things, either, so the example given above for concurrent markup could also be stored as a pair of multi-colored trees.
A more characteristic example, however, might be the movie database described in [Jagadish et al. 2004] (as shown here, the example has been augmented slightly in the interests of verisimilitude with information from the Internet Movie Database). The user may view the database as a set of three different XML documents, one showing actors and their roles:
<actors>
<actor>
<name>Bette Davis</name>
<movie-role id="RB023">
<!--* Payment on Demand (1951) *-->
<name id="RB024">Joyce Ramsey (nee Jackson)</name>
</movie-role>
<movie-role id="RB117">
<!--* All about Eve (1950) *-->
<name id="RB125">Margo Channing</name>
</movie-role>
<movie-role id="RB927">
<!--* Beyond the Forest (1949) *-->
<name id="RB925">Rosa Moline</name>
</movie-role>
<!--* ... *-->
</actor>
<actor>
<name>Audrey Hepburn</name>
<!--* ... *-->
<movie-role id="RB097">
<!--* The Children's Hour, 1961 *-->
<name id="RB475">Karen Wright</name>
</movie-role>
<movie-role id="RB157">
<!--* Breakfast at Tiffany's, 1961 *-->
<name id="RB175">Holly Golightly</name>
</movie-role>
<movie-role id="RB837">
<!--* The Nun's Story, 1959 *-->
<name id="RB975">Sister Luke</name>
</movie-role>
<!--* ... *-->
</actor>
</actors>
<movie-genre>
<name>all</name>
<movie-genre>
<name>comedy</name>
<movie id="RG012">
<name id="RG015">All about Eve</name>
<movie-role id="RB117">
<!--* Bette Davis *-->
<name id="RB125">Margo Channing</name>
</movie-role>
<movie-role>
<!--* Anne Baxter *-->
<name>Eve Harrington</name>
</movie-role>
<movie-role>
<!--* George Sanders *-->
<name>Addison DeWitt</name>
</movie-role>
<!--* ... *-->
</movie>
<movie-genre>
<name>slapstick</name>
<!--* ... *-->
</movie-genre>
</movie-genre>
<!--* ... *-->
<movie-genre>
<name>drama</name>
<movie>
<name>Breakfast at Tiffany's</name>
<movie-role id="RB157">
<!--* Audrey Hepburn *-->
<name id="RB175">Holly Golightly</name>
</movie-role>
<movie-role>
<!--* George Peppard *-->
<name>Paul 'Fred' Varjak</name>
</movie-role>
<movie-role>
<!--* Patricia Neal *-->
<name>2-E (Mrs. Fallenson)</name>
</movie-role>
<movie-role>
<!--* Buddy Ebsen *-->
<name>Doc Golightly</name>
</movie-role>
<!--* ... *-->
</movie>
</movie-genre>
</movie-genre>
A third document shows awards won by movies:
<movie-award>
<name>Oscar</name>
<movie-award>
<name>1950s Oscars</name>
<movie id="RG012">
<name id="RG015">All about Eve</name>
<votes>14</votes>
</movie>
<!--* ... *-->
</movie-award>
</movie-award>The nodes shown with id attributes are shared among the different trees: the movie-role element whose ID is “RB117”, for example, appears both in the actors tree and in the movie-genre tree, as does its name child.2 This redundancy makes queries against the XML easier to specify (exploiting the fact that XML query languages typically make specification of parent : child or ancestor : descendant relations especially convenient), and (at least in the test implementation) also faster (presumably because traversal of parent : child links is highly optimized).
When a node is shared between trees, it may have different children in the different trees; in the example, element RG012 illustrates this phenomenon. When several trees of different colors do share a great deal of structure, with elements in different trees having the same children in the same sequence, then (notionally, at least) multi-colored trees will store the information about the parent : child and previous : next sibling relationships several times.
Goddag structures are directed acyclic graphs in which the directed arcs represent a parent/child relation. They were developed to aid in the understanding, markup, and processing of documents with overlapping structures; they are rooted especially in work on manuscript transcription, where important phenomena (deletions, marks in the margin, paragraphs) often overlap each other. As in the trees used to represent XML documents, the children of any node in a Goddag are ordered (or typically so; there are exceptions). Unlike XML trees, however, Goddag structures allow any node to have multiple parents.
Several forms of Goddag structure may be distinguished:
For example, consider the following TexMecs document, which has two roots, one for the scene (the dramatic view) and one for the embedded haiku.
<* The dramatic view *> <scene| <sp who="HUGHIE"|<p|How did that translation go?|p> <lg type="haiku"|<l|da de dum de dum,|l> <l@frog|gets a new frog,|l> <l|...|l>|lg> |sp> <sp who="LOUIS"|<p|Er ...|p> <lg|<l@new|it's a new pond.|l>|lg> |sp> <sp who="DEWEY"| <p|Ah ...|p> <lg|<l@pond|When the old pond|l>|lg> <p|Right. That's it.|p> |sp> |scene> <* The embedded poem *> <haiku| <lg|<^l^pond><^l^frog><^l^new>|lg> |haiku>
For further details, the reader is directed to [Sperberg-McQueen / Huitfeldt 2000].
This section describes simple relational representations of XML, concurrent XML, multi-colored trees, and Goddag structures.
The representations have two goals: primarily, they provide a clear and explicit definition of the data structures which exhibits clearly ways in which they are the same and ways in which they differ. As a secondary matter it provides an obvious mapping to widely available technology which can be used to support search and retrieval operations on overlapping structures. The requirements to be met by the representations defined here include:
Each relation is specified in the form
For example
nodes(node_id nID, node-type type)
Some conventional column names are used repeatedly with the same meaning:
For most column names, a simple form of ‘Hungarian notation’ is used: some names have a short prefix indicating the domain from which the values are drawn:
When the type of a column is clearly indicated by its name, the explicit type indicator is often omitted. For example, the following two expressions specify the same relation:
child_parent(node_id nCh, node_id nPar) child_parent(nCh, nPar)
The values in the relations sketched here are drawn from the following domains:
The data structures described here have in common that they represent the document as a graph, i.e. as a set of nodes and a set of arcs connecting those nodes (with nodes and arcs possibly ordered in different ways). We postulate several kinds of node: documents, elements, attributes, comments, pcdata (‘text nodes’), and processing instructions.
The representation of these basic nodes is the same for XML, concurrent XML, colored XML, and Goddag structures. It uses the following relations:
For example, the nodes in the example of section 2-1 could be represented thus (trimming whitespace, for convenience, using “-” to indicate null values, and omitting the uninformative node relation for brevity):
node_type(L1,document). node_type(L2,document). node_type(n1,element). node_type(n2,attribute). node_type(n3,attribute). node_type(n4,element). node_type(n5,element). node_type(n6,attribute). node_type(n7,element). node_type(n8,pcdata). node_type(n9,element). node_type(n10,pcdata). node_type(n11,element). node_type(n12,attribute). node_type(n13,pcdata). element(n1,-,div). element(n4,-,text). element(n5,-,u). element(n7,-,s). element(n9,-,s). element(n11,-,u). attribute(n2,n1,-,type,"dialog"). attribute(n3,n1,-,org,"uniform"). attribute(n6,n5,-,who,"Peter"). attribute(n12,n11,-,who,"Paul"). pcdata(n10, " the hammer? "). pcdata(n10, "Would you give me "). pcdata(n6,"Hey Paul! ").
The different data structures described here differ in the graph structures in which they embed the nodes of the document. This and the following sections describe the representation of those structures.
XML defines a single tree over the document. Conventionally, n-ary trees like those of XML are often described as trees in which each node as a variable-length sequence of children. A straightforward translation of this idea into our notation would give us
For our purposes here, however, it is more convenient to factor the information into two relations, each simpler:
Cycles involving any mixture of these two relations are forbidden. And there must be exactly one document node (per XML document).
Various refinements and controlled redundancies are possible, which may make different operations or traversals of the tree more convenient. For example
For example, given the node identities specified above for the examples in section 2-1, the L1 tree could be described thus using node_firstchild and node_next:
node_firstchild(L1,n1). node_firstchild(n1,n5). node_firstchild(n5,n8). node_firstchild(n8,-). node_firstchild(n10,-). node_firstchild(n11,n13). node_firstchild(n13,-). node_next(L1,-). node_next(n1,-). node_next(n10,-). node_next(n11,-). node_next(n13,-). node_next(n5,n11). node_next(n8,n10).
node_relations(L1,-,-,-,n1,-). node_relations(n1,L1,-,-,n5,n11). node_relations(n5,n1,-,n11,n8,n10). node_relations(n8,n5,-,n10,-,-) node_relations(n10,n5,n8,-,-,-) node_relations(n11,n1,n5,-,n13,n13). node_relations(n13,n11,-,-,-,-).
Concurrent XML constructs several trees over the same character data. Each of those trees may be described in the same way as any XML tree; for simplicity and regularity we assume here that all pcdata nodes in the document are shared among all trees, and that no other nodes are shared.3 The first change needed is to change some of the constraints on the relations by specifying that:
The second necessary change is to specify, in the node_next relation, an additional column to indicate which tree the relation applies to. For convenience, we use the identifier used in the start- and end-tags of elements in that example. (In SGML, this is the identifier of the root element of that tree; in XCONCUR it is the arbitrary identifier used for the document node.):
No similar change is needed for the node_first relation, since the only nodes which appear in more than one tree are PCDATA nodes, which have no children and thus no first children, in any tree.
The concurrent trees for the example in section 2-1 are represented thus:
/* document L1 */ node_firstchild(L1,n1). node_firstchild(n1,n5). node_firstchild(n5,n8). node_firstchild(n8,-). node_firstchild(n10,-). node_firstchild(n11,n13). node_firstchild(n13,-). node_next(L1,L1,-). node_next(n1,L1,-). node_next(n5,L1,n11). node_next(n8,L1,n10). node_next(n10,L1,-). node_next(n11,L1,-). node_next(n13,L1,-). /* document L2 */ node_firstchild(L2,n4). node_firstchild(n4,n7). node_firstchild(n7,n8). node_firstchild(n9,n10). node_next(L2,L2,-). node_next(n4,L2,-). node_next(n7,L2,n9). node_next(n8,L2,-). node_next(n9,L2,-). node_next(n10,L2,n13). node_next(n13,L2,-).
Like CONCUR, colored XML defines multiple trees over the document, distinguished by color so that a collection of data may have (for example) a red, a green, and a blue tree sharing nodes. The n-ary representation of multi-colored trees is almost the same as the n-ary representation of XML trees, but adds a color column as part of the key:
The simpler relations are similarly augmented:
The more general description of a node and all of its neighbors is again similar to that for XML trees, with the addition of a color column:
It follows from the rules of construction that in a set of n colored trees, any node may have up to n parents, no two of which share a color.
The relational representation of the example of section 2-1 as a set of multi-colored trees would be very similar to the one given above for concurrent trees, if we substitute “red” for “L1” and “blue” for “L2” in the node_next relation, and add a column with the appropriate value to the rows of the node_first relation. The individual node relations remain unchanged.
A more interesting example, however, would be the actors / movies / awards example of section 2-2. It is left as an exercise for the reader.
Like multi-colored trees, Goddag structures allow nodes to have multiple parents. But there is no requirement analogous to the rule that the parents of a node must have different colors. The n-ary representation of Goddag structures is thus the same as the n-ary representation of XML trees:
Restricted Goddags have the same parent_children relation, with the constraint that parents who share children are required to have them in the same sequence.4
In colored Goddags, as might be expected from the name, the relation is augmented with a color column:
The node_firstchild relation is, for normal and restricted Goddags, the same as in XML.
The node_next relation for restricted Goddags is the same as for XML: for any node, there is at most one right sibling:
The more general node_relations relation described above for the other data structures has no single analog for Goddag structures. We could specify
It is useful, although not strictly speaking essential, to record the identifiers used to identify certain nodes:
The nodes of the sample Goddag of section 2-3 could be represented thus, using the relations for normal Goddags. I omit the node_type relation as redundant:
document(DD).
document(DV).
element(n01,-,"scene").
element(n02,-,"sp").
element(n04,-,"p").
element(n06,-,"lg").
element(n08,-,"l").
element(n10,-,"l").
element(n12,-,"l").
element(n14,-,"sp").
element(n16,-,"p").
element(n18,-,"lg").
element(n19,-,"l").
element(n21,-,"sp").
element(n23,-,"p").
element(n25,-,"lg").
element(n26,-,"l").
element(n28,-,"p").
element(n99,-,"haiku").
element(n98,-,"lg").
element(n97,-,"l").
element(n96,-,"l").
element(n95,-,"l").
attribute(n03,n02,-,who,"HUGHIE").
attribute(n07,n06,-,type,"haiku").
attribute(n15,n14,-,who,"HUGHIE").
attribute(n22,n21,-,who,"HUGHIE").
pcdata(n05,"How did that translation go?").
pcdata(n09,"da de dum de dum,").
pcdata(n11,"gets a new frog,").
pcdata(n13,"...").
pcdata(n17,"Er ...").
pcdata(n20,"it's a new pond.").
pcdata(n24,"Ah ...").
pcdata(n27,"When the old pond").
pcdata(n29,"Right. That's it.").
id_node("frog",n10).
id_node("new", n08).
id_node("pond",n26).
The graph structure of the example can be represented thus. For brevity I omit rows for nodes which lack children, or which lack following siblings (except for the PCDATA nodes which have multiple parents):
node_firstchild(DD,n01). node_firstchild(n01,n02). node_firstchild(n02,n04). node_firstchild(n04,n05). node_firstchild(n06,n08). node_firstchild(n08,n09). node_firstchild(n10,n11). node_firstchild(n12,n13). node_firstchild(n14,n16). node_firstchild(n16,n17). node_firstchild(n18,n19). node_firstchild(n19,n20). node_firstchild(n21,n23). node_firstchild(n23,n24). node_firstchild(n25,n26). node_firstchild(n26,n27). node_firstchild(n28,n29). node_firstchild(n99,n98). node_firstchild(n98,n97). node_firstchild(n97,n27). node_firstchild(n96,n11). node_firstchild(n95,n20). node_next(n02,n01,n14). node_next(n04,n02,n06). node_next(n08,n06,n10). node_next(n10,n06,n12). node_next(n11,n10,-). node_next(n11,n96,-). node_next(n14,n01,n21). node_next(n16,n14,n18). node_next(n20,n19,-). node_next(n20,n95,-). node_next(n23,n21,n25). node_next(n25,n21,n28). node_next(n27,n26,-). node_next(n27,n97,-). node_next(n97,n98,n96). node_next(n96,n98,n95).
It may be worth comparing the representations given here to some other approaches to overlap.
First, there is the rather different relational representation of overlapping structures proposed by [Durusau / O'Donnell 2004]. Durusau and O'Donnell postulate no graph structure, but they do define a relation for representing documents:
The Trojan Horse markup used by Durusau and O'Donnell specifies start- and end-points for elements. The elements thus identified may be taken as nodes in a graph. But Durusau and O'Donnell explicitly refrain from specifying rules for determining the arcs of the graph. “The table here simply records the information required to resolve any ambiguity and position within a hierarchy, deferring the resolution of any ambiguity to [...] later processing.”
The markup postulated by Durusau and O'Donnell corresponds fairly directly to that of MECS; it follows that documents represented in the segments relation can be represented as restricted Goddag structures, and that any restricted Goddag structure can be represented using the segments relation.
The segments relation can also clearly be used to represent concurrent trees over the same frontier, so it can also be used to generate XConcur documents, and vice versa. For any one sequence of segments, there may be many corresponding sets of concurrent trees, but some representation as concurrent trees is guaranteed to be possible [Sperberg-McQueen / Huitfeldt 1999].
The Layered Markup and Annotation Language (LMNL) proposed by Jeni Tennison and Wendell Piez and their collaborators specifies a data structure for overlap, but not one whose relationship to the graph structures described elsewhere in this paper is obvious [Tennison / Piez 2002], [Anonymous 2004], [Anonymous 2006], [Cowan 2006].
A naive translation of the data model described by [Anonymous 2006], removing some of the more inconvenient redundancies, might use the following domains:
Using these relations, a naive representation of the example in section 2-1 might be as follows (attributes are omitted for simplicity).
document(D0,'http://www.example.com/lmnl/hammer',L0).
limen(L0,D0,
"Hey Paul! Would you give me the hammer?",
{d1,s1,s2,t1,u1,u2},).
range(u1,L0,-,u,0,28).
range(u2,L0,-,u,28,39).
range(d1,L0,-,div,0,39).
range(s1,L0,-,s,0,10).
range(s2,L0,-,s,10,39).
range(t1,L0,-,text,0,30).
The LMNL data model does not explicitly define documents as graphs; there may be some way of viewing instances of the LMNL data model usefully as graphs; such a view might help elucidate the similarities and differences between LMNL and the graph-based data structures described elsewhere in this paper. It seems plausible that a graph may be defined by taking atoms and ranges as vertices, and some relations among them as edges. But it is not clear what relations among ranges, if any, may best be taken as corresponding to the parent : child and previous : next relations in the other data structures described here; that remains work for the future.
A surprising variety of variations on XML can be described by simple modifications to the sequence relation on nodes of the document.
If we allow multiple document nodes, and make it possible for PCDATA nodes to have following siblings in some but not all documents, then we have concurrent XML.
If instead we add a color value as part of the key for the node_firstchild and node_next relations, then the result is multi-colored trees.
If instead of color we add to the node_next relation a new nPar (parent node) column, again as part of the key, then the result is normal Goddag structures.
Further consideration of the expressive power afforded (and not afforded) by these variations may shed light on the nature of the overlap problem, and may help analyse the problem further. Many discussions of overlapping structures have assumed that overlap is the name of a single problem (or at least have passed in silence over the possible differences among examples of overlap); but if each of the data structures described here successfully handles a different set of problems (not necessarily disjoint), with different degrees of naturalness or redundancy, then it seems natural to suppose that there are different kinds of overlapping structures, and that different mechanisms may be appropriate to different problems.
While that outcome may deal a blow to any dream of a unified solution to ‘The Overlap Problem’, it may also mark a stage in the maturation of our understanding of the problem area.
|
The identifiers L1 and L2 have been uppercased to avoid confusion with the numbers eleven and twelve. |
|
|
The reader may find it strange that the movie-role element seems to indicate neither the movie the role is in, nor the actor who played the role. It is quite true that an XML structure intended as a stand-alone representation either of movies or of actors and their careers would represent this information differently, using perhaps a structure of the form <actor>
<name>Bette David</name>
<role>
<movie>All about Eve</movie>
<character>Margo Channing</character>
</role>
<!--* ... *-->
</actor>In the actor context, the name of the actor is already known, given by the movie-role element's name sibling; it may be found the traversing the XPath “parent::actor/child::name” (or more simply “../name”). Similarly, in the movie-genre context, the name of the movie is also given by a name sibling and is the value of the expression “../name”. In each case, the relevant information is stored in a parent-child (actor/movie-role or movie/movie-role) link and can be retrieved by traversing the XML tree in a predicable way. In the example database, in which the actor tree is colored red and the movie tree blue, the two pieces of information can be retrieved from any movie-role element by traversing the two extended XPath expressions “{red}parent::actor/{red}child::name” and “{blue}parent::movie/{blue}child::name”. |
|
|
Sharing structures among trees may be possible, but the effect on the relational representation may vary depending on what constraints are assumed to hold among the trees. If two trees share a div element, for example, are they required, or not required, to share the same children for the div, in the same order? To use traditional SGML terminology: do they share the element? or just the tags? Concurrent trees which share elements will tend to resemble the Goddag structures described below; concurrent trees which share tags but not necessarily elements will tend to resemble multi-colored trees. |
|
|
This constraint is hard to check in relations of this form; that is one disadvantage of this formulation. |
[Anonymous 2004] Anonymous [Jeni Tennison? and Wendell Piez?]. 2002 (rev. 2004?). “LMNL Data Model”. Available on the Web at http://www.lmnl.net/prose/data-model/data-model-spec.html.
[Anonymous 2006] Anonymous [Jeni Tennison? and Wendell Piez?]. 2006. “LMNL Data Model”. Available on the Web at http://www.lmnl.org/wiki/index.php/LMNL_data_model.
[Cowan 2006] Cowan, John. 2006. “LOM”. Available on the Web at http://www.lmnl.org/wiki/index.php/LOM.
[Dekhtyar / Iacob 2005] Dekhtyar, Alex, and Ionut Emil Iacob. 2005. “A Framework For Management of Concurrent XML Markup”. Data and Knowledge Engineering 52.2: 185-215.
[DeRose 2004] DeRose, Steven. 2004. “Markup overlap: A review and a Horse”. Paper given at Extreme Markup Languages 2004, Montréal, sponsored by IDEAlliance. Available on the Web at http://www.mulberrytech.com/Extreme/Proceedings/html/2004/DeRose01/EML2004DeRose01.html.
[Durusau / O'Donnell 2002a] Durusau, Patrick, and Matthew Brook O'Donnell. 2002. “Concurrent markup for XML documents”. Paper given at XML Europe 2002, Barcelona, sponsored by IDEAlliance. Available on the Web at http://www.idealliance.org/papers/xmle02/dx_xmle02/papers/03-03-07/03-03-07.html.
[Durusau / O'Donnell 2002b] Durusau, Patrick, and Matthew Brook O'Donnell. 2002. “Coming down from the trees: Next step in the evolution of markup?” Late-breaking paper given at Extreme Markup Languages 2002, Montréal, sponsored by IDEAlliance. Slides (but no full text) available on the Web at http://www.mulberrytech.com/Extreme/Proceedings/author-pkg/2002/Durusau01/EML2002Durusau01.zip.
[Durusau / O'Donnell 2004] Durusau, Patrick, and Matthew Brook O'Donnell. 2004. “Tabling the overlap discussion”. Paper given at Extreme Markup Languages 2004, Montréal, sponsored by IDEAlliance. Available on the Web at http://www.mulberrytech.com/Extreme/Proceedings/html/2004/Durusau01/EML2004Durusau01.html.
[Eckart / Teich 2007] Eckart, Richard, and Elke Teich. 2007. “An XML-based data model for flexible representation and query of linguistically interpreted corpora”. In Datenstrukturen für linguistische Ressourcen und ihre Anwendungen / Data structures for linguistic resources and applications: Proceedings of the Biennial GLDV Conference 2007, ed. Georg Rehm, Andreas Witt, Lothar Lemnitzer. Tübingen: Gunter Narr Verlag. Pp. 327-336.
[Hilbert et al. 2005] Hilbert, Mirco, Oliver Schonefeld, and Andreas Witt. 2005. “Making CONCUR work”. Paper given at Extreme Markup Languages 2005, Montréal, sponsored by IDEAlliance. Available on the Web at http://www.mulberrytech.com/Extreme/Proceedings/html/2005/Witt01/EML2005Witt01.xml.
[Huitfeldt / Sperberg-McQueen 2001] Huitfeldt, Claus, and C. M. Sperberg-McQueen. 2001. “TexMECS: An experimental markup meta-language for complex documents”. 2001, rev. 2003. http://decentius.aksis.uib.no/mlcd/2003/Papers/texmecs.html
[Huitfeldt 1999] Huitfeldt, Claus. 1999. MECS—A multi-element code system. Working papers of the Wittgenstein Archives at the University of Bergen.
[Jagadish et al. 2004] Jagadish, H. V., Laks V. S. Lakshmanan, Monica Scannapieco, Divesh Srivastava, and Nuwee Wiwatwattana. 2004. “Colorful XML: One hierarchy isn't enough”. Proceedings of the 2004 ACM SIGMOD International conference on management of data, Paris, sponsored by the Association for Computing Machinery Special Interest Group on Management of Data. New York: ACM Press.
[Schonefeld / Witt 2006] Schonefeld, Oliver, and Andreas Witt. 2006. “Towards validation of concurrent markup”. Extreme Markup Languages 2006.
[Schonefeld 2007] Schonefeld, Oliver. 2007. “XCONCUR and XCONCUR-CL: A constraint-based approach for the validation of concurrent markup”. In Datenstrukturen für linguistische Ressourcen und ihre Anwendungen / Data structures for linguistic resources and applications: Proceedings of the Biennial GLDV Conference 2007, ed. Georg Rehm, Andreas Witt, Lothar Lemnitzer. Tübingen: Gunter Narr Verlag. Pp. 347-356.
[Sperberg-McQueen / Huitfeldt 1999] Sperberg-McQueen, C. M., and Claus Huitfeldt. 1999. “Concurrent document hierarchies in MECS and SGML”. Literary & Linguistic Computing 14.1: 29-42.
[Sperberg-McQueen / Huitfeldt 2000] Sperberg-McQueen, C. M., and Claus Huitfeldt. 2000. “GODDAG: A Data Structure for Overlapping Hierarchies”. In DDEP-PODDP 2000, ed. P. King and E.V. Munson, Lecture Notes in Computer Science 2023 (Berlin: Springer, 2004), pp. 139-160. Available on the Web at http://www.springerlink.com/index/98J1VBU5NBY73UL3 and http://www.w3.org/People/cmsmcq/2000/poddp2000.html.
[Tennison / Piez 2002] Tennison, Jeni, and Wendell Piez. 2002. “LMNL Syntax”. Available on the Web at http://www.lmnl.net/prose/syntax/index.html.
[Witt 2004] Witt, Andreas. 2004. “Multiple hierarchies: new aspects of an old solution”. Paper given at Extreme Markup Languages 2004, Montréal, sponsored by IDEAlliance. Available on the Web at http://www.mulberrytech.com/Extreme/Proceedings/html/2004/Witt01/EML2004Witt01.html