Abstract
In an important recent article on XML.com entitled "Identity Crisis", Kendall Clark addresses the issue of "identity" as it pertains to the World Wide Web. Clark quotes the description of the Web by the W3C's Technical Architecture Group (TAG) as a "universe of resources", where "resource" is to be understood according to the definition given in RFC 2396 as being "anything that has identity". Clark points out that the concept of "identity" itself is nowhere defined and moreover is severely problematic.
Clark's article is part of a long-standing and on-going discussion in the Semantic Web community. Tim Berners-Lee, after finding himself in a minority in the W3C TAG, has found it important enough to justify a position paper of his own, entitled "What do HTTP URIs Identify?". Other important contributions have been David Booth's "Four Uses of a URL" and Sandro Hawke's "Disambiguating RDF Identifiers", among many others. Most recently, a W3C mailing list devoted to "the social meaning of RDF and URIs" spawned over 150 postings in a single month.
The heart of the matter is the question "What do URIs identify?" Today the Semantic Web community has no consistent answer to this question, as one of their number notes: "To date, RDF has not been clear about whether a URI like http://www.w3.org/Consortium identifies the W3C or a web page about the W3C. Throughout RDF, strings like http://www.w3.org/1999/02/22-rdf-syntax-ns#type are used with no consistent explanation of how they relate to the web."
Why is this important? Because without clarity on this issue, it is impossible to solve the challenge of the Semantic Web, and it is impossible to implement scaleable Web Services. It is impossible to achieve the goals of "global knowledge federation" and impossible even to begin to enable the aggregation of information and knowledge by human and software agents on a scale large enough to control infoglut.
Ontologies and taxonomies will not be reusable unless they are based on a reliable and unambiguous identification mechanism for the things about which they speak. The same applies to classifications, thesauri, registries, catalogues, and directories. Applications (including agents) that capture, collate or aggregate information and knowledge will not scale beyond a closely controlled environment unless the identification problem is solved. And technologies like RDF and Topic Maps that use URIs heavily to establish identity will simply not work (and certainly not interoperate) unless they can rely on unambiguous identifiers.
A solution to the "identity crisis of the Web" is clearly essential. The purpose of this controversial paper is to offer an explanation of the root causes of the problem and to show how concepts originally developed as part of XML Topic Maps (XTM) offer a solution that can be applied to the semantic web in general.
Keywords
Since this was a late-breaking talk, the author did not have time to complete the paper for the proceedings.
![]() ![]() |
Design & Development by deepX Ltd. |