|
Table of contents | Author | City | Company | Country | State/Province | Term | Interchange | ![]() |
metadata
Neylon, Eamonn
, Manifest Solutions,
Oxford
Oxfordshire
United Kingdom
Email: eneylon@manifestsolutions.com
Web site:http://www.manifestsolutions.com/
Eamonn Neylon is the founder and main consultant of Manifest Solutions, a consultancy working on strategic issues in electronic publishing. Eamonn has worked in both the publishing and software development industries. During eight years with the Thomson Corporation, he developed several innovative systems for publishing on the Internet. He then joined a small software developer where he oversaw several software maintenance releases and created the Lynkbase metadata management system. More recently he has been engaged in rights management activities.
Eamonn holds a degree in Electronic Systems Engineering from the University of East Anglia and is editor of interChange, the newsletter of the International SGML/xML Users' Group.
The DOI is a system that uses identifiers to resolve to services made available on a global basis by content owners and their partners in an information supply chain. Until recently the DOI System was limited to a simple redirection to a networked resource, but the capabilities of the underlying resolution system have now been exposed using an XML based interface to allow arbitary services and relationships to be expressed through the resolution of an identifier. The services that are made available can be any information about the entity being identified, or manifestations of the entity itself.
The DOI system uses a set of primitives that allow basic questions to be asked of an identifier. An application profile describes the extended set of services that are available for a particular content genre (also defined by the application profile). These two levels of service allow any service or set of services to be presented to the user of an identifier according to their interests. The DOI is a global service that returns the same information to all users regardless of their context. A complimentary technology has been developed that works with DOIs to provide for the discovery of locally available services using the OpenURL system for metadata transportation is also described.
OpenURL is a system that supports the local resolution of identified entities using contextual information supplied along with a request to an OpenURL-aware service. Integration of the DOI and OpenURL systems has been achieved through a project which addresses the location of an appropriate copy of an information resource in a library environment. The DOI system can be used to access metadata which can then be supplied to a users local service that determines whether the resource is available locally.
Both these systems are described and their relationship to each other is explored. By bringing these two approaches together it is possible to satisfy the needs of both producers and consumers of information resources.
Identifiers are special names that are used for particular purposes. They have many uses; but those that persist over periods of time and remain associated with the same thing (entity) are the most powerful. So identifiers that relate to people, physical objects and information are particularly valuable as these can all be used in different ways at different times.
In computer systems, identifiers are often used to keep track of generated objects for the duration of their existence within a program. But identifiers can also be externalized from a particular program, and used to allow interactions between heterogeneous computer systems. This paper considers such persisted identifiers and how they can be of use in developing service-based applications that can communicate across systems. SGML has system identifiers and public identifiers, but these are different types of identifiers to those that are considered herein.
Two types of identification are considered: direct (by-reference) and through metadata (by-description). Systems that support the use of each of these types of identifier are presented. A proposal to support the predictability and interoperability of identifier resolution is presented, and the integration with the emerging web-services platform is considered.
How do we know when we are dealing with an identifier? A persisted identifier needs to be usable within some domain of application otherwise it is not really an identifier at all. Consider the ISBN . This string of numbers and a check character is universally recognized as the correct way to identify monographic material. It can be transcribed and plugged into a number of diverse systems. However, although it is a public identifier, there are still no real global applications of ISBNs. Book identification is widely deployed both in print and electronically, but a regional administration system has resulted in territory specific maintenance. Most applications, like online retailing or CD-based catalogs, only comprehend a particular subset of ISBNs.
But globalization, particularly through the Internet, is affecting our expectations. We no longer constrain ourselves to products that are delivered to convenient markets, but are able to find the products wherever they may be available. Combined with an increasing expectation of predictability from computational systems, the role of identifiers is set to become both more substantial and also less obvious.
The IETF has been at the forefront of public name and identifier standardization. This body is responsible for the existence of URI, URN and URL, all of which are really identifiers. However there is still much confusion as to the difference between these things. In brief, URL is a locator. What can be found at a particular location can vary but if a URL is used as an identifier then there is a requirement for maintenance of the resource at the URL. URNs are names that are given to networked resources. It is a namespacing mechanism that allows the generation of identifiers within a particular domain. Just like real name, it is possible for the same URN to be applied to more than one resource. URIs are globally unique identifiers, they specify a unique identifier for an resource which cannot be changed with time. Equivalent URIs indicate equivalent resources. For both URNs and URIs, there is a registration process set up by IANA.
Before proceeding it may be useful to describe some of the potential applications of the types of identifiers that are being considered. Increasingly, automation requires less ambiguity than we are used to as humans. It is important that resources can be addressed with precision and hence identification schemes linked to clear transactional capabilities are a fundamental building block for future applications.
An activity that takes identifiers very seriously is the Semantic Web. This uses URIs to enable its processing model; equivalence of URIs is used for deductive reasoning to create network interoperability. So if identifiers are the cornerstone of this activity, isn’t it time that we looked more closely at them?
Identifiers are also important in enabling TopicMaps and RDF. Both of these descriptive technologies rely on the use of uniqueness as a means of enabling functionality - albeit in different ways. The key identifier within Topic Maps is a PSI , which provides a definition of a subject that is being described. Although there is a requirement to make PSIs publicly available, there is no agreed manner to achieve this publication. Use of DOIs to resolve PSIs might be one way to publish these PSIs. OASIS has recently established an initiative to address the issue of making PSIs available. A decision is yet to be made by the Topic Map community as to whether there will be a specified means of publishing PSIs. If there is a move to do so, then the DOI is likely to be considered as a means of doing so. There is already some awareness within this community of DOI, and the flexibility of what a DOI can resolve to should appeal as PSIs can take any form.
RDF places identifiers in the form of URIs at the heart of its processing system. All resources within an RDF instance must be identified. So at an architectural level the need or identifiers is well understood. As well as requiring identifiers to allow the processing of their graphs, these knowledge representation capabilities may be useful in allowing the resources related to a DOI to be expressed in a consistent manner. In particular RDF, annotates resources directly rather than relying on a greater topic-centric view of resources. RDF has been successfully applied to DOIs in the proposed implementation described herein.
Identifiers are also being embraced within the content industry. The scientific journal publishing community has been the first to adopt a system for identification that has a resolution system supporting the creation of predictable responses to usage of the identifier. Other sectors are now actively considering the use of resolvable identifiers within their communities, including book publishers, the music industry and distance learning activities. These communities have diverse requirements ranging from content locating to integration with rights management applications, but all have some common requirements.
Before considering a dedicated identifier system, we should look at an alternative approach that affords some functionality not currently available in global identifier systems and which may work well in conjunction with identifier-based systems.
OpenURL is a mechanism for providing localized responses to requests for resolution. It is a syntax that uses a description of an intellectual property resource to yield references to local holdings. Metadata (possibly including identifiers) about one or more resources is transported embedded in URLs to service providers that can perform some action using that information. The information transported in an OpenURL describes a resource that is being requested (and optionally where the request has originated). The OpenURL syntax specifies how metadata can be encoded in a URL.
The term OpenURL is used with respect to both a syntax and a system. But the system is not defined — anything that uses the transferred data could be said to provide an OpenURL system. In practice, the transferred data is used in conjunction with information about the context of a user interested in a particular resource. This user-contextual information is not part of the OpenURL syntax but is instead supplied through other information supplied when the URL is activated (such as HTTP header information, a digital certificate, cookies or some other identification process).
The target of the service request component of an OpenURL is a user's local service provider. The parameter component of an OpenURL transports the target object's metadata. An OpenURL consists of a base URL followed by a query for one or more objects. So:
http://resolver.local.org/getlocal?author=Shelley
sends an OpenURL-compliant request to a receiving service provided by getlocal at the location specified that a query with the parameter author and the value Shelley. What is not seen in the syntax is that the service will also receive any information about the user that may be sent along by default with the request as part of any authentication that has taken place between the users client and the server. The local service can then decide, based on the metadata sent and what the server knows of the user’s credentials, how to respond to the request.
In order to allow for the delivery of context-sensitive services information, recipients of an OpenURL should implement a technique to determine the difference between a user who has access to a service component that can deliver context-sensitive services; and a user that does not. The mechanism used to determine a user's membership of a particular group could be cookies, digital certificates, part of a user's stored profile in an information service, an IP address range, or something else. This user recognition is not a part of the OpenURL syntax and is separate to OpenURL.
The localization capability of OpenURL provides a unique feature that addresses a fundamental issue with global identification systems – what if a user already has rights to a local copy of a globally identified resource. The OpenURL syntax is currently being standardized under the auspices of NISO . Current implementations deal with scholarly publishing, but there is a proposal to generalize the syntax to accommodate any type of media object. This could provide a powerful systematic localization with application much broader than the current academic library users.
The system considered for identifier resolution is the DOI System. This is a multi-layered system that defines the syntax, resolution, policies and governance of identifier resolution for identifiers registered with the system. DOI is an open system that uses the handle system technology to make identifiers actionable. The syntax is very flexible so legacy identifiers can easily be registered within the DOI system and the underlying technology provides a powerful infrastrucural-component for widescale usage.
The DOI is a persistent identifier of intellectual property entities. The DOI can be used to identify any of the various physical objects that are manifestations of intellectual property: for example, printed books, CD recordings, videotapes, journal articles. A DOI can also be used to identify less tangible manifestations, the digital files that are the common form of intellectual property in the network environment. But the use of a DOI can go beyond the identification only of manifestations – it can also be used to identify performances of intellectual property or the abstractions that underlie the different manifestations. It does not mandate how identifiers are constructed or what they are applied to.
The IDF , a not-for-profit open membership organization set up in 1998 for the purposes of developing and governing the DOI System, sets out the rules that govern the implementation and operation of the DOI system. The IDF manages development, policy and licensing of the DOI to registration agencies and technology providers and advises on usage and development of related services and technologies.
The DOI system is scalable for size (there are no fixed field lengths and an infinite number of DOIs may be conceived) and performance (handle is a distributed technology). As persistence is a function of policy, not technology, DOI policies are designed to enforce persistence and to prevent deletion or renaming of identifiers. The global uniqueness of each DOI is encouraged through syntax and administrative procedures and is enforced through the technology, which does not allow the entry of duplicate DOIs.
Identifiers and metadata (associated data about the identified entity) are interdependent. Globally resolved opaque identifiers (such as DOIs) alone are not enough to provide a reference to what is being identified. In addition it may be important to consider the context of a particular user when providing services based on the entity being identified. This interdependence is recognised and is related to the need to uniquely identify intellectual property entities – providing kernel metadata for disambiguation of entities is central to the IDFs policies. So minimal levels of metadata are therefore an essential component of well designed actionable identifiers such as the DOI; and the DOI system requires that identifiers are registered with mandatory structured descriptive data. An XML expression of the kernel metadata requirements has been published to make it easier for registration agencies to be conformant.
The DOI system consists of four distinct components. The syntax is a flexible structure which accommodates existing identifiers. The system is a means of making the identifier actionable so that predictable access to resources related to the content being identified can be achieved. The policies are enforced to ensure that the principles upon which the DOI system is based are maintained. Finally, a governance layer, works to establish a federation of registration agencies and promotes mechanisms to maximize interoperability across implementations of DOIs.
The implementation of the DOI infrastructure requires that appointed registration agencies work with user communities to define the metadata requirements and enforce both global and application level policy. In doing so, these registration agencies are establishing the administrative environment that supports the maintenance of DOIs, and the creation of applications that rely upon the resolution services. The creation of administrative interfaces allows for the possibility of different parties making assertions about a DOI. This provides a means of administering the information about a DOI by all recognized participants in a supply chain. It is therefore important that registration agencies are established with the support of the communities that they represent and that they reflect the needs of those communities so that appropriate permission structures can be established. As the DOI is a persistent identifier, if the ownership of the entity identified or rights in it change, the identification of that entity should not (and does not) change. The responsibility for managing the DOI changes, but not the DOI itself.
Is the digital object identifier an identifier for digital objects or a digital identifier of objects? The answer is that the DOI is not restricted to the electronic networked environment and so is a digital identifier for objects. Those objects that are identified do not need to exist in a manifestation but can be abstract ideas or physical manifestations outside of an electronic environment. The DOI is not about distributed content management but the distributed management of the services relating to the identified content.
DOIs use a resolution system to provide access to a variety of services. Resolution is the service that makes metadata and identifier based approaches useful. It is the means by which a request can be responded to and requires agreement about what can be asked of a resolution service. OpenURL uses webservers to provide a form of resolution for data encoded in a URL. DOIs use a dedicated resolution service based upon the Handle system technology.
The Handle System, provides a globally distributed capability for assigning, managing, and resolving persistent identifiers, known as “handles” to facilitate the access of digital objects and other resources on networks such as the Internet over long periods of time. Handle resolution enables an identifier (DOI) to resolve to multiple pieces of current state data such as type(s) and location(s) of instances of the identified entity, type(s) and location(s) of associated metadata, public keys, accessibility, etc. The Handle System provides the underlying capabilities of persistence and state maintenance.
DOI, being based on handle, is a global system. All information about a resource is the same in the global system wherever the DOI is resolved from (although it may be filtered to provide different views on that data). Data associated with a DOI can be modified, and extended, and is not locked into embedded implementations. The DOI system provides an authoritative resolution service with careful control of the results of the resolution process. Additionally, a DOI can be resolved to multiple pieces of information, including pointers to well-structured metadata.
Existing resolution systems tend to return a single value of a certain type and have limited application. The DOI being based on the handle system can return arbitrarily complex data depending on what has been deposited in the system. This ability to return more than one value is called multiple resolution and allows rich option sets to be returned to a user.
However, it is becoming increasingly clear that the same resolution information is not equally valid for the same identifier across all situations. Consider a library holding a local copy of an entity identified by a DOI: the associated information in the global DOI resolution system does not, and should not, account for the specifics of that local copy, but that is precisely the information required by a patron of that library. Contextualisation is a general issue for all identifiers in many contexts, e.g. IP-telephony (enterprise dialing schemes taking precedence over, but linked to, global numbering). This is why DOI and OpenURL are seen as complimentary technologies.
The DOI is often compared with the DNS system. The main differences are the complex data that can be returned from a resolution and the lack of localization available in the handle system. The complex data that can be returned from a DOI resolution is analogous to directory services and is a strength of DOI over DNS. The lack of localization would be an issue was it not for the complimentary localization capabilities provided by the OpenURL system. Indeed interoperability experiments have taken place which demonstrate how global requests could be redirected to local services and vice versa through cooperation between DOI and OpenURL.
What a DOI resolves to is known as a service. There can be many services for a given DOI, and these services are not necessarily web services (although often, they can be represented as such – see later). As with dynamic HTML there is more than one definition of what a service is, and it is important that we define ours. A DOI service is a predictable response to a resolution request that affords some functionality. This means that a service may implement any functionality, but that it needs to be presented in a consistent manner. The DOI architecture is a distributed and thus services can be made available at a variety of locations. The ability of DOI services to pass requests to each other depending on their ability to respond to a particular request thus allows a flexible and extendible implementation of the system. So heterogeneity in the capabilities of the service providers is not critical, and scalability in the deployment of resolution services as requirements and capabilities evolve is supported.
The current position with DOI is that one DOI resolves to one URL. This is what is implemented in Crossref and is an example of resolution to a single resource. This provides similar functionality to a PURL - a Persistent URL. But the underlying data store, the handle system, allows for much richer data structures. The next step in the evolution of DOI as a network identifier is the "One to Many" scenario. The problem with this approach is that one doesn't know what to return to the end user - a single URL, multiple URLs, or some other form of data entirely. And there is no presentation assistance - URLs in their raw state are at best an unpalatable mouthful for the end user. It is important at this point to reiterate the point that what is being resolved when a DOI is used is not a content object (the entity itself) but a new type of object – the service object which describes those services available for a given identified entity.
What was needed is a general framework for linking multiple resources. A common infrastructure interface has been proposed that allows application builders to create, access and maintain DOIObjects (DObjects) in their applications enabling the network retrieval of a set of related services. In creating a framework to instantiate DObjects within applications, care was taken not to preclude or require any business models that might be applied to the use or administration of DOIs.
A layered approach was adopted for the interface with existing technologies applied to particular problems encountered during its development. This approach allows the layers to be replaced with functionally equivalent approaches as technology evolves. The central concept embodied in a DObject is a resource group. Each resource group has one resource with associated information that qualifies the meaning of the resource. A resource group is not an atomic unit of information but may be a composite set of data that defines items such as the resources label, type, access name, etc. The group allows properties to be added to a resource. Resource groups bind together items of data to form resources (for example a resource and a label). So a resource can comprise many data items, but each resource must be one of:
"Literals" - inline resources identified by URI of scheme "data:" (RFC 2397), ie literals are represented as resources.
"Resources" - familiar web resources identified by URI; properties: anonymous, maintained by a single organization or enterprise.
"DObjects" - managed resources identified by URI of scheme "doi:" (the DOI); properties: named, maintained across a common business sector.
Groups bind together related resources. They exist to allow relationships between objects to be expressed and provide a sibling relationship between resources.
Hierarchies are means of expressing parent-children type relationships between resources. They allow the construction of nested data structures.
Ordering sequences are means of expressing preferred ordering amongst siblings. They ensure that resources are made available in an intended order.
The DObject resource description is serialized as an XML document. In particular an RDF/XML serialization provides the necessary constraints. This is purely a "behind the scenes" exchange syntax which is not intended to be exposed to DOI end users, although it does allow for downstream interoperability with other RDF aware applications. RDF allows the resources related to a DOI to be expressed in a consistent manner. In particular RDF, annotates resources directly rather than relying on a greater topic-centric view of resources. RDF has been successfully applied to DOIs in the proposed YADS implementation.
Resources are typed against a limited set of primitive data types .A goal of the interface design is to keep the implementation as simple as possible. It is the minimum number of types possible; however to facilitate the extension of types, a type of type "type7quot; is implemented. This facility allows type extensions to be bound with resources to type that resource. This extension mechanism does not remove the typing problem but transfers it from the primary storage system through an indirection to an extended type registry.
This approach to data typing has been called "neutral semantics" or "late semantic binding". The semantics that are to be exchanged between a DOI client application and a DOI service are primarily structural. Application specific semantics can be recovered from an application-specific schema, which is registered within the DOI system. This mechanism allows the generic representation for a DOI to be specialized (or enriched) into an application specific representation. As an example consider how a new eBook format type could be introduced:
| This resource: | Can be translated to: |
<rdf:li rdf:parseType="Resource"> <doi:resource>Microsoft Reader</doi:resource> <doi:type>xyz:ebook_format</doi:type> </rdf:li> </code></para></td><td><para><code> <rdf:li rdf:parseType="Resource"> <xyz:ebook_format>Microsoft Reader</xyz:ebook_format> </rdf:li> |
However, some types need to be defined to support the core requirements of the interface. At the Dobject level, these have been identified as resource, label, detail, access, profile, type, role alias and resolves.It is also necessary to provide some system level DObject types that can be located through use of the profile. These are genre: agent:, schema service: and null.
In this model profiles link particular resource sets of a DObject to a group of specifying resources that say something about what that set of resources relates to (an application profile). It is proposed that all types should be registered in the system so that they are discoverable by application developers. A possible implementation of this approach is demonstrated within YADS .
|
DOI Resolution Response Structured Data According to the YADS Proposed Model, courtesy Tony Hammond (Elsevier Science) |
Each resource hierarchy is to be interpreted by a particular DOI application. Multiple hierarchies can be supported by adding a system property "profile" which is a DObject describing the application profile.
YADS is a tool which has been developed to shows how DOIs as public identifiers can be resolved to multiple resources and how those resources are related. YADS does not represent a complete DOI API, but it does address the description, structuring, relationships and typing requirements of resolution information. Some of the outstanding issues that have not been addressed yet include parameterization, concurrent editing of a DObject, and authentication.
YADS uses a map to handle system by serializing it into discrete handle values along with the map type that lets you put it all back together. An alternative approach might be to just keep it as a single construct such as RDF and move the management responsibilities to the registration agencies.
|
How DObjects are Serialized into the Handle System, courtesy Corporation for National Research Initiatives |
Web services are emerging as the preferred means of making message-based services that use the HTTP protocol publicly available. Web services provide a gateway to other applications that can use any protocol. Just as the Common Gateway Interface allowed web-servers to communicate with back-end systems to deliver legacy applications through a new and standardized interface, web services provide a well defined to allow machines to respond to requests for information. This new interface technology represents an evolution of the systems that we now know, to a more robust and distributed infrastructure where issues such as reliability can be addressed. The ability to resolve a DOI through a web service (or set of web services) would provide a standardized interface, which can be implemented in various ways as back-end technology evolves.
SOAP provides a means of packaging and unpacking information objects for transportation in a distributed environment. SOAP can then be used to actually provide a generic interface to DOI resolution services.identifiers and this would allow new service providers to become available as the use of the infrastructure increases. Within a SOAP envlope, it is necessary to have a standard way to expose DOI Objects. The data model that has been employed to provide access to the DObject is RDF. This provides a means of defining and constraining the allowed types and extensions through the RDFSchema as well as interoperability with other applications that choose this data model The interface that allows developers to access DOI objects presents the data as textual items packaged in XML syntax. This allows the serialization of the DOI object for transport and system interoperability with other XML applications. In turn the XML is wrapped in a SOAP envelope to allow the deployment of the resolution service as a web service, if required. The following is a simple example of how this can be expressed:
<?xml version="1.0" encoding="UTF-8" ?> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <SOAP-ENV:Body> <DOI-SERVICE:readDObjectStoreResponse xmlns:DOI-SERVICE="http://dx.doi.org/1014/doi.service"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:doi="doi:1014/10.1000/system.schema.2001-07-26#"> <doi:Object rdf:about="doi:1014/10.1006/abio.2001.5293"> <doi:contains> <rdf:Bag> <rdf:li rdf:parseType="Resource"> <doi:profile rdf:resource="doi:1014/10.1000/system.profile.crossref" /> <doi:contains> <rdf:Bag> <rdf:li rdf:parseType="Resource"> <doi:resource rdf:resource="http://idealibrary.com/links/doi/10.1006/abio.2001.5293" /> <doi:access>default</doi:access> <doi:label>Anal. Biochem. 2001; 295(1): 127-127</doi:label> </rdf:li> <rdf:li rdf:parseType="Resource"> <doi:resource rdf:resource="doi:1014/10.1006/abio.2000.4984" /> <doi:role>relative</doi:role> <doi:type>xref:Original</doi:type> </rdf:li> </rdf:Bag> </doi:contains> </rdf:li> </rdf:Bag> </doi:contains> </doi:Object> </rdf:RDF> </DOI-SERVICE:readDObjectStoreResponse> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
The WSDL is another component of the web services framework that can provide a way to specify what a particular DOI service does and may be useful for particular communities to use for describing the services provided upon resolution of a DOI. WSDL could also be used to describe web-based access to DOI resolution services (i.e. provide a means of formulating requests as well as responses).
DOI can also be presented as an XLink application: extended XLink supports many of the concepts also available in XLink. These extended XLinks relate to resources that can be local (by value addressed in relation to linking element) or remote (by reference addressed using a URI). XLink arcs can be outbound, inbound or third party — allowing bi-directional links to be maintained. Third party links are those in which the actual link is external to those documents in which they are used. Documents containing collections of inbound and third-party links are called link databases, or linkbases. Extended XLink allows multiple resources to be referenced within a link, mirroring the multiple resolution capabilities of the DOI System.
Extended XLink allows authors to create a link that emanates from a resource to which the author does not have (or choose not to exercise) write access to, or from a resource that offers no way to embed linking constructs. When such external links are used, the requirements for discovery of the link are greater than for outbound arcs (i.e. the infrastructure need to be more advanced that that which is currently widely available). So extended XLink tends not to be used in practice. However, the infrastructure of the doi system addresses some of the management issues surrounding the gathering and presentation of links with multiple resources at their nodes.
In practice extended Xlinks have been suppported in external databases where the resolution of the links are prerformed at the server side before delivery to the user. This mirrors how DOIs with multiple resources can currently be resolved. Links are vital to the IDF, as the resolution system is ultimately about linking resources to an identifier. Because of this common interest, reconciliation of DOI and XLink might afford many benefits and move the market along.
Also related to XLink, are the XPointer and XPath standards. Both of these allow specification of what part of a modeled entity is of interest once it has been retrieved. They provide support for addressing into parts of a located XML instance. This ability to address components of a data model could provide a useful means of addressing into a DOI resource set — but this requires an established model of the DOI resource set to work with. Such a model is not currently agreed.
Conventional web links do not take into account the identity of the user: they take all users to the same target. This can cause problems. Both the DOI and OpenURL address these issues but with a different focus. DOI creates a globally accessible resolvable identifier than can be maintained independently of embedded links. OpenURL provides a resolution service that considers local user information when directing users to a resource. These systems are complementary in that they provide different functions that support the overall user experience by enabling a more robust and contextual Internet.
These two efforts in implementing structured identifiers and descriptions about information entities are complementary. DOI and OpenURL have sometimes been seen as competing, rather than (as is the case) complementary technologies: DOI creates a globally accessible resolvable identifier than can be maintained independently of embedded links; OpenURL provides a resolution service that considers local user information when directing users to a resource.
The two efforts could be wrongly caricatured as "DOI community is for publishers/content owners; and the OpenURL community is for libraries/intermediaries". This extreme, and incorrect, view may then lead to a belief that publishers and libraries do not see the interdependencies that exist between them. The idea that publishers want to kill libraries by forcing all electronic traffic to go through opaque identifiers instead of normal bibliographic references that libraries can use without the publishers' cooperation is simply not true. Similarly, any belief that libraries want to kill current commercial publishers by removing their revenue stream and helping universities self-organize scholarly publishing should be seen in the light of existing commercial reality. In fact, each provides a necessary functionality in enabling a more robust and context-sensitive access to information for users.
As well as introducing these systems, we have considered the use of RDF as a proposed general model for DOI services. The expressive power of RDF allows for very flexible data structure creation whilst allowing for necessary constraints to be specified within the system of RDF Schemas. This elegant solution affords many benefits to the DOI community, not least an interoperability with the related technologies that may use the same data model. The potential application in Topic Maps and likely relationship with XLink have also been suggested. There is still much work to be done in these areas, but it seems likely that this work will result in a powerful interoperable identifier system that will be made available as web services.
The DOI model, which is the most substantial component of this paper, was developed by Tony Hammond of Elsevier Science (formerly of Academic Press). Without Tony's insight, determination and boundless energy, the work reported in this paper would not have happened.
[YADS] Tony Hammond, YADS: Yet Another DOI Service http://dx.doi.org/1014/yads
[OPENURL] Herbert Van de Sompel and Oren Beit-Arie, Generalizing the OpenURL Framework beyond References to Scholarly Works; The Bison-Futé Model, D-Lib Magazine July/August 2001, http://www.dlib.org/dlib/july01/vandesompel/07vandesompel.html
[NEYLON] Eamonn Neylon, Managing Intellectual Property with the DOI System, XML Europe 2001, Berlin, Germany, GCA Conference Proceedings
[SYNTAX] ANSI/NISO Z39.84-2000 Syntax for Digital Object Identifier Syntax, http://www.techstreet.com/cgi-bin/pdf/free/247384/z39.84.pdf
[DOI] The DOI Handbook, Version 1.0.0, February 2001, http://www.doi.org/
[INDECS] Godfrey Rust and Mark Bide, The <indecs> Metadata Framework, Principles, model and data dictionary, June 2000, http://www.indecs.org/pdf/framework.pdf
[MPEG] ISO/IEC JTC1/SC29/WG11 N3942, January 2001, Revised Call for Proposals for Digital Item Identification and Description.
[IANA] registered URN Namespaces, http://www.iana.org/assignments/urn-namespaces
|
Table of contents | Author | City | Company | Country | State/Province | Term | Interchange | ![]() |