XML 2003 logo

New Public Domain Journal Article Archiving and Interchange DTDs

Abstract

In March 2003, the National Library of Medicine (NLM) released into the public domain a suite of DTD modules for describing journal literature, books, and many kinds of textual material. The full suite was developed by the National Center for Biotechnology Information (NCBI) and the XML consulting firms Inera, Inc. (funded by the Andrew W. Mellon Foundation) and Mulberry Technologies, Inc. (funded by NCBI). The modular DTD library and several complete DTDs made using the modules are in the public domain for use by any organization or individual without permission from NLM.

The first two public DTDs developed from the suite, the Journal Archiving and Interchange DTD and the Journal Publishing (authoring) DTD, were also released in March 2003 along with tag set documentation. The Publishing DTD defines a common format for the creation of journal content in XML. The advantages of a common format are portability, reusability, and the creation and use of standard tools. The Archiving DTD also defines journal articles, but it was created to provide a common, public format in which publishers, aggregators, and archives can exchange journal content and store it in large commonly-tagged repositories.

The DTD Suite was developed from work begun by NCBI in support of PubMed Central (PMC). The DTDs will form the basis for the PMC (PubMed Central) repository as well as the JSTOR (Journal STORage: The Scholarly Journal Archive) Electronic Archive Project. In the months since their release:

  • How have the DTDs been accepted and used in the journal publishing community? By electronic archival projects and repositories? By the large publishing conglomerates who have DTDs of their own? By first-time XML publishers who have never had a DTD? By conversion vendors? By the content aggregators and abstract and indexing services who must deal with multiple DTDs and schemas? In the wider publishing world beyond that of STM journals?

  • What new DTDs are available for public use? What DTDs are planned?

  • NLM has set up an advisory board of publishers, academics, aggregators, and consultants to oversee the DTD suite and ensure that the direction of growth and modification of the modules and new public DTDs is in everyone's best interests. What direction has come from the advisory board?

  • When will schema versions of the suite be available and what schema languages will be supported?

  • What tools have been developed to support the DTDs? What tools are planned?

(Note: Most of the update material is late-breaking and will therefore be given in the presentation but is not present in this paper.)

Keywords