XML 2002 logo

Normalized Metadata Format: RDF Meets XML Schema

Abstract

Normalized Metadata Format (NMF) is an open specification for XML Schema based metadata that can be mechanically interchanged with RDF processors. This paper will discuss NMF and how it provides the metadata leg of the OSTA MultiPhoto/Video initiative.


Table of Contents

1. Introduction
2. What is the MultiPhoto/Video Initiative?
3. NMF Concepts
3.1. Round-tripping Metadata
3.2. Design Approach
3.2.1. Partial Validation
3.2.2. Naming Patterns and Reserved Names
3.3. Host Contexts for NMF
4. NMF Encoding
5. Examples
5.1. Dublin Core
5.1.1. RDF/XML Representation
5.1.2. NMF Representation
5.2. RSS
5.2.1. RDF/XML Representation
5.2.2. NMF Representation
Bibliography
Biography

1. Introduction

Normalized Metadata Format is an open specification that describes an XML Schema based representation for metadata. This representation has several primary goals

  • To provide a simple, flexible way to define and interchange metadata using mainstream XML tools and technologies.

  • To provide a mechanical schema-less mapping between a subset of the Resource Description Format (RDF) based metadata and NMF.

  • To provide a straight-forward mechanical mapping to relational databases.

The NMF specification defines a base layer of functionality, which provides the core mechanisms necessary for definition, and use of metadata that conforms to the NMF model and syntax. A set of profiles are also being defined that make use of the NMF model and syntax. These profiles will be separate specifications that will rely on the NMF specification.

These profiles are composed of both schemas and best practices for use of those schemas. Some of the NMF schemas in these profiles represent a mechanical mapping of existing metadata formats that are specified using RDF and other metadata representations. Others have been specified directly in NMF and do not have explicit specifications outside of NMF although they can be interchanged and processed using RDF toolsets once they have been mechanically converted to RDF.

NMF based modules may be specified via a stand-alone specification or they may be specified as part of a larger specification such as a MultiPhoto/Video Profile. The following NMF based specifications are being developed:

  • Dublin Core Profile: Dublin Core profile. Mirrors the Dublin Core metadata set. The Dublin Core profile is specified in the [DC-NMF].

  • MultiPhoto/Video Core Specification: A module that specifies NMF equivalents to the identifiers that are defined by the MultiPhoto/Video core specification. This module is specified as part of the MultiPhoto/Video core specification [MPV-Core].

  • OSTA Manifest Profile: A specification that enables an OSTA Manifest to declare profile information [MANIFEST]. .

MultiPhoto/Video Presentation Profile: A profile that provides metadata that is useful for describing aspects of multi-media presentations like slideshows [MPV-Pres]. The primary driver for the development of NMF was to provide a metadata representation for information about the assets that are described in the MultiPhoto/Video initiative. A short overview of the MultiPhoto/Video initiative is provided below. A complete description is available in the various specifications and white papers that are accessible from the OSTA web site .

2. What is the MultiPhoto/Video Initiative?

MPV provides specific manifest and metadata formats and implementation practices that support existing industry specifications such as the World Wide Web Consortium's SMIL and I3A's DIG35. MPV is compatible with and supports the DCF and Exif specifications from the JEITA and JCIA that are widely used in digital cameras.  New metadata elements will be developed as necessary.  The work is oriented to deliver tangible and useful results in the near-term. The MultiPhoto/Video initiative has the goal to enhance interoperability, ease-of-use, and abilities to play and manipulate collections of photo/video content, including still images, still with audio, still sequences, video clips, audio-only clips, and related files

This is done by defining a basic mechanism for specifying metadata about assets and collections of assets and also defining a rich set of metadata that makes use of this mechanism. The collections and assets are defined by the MultiPhoto/Video (MPV) specification which provides a context for interpreting the metadata that is specified using the NMF encoding

MPV is made available at low cost and without royalty from the Optical Storage Technology Association (OSTA) and the International Imaging Industry Association (I3A).  OSTA is an industry association promoting the use and interoperability of recordable CD and DVD discs in computer and consumer electronics devices.  I3A is an industry association promoting digital and film imaging technologies

MPV enables PC software and consumer electronics devices like DVD players to playback and manipulate collections of digital photo/video content including still images, still with audio, still sequences, video clips, audio-only clips, and related files.  The emphasis is on personal content originating from many sources including digital cameras, film, scanners and video digitizer and stored on a range of media including memory cards, recordable or stamped CDs and DVDs, and even computer hard disks or internet services

NMF metadata is primarily designed to support a lossless mapping to and from RDF and relational databases. This allows several significant RDF-based applications that are suitable to be translated to NMF and then processed in the normalized XML Schema environment that NMF enables.

3.  NMF Concepts

3.1. Round-tripping Metadata

There are two primary representations for Metadata in the NMF approach. One is the NMF representation, which is based on XML Schema, and the other is the RDF representation, which may or may not be based on a schema representation. These representations may also be mapped as relational database schema.

NMF is designed to allow both a natural description of metadata in NMF while also allowing as large a subset of RDF metadata as possible to be translated into NMF. This translation between the NMF representation and the RDF representation must be supported in both a type aware and type unaware environment. In practice, even the type-aware (schemas available) environment may not have access to the type information for all of the metadata that the mapping is being applied to.

The NMF model and syntax is an attempt to balance the requirement for a natural approach to specifying metadata using an XML schema based definition while also supporting a natural mechanical mapping of RDF metadata to and from NMF.

Many of the constraints and stylized patterns of definition that NMF employs derive from this need to support these requirements in type aware and type unaware processing environments.

3.2. Design Approach

NMF employs a specific approach to schema design and extension. This approach tries to maximize the utility of strong typing while allowing runtime interoperability with metadata for which type information is not available. In addition, NMF attempts to provide as large an amount of interoperability as possible with RDF metadata.

The approach NMF employs provides a simple flat mix-in style of object-oriented design. This approach can be found in use in some Object to Relational mapping layers, which have several alternative ways to address the mapping of class hierarchies to relational tables [AMBLER1].

NMF makes use of the single table per schema in its mapping approach. This means that if you have a type that is conceptually derived from multiple base types, each base type’s contribution to the data the derived type encapsulates is maintained in a separate table in the relational model and in a separate BySchema container element in the NMF model.

3.2.1. Partial Validation

NMF provides a general extension model that allows for partial validation because it allows the mix-in of metadata for which type information is not available in specific locations in the NMF data model.

The locations at which partial validation may occur are:

  • Manifest ( specified using the OSTA Manifest file:Manifest element) [MANIFEST]

  • Top-level Composite Property (nmf:Metadata)

  • Nested Composite Property (instances of nmf:CompositePropType)

The Manifest and the top-level Composite property are weakly typed in the NMF model in order to allow applications to mix in both metadata and (in the case of the Manifest) non-metadata content for which type information may or may not be available

. The Nested Composite Properties can be typed either strongly or weakly based on the processing assumptions of the schema that is defining the composite property. See best practices for more information.

For more information, please refer to Open Content Model Helpers3.11.

3.2.2. Naming Patterns and Reserved Names

There are several reserved local-name suffixes for element names that are used by the NMF encoding. The reserved local-name suffixes are any of these character sequences:

  • Bag

  • Seq

  • Alt

  • QVal

  • Ref

  • AnyXML

If these suffixes are to be used in a property name, they must have a trailing underscore. The RDF to NMF mapping defines the algorithm for handling these reserved values when they are encountered in the RDF encoding and how to maintain them in a lossless manner. The algorithm uses a simple escaping mechanism where an underscore character is appended to the suffix string when mapping from RDF to NMF and removed when mapping from NMF to RDF.

3.3. Host Contexts for NMF

NMF metadata is describing a particular resource which is identified via a range of mechanisms. These mechanisms include:

  1. Explicit identification using the equivalent of the RDF about attribute whose value is a URI.

  2. Implicit identification provided by the host context in the form of a base URI.

  3. Explicit identification using the MPV core identification properties.

  4. Implicit identification using MPV identification properties specified by the host context.

  5. Implicit identification using some other host context mechanism.

NMF processors are only required to support the first two mechanisms. MPV processors will support 3 and 4. Finally, specialized processors that are aware of the host context mechanism will support 5.

4. NMF Encoding

A particular NMF property has a base name and one or more types. The most likely usage of NMF properties will be where a property has a single type. In addition, there will be some scenarios that involve interoperation with existing systems where a property might take on different variant forms.

NMF Properties can be of the following base types:

  • Simple Properties whose values are textual.

  • Composite Properties whose value is a set of properties.

  • Ref Properties whose values are a URI visible to the RDF representation (not textual content).

  • AnyXML Properties whose value is well formed XML.

These base types can be encapsulated in one of the following higher-order types:

  • Qualified Properties whose values are qualified by additional contextual metadata.

  • Array Properties whose values are an array of the base property type (Seq, Bag, Alt).

Each property has an initial base local name that is independent of the type of the property. In fact, some properties can be specified using several alternate value types. As an example, in Dublin Core, some of the properties can take on either a single value or an array of values. If an array property has a single entry, then an alternate form of specifying the property can be employed. Depending on the type of the underlying array element, the property might then be of any of the non-array property types.

Another common variation in property type is encountered in general purpose RDF where any composite property can be specified either inline as a nested resource or out of line as a Ref.

NMF uses a system of naming patterns to explicitly indicate the property type in the name of the property. This is used to allow mapping algorithms to operate even when there isn’t access to schema information that indicates the property types.

The naming patterns append a suffix to the base local-name of the property that indicates the property type. Here are the naming patterns with a hypothetical property whose local-name is “Destination”:


Property Type	Suffix	Example
Simple	none	Destination
Qualified	QVal	DestinationQVal
Composite	none	Destination
Ref 	Ref	DestinationRef
Array(Bag)	Bag	DestinationBag
Array(Seq)	Seq	DestinationSeq
Array(Alt)	Alt	DestinationAlt
XML Literal	AnyXML	DestinationAnyXML


Properties are packaged together based on the schema that they are defined in. One or more BySchema Properties elements are then contained in a Properties container. These containers can either occur at the top-level or at a nested level.

5. Examples

This section contains various examples of RDF based document formats and their equivalent representation in NMF.

5.1. Dublin Core

This example shows the use of stand-alone Dublin Core metadata. It purposely makes use of unusual mix of array valued properties in order to demonstrate the mapping onto NMF constructs.

5.1.1. RDF/XML Representation

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description about="http://www.foo.com/cool.html"> 

    <dc:creator>
       <rdf:Seq ID="CreatorsAlphabeticalBySurname">
          <rdf:li>Mary Andrew</rdf:li>
          <rdf:li>Jacky Crystal</rdf:li>
        </rdf:Seq>
     </dc:creator>
     <dc:identifier>
        <rdf:Bag ID="MirroredSites"> 
          <rdf:li rdf:resource="http://www.foo.com.au/cool.html"/>
          <rdf:li rdf:resource="http://www.foo.com.it/cool.html"/>
        </rdf:Bag>
      </dc:identifier>
      <dc:title>
        <rdf:Alt>
          <rdf:li xml:lang="en">The Coolest Web Page</rdf:li>
          <rdf:li xml:lang="it">Il Pagio di Web Fuba</rdf:li>
        </rdf:Alt>
      </dc:title>
    </rdf:Description> 
</rdf:RDF>

5.1.2.  NMF Representation

<file:Manifest xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:file=”http://ns.osta.org/manifest/1.0/”
  xmlns:nmf="http://ns.osta.org/nmf/1.0/">

  <nmf:Metadata  nmf:about=”http://www.foo.com/cool.html”>
    <Properties xmlns="http://purl.org/dc/elements/1.1/">
      <creatorSeq>
         <creator>Mary Andrew</creator>
         <creator>Jacky Crystal</creator>
      </creatorSeq>
      <identifierBag>
         <identifierRef>http://cool.html</identifierRef>
         <identifierRef>http://cool.html</identifierRef>
      </identifierBag>
      <titleAlt>
          <title xml:lang="en">The Coolest Web Page</title>
          <title xml:lang="it">Il Pagio di Web Fuba</title>
      </titleAlt>
    </Properties>
  </nmf:Metadata>
<file:Manifest>

5.2. RSS

This example shows the use of RSS1 [RSS1]. This format makes use of typenodes, rdf references and sequences. It also has properties from multiple namespaces describing the channel resource which is based on RSS1 modules.

Note that the properties associated with each resource are grouped by namespace and alphabetically reordered by the mapping between the RDF/XML and NMF representation. This is done in order to allow a deterministic schema to specified on the NMF side of the mapping.

5.2.1. RDF/XML Representation

<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"  
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
  xmlns:co="http://purl.org/rss/1.0/modules/company/"
  xmlns:ti="http://purl.org/rss/1.0/modules/textinput/"
  xmlns="http://purl.org/rss/1.0/">
  <channel rdf:about="http://www.xml.com/xml/news.rss">
     <title>XML.com</title>
     <link>http://xml.com/pub</link>
     <description>
      XML.com features a rich mix of information and services 
      for the XML community.
     </description>
     <image rdf:resource="http://xml.com/universal/images/xml_tiny.gif"/>
     <items>
       <rdf:Seq>
          <rdf:li resource="http://2000/08/09/xslt/xslt.html"/>
       </rdf:Seq>
     </items>
     <dc:publisher>The O'Reilly Network</dc:publisher>
     <dc:creator>Rael Dornfest (mailto:rael@oreilly.com)</dc:creator>
     <dc:rights>Copyright &#169; 2000 O'Reilly &amp; Associates, Inc. </dc:rights>
     <dc:date>2000-01-01T12:00+00:00</dc:date>
     <sy:updatePeriod>hourly</sy:updatePeriod>
     <sy:updateFrequency>2</sy:updateFrequency>
     <sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase
   </channel>
   <image rdf:about="http://xml.com/universal/images/xml_tiny.gif">
      <title>XML.com</title>
      <link>http://www.xml.com</link>
      <url>http://xml.com/universal/images/xml_tiny.gif</url>
   </image>
   <item rdf:about="http://xml.com/pub/2000/08/09/xslt/xslt.html">
       <title>Processing Inclusions with XSLT</title>
       <link>http://xml.com/pub/2000/08/09/xslt/xslt.html</link>
       <description>
         Processing document inclusions with general XML tools ...
       </description>
   </item>
</rdf:RDF>

5.2.2. NMF Representation

<file:Manifest xmlns="http://purl.org/rss/1.0/" 
              xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
              xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
              xmlns:co="http://purl.org/rss/1.0/modules/company/"
              xmlns:ti="http://purl.org/rss/1.0/modules/textinput/"
              xmlns:nmf=”http://ns.osta.org/nmf/1.0”
              xmlns:file=”http://ns.osta.org/manifest/1.0/”>

  <nmf:Metadata  nmf:about=”http://www.xml.com/xml/news.rss”>
    <channel xmlns="http://purl.org/rss/1.0/">
       <description>
         XML.com features a rich mix of information and services
       </description>
       <imageRef> http://xml.com/universal/images/xml_tiny.gif</imageRef>
       <link>http://xml.com/pub</link>
       <title>XML.com</title>

       <itemsSeq>
         <itemsRef> http://2000/08/09/xslt/xslt.html</itemsRef>
       </itemsSeq>
    </channel>
    <dc:Properties>
      <dc:creator>Rael Dornfest (mailto:rael@oreilly.com)</dc:creator>
      <dc:date>2000-01-01T12:00+00:00</dc:date>
      <dc:publisher>The O'Reilly Network</dc:publisher>
      <dc:rights>
        Copyright &#169; 2000 O'Reilly &amp; Associates, Inc.
      </dc:rights>
    </dc:Properties>
    <sy:Properties>
      <sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
      <sy:updateFrequency>2</sy:updateFrequency>
      <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:Properties>
  </nmf:Metadata>

   <nmf:Metadata nmf:about=”http://xml.com/universal/images/xml_tiny.gif”>
      <image xmlns="http://purl.org/rss/1.0/">
         <link>http://www.xml.com</link>
         <title>XML.com</title>
         <url> http://xml.com/universal/images/xml_tiny.gif</url>
      </image>
   </nmf:Metadata>
   <nmf:Metadata nmf:about=”http://xml.com/pub/2000/08/09/xslt/xslt.html”>
      <item xmlns="http://purl.org/rss/1.0/">
        <description>
          Processing document inclusions with general XML tools ...
        </description>
        <link> http://xml.com/pub/2000/08/09/xslt/xslt.html</link>
        <title>Processing Inclusions with XSLT</title>
      </item>
   </nmf:Metadata>
</file:Manifest>

Bibliography

[AMBLER] “Mapping objects to relational databases”, Scott Ambler, July, 2000. Available at: http://www-106.ibm.com/developerworks/library/mapping-to-rdb/#h9

[DC-NMF] "Dublin Core Normalized Metadata Format Profile Specification 1.0"; OSTA, 2002,. Available at http://www.osta.org/mpv/

[DC-XML] “Guidelines for implementing Dublin Core in XML” Andy Powell, http://www.ukoln.ac.uk/metadata/dcmi/dc-xml-guidelines/

[DCF-1999 ] “Design rule for Camera File system, Version 1.0”, JEIDA standard, English Version 1999.1.7, Japanese Electronic Industry Development Association (JEIDA).

[DIG35-2001 ] “DIG35 Specification – Metadata for Digital Images, Version 1.1”, June 18, 2001, International Imaging Industry Association (I3A) [recently formed by combining the Digital Imaging Group and PIMA]. http://www.i3a.org

[DPOF ] “DPOF (Digital Print Order Format) Specification Version 1.1”, July 17, 2000, Canon Inc, Eastman Kodak Company, Fuji Photo Film Co., Ltd., Matsushita Electric Industrial Co., Ltd.

[Exif2002 ] "Exchangeable image file format for digital still cameras: Exif Version 2.2", JEITA CP-3451, Japan Electronics and Information Technology Industries Association (JEITA), February 19, 2002.

[DCQ-RDF] “Expressing Qualified Dublin Core in RDF / XML”, Stefan Kokkelink and Roland Schwäenzl, Dublin Core Metadata Initiative Proposed Recommendation, August 29th, 2001. Available at http://dublincore.org/documents/dcq-rdf-xml/

[DC-RDF] “Guidance on expressing the Dublin Core within the Resource Description Framework (RDF)”, Eric Miller, Paul Miller and Dan Brickley. Dublin Core Metadata Initiative Draft, July 1999. Available at http://www.ukoln.ac.uk/metadata/resources/dc/datamodel/WD-dc-rdf/

[DCMI-NS] “Namespace Policy for the Dublin Core Metadata Initiative”, Andy Powell and Eric Wagner, Dublin Core Metadata Initiative Recommendation, October 26th, 2001. Available at http://dublincore.org/documents/dcmi-namespace/

[DC-RDF-Simple] “Expressing Simple Dublin Core in RDF/XML”, Dave Beckett, Eric Miller and Dan Brickley, Dublin Core Metadata Initiative Proposed Recommendation, November 28th, 2001. Available at http://dublincore.org/documents/dcmes-xml/

[MANIFEST] "XML Manifest Specification 1.0"; OSTA, 2002. Available at http://www.osta.org/mpv/

[MPV-Core] "MultiPhoto/Video Core Specification 1.0"; OSTA, 2002. Available at http://www.osta.org/mpv/

[MPV-Pres] “MultiPhoto/Video – Presentation Profile Specification”, OSTA, 2002. Available at http://www.osta.org/mpv/

[NMF ] "Normalized Metadata Format Specification 1.0"; OSTA, 2002. Available at http://www.osta.org/mpv/

[RDF] "Resource Description Framework (RDF) Model and Syntax Specification", Ora Lassila and Ralph R. Swick. W3C Recommendation 22 February 1999, Available at http://www.w3.org/TR/REC-rdf-syntax/

[RDFschema] "Resource Description Framework (RDF) Schema Specification", Dan Brickley and R.V. Guha.W3C Proposed Recommendation 03 March 1999, Available at http://www.w3.org/TR/PR-rdf-schema/

[RSS1] “Rich Site Summary 1.0”, RSS Working Group, December 6th, 2000. Available at http://purl.org/rss/1.0/spec.

[SMIL20] "Synchronized Multimedia Integration Language (SMIL 2.0) Specification". W3C Working Draft, work in progress. Available at http://www.w3.org/TR/smil20/

[OSTA-WEB] “OSTA MulitPhoto/Video Initiative”, 2002, Available at http://www.osta.org/mpv/

[XML-NS] "Namespaces in XML", Tim Bray, Dave Hollander, Andrew Layman. W3C Recommendation 14 January 1999, Available at http://www.w3.org/TR/REC-xml-names

[XSCHEMA] "XML Schema, XML Schema Part 1: Structures". W3C Recommendation 2 May, 2001. Available at http://www.w3.org/TR/xmlschema-1/

[XSCHEMA2] "XML Schema, XML Schema Part 2: Datatypes". W3C Recommendation, 2 May, 2001. Available at http://www.w3.org/TR/xmlschema-2/

Biography

Gabe Beged-Dov is a software architect in the Photo/Video Solutions area of the Imaging and Printing Group of Hewlett Packard. One of his primary focuses is the MultiPhoto/Video initiative which is developing a suite of XML based, royalty free, specifications (and an accompanying open source tool chain) that allows interchange and processing of rich media collections between devices such as cameras, PC and DVD players. He has been active in the development of XML related technologies for many years as both an independent consultant and as the employee of various companies including Rogue Wave Software and Hewlett Packard. He has contributed to several XML related standards efforts including XML Schema, RDF, XLink and RSS. He lives in Corvallis Oregon where spring is wonderful as long as you have your allergy pills handy.