Using RDF for Knowledge Management

Keywords: RDF, Knowledge Management, Semantic Web, Metadata, Taxonomy, Electronic Publishing, Publishing, Database, Data representation, Enterprise applications, Enterprise Content Management, Topic Maps, XTM

Roger Sperberg
Consultant, Information Architecture
Topical Web
Montclair
New Jersey
United States of America
roger.sperberg@topicalweb.com

Biography

Roger Sperberg is an consultant in electronic publishing and information architecture. He was formerly manager of electronic publishing systems at Aspen Publishers, a legal publisher which is publishing an increasing number of its books direct from XML. Prior to that he was director of content services for The Ballantine Publishing Group at Random House. He is the principal instigator of the RDF Index project, an initiative to distribute mergeable indexes using RDF.

Rajeev Voleti
Solution Architect
Ovitas, Inc.
Woburn
Massachusetts
Rajeev.Voleti@ovitas.com

Biography

Rajeev Voleti is a Solution Architect for Ovitas, Inc. He works with Fortune 500 companies to implement knowledge and content management solutions.


Abstract


Using new tools, RDF can be used for knowledge management, maintaining all the data’s relations, automatically building tables for RDBMS deployment, and supporting graphical navigation, multiple navigation trees, and linking across diverse content sets.


Table of Contents


1. Introduction
2. Publishing application
3. Terrorist/intel application
     3.1 Search
     3.2 KM Challenges
     3.3 Permissions
4. Where is the RDF?
5. Other issues
6. Conclusions

1. Introduction

Using new tools, we have constructed knowledge management systems, both document-based and restricted to metadata, using RDF as the underlying technology. This means that RDF was used for all the essentials for the publishing system and intelligence-agency system we will discuss here — maintaining all the data’s relations, automatically building tables for RDBMS deployment, navigating graphically, supporting multiple navigation trees, and linking across diverse content sets.

Why bother devising new ways to do these things? Well, simply put, to deal with information explosion. According to the Gartner Group,

The “Google” solution (aka fulltext search) to this explosion just doesn’t suffice. As useful and wonderful as it has proven, fulltext search has serious flaws — it lacks context and doesn’t provide any mechanism for navigating content. In our view, RDF deals with these flaws and thus provides the future for search on the web and in the enterprise.

The things that we talk about in this paper were built in RDF, but obviously this type of system can be done in a relational database. One point we want to make is that to do that, you’d have to build everything manually. We used Empolis’ MDS, whose automated features allowed us to import and maintain information in RDF and then MDS automatically built RDBMS tables for us, enabling information to be served on the web from a traditional RDBMS server.

In part, this is kind of like the situation with an electric car — to replace a gasoline-powered car, first you have to prove you don’t lose any capabilities, so the manufacturer builds it to look and drive like an internal-combustion vehicle. Our system looks and drives like an RDBMS system, but it has the same kind of superiorities the electric car has when you compare the systems side by side.

For one thing, RDF makes it easier to transfer information from different organizations. Because RDF schemas don’t need to match, you can import models. This provides large benefits from being able to use content generated by other entities, both within and outside your organization. It makes it easier to find information.

And the metadata stored in RDF makes it possible to understand the relationships in your content. This applies both in the situation in which metadata is already assigned, or when you add your own metadata and create your own classification

RDF in essence can make everyone part of a global system, and allows you to identify a resource in many different ways based on context

In this paper we want to show you multiple models with data, explain how RDF can provide clarity based on context, demonstrate some of the graphical navigation possibilities of RDF and actually peek a little under the hood to show you the engine driving the applications.

2. Publishing application

The first application we have to show is from a publisher with millions of documents in tax, accounting, insurance and legal areas. We’ve chosen it to illustrate how a company with large document sets, each requiring separate taxonomies, benefits from using an RDF-based system. In this type of situation, not only does each document set have its own terms and logic, but each constituency — user, author, internal employees — is likely to require its own taxonomy. Moreover, externally supplied taxonomies must also be brought into the equation.

RDF not only provides more accurate search, but it permits alternate modes of navigation as each document is classified against multiple taxonomies. As you examine the application, you might look and — like the electric car — wonder what’s different and where the RDF is. To spare you the suspense, we’ll note here that it is found in the taxonomy hierarchies, in representing relations between topics and relations between the documents and topics, as well as in permissions, identifying which documents can be seen by a user.

The specific goals of a publishing application are:

This takes some re-orientation in the thinking of the editors and users of the publisher’s website. They are accustomed to treating a document’s source, such as a book, journal or online publication, as somehow separating it from semantically related material. But as Figure 1 illustrates, these disparate documents are readily associated.

306-rdf-km-01b.jpg

Figure 1:

A taxonomy of all the topics covered is similar to a table of contents in how it aids navigation, but it is more a table of contents for a whole library than for a single book. It’s important to keep in mind that multiple taxonomies can be created, with the topics (or nodes) in each pointing to the same documents. And topics in one taxonomy can appear in other taxonomies or can map to equivalent or similar nodes.

306-rdf-km-02b.jpg

Figure 2:

Figure 2 shows a federal-tax taxonomy, which has a goodwill node in its fifth level; the Internal Revenue Code, whose section 197 deals with amortization of goodwill and other intangibles; and a shallow accounting taxonomy, which includes goodwill among its several hundred top-level topics and which subdivides the topic into intangible issues and issues resulting from business combinations, such as mergers or acquisitions. The goodwill node appears in both the tax and accounting taxonomies and maps to Section 197 of the IRC.

In our application, we included three tax taxonomies — the aforementioned federal-tax taxonomy, the IRC, and a table of contents from a book about the federal tax statutes, in which part 21 of chapter G deals with amortization of goodwill. Because the nodes are equivalents, they provide three alternative routes to the same set of documents. These three navigation paths, and how they lead to the same documents (with different relations, they might lead to overlapping but not identical sets) are shown in Figure 3.

306-rdf-km-03b.jpg
306-rdf-km-04b.jpg

Figure 3:

Each document has metadata fields containing information on what nodes in which taxonomies point to it. In Figure 4, the document whose metadata is being viewed is related to three separate topics in the Federal Tax taxonomy, goodwill, debt and amortization period. We exposed this list of metadata to users, which meant that a single click on a metadata value would lead to all the other documents assigned to this other topic to be displayed — even though it wasn’t the topic originally navigated to. In Figure 4, this is shown by the goodwill node still being displayed in the navigation pane, while a list of documents associated with amortization period are displayed in the document results pane. When a user navigates to a likely topic and views the metadata for a promising document, this sideways navigation lets the user move to an associated topic by treating the document itself as a bridge between the two topics — the two topics are related in this way, that this document contains content relevant to both topics.

306-rdf-km-05b.jpg

Figure 4:

An alternate route to the same document result list can be placed in the document itself, as shown in Figure 5. In this use, the application grabs the topics for each popup menu from the document’s RDF store and constructs the menus on the fly. (Note that the document shown in Figure 5 has relations to five different nodes in the federal tax taxonomy and isn’t the same one shown in Figure 4.) Choosing any of the topics causes the full list of documents associated with that topic to be displayed.

306-rdf-km-06b.jpg

Figure 5:

Search capabilities go beyond simple full-text or string searches. A text search can be restricted to documents that have a specific topic assigned to it, from any taxonomy. Or the text can be omitted, and just documents that contain one or more desired values of metadata located. The two captures shown in Figure 6 illustrate two such searches, the first constraining metadata and full-text, the second constraining multiple pieces of metadata only.

306-rdf-km-07b.jpg
306-rdf-km-08b.jpg

Figure 6:

Taxonomies used for navigation reflect the content that they were created for. Figure 7 shows how an accounting user would approach documents using different taxonomies than the tax user. Of course, initially these taxonomies would lead to more accounting-oriented documents.

306-rdf-km-09a.jpg

Figure 7:

But a tax document can be assigned an “accounting” topic and vice-versa, when appropriate. So moving sideways from tax documents to accounting documents doesn’t require starting at the top of the navigation tree. Figure 8 shows how a user starting out from a document located with a tax taxonomy can click on an accounting topic and move to a result list of documents from the accounting content set. Essentially, this means that clicking a value in the metadata screen shows a user “more documents like this one.”

306-rdf-km-10a.jpg

Figure 8:

MDS enabled us to build another capability based on the documents’ metadata. A search, for instance, results not only in a results list but in a clustering of the results, shown in a separate tab for the so-called Document Tree. This Document Tree takes a result list and shows all the relations among all the documents in that list, so the user can see what “like” means in that suggestive phrase “more documents like this one.” All the values for all the metadata fields for the documents in the result list are collated here, enabling clusters of commonality to be located without knowing in advance where they lie. Then those documents can be accessed directly from this screen.

This is perhaps more significant than it appears. An expert user might know in advance, for instance, that “amortizable basis” (the original search term resulting in the document list shown in Figure 9) overlaps significantly with such other topics in the taxonomy as “property,” “amortization,” “going concern,” “goodwill,” and “intangibles.” The clustering done by the application enables a much less experienced user to identify every commonality across the complete taxonomy (at least as evidenced in the available documents), as well as to steer clear of unrewarding avenues of research, such as “inventory,” “equity,” “cash basis,” and “deferred taxes.”

306-rdf-km-11a.jpg

Figure 9:

When discussing the advantages of a relational technology like RDF, this aspect seems to be underplayed somewhat, and yet it is not the ability of an RDF-based system to emulate existing RDBMS systems that justifies moving to a new technology, just as the electric car shouldn’t be evaluated simply by how well it replaces the gasoline-powered car. Instead in both cases the evaluation should point to that critical element the new technology provides that the competing technology can’t manage.

3. Terrorist/intel application

The terrorist/intel application is an illustration of the flexibility of RDF to describe many different types of information. In this application we use RDF to describe relationships between metadata to navigate and search knowledge. We have created metadata relating to the organizations, geographical location, and terrorist events that have occurred.

The terrorist/intel application demonstrates the following:

306-rdf-km-30a.jpg

Figure 10:

Figure 10 shows a typical view users of the system would have. When users enter the system they will be taken to a screen that displays organizing principles on the left hand side of the browser. Organizing principles are ways in which information can be logically classified or grouped. In this example the information is categorized into two organizing principles, which are organization and places (geographical locations).

Organizing principles are an important concept in knowledge management because they mimic the way people organize information in their minds. In this case we use the organizing principles and represent them in a folder structure to help users navigate the document sets based on metadata. For example, in geography we use the organizing principles to organize the document sets based on location of the terrorist activity. We can use the organization organizing principle to group document sets by the terrorist group that produced the event.

On the right side of the screen in Figure 11 we have information relating to the ASG node, which has been selected in the organization organizing principle. There is information about the ASG node such as Organziation Type (terrorist), contains (if it has a sub node), and description. In the bottom half of the screen information about the events that are related to the ASG are listed. These are links that can be navigated and information about the event will be listed. All of the information displayed is represented in RDF, samples of which appear in the next section.

In Figure 11 we can navigate the events metadata by clicking on the entries, which link to event pages. For instance, clicking on Zamboanga - 10/08/2001 takes a user to the events screen which shows related information.

306-rdf-km-31a.jpg

Figure 11:

In this screen we defined a metadata object for events. This object shows the user information about the event that took place. The important concept to take away from this screen is that the metadata is a source of knowledge for the user. The metadata tells the user a description of the event, location, sponsor and target. This information may be as important to the user as the actual document sets in the system.

This knowledge may be represented in many different ways. It is important to make the information navigation as easy as possible since there may be large quantites of data in the system.

306-rdf-km-32a.jpg

Figure 12:

In Figure 12, the event object is displayed in a visual manner, which in many circumstances makes the information easier to navigate. This StarTree is generated directly from the RDF on the fly. In this example we see the event at Zamboanga - 10/08/2001 in the graph’s center, with various arcs coming from the event. Each arc is labelled and gives the target node some content. For example if we look at the “sponsor” arc it has a target of ASG, so we learn that ASG sponsored this event. The target of this terrorist activity is Private Citizens and Property and the location is Philippines and Zamboanga. Clicking on the nodes expands them, so we can discover other relationships.

306-rdf-km-33a.jpg

Figure 13:

Figure 13 shows ASG node expanded and displays other relationships to it. For example we can quickly see all of the other terror events which were committed by the ASG by following the “sponsor” arc. We can see that this group is in the “favorite” or recently viewed category of the FBI. We also see that ASG has been identified as a terrorist organization.

Again, by modeling the information in RDF, additional non-typical capabilities are made available to the information designer to facilitate less-experienced searchers or to stimulate a different type of thinking or of approaching the content. Readily providing visualization of data and such types of graphical navigation as StarTrees, in our opinion, are among the primary advantages of utilizing an RDF approach. In the next section, we discuss what type of RDF needs to be generated for this.

3.1 Search

So far we’ve seen the application from a navigation perspective. Another way to find information is through search. In this application we provide three types of search. The first is Metadata (RDF), the second method is CBR (which uses attributes and similarities), and the third is fulltext.

306-rdf-km-34a.jpg

Figure 14:

In Figure 14 we see that the term “bin laden” is searched. A list of results is returned to the user with a list of metadata associated with the document along with a summary. The terms which are found in the summary are highlighted. We can see another representation of this list in the “Tree” tab.

306-rdf-km-35a.jpg

Figure 15:

When the Tree tab is selected, the documents returned from the search are clustered based on metadata which is identified during the search. So, in this example we searched for “bin laden,” then we get a tree of organizations which are related to the term. From Tree we can gain knowledge about relationships between bin laden and these terrorists groups. Not only can we see the relationship but we can gain context based on the grouping. We can see documents which are related to the bin laden and Hamas.

306-rdf-km-36a.jpg

Figure 16:

When we expand the branch we can see bin laden in the context of Hamas (Figure 16). We can see all of the documents which are related to both. In this case bin laden is searched via fulltext and the metadata for the documents which are returned is used to cluster the documents. This will make it easier for the user to find information which is contextual.

3.2 KM Challenges

Some of the challenges of KM in the intelligence community are very similar to those of large corporations. These challenges include large amounts of inconsistent data, classification of information into meaningful structures, and distribution.

When we talk about the challenge of having large amounts of inconsistent data we are talking about the following problem. Large organizations have data in structured (xml) and unstructured (Word, PDF, ...) formats, different languages and information in different repositories which don’t “speak” to each other. This problem is complex and is hard to solve but it is possible. The application which is demonstrated can store any format of document as well as put metadata on information which is not stored in the system. An example of the system having metadata on external objects is similar to the card catalog system in the library. The card catalog contains metadata on books which are located in the library. We can also have metadata on humans and put metadata on them to locate information about them. By putting metadata on objects external to the system we can now manage knowledge of external systems.

Metadata becomes especially important when there is a large volume of unstructured information. For example if you have PDFs, the information inside them is textual. With XML there is context to the information based on the tagging structure. With PDFs this is highly limited. There needs be other ways to get to the information and this is through metadata. RDF allows us to place metadata on the content and find it more easily. Some of these problems can be solved through search but what happens when you have 500 or more documents which are returned? It can’t be expected that users will page through so many results. This is where the metadata also comes in handy for clustering. This makes the data more manageable because you can classify the documents in many dimensions.

In terms of getting data into the system the biggest challenge is classification of the data with the different metadata. It is very important to classify the content correctly and consistently. Without classification of information you can’t search for information. The application has a classification engine which can be applied to the documents which are imported into the system. This will autoclassify the documents with metadata. This metadata can be changed by the administrators if necessary. The system will only be useful if there isn’t a heavy burden on the administrators of the system to keep it running properly.

Distribution of information is a difficult problem in KM. Multiple methods of access to the information is very important for the system to be intuitive. Some people like to navigate to information while others prefer to search. It is important to provide both means for finding information.

3.3 Permissions

The system uses RDF relationships to store permission information. Based on this platform, complex permission models can be designed. In the Empolis e:kms platform you can have multi-dimensional permissions. For example you can state that people in certain groups may view documents with certain metadata on it. For example, only people in the FBI who work in terrorism with ASG can see information with the ASG metadata on it.

4. Where is the RDF?

There are a few things which are required for representing RDF information within the system. The first file will contain the schema for the metadata which needs to be represented. In the Terrorist/intel application, for “places” we have the following:

<?xml version='1.0' encoding='UTF-8'?>

<!DOCTYPE rdf:RDF [
         <!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
         <!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>  
  <!ENTITY emps 'http://www.empolis.com/2002/ekms/project-schema#'>
         <!ENTITY ekms 'http://www.empolis.com/2002/ekms/schema#'>  
         <!ENTITY emp  'http://www.empolis.com/'>
         <!ENTITY ovitas  'http://www.empolis.com/2004/ekms/ovitas-project-schema#'>
]>

<rdf:RDF
   xmlns:rdf="&rdf;"
   xmlns:rdfs="&rdfs;"
   xmlns:emps="&emps;"
   xmlns:ekms="&ekms;"   
   xmlns:emp="&emp;"   
   xmlns:ovitas="&ovitas;"   
>
 
 <rdfs:Class rdf:about="&ovitas;Place">
  <rdfs:label xml:lang="en">Place</rdfs:label>  
  <rdfs:subClassOf rdf:resource="&ekms;CreatableClass"/>     
  <ekms:searchOP rdf:resource="&ovitas;PlaceOrgPrinciple"/>  
 </rdfs:Class>
 
 <rdf:Property rdf:about="&ovitas;hasSubElementOfPlace">   
   <ekms:propertyType rdf:resource="&ekms;editableProperty"/>        
   <ekms:propertyType rdf:resource="&ekms;orderedProperty"/>        
  <rdfs:label xml:lang="en">contains</rdfs:label>  
  <ekms:reverseLabel xml:lang="en">part of</ekms:reverseLabel>
  <rdfs:domain rdf:resource="&ovitas;Place"/>
  <rdfs:range rdf:resource="&ovitas;Place"/>  
 </rdf:Property> 

 <rdf:Property rdf:about="&ovitas;descriptionOfPlace">  
    <ekms:propertyType rdf:resource="&ekms;editableProperty"/> 
    <rdfs:label xml:lang="en">description</rdfs:label>  
        
    <ekms:maxCardinality>1</ekms:maxCardinality>
    <rdfs:domain rdf:resource="&ovitas;Place"/> 
    <rdfs:range  rdf:resource="&rdfs;Literal"/>
 </rdf:Property>
 
 
 </rdf:RDF>

The first thing that you’ll notice in the schema definition is that we define the &ovitas;Place RDF class. We state the labels that it may have and the language of the label. We then define that Place is a subclass of creatable class. Then we define the class Place as being an organizing principle.

Next we define the subElementOfPlace, which allows Place to have subregions, (i.e. Israel contains Jerusalem). We define the domain and range of the values of the relationship to be Place. A description of Place is also defined in the schema as well as the cardinality of the property.

With this schema definition we can now generate instances of Place which will have the info about all of the different locations.

Below we have an excerpt from the data RDF file which contains instance of the RDF for place.

<?xml version='1.0' encoding='UTF-8'?>

<!DOCTYPE rdf:RDF [
   <!ENTITY rdf  'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
   <!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>  
   <!ENTITY emps 'http://www.empolis.com/2002/ekms/project-schema#'>
   <!ENTITY ekms 'http://www.empolis.com/2002/ekms/schema#'>  
   <!ENTITY emp  'http://www.empolis.com/'>
   <!ENTITY ovitas 'http://www.empolis.com/2004/ekms/ovitas-project-schema#'>
]>

<rdf:RDF
   xmlns:rdf="&rdf;"
   xmlns:rdfs="&rdfs;"
   xmlns:emps="&emps;"
   xmlns:ekms="&ekms;"  
   xmlns:emp="&emp;"    
   xmlns:ovitas="&ovitas;"   
>
 <!-- root node -->
 <rdf:Description rdf:about="&ovitas;Place.Places">
  <rdfs:label xml:lang="en">Places</rdfs:label>  
  <rdf:type rdf:resource="&ovitas;Place"/>
  <ovitas:descriptionOfPlace>This is a description
                         </ovitas:descriptionOfPlace> 
  <ovitas:hasSubElementOfPlace>
       <rdf:Seq>  
       <rdf:li rdf:resource='&ovitas;Place.Oceans'/>
       <rdf:li rdf:resource='&ovitas;Place.Regions'/>
       </rdf:Seq>
    </ovitas:hasSubElementOfPlace>  
 </rdf:Description>  
 
 <rdf:Description rdf:about="&ovitas;Place.Oceans">
  <rdfs:label xml:lang="en">Oceans</rdfs:label>  
  <rdf:type rdf:resource="&ovitas;Place"/>  
  <ovitas:descriptionOfPlace>This is a description of Oceans
                                  </ovitas:descriptionOfPlace> 
  <ovitas:hasSubElementOfPlace>
       <rdf:Seq>  
       <rdf:li rdf:resource='&ovitas;Place.Atlantic'/>
       <rdf:li rdf:resource='&ovitas;Place.Arctic'/>
       <rdf:li rdf:resource='&ovitas;Place.Indian'/>
       <rdf:li rdf:resource='&ovitas;Place.Pacific'/>
       </rdf:Seq>
    </ovitas:hasSubElementOfPlace>  
 </rdf:Description>


 <rdf:Description rdf:about="&ovitas;Place.Regions">
  <rdfs:label xml:lang="en">Regions</rdfs:label>  
  <rdf:type rdf:resource="&ovitas;Place"/>  
  <ovitas:descriptionOfPlace>The regions of the world
                                          </ovitas:descriptionOfPlace> 
  <ovitas:hasSubElementOfPlace>
       <rdf:Seq>  
    <rdf:li rdf:resource='&ovitas;Place.WesternEurope'/>
    <rdf:li rdf:resource='&ovitas;Place.SEAsiaOceania'/>
    <rdf:li rdf:resource='&ovitas;Place.SouthAsia'/>
    <rdf:li rdf:resource='&ovitas;Place.NorthAmerica'/>
    <rdf:li rdf:resource='&ovitas;Place.MiddleEast'/>
    <rdf:li rdf:resource='&ovitas;Place.LatinAmerica'/>
    <rdf:li rdf:resource='&ovitas;Place.EasternEurope'/>
    <rdf:li rdf:resource='&ovitas;Place.CentralAsia'/>
    <rdf:li rdf:resource='&ovitas;Place.Africa'/>
       </rdf:Seq>
    </ovitas:hasSubElementOfPlace>  
 </rdf:Description>  


<rdf:Description rdf:about="&ovitas;Place.Brazil">
 <rdfs:label xml:lang="en">Brazil</rdfs:label>
 <rdf:type rdf:resource="&ovitas;Place"/>
 <ovitas:descriptionOfPlace>Brazil</ovitas:descriptionOfPlace>
 <ovitas:hasSubElementOfPlace>
  <rdf:Seq>
   <rdf:li rdf:resource='&ovitas;Place.Santos'/>
   <rdf:li rdf:resource='&ovitas;Place.SaoPaulo'/>
  </rdf:Seq>
 </ovitas:hasSubElementOfPlace>
</rdf:Description>

<rdf:Description rdf:about="&ovitas;Place.CostaRica">
 <rdfs:label xml:lang="en">Costa Rica</rdfs:label>
 <rdf:type rdf:resource="&ovitas;Place"/>
 <ovitas:descriptionOfPlace>Costa Rica</ovitas:descriptionOfPlace>
 <ovitas:hasSubElementOfPlace>
  <rdf:Seq>
   <rdf:li rdf:resource='&ovitas;Place.Heredia'/>
  </rdf:Seq>
 </ovitas:hasSubElementOfPlace>
</rdf:Description>

Here we see the contents of the RDF data file which shows how the Place nodes in the application were generated. As we can see from each instance of Place the RDF XML matches the schema that we saw earlier. We first describe the Place node and give it a label for each language we are using. Then we define the type of the class that we are creating (in this case it is Place), then we give it a description. Finally we list the subelements of Place. For example Brazil has &ovitas;Place.Santos and &ovitas;Place.SaoPaulo as subelements.

The RDF is imported into the system (e:kms) and translated into the tables which are located inside an Oracle RDBMS. The RDF is stored inside the RDBMS for use in the application. The RDF can be generated at any time and exported with the content. An important concept to note here is that RDF can be stored in various ways but the most important point is that information and metadata can be portable from one system to another. The metadata which is generated for the content can be used by a web browser to find relevant information. This idea of portability will make it easier for knowledge management in the future. The portability is also important in the context of intelligence applications in sharing information across disparate systems.

5. Other issues

As with any technology there are implementation issues. The main implementation issues that people may face is understanding how to classify their content. The second biggest implementation issue may be scaleability of a large RDF database.

Classification of content may be quite difficult depending on the content which is being classified. Luckily RDF is very flexible and allows many ways to classify even difficult information structures.

The second implementation issue to look out for is the scalability of the RDF database. You must make sure that it can support large amounts of RDF relationships. This can become a limiting factor for organizations with complex metadata and large sets of documents. If a company has a large set of documents it will also have a large set of metadata to discriminate between documents.

6. Conclusions

In this paper we have tried not only to show “industrial-strength,” real-world applications of RDF but also to discuss the rationale for turning to RDF and its advantages and to explain something of its construction and use in the applications.

From its ability to utilize an extra dimension of the documents’ data, as shown with the clustering capability, to its ease in providing for visual methods of navigation, RDF has shown itself a superior technology for knowledge management.

XHTML rendition made possible by SchemaSoft's Document Interpreter™ technology.