Abstract
Topic maps received great attention at many XML conferences and other events since their publication as ISO standard in 2000 and their XML-isation as XTM in 2001. Most of the conference presentation related to topic maps were about the technology and not about the applicability of the technology. We heard about proper handling of scope, published subjects, graph theory, constraints, query languages, and the like. But we have rarely heard how to apply topic maps in a real-world solution and which role topic maps are able to play in such a solution. Consequently, this paper is about topic maps as part of the larger picture of a holistic solution. After providing typical topic maps applications we introduce the three major categories of topic map applications and illustrate them with examples.
Keywords
Table of Contents
There is no doubt, we are living in the information age and we are part of the information society. Millions of people all over the world surf the Internet every day – for fun or as part of their job. Large parts – if not the entire – economic system of the industrialised world relies on electronic information that is stored and managed in computers and interchanged via networks. The amount of electronically available information doubles every two years, perhaps even faster. It is not a lack of information but and overabundance of rich information that causes discomfort. It is the info glut that which increasingly is becoming a daily challenge to sort through and identify exactly what is needed.
But what has led to this information avalanche? It is simply that we have access to not just a single information resource on any given topic but to many. Information on a given topic can reside in company database records, company documents, parts of documents, web pages, images, videos, all coming from different sources and different repositories. Increasingly, the context of the information resource is needed to validate its relevance. We need pointers to related information as well to find out how the resource fits in the larger picture. We want to extract knowledge from information in context.
How did humans manage the ‘info glut’ challenge in the past before computers were available? Information was collected in books (physical containers) and books were collected in even larger physical containers (called libraries). A library catalogue helped us to locate the books on the shelves that related to our topic of interest. An important attribute is the classification code assigned to the book. The code addresses a node in a (sometimes large) subject classification schema and states what the book is about (= which subjects are covered by the book).
To actually retrieve, locate, the needed text portion in the book we used one or more of four different paradigms: following the hierarchical structure of the table of contents, reading the text from beginning to end, browsing through the pages, or looking up the back-of-book index.
The previous section identified the, historically, three most common forms of information location referencing in use today, catalogue reference, table of contents, and indexing.
When the International Organization for Standardization, Geneva, Switzerland (ISO) committee started its design work on the topic maps paradigm about ten years ago it had an electronic version of a back-of-book index in mind. As such, topic maps are designed to manage the 'info glut', build valuable information networks over any kind of information resources, and enable the structuring of unstructured information. A topic map can be seen as an electronic super index that implements the back-of-book index paradigm.
Topic maps are explicitly modelled ontologies[2], which are key to semantic structuring of information and to seamless integration of navigation, searching, and notification in knowledge management. Dubbed the ‘GPS of the information universe’[3], topic maps are a means of organising and accessing large and continuously growing information pools. They provide a ‘bridge’ between the domains of knowledge management and information management.
Coming from back-of-book indices, the topic maps paradigm defines the necessary concepts to model explicit knowledge structures over resources. Topic maps are simple but not too simple. They are lightweight but have the potential to grow with the demands of the information society. They are an international ISO standard, ensuring stability, reliability and openness – all essential for a secure Information Technology investment. ISO is working on a family of topic map standards to complete the existing standard with data model, query language, and schema language.
Topic maps are a number of technologies wrapped up in one:
Complex metadata: A topic map contains information about information resources. It is not part of the information resources, it is created, managed, and stored separately from the resources, but could be closely connected to them.
Search index: As searching in a back-of-book index is a very precise searching method, searching in a topic map provides better search results as searching in a full text index. A topic map can be seen as an intelligent search index, which perfectly cooperates with intelligent search engines.
Information organising principle: Subject classifications, taxonomies, and thesauri help to organise information resources. Topic maps are perfect to model these organising principles.
Knowledge structure: Topic maps are a base technology for explicit knowledge modelling and knowledge navigation – hence their value to Knowledge Management. A visualisation of topic maps as 2D or 3D graphs provide the user with an easy to grasp illustration of the modelled knowledge as shown by Figure 1.
It is the great flexibility, which is the advantage and disadvantage of topic maps at the same time. It is an advantage that topic maps allow the modelling of many arbitrary information and knowledge structures. It is an disadvantage that no predefined guidelines or schemas exist (yet), which help to model the many possible structures.
Although a quite new standard, topic maps have left the ‘ivory-tower’ of theoretic thinking and became a ‘real world’ phenomenon – with real products, real projects, real solutions, and a growing list of usage experiences as shown in the next chapters. Their wide applicability in many domains makes topic maps an emerging technology.
This chapter lists some of the most important applications of topic maps.
Topic maps are a technology to model and code knowledge explicitly and, as such, they provide a standardised format and paradigm for the codification of knowledge. Both are required to share, acquire, create and exchange knowledge in an organisation or between organisations. Usually, explicit knowledge is codified in metadata, taxonomies or thesauri. Also, much richer ontologies with lots of non-hierarchical relationships are typical topic maps applications.
Furthermore, topic maps are able to ‘bridge’ between regular record/document/content management and more sophisticated knowledge management. They help to integrate many diverse data sources under comprehensive semantic views, reflecting the business views on available data and information. Topic maps provide the additional semantic layer on top of the data/information necessary to better understand the data/information.
The most obvious topic maps application is subject classification based on a classification schema (also called a taxonomy). Classifications organise resources and simplify access to them. They are a key feature of knowledge management.
A topic map fulfils two functions at the same time: 1) it represents the classification schema with its classes, class hierarchy, class codes, and cross relationships; and 2) it assigns the resources into the schema. As these occurrences can point into various repositories, a topic map based classification can easily span multiple systems, providing one classified view on all resources.
A scoped topic map or multiple topic maps can model different classifications of the same resources, showing the different aspects of the resource by such a multi-dimensional classification as illustrated by Figure 2.
Users can navigate the classification in a topic map navigator to find resources. They can also browse the resources and see the assigned classifiers as well as related classes.
A topic map can represent the knowledge about a specific application domain. It is a simple ontology, an explicit model of the domain knowledge. [4] Such an ontology provides explicit access to explicit knowledge structures in order to:
navigate the knowledge structures;
visualise the knowledge structures as graphs;
query the knowledge structures;
derive new knowledge structures through inferencing;
connect resources from various repositories to knowledge structures;
analyse the knowledge structures (e.g. with statistics);
publish or sell the knowledge structures.
A typical application of knowledge representation is the corporate memory used in enterprise knowledge management. It models the knowledge about products, projects, people, policies, processes, and practices and provides it to employees. Again, a topic map offers an ideal means of presenting and accessing the corporate memory.
Another application is business-to-commerce web sites (online shops). The represented knowledge about sold goods and services are the base technology behind an intelligent virtual sales assistant. Such a virtual sales assistant helps the customer to explore the stock, navigate through it, ask queries and get proper answers, and to communicate with the business-to-commerce system via feedback dialogs.
It is a good idea to distinguish between lightweight ontologies and heavyweight ontologies. [5] Topic maps are seen as lightweight ontologies because they are able to model knowledge in a very ‘shallow’ way (e.g. just topics, their classes, occurrences, and associations, but no class hierarchies, constraints, or inference rules). Even ‘shallow’ topic maps are already very useful without having put large investments in their creation. Heavyweight ontologies, by contrast, contain class hierarchies, constraints, and inference rules. It takes a long time and many resources to develop and maintain them and it is uncertain if there will be a benefit from this extra effort. Resource Description Framework (RDF) and Web Ontology Language (OWL) of the World-Wide Web Consortium (W3C) are technologies designed to model heavyweight ontologies.
Searching powered by topic maps could be the slogan of search engines that use the topic maps paradigm to improve their query results. Intelligent ‘find’ technologies are the result. However, not every search technology is prepared to benefit from topic maps. Only those that make use of an explicit knowledge model can easily migrate to topic maps. Others based on for example, statistical algorithms just have no knowledge model and cannot apply the paradigm.
Typical knowledge models of intelligent search engines are based on concept hierarchies with synonyms and may be weighted similarities between concepts. For example Case-Based Reasoning – a result of artificial intelligence research (CBR) is based on such models. With topic maps, the knowledge models could be represented in a standardised notation instead of in a proprietary format. Furthermore, all features listed in previous section Section 3.2 can be combined with intelligent searching, leading to a seamless integration of the two access methodologies, navigation and searching.
The Knowledge Management concepts suggest that an organisation should acquire knowledge if it is not available in the organisation. This assumes that someone is publishing the knowledge. Publishing knowledge is about selling added-value – the core business of a publisher. Publishers – commercial and corporate publishers – gather, verify, assemble and distribute resources as publications.
The new information age forces them to change their publishing paradigm from being product-centric to becoming information-centric. The information is what is focussed on and must be continuously updated. A variety of publications can be generated from the central information pool through various channels on different media.
A publisher can make use of topic maps by:
supporting the editorial work as well as the information selection process; and
enriching the information published online.
Editorial work and the selection process benefit from subject classification and similar techniques. Enrichment of online information (= resources) is mainly about optimising access to the resources. This covers clever navigation paths, searching and hyperlinks.
Topic maps are key to interactive access. They provide different views about the same content resources (e.g. personalised views based on user profiles or even user defined publications). However, they are also able to provide one view about many different resources from various repositories. Because resources can be separated from the topic map data, business models can be developed in which content (= resources) and added-value (= topic map) are created, packaged, syndicated and sold independently.
Topic maps are quite a new phenomenon but several industries already apply them or will be soon making use of them. Their flexibility and expressiveness as well as the fact that topic maps are an ISO standard makes them very attractive.
Commercial publishers are very interested because topic maps give them a standard at hand to add value to their content. Encyclopaedia publishers, legal publishers as well as publishers from e-learning, media, and news domains are early adopters of topic maps.
Web portal providers use topic maps to organise their web sites and to provide clear and consistent navigation patterns.
Topic maps are applied in call centres, in knowledge gateways/portals, and as corporate memory. Topic maps in call centres support the call centre agent or enable the customer to directly find the relevant answer for a certain question. So-called knowledge gateways/portals provide answers to many typical questions that customers and partners have about the products and services of a company. A corporate memory is a business’s relevant internal information about people, products, projects and policies of the company. Very innovative companies use topic maps already as next generation content management paradigm or are at least investigating the approach.
The wide range of topic maps applications can be categorised in three major categories:
ontology management;
organising principle in document and content management;
knowledge modelling in enterprise knowledge management.
Depending on the application’s category the user interaction with the topic map differs. Different interaction consequently leads to different user interfaces and software tools. But the different categories also show that the role, topic maps play in business solutions, can vary from ‘lead actor’ over ‘supporting actor’ to just being the ‘extra’.
The first category of topic maps applications brings the users in ‘direct contact’ with a topic map. Typically, such a topic map models a subject classification, taxonomy, thesaurus, or – most general – an ontology. The only or main purpose of these topic maps applications is the explicit creation, maintenance, and usage of the ontologies. They are not hidden by business logic from the users. The ontology and its components like topics, classes, associations, occurrences, and scope are the business objects.
The second category applies topic maps to document management, content management, and web content management systems. Topic maps play the role of complex metadata helping to organise and manage the information objects or describing how the information objects should be assembled to build publications or web sites. The topic map is mostly hidden behind the system’s user interface. The business logic maps the topic map components to business objects like information objects, effectivity parameters, publishing templates.
The third category utilises topic maps in the domain of information access and knowledge management. A key requirement for information access is that it has to be somehow ‘intelligent’. Explicit modelled ontologies, topic maps, provide the necessary ‘intelligence’. But ontologies are not only supporting the information access they also are a central part of a holistic knowledge management solution. Knowledge mining software utilising algorithms from statistics and linguistics helps the knowledge engineers to build and maintain the ontologies. The ontologies are the background technology for intelligent information retrieval. They can be visualised to the user to illustrate parts of the knowledge model and explain certain circumstances.
Topic maps are a number of technologies wrapped up in one. Topic maps applications may range from a simple electronic index or thesaurus over complex metadata, subject classification, and web site organisation up to knowledge representation and ontologies. Topic maps can be the ‘brains’ of intelligent search engines and drive the next generation content management systems. Corporate memory and other key aspects of a knowledge management solution can also benefit from topic maps.
The paper has shown that topic maps are an interesting and important technology, but not a solution. They are always part of a larger picture and only their combination with other technologies plus the necessary business logic fulfil the user expectations.
Part of this work has been conducted in the context of the German project LIKE (http://www.like-projekt.de), which is supported by the German Ministry of Education and Research (BMBF) under the funding number 01HW0160.
[1] The heading of this section is a little bit provocative to show that many concepts used in ‘modern’ information technology are in fact quite old. Despite their antiquity they are still in regular use.
[2] An ontology is an explicit specification of a shared conceptualisation.
[3] The slogan was coined by Charles F. Goldfarb, the inventor of SGML and father of mark-up languages.
[4] We use the term ‘explicit’ to emphasise the fact that a topic map contains explicit knowledge structures, which are human and machine interpretable. Documents, by contrast, only contain ‘implicit’ knowledge, which is interpretable (readable) by humans but rarely by machines. ‘Tacit’ knowledge is not coded at all (i.e. knowledge only ‘in the heads’ of the employees).
[5] The terms ‘lightweight’ and ‘heavyweight’ ontologies were introduced by Rudi Studer, University of Karlsruhe, Germany.
![]() ![]() |
Design & Development by deepX Ltd. |