Semantic Thumbnails - Summarizing XML Documents and Collections

Extended Abstract

Track: Core Technologies, Storing XML, Web Services

Audience Level: Technical View

Time: Thursday, November 18 at 14:45

Author: Dr Mehmet Dalkilic , Assistant Professor, Indiana University

Author: Dr Arijit Sengupta , Assistant Professor, Indiana University

Keywords: Content Repurposing, Content Management, Conversion, Data Representation, Graphic, Ontology, Search, Semantic Web, Document Semantics, Document Summarization, Thumbnails

Abstract:

The concept of thumbnails is common in image representation. A thumbnail is a highly compressed version of an image that provides a small, yet complete visual representation to the human eye. We propose the adaptation of the concept of thumbnails to the domain of documents, whereby a thumbnail of any document can be generated from its semantic content, providing an adequate amount of information about the documents. However, unlike image thumbnails, document thumbnails are mainly for the consumption of software such as search engines, and other content processing systems. With the advent of the semantic web, the requirement for machine processing of documents has become extremely important. We give particular attention to electronic documents in XML and in RDF/XML, with a view towards the processing of documents in the semantic web.