Abstract
In this paper we propose a query language for retrieving MPEG-7 documents. It is defined as a specialisation of W3C XQuery and well adapted to retrieve audio-visual repositories such as TV news program archives. We describe the need for this query language, the motivation for such an adaptation, and other details of our implementation strategy. Finally we present the TV news querying tool of the COALA (Content Oriented Audiovisual Library Access) project which supports our proposed query language.
Table of Contents
An increasing amount of audiovisual data is nowadays produced and transferred between a variety of application domains. This implies important inconveniences and difficulties related to the exchange and access to audiovisual data. The MPEG-7 standard [MPEG-7 Home Page] was intended as an answer to this problem by providing an interoperable description format for audiovisual content based on W3C XML Schema [XML Schema, Parts 0, 1, and 2. W3C Recommendation]. MPEG-7 provides a standard library of content descriptions which are required in the majority of audiovisual applications. Applying such standard descriptions can considerably improve the exchange and access to audiovisual data; however, such representation is a necessary but not a sufficient condition to make the audiovisual data searchable according to specific application contexts and users' needs. Indeed, MPEG-7 provides a generic library of descriptions to cover almost all application domains. Nevertheless it is not designed to take into account any specific user model required for a given audiovisual retrieval process.
As MPEG-7 documents are XML-based, XQuery fits naturally as the flexible query language to extract data from them. However, using XQuery directly has the disadvantage of getting the user lost in describing the details of MPEG-7 rather than focusing on the essence of his retrieval requirements. Indeed, to retrieve MPEG-7 documents, we need an abstract model which reflects the user’s retrieval needs and behaviours. A query language adapted to retrieve MPEG-7 should from one side reflect such high level abstract model and on the other side be adequate for the effective retrieval of MPEG-7 content descriptions in XML format. In order to fulfil these two requirements we propose in this paper the Semantic Views Query Language (SVQL), which is an adaptation of W3C XQuery for retrieving MPEG-7 descriptions in a TV news retrieval application.
The paper is organised as in the following: In Section 2, “Related works” we provide an overview of classical audiovisual query languages and the query languages designed for MPEG-7 retrieval. In Section 3, “Semantic Views Query Language” we propose the SVQL by first introducing the Semantic Views Model and then describing the principles of SVQL syntax. In Section 4, “Implementation” we describe the details of the implementation of SVQL using XQuery and also an SVQL interface for querying TV news documents described in MPEG-7. Section 5, “Conclusions and future directions” concludes the paper and describes the future directions of our work.
Querying audiovisual documents is an issue which has been the subject of many research studies. Two main categories of audiovisual querying approaches can be distinguished in the literature: feature-based querying and semantic querying. The latter refers to techniques which focus on the low-level audiovisual features (colour, shape, etc.) such as query-by-example [Flinker, M., et al.], [Chang, S.-F., and al.], [Hampapur, A., and al.] and query-by-sketch [Ahanger G., Benson D., and Little T.D.C.]. The former refers to querying based on more high-level semantics which are closer to user’s interpretations and the usage contexts. Various semantic query languages have been proposed. The simplest approaches use only the traditional data based retrieval, i.e. they use key words to describe the content [Gibbs S. and Breitender C.]. This way of querying is however very limited, as it does not allow a detailed specification of the content and the type of desired results. A set of semantic query languages have been proposed in the present state of the art based on an extension of classical database query languages such as SQL and OQL [OOmoto, E. and Tanaka K.], [Li Q., Yang Y., and Chung W.-K.], [Hwang, E. and Subrahmanian V.S.]. The most important problem with these languages is that none of them is defined based on a study of the users’ requirements. Each of these languages focuses only on a subset of the rich structure and content-based descriptions based on which users can query audiovisual content.
The arrival of the MPEG-7 standard was an important evolution in modelling and representing the audiovisual content. Such descriptions are however, useful only once they can be correctly and easily retrieved via an adapted query language. A new challenge of audiovisual querying is then to provide a query language capable of retrieving audiovisual content described by MPEG-7 and based on the high-level requirements of the users in different application areas.
Currently a few querying techniques are being proposed with the aim of retrieving audiovisual content described by MPEG-7. We can classify these approaches into two main groups. The first group [Kang, J. H and al], [Tjondronegoro, D. and Chen, Y.P.P] consists of using directly XQuery as the retrieval language. The choice of XQuery is motivated by the fact that it fits well to the retrieval of XML based documents. However, these approaches have the disadvantage that they require an advanced knowledge of the MPEG-7 details in order to express a precise query. Moreover, they do not take into account different possible interpretation of the content which are related to various user’s requirements.
The second group of queries focuses on the extraction of semantic information from MPEG-7 documents following the users’ needs. This information could be in some cases not directly represented in MPEG-7, but deduced using for example an inference network model [Graves, A. and Lalmas, M.]. A specification of crucial query issues in MPEG-7 is proposed in [Liu, P. and Suhsu L.], which takes into account the implicit information to be extracted from MPEG-7 descriptions, such as spatio-temporal relations deduced from the points coordinates. Such semantic aspects are not directly expressible in XQuery, and therefore each of these approaches proposes their own specialised query languages.
In the next sections we propose an adaptation of XQuery called SVQL, which has the goal of providing a high-level query language for retrieving audiovisual documents described by MPEG-7 in a TV news production and archiving environment.
SVQL (Semantic Views Query Language) is a high level query language which allows different TV news users (journalists, archivists, producers, etc.) to express their professional requirements in an abstract and precise way. This language is defined as a result of a specific user study realised in a TV news production and archiving environment [Fatemi, N.]. Before describing the principles of SVQL, we should present the Semantic View Model which represents the underlying data model based on which SVQL has been designed.
In order to provide a high level query language which takes into account various users requirements and viewpoints in the process of audiovisual retrieval we realised a detailed study of TSR (Television Suisse Romande) production and archiving environment. We studied the various queries put forward by TV News professional and non-professional users. Analysing the different queries showed that users adopt five different Views to express their requirement: PhysicalView, ProductionView, ThematicView, VisualView and AudioView. The following example shows one such query and analyses how the user expresses the query via different Views.
The user is in fact searching for a video segment, trying to describe this video segment from different viewpoints as shown in Figure 2
In the Semantic Views Model each View is described using five elements: BasicViewEntity, ViewDescriptions, IntraViewRelations, InterViewRelations and ViewOperators. BasicViewEntity is the atomic unit of description in each View. ViewDescriptions express different characteristics of the BasicViewEntity in each View. InterViewRelations express the correspondence relation between the BasicViewEntities that belong to different Views. IntraViewRelations express the relations between the BasicViewEntities that belong to the same View. A detailed description of the Semantic Views Model can be found in [Fatemi, N.].
SVQL is a high level query language that allows users to express their requirements following the Semantic Views Model in a concise, abstract and precise way. Moreover, SVQL is designed particularly to make possible the retrieval of audiovisual data described by MPEG-7 standard. SVQL allows the users to formulate high level queries on top of MPEG-7 descriptions without getting involved into the implementation details of MPEG-7. Compared to traditional database query languages such as SQL, OQL, and XQuery, SVQL has the advantage of being specially designed for the retrieval of audiovisual data based on a semantic model of user’s requirements. Query languages such as SQL, OQL, and XQuery are designed to fit specific structures of data. However these languages are not suited to express the abstract level of the Semantic Views Model. Using XQuery and XML Schema is an advantageous method for the “implementation” of the Semantic Views Model and Semantic Views Query Language on top of MPEG-7 instances. Nevertheless, from a conceptual point of view, XQuery does not allow the user to directly express his/her abstract requirements following the Semantic Views Model.
The language can be used both by the end users and by application programmers. The former allows users with a high level query language which is relatively easy-to-use and abstract. The latter presents a useful tool which can facilitate and speed-up the development of applications based on the Semantic Views Model.
Figure 3 shows an example of a query in SVQL. We refer to the same query example as the one given in Section 3.1, “Semantic Views Model ” :
As can be observed in the example, the query is based on a "LET-WHERE-RETURN" structure, very close to the FLWR structure of XQuery ("FOR-LET-WHERE-RETURN"), with a small difference that we do not use the FOR keyword .
This type of query syntax, refereed to as "keyword oriented syntax" [Cotton, P. and Robie J.], is used in the most well-known query languages, such as SQL, OQL, and XQuery, and is a familiar mode of query expression for specialized end users and application programmers.
As can be seen in the above query, the LET clause contains two types of expressions:
In the first expression, the SemanticViews () function is called with the name of the file containing the MPEG-7 instances to be retrieved. This function creates the Semantic Views Document corresponding to the MPEG-7 file on the fly.
In the next series of expressions, a set of functions are called to get different required BasicViewEntities of the Semantic Views Document and to assign them into a set of variables. Each function name indicates the type of the BasicViewEntity, e.g. NewsItem, Fact, Shot, VideoSegment, and Speech.
The WHERE clause also contains two different types of expressions:
The first four expressions represent a set of conditions that should be held on the BasicViewEntities of different Views. These conditions are expressed via a set of ViewOperators: here match (), and greaterThan () are used.
The last expression represents the condition concerning the InterViewRelation of BasicViewEntities via corresponds () operator. It determines which of the cited BasicViewEntites correspond to each other. This expression is used if more than one View is used in the query.
Finally, the RETURN clause identifies a variable containing a BasicViewEntity to be returned to the user. In the above query the variable containing the VideoSegment is returned.
As shown in Figure 4, SVQL allows formulation of queries following the Semantic Views Model which will be mapped through different layers into the MPEG-7 instances.
The syntax of SVQL [Fatemi, N.] is a specialization of the syntax of XQuery. This design feature of SVQL has two main advantages. Firstly, as we mentioned above, this type of syntax allows a familiar mode of query expression for specialized end users and application programmers. Secondly, it allows a straightforward implementation of SVQL based on XQuery. The specialisation of XQuery into SVQL is realized by defining a set of functions and operators, which represent the basic functionalities of Semantic Views Model at an abstract level. The processing of SVQL queries is realized by an XQuery processor taking care of the SVQL functions and operators. These are represented as XQuery internal functions calls.
As can be seen in Figure 4 four classes of functions are involved in the processing of an SVQL query:
MPEG-7 API Functions: a set of functions [Schroeter R., Fatemi N., and Abou Khaled O.] that facilitate the access to the MPEG-7 instances, and which are called by the three classes of functions cited below. These functions which are implemented in XQuery and/or JDOM can be reused in any application requiring access and manipulation of MPEG-7 descriptions.
Semantic Views Document Creation: the on-the-fly creation of Semantic Views Document via SemanticViews() function. A Semantic Views Document is indeed an XML document containing the description of the different Views which describe to a given audiovisual document.
BasicViewEntity Location: a set of functions (such as NewsProgram (), Shot (), etc.) which locate BasicViewEntities (of type NewsProgramType, ShotType, etc.) inside the Semantic Views Document whose path is passed in the parameter of the function.
Semantic Views Operators: all ViewOperators, such as match(), corresponds (), etc.
The implementation architecture described above allows the possibility to directly perform SVQL queries against MPEG-7 instances. This feature allows querying MPEG-7 in different sources (database, stream, etc.); it allows querying directly partial MPEG-7 descriptions during the production phase without needing to translate and store them based on the Semantic Views Document Model; and finally, it allows an easier extension of the Semantic Views Model, as the mapping between the MPEG-7 and the Semantic Views Model is represented via a set of XQuery functions. The only modification to be done is then the core of these functions, the stored data remaining intact. However, this approach can also have a set of disadvantages. The on the fly translation of MPEG-7 instances into Semantic Views instances can make the querying more expensive in terms of time and memory. The compromise between these different measures (flexibility and extensibility from one side and the memory and time form the other side) can decide whether it would be better to realise the Semantic Views Document creation of the fly or in advance.
A TV news querying interface based on SVQL is provided in the COALA (Content Oriented Audiovisual Library Access) project [COALA Home Page]. COALA is an experimental project that provides various tools for both indexing and retrieval of audiovisual programs at the TSR broadcasting company. The indexing tool of COALA called LogCreator [Fatemi, N.] creates MPEG-7 description of TV news programs. These descriptions are stored in an XML Native database, TAMINO [Tamino Database]. The Semantic Views querying tool provides an interface adapted for the expression of SVQL queries (Figure 5). The queries are resolved using Quip, XQuery processor provided by TAMINO.
In this paper we presented the SVQL query language which is an adaptation of the W3C XQuery for retrieving audiovisual documents described by MPEG-7. XQuery is designed specifically for retrieving data in XML format, such as MPEG-7. However, using XQuery directly has the disadvantage of getting the user lost in describing the details of MPEG-7 rather than focusing on the essence of his retrieval requirements. SVQL is a specialised query language designed based on TV news professional users’ requirements. Therefore users of SVQL have the advantage of querying based on a high-level language which is adapted to their professional requirements. Moreover by implementing SVQL using XQuery we provide an efficient retrieval of MPEG-7 descriptions using a specialised XML querying language, which is XQuery.
The current implementation of SVQL takes into account mainly those MPEG-7 descriptions which are used in the domain of TV news retrieval application. We are interested to study the extension of SVQL to cover a wider range of MPEG-7 descriptions used in more various audiovisual document types. Finally, we believe that the approach proposed in this paper for the implementation of SVQL can be adopted as a generic way of adaptation of XQuery for defining new specialised query languages.
We would like to express our gratitude to TSR TV news http://www.tsr.ch production and archiving experts for their precious collaboration during the study of their system.
[MPEG-7 Home Page] http://www.darmstadt.gmd.de/mobile/MPEG7/index.html
[XML Schema, Parts 0, 1, and 2. W3C Recommendation] http://www.w3.org/TR/2001/REC-xmlschema-0-20010502, http://www.w3.org/TR/2001/REC-xmlschema-1-20010502, http://www.w3.org/TR/2001/REC-xmlschema-2-20010502
[Flinker, M., et al.] Query by Image and Video Content: The QBIC system. IEEE Computer, 1995. 28: p. 23-32.
[Chang, S.-F., and al.] A fully automated content based video search engine supporting spatio-temporal queries. IEEE Transaction on Circuits and Systems for Video Technology, 1998. 8: p. 5.
[Ahanger G., Benson D., and Little T.D.C.] Video query formulation. in Storage and Retrieval for Images and Video Databases II. 1995. San Jose.
[Gibbs S. and Breitender C.] Audio/video database: an object-oriented approach. in International Conference on Data Engineering. 1993.
[OOmoto, E. and Tanaka K.] OVID: Design and implementation of a video-object database system. IEEE Transaction on Knowledge and data Engineering, 1993. 5(4): p. 629-643.
[Li Q., Yang Y., and Chung W.-K.] CAROL: Towards a declarative video data retrieval language. in Proceedings of SPIE. 1998.
[Hwang, E. and Subrahmanian V.S.] Querying video libraries. Journal of Visual Communications and Image Representation, 1996. 7(1).
[Fatemi, N.] A Semantic Views Model for Audiovisual Indexing and Retrieval, PhD thesis, Swiss Federal Institute of technology of Lausanne, Lausanne, March 2003.
[Kang, J. H and al] An XQuery engine for digital library systems, in Proceedings of 3rd ACM/IEEE-CS Joint Conference on Digital Libraries, Houston, Texas,May 2003.
[Tjondronegoro, D. and Chen, Y.P.P] Content-based indexing and retrieval using MPEG-7 and XQuery in video data management systems, World Wide Web: Internet and Web Information Systems, 5, p. 207-227, 2002.
[Graves, A. and Lalmas, M.] Video retrieval using an MPEG-7 based inference network, in Proceedings of ACM SIGIR’02, Tampere, Finland, August 2002.
[Liu, P. and Suhsu L.] Queries of digital content descriptions in MPEG-7 and MPEG 21 XML documents, XML Europe 2002, Barcelona, Spain, May 2002.
[Cotton, P. and Robie J.] The W3C XML Query Working Group, XML Europe 2001, Berlin, Germany, May 2001.
![]() ![]() |
Design & Development by deepX Ltd. |