MPEG-7 and the Semantic Web

W3C Incubator Group Editor's Draft 14 August 2007

This version:: http://www.w3.org/2005/Incubator/mmsem/XGR-mpeg7-20070814/
Latest version:: http://www.w3.org/2005/Incubator/mmsem/XGR-mpeg7/
Previous version:: This is the first public version.
Contributors:: Oscar Celma, Universitat Pompeu Fabra; Stamatia Dasiopoulou, Informatics and Telematics Institute; Michael Hausenblas, JOANNEUM RESEARCH; Suzanne Little, Italian National Research Center (CNR Pisa); Chrisa Tsinaraki, Technical University of Crete; Raphaël Troncy, Center for Mathematics and Computer Science (CWI Amsterdam)
: Also see Acknowledgements.

Document Roadmap

After reading this document, readers may turn to separate living documents discussing individual multimedia annotation vocabularies, and other relevant tools and resources.

This document targets the developers and researchers in multimedia semantics. It describes the four current OWL/RDF proposals of MPEG-7, as well as a comparison of the different modeling approaches in the context of practical applications. Any harmonizations aiming at providing a single ontology of MPEG-7 will not be done in the framework of this XG.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of Final Incubator Group Reports is available. See also the W3C technical reports index at http://www.w3.org/TR/.

This document was developed by the W3C Multimedia Semantics Incubator Group.

Publication of this document by W3C as part of the W3C Incubator Activity indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. Participation in Incubator Groups and publication of Incubator Group Reports at the W3C site are benefits of W3C Membership.

Incubator Groups have as a goal to produce work that can be implemented on a Royalty Free basis, as defined in the W3C Patent Policy. Participants in this Incubator Group have made no statements about whether they will offer licenses according to the licensing requirements of the W3C Patent Policy for portions of this Incubator Group Report that are subsequently incorporated in a W3C Recommendation.

Discussion of this document is invited on the XG public mailing list public-xg-mmsem@w3.org (public archives). Public comments should include "comments: [MPEG-7]" at the start of the Subject header.

1. Introduction
2. Existing MPEG-7 ontologies
- 2.1. Using the MPEG-7/ABC Ontology
- 2.2. Using the MPEG-7/Tsinaraki Ontology
3. Solving the interoperability problems
4. Conclusions
References
Acknowledgments

1. Introduction

MPEG-7 was developed to provide standardized tools for describing different aspects of multimedia at different levels of abstraction. Its XML-based syntax enables smooth interchange across applications and over the web, but the lack of precise semantics hinders metadata interoperability. Two representative examples include:

Semantically identical metadata can be represented in multiple ways. For example, an image depicting a player scoring a goal can be annotated using the free text tag ("Zinedine Zidane scoring against England"), the keyword tag (Zidane, goal, France, England), the label tag etc.

<FreeTextAnnotation xml:lang="en">Zinedine Zidane scoring against England.</FreeTextAnnotation>

Using the free text annotation

<KeywordAnnotation xml:lang="en"> 
  <Keyword>Zinedine</Keyword>
  <Keyword>Zidan</Keyword>
  <Keyword>scoring</Keyword>
  <Keyword>England</Keyword>
  <Keyword>goal</Keyword>
</KeywordAnnotation>

Using the keyword annotation

<StructuredAnnotation> 
  <Who>
    <Name xml:lang="en">Zinedine Zidane</Name>
  </Who>          
  <WhatAction>
    <Name xml:lang="en">Zinedine Zidane scoring against England.</Name>
  </WhatAction>
</StructuredAnnotation>

Using a structured annotation with labels

<Semantic id="FormalAbstractionDescription"> 
  <SemanticBase xsi:type="AgentObjectType" id="Zidane">
    <Label><Name>Zidane </Name></Label>
    <Agent xsi: ="PersonType">
      <Name>
        <GivenName>Zinedine</GivenName>
        <FamilyName>Zidane</FamilyName>
      </Name>
    </Agent>
  </SemanticBase>
  <SemanticBase xsi:type="EventType" id="scoring">
    <Label> 
      <Name>Zinedine Zidane scoring against England.</Name>
    </Label>
  </SemanticBase>
</Semantic>

Using MPEG-7-built-in (non-formal) semantic descriptor

The intended semantics underlying the structure of descriptions defined within MPEG-7, for example the decomposition relation between an image and its constituent segments, are not formal and as such cannot be deployed (an image annotated as depicting Zidane and an image, a segment of which is annotated as depicting Zidane wonâ€™t be both retrieved in a corresponding â€˜semanticâ€™ query unless customized query expansion is performed to cover both cases.

2. Existing MPEG-7 ontologies

To alleviate the resulting interoperability issues, efforts have been undertaken to translate MPEG-7 into an ontology and through appropriate frameworks to enable its integration with other ontologies, thus enhancing interoperability. Two main such methodologies include the proposals by Hunter et. al. and Tsinaraki et. al. Both approaches aim to provide a framework for interoperable MPEG-7 compliant multimedia metadata. However, given the continuously growing research interest in formalizing multimedia related semantics and building a common metadata framework, the question of how interoperable these proposals are becomes particularly important.

2.1. Using the MPEG-7/ABC Ontology

In the approach proposed by Hunter, the ABC ontology is used as the core one to provide attachment points for integrating mpeg7 and domain specific ontologies. More specifically, the mpeg7:MultimediaContent class (and the subsequent multimedia and segment hierarchy) is defined as a subclass of the abc:Manifestation class, while the corresponding domain ontologies are assumed to be appropriately attached to corresponding ABC classes.

A first observation at this point would be that MPEG-7 includes apart from the structure related description schemes, descriptions on other aspects as well (e.g., the semantic part ones), for which it is not clear how the mapping to ABC should be and how they relate to possibly relevant domain specific definitions. For example, mpeg7:Agent could be mapped to abc:Agent. Assuming a domain specific class o:Person it should be again linked to abc:Agent as equivalent class, subclass or through some property, thus raising issues about the semantics of the mpeg7:Agent and o:Person relation, which in turn reduces interoperability among possible pre-existing MPEG-7 based annotation metadata and newly created ones under the ABC core ontology framework.

Let assume that someone follows the approach by Hunter, using the Multimedia Description Scheme (MDS) part of the MPEG-7 ontology to address the structural aspects, in order to annotate an image depicting Zidane scoring. Assuming a soccer ontology s, the involved classes would be s:goal, s:player, s:scoring and mpeg7:image (at least in a simple case where spatiotemporal decomposition is not taken into account). One possible way to represent this annotation would be using the following statements:

:image01 rdf:type mpeg7:Image 
:goal01 rdf:type s:Goal
:scoring01 rdf:type s:Scoring

:image01 mpeg7:depicts :goal01
:goal01 abc:hasAction :scoring01
:scoring01 abc:hasAgent  s:_b1
:_b1 :hasName 'Zinedine Zidane'

where additionally the following hold:

mpeg7:Image rdfs:subclass mpeg7:MultimediaContent 
mpeg7:MultimediaContent rdfs:subclass abc:Manifestation
s:Scoring rdfs:subclass abc:Action
s:Goal rdfs:subclass abc:Event

Notice that under this framework, having attached this annotation to a specific image region rather than the whole image, i.e.

:region01 rdf:type mpeg7:StillRegion 
:region01 mpeg7:depicts :goal01

we would be able to retrieve the corresponding image if querying for images depicting Zinedine Zidane scoring, due to the subclass relation
mpeg7:StillRegion rdfs:subclass mpeg7:Image,
something that is not inherently possible by MPEG-7 itself.

Leaving out individual issues regarding the taken modeling decisions (e.g., should still regions be modeled as a subclass of image or related to the latter through partonomic decomposition relations only), the one sees evidence for the value of using an upper ontology, adequately generic to allow the consistent integration between an MPEG-7 ontology and domain specific ones.

2.2. Using the MPEG-7/Tsinaraki Ontology

In Tsinaraki on the other hand, the semantic part of MPEG-7 is translated into an ontology that serves as the core one for the attachment of domain specific ontologies, in order to achieve MPEG-7 compliant domain specific annotations. A first observation is that under this approach the initial conceptualization of the domain specific ontologies needs to be "mapped" to the MPEG-7 modeling rationale. Consequently, annotation metadata produced following this approach would not be interoperable with approaches coupling domain specific ontologies with an MPEG-7-like one, following a procedure similar to the one proposed by Hunter.

3. Solving the interoperability problems

In this section we will present the possible solutions for the interoperability problems that arise from the different translations/formalisations of the MPEG-7 standard. The specific interoperability problems have been illustrated in the motivating example. There are three approaches in the literature that try to overcome such interoperability problems. These approaches are:

TODO: Michael to describe syntactic (XML, XML-Schema) and semantic (RDF/OWL/rules) aspects.

Create syntactic mappings between terms of two or more standards (e.g. Cidoc-Crm Vs Dublin-core). The proposed solution exploits the expressive power and reasoning support of OWL and SWRL (or other rules on-top-of ontologies language) in order to created syntactic as well as semantic mappings.
Align the domain ontologies in a multimedia core ontology (or framework) that ensures interoperability. This approach covers the work that is in progress in the K-Space project.
Using MPEG-7 profiles. This approach will be mainly covered by Michael.

The aim of this section is not to present the analytical solutions but rather the mechanism to ensure interoperability in MPEG-7 based MM applications. In addition, we will present the interoperability problems that are solved and the new ones that are introduced.

References

[Dublin Core]: The Dublin Core Metadata Initiative, Dublin Core Metadata Element Set, Version 1.1: Reference Description
[Hunter, 2001]: J. Hunter. Adding Multimedia to the Semantic Web — Building an MPEG-7 Ontology. In International Semantic Web Working Symposium (SWWS 2001), Stanford University, California, USA, July 30 - August 1, 2001
[MPEG-7]: Information Technology - Multimedia Content Description Interface (MPEG-7). Standard No. ISO/IEC 15938:2001, International Organization for Standardization(ISO), 2001
[Ossenbruggen, 2004]: J. van Ossenbruggen, F. Nack, and L. Hardman. That Obscure Object of Desire: Multimedia Metadata on the Web (Part I). In: IEEE Multimedia 11(4), pp. 38-48 October-December 2004
[Ossenbruggen, 2005]: F. Nack, J. van Ossenbruggen, and L. Hardman. That Obscure Object of Desire: Multimedia Metadata on the Web (Part II). In: IEEE Multimedia 12(1), pp. 54-63 January-March 2005
[OWL Guide]: OWL Web Ontology Language Guide, Michael K. Smith, Chris Welty, and Deborah L. McGuinness, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-owl-guide-20040210/ . Latest version available at http://www.w3.org/TR/owl-guide/
[OWL Semantics and Abstract Syntax]: OWL Web Ontology Language Semantics and Abstract Syntax, Peter F. Patel-Schneider, Patrick Hayes, and Ian Horrocks, Editors, W3C Recommendation 10 February 2004, http://www.w3.org/TR/2004/REC-owl-semantics-20040210/ . Latest version available at http://www.w3.org/TR/owl-semantics/
[RDF Primer]: RDF Primer, F. Manola, E. Miller, Editors, W3C Recommendation, 10 February 2004. This version is http://www.w3.org/TR/2004/REC-rdf-primer-20040210/. The latest version is at http://www.w3.org/TR/rdf-primer/
[RDF Syntax]: RDF/XML Syntax Specification (Revised) , Dave Beckett, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/ . Latest version available at http://www.w3.org/TR/rdf-syntax-grammar/
[Stamou, 2005]: G. Stamou and S. Kollias (eds). Multimedia Content and the Semantic Web: Methods, Standards and Tools. John Wiley & Sons Ltd, 2005
[Troncy, 2003]: R. Troncy. Integrating Structure and Semantics into Audio-visual Documents. In Second International Semantic Web Conference (ISWC 2003), pages 566 – 581, Sanibel Island, Florida, USA, October 20-23, 2003. Springer-Verlag Heidelberg
[Tsinaraki]: Tsinaraki, C.: OWL soccer ontology available at http://elikonas.ced.tuc.gr/ontologies/soccer.zip
[VDO]: aceMedia Visual Descriptor Ontology, available from http://www.acemedia.org/aceMedia/reference/resource/index.html
[XML NS]: Namespaces in XML, Bray T., Hollander D., Layman A. (Editors), World Wide Web Consortium, 14 January 1999. This version is http://www.w3.org/TR/1999/REC-xml-names-19990114/. The latest version is http://www.w3.org/TR/REC-xml-names/

Acknowledgments

$Id: Overview.html,v 1.3 2007/08/15 00:47:45 rtroncy Exp $