throbber

`
`
`
`Introduction to
`
`MPEG-7
`
`Multimedia Content
`
`Description Interface
`
`Edited by
`
`B. S. Manjunath
`
`University of California, Santa Barbara, USA
`
`Universitar Politecnica de Catalunya, Barcelona, Spain
`
`Philippe Salembier
`
`Heinrich-Hertz—Institute (HHI), Berlin, Germany
`
`Thomas Sikora
`
`JOHN WILEY & SONS, LTD
`Apple 1014
`
`Apple 1014
`
`

`

`, Copyright © 2002
`'
`
`John Wiley & Sons Ltd,
`The Atrium, Southern Gate, Chichester,
`West Sussex P019 8SQ, England
`
`Telephone
`
`(+44) 1243 779777
`
`Email (for orders and customer service enquiries): cs-books@wiley.co.uk
`Visit our Home Page on www.wileyeurope.com or www.wiley.co.uk
`
`Reprinted March 2003
`
`All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or
`transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or
`otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms
`of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T
`4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be
`addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate,
`Chichester, West Sussex P019 SSQ, England, or emailed to permreq@wiley.co.uk, or faxed to
`(+44) 1243 770571.
`
`This publication is designed to provide accurate and authoritative information in regard to the subject
`matter covered. It is sold on the understanding that the Publisher is not engaged in rendering
`professional services. If professional advice or other expert assistance is required, the services of a
`competent professional should be sought.
`
`Other Wiley Editorial Offices
`
`John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA
`
`Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA
`
`Wiley-VCH Verlag GmbH, Boschstr. 12, D—69469 Weinheim, Germany
`
`John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia
`
`John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809
`
`John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1
`
`British Library Cataloguing in Publication Data
`
`A catalogue record for this book is available from the British Library
`
`ISBN 0 471 48678 7
`
`Typeset in 10/ 12pt Times Roman by Laserwords Private Ltd, Chennai, India
`Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire
`This book is printed on acid-free paper responsibly manufactured from sustainable forestry
`in which at least two trees are planted for each one used for paper production.
`
`Apple 1014
`
`
`
`
`
`
`Apple 1014
`
`

`

`
`
`E
`
`:
`5
`
`1
`
`E
`E
`.
`
`2
`E
`i
`
`;
`1
`E
`
`
`
`4
`E
`
`Contributors
`
`Adam Lindsay
`Computing Department Lancaster
`University
`Lancaster
`LAl 4YR
`
`UK
`
`Ajay Divakaran, Ph.D.
`Mitsubishi Electric Research
`Laboratories
`Murray Hill Laboratory
`571 Central Avenue
`Suite 115
`-
`M
`H11, NJ 07974
`1
`urray
`ajayd@merl. com
`
`Akio Yamada
`Computer & Communication
`Research, NEC COYP'
`Miyazaki 4-1—1, Miyamae,
`Kawasaki 216—85555, Japan
`a-yamada@da.jp.nec.com
`
`Ana Belen Benitez
`Electrical Engineering Department
`Columbia University
`1312 Mudd, #F6, 500 W. 120th
`St, MC 4712
`New York, NY 10027
`USA
`Tel: +1 212 854—7473
`Fax: +1 212 932—9421
`ana@ee.columbia.edu
`
`Benoit Mory
`Laboratoires d’Electronique
`Philips
`22 avenue Descartes, BB 15
`94453 Limeil—Brevannes Cedex
`
`France
`
`benoit. mory @philips. com
`
`‘
`Chi Sun Won
`gepartment of Electrlcal
`ngineenng
`.
`.
`Dong G111? Umversuy
`26 Beon—Ji, 3 Ga, P11~Dong,
`Joong-Gu Seoul
`’
`South Korea
`
`cswon@dgu. ac. kr
`
`Claude Seyrat
`Expway 0/0 Acland
`18 avenue Georges V
`75008 pafis
`France
`cseyrat@acland.fr
`
`Cédfic Thiénot
`Expway 0/0 Acland
`18 avenue Georges V
`75005 Paris
`France
`cthienot@acland.fr
`
`Dean S. Messing
`Information Systems
`Technologies Dept.
`
`Apple 1014
`
`Apple 1014
`
`

`

` CONTRIBUTORS
`
`
`
`Sharp Laboratories of America
`”5750 N .W. Pacific Rim Blvd.
`
`Camas, WA, USA. 98607
`
`deanm@sharplabs. com
`
`Fernando Manuel Bernardo Pereira,
`Professor
`
`Instituto Superior
`Técnico — Instituto
`
`de Telecomunicaooes
`Av. Rovisco Pais, 1049—001
`
`Lisboa, Portugal
`FernandorPereira @ lx. it.pt
`
`Francoise Preteux
`
`Institut National des
`
`Telecommunications
`
`Unité de Projets ARTEMIS
`9, Rue Charles Fourier
`
`91011 Evry Cedex — France
`Francoise.Preteux@int-evryfr
`
`Hawley K. Rising III
`Sony MediaSoft Lab, USRL
`MD# SJ2C4
`
`3300 Zanker Road
`
`San Jose, CA 95134—1901
`
`hawley. rising @ am.sony. com
`
`Heon Jun Kim
`
`MI Group, Information
`Technology Lab.
`LG Electronics Institute of
`
`Technology
`16 Woomyeon—Dong, Seocho—Gu
`Seoul, Korea 137~724
`
`hjk@lge.co.kr
`
`I. P. A. Charlesworth,
`
`Reech Capital PLC,
`1 Undershaft,
`
`London EC3P 3DQ, England
`jason- charlesworth @ reech. com
`
`Jane Hunter
`
`DSTC Pty Ltd
`
`Distributed Systems Technology
`CRC
`
`Level 7, General Purpose Sout
`The University of Queensland
`Queensland 4072
`Australia
`
`jane @ dstc. edu.au
`
`Jens—Rainer Ohm
`
`Institute of Communication
`
`Engineering
`Aachen University of Technology
`Melatener Str. 23, D-52072
`
`Aachen, Germany
`0hm@ient.rwth-aachen.de
`
`John Smith
`
`IBM T. J. Watson Research Center
`
`30 Saw Mill River Road
`
`Hawthorne, NY 10532
`USA
`
`jrsmith@wats0n. ibm. com
`
`Iose M. Martinez
`
`Grupo de Tratamiento de
`Imagenes
`Dpto. Sefiales, Sistemas y
`Radiocomunicaciones
`
`E.T.S.Ing. Telecomunicacion
`(C—306)
`Universidad Politecnica de Madrid
`
`Ciudad Universitaria s/n
`
`E—28040 Madrid
`
`Spain
`jms@gti.ssr.upm.es
`
`Ibrg Hueuer
`Siemens AG
`
`CT IC 2
`
`Jorg Heuer
`81730 Miinchen
`
`Germany
`Tel: +49 89 636 52957
`
`Fax: +49 89 636 52393
`
`Joerg.Heuer@mchp.siemens.de
`
`Apple 1014
`
`
`
`r1,1-,,_.._,.,Mt_._.wiw.__._fi,_MWm
`
`
`
`Apple 1014
`
`

`

`
`
`
`
`CONTRIBUTORS
`
`Kyoungro Yoon
`LG Electronics Institute of
`
`Technology
`16 Woomyeon—dong, Seocho—gu,
`Seoul 137-724
`
`Korea
`
`Tel: +82~2~526—4133
`
`Fax: +82~2—526-4852
`
`yoonk@lg-elite.com
`
`Leonardo Chiariglione
`Telecom Italia Lab
`
`Via G. Reiss Romoli, 274
`I—10148 Torino
`
`Italy
`leonardo.chiariglione@tilab. com
`
`Leszek Cieplinski
`Mitsubishi Electric ITE—VIL
`
`20 Frederick Sanger Road
`Guildford
`
`Surrey
`GU2 7YD
`
`United Kingdom
`Leszek.Cieplinski@vil.ite.mee.c0m
`
`Michael Casey
`M E R L
`
`201 Broadway, 8th Floor
`Cambridge MA 02139
`mkc@merl.c0m
`
`Michael Wollborn
`
`Robert Bosch GmbH
`
`FV/SLM
`
`PO Box 777777
`
`D—31132 Hildesheim
`
`Germany
`Michael. W0llb0rn@de.bosch. com
`
`Mikio Sasaki
`
`Research Laboratories, DENSO
`CORPORATION
`
`500—1 Minamiyama,
`Komenoki-cho, Nisshin-shi,
`
`Aichi—ken,
`
`470—0111 Japan
`msasaki @ rlab. denso. c0.jp
`
`Miroslaw Bober
`
`Visual Information Laboratory
`Mitsubishi Electric Information
`
`Technology Center Europe
`20 Frederick Sanger Road
`Guildford Surrey GU2 7YD, UK
`miroslaw. bober@vil. ite. mee. com
`
`Mufit Ferman
`
`Sharp Laboratories of America
`5750 NW. Pacific RimBlvd.
`
`Camas, WA 98607
`USA
`
`mferman@sharplabs. com
`
`Neil Day
`MPEG—7 Alliance
`
`Dublin, Ireland.
`dneil@bluemetrix.com
`
`Shun-ichi Sekiguchi
`Multimedia Signal
`Processing Lab,
`Multimedia Labs, NTT
`DoCoMo Inc.
`
`Olivier Avaro
`
`France Télécom R&D
`
`38/40 rue General Leclerc
`
`92794 ISSY MOULINEAUX
`
`Cedex 9
`
`France
`
`Olivier.avar0@francetelecom.c0m
`
`Peter van Beek
`
`Sharp Labs of America,
`5750 NW. Pacific Rim Blvd,
`
`Camas, WA 98607
`USA
`
`,
`
`Tel: +1 360—817—7622,
`
`Fax: +1 360—817~8436,
`
`pvanbeek@sharplabs. com
`Apple 1014
`
`Apple 1014
`
`

`

` CONTRIBUTORS
`
`
`
`wmmmww
`
`. Philip N. Garner
`Canon Research Centre
`
`Europe Ltd,
`1 Occam Court,
`
`Occam Road,
`
`Surrey Research Park,
`Guildford,
`
`Surrey GU2 7Y1.
`United Kingdom
`philg@cre. canon. co. uk
`
`Philippe Salembier
`Universitat Politecnica de
`
`Catalunya
`Campus Nord, Modulo D5
`Jordi Girona, 1-3
`
`08034 Barcelona,
`
`Spain
`Tel: +34 9 3401 7404
`
`Fax: +34 9 3401 6447
`
`philippe @ gps. tsc. upc. es
`
`Rob Koenen
`
`InterTrust Technologies
`Corporation
`4750 Patrick Henry Drive
`Santa Clara, CA 95054
`USA
`
`rkbenen@intertrust. com
`
`Santhana Krishnamachari
`
`Philips Research
`345 Briarcliff Manor
`
`New York 10510, USA
`
`Santhana.krishnamachari@philips.com
`
`Schuyler Quackenbush,
`AT&T Labs, Rm E133 180 Park
`
`Avenue, Bldg. 103 Florham Park,
`NJ 07932, USA
`
`Sylvie JEANNlN
`Philips Research USA
`345 Scarborough Road, Briarcliff
`Manor NY 10510, USA
`
`Thomas Sikora
`
`Heinrich—Hertz-Institute for
`
`Communication Technology
`Einsteinufer 37 D—10587 Berlin
`
`Germany
`Sikora@hhi.de
`
`Toby Walker
`Media Processing Division
`Network and Software Technology
`Center of America
`
`3300 Zanker Road, MD #SJ2C4
`
`San Jose, California 95134
`
`t0byw@usrl.sony.c0m
`
`Whoi—Yul Yura Kim
`
`School of Electrical and Computer
`Engineering
`Hanyang University
`Korea
`
`wykim@email.hanyang.ac.kr
`
`Yanglim Choi
`Digital Media R&D Center,
`Samsung Electronics Co., Ltd.
`416, Maetan 3-Dong, Paldal-Gu,
`Suwon, Kyungki—Do, S. Korea.
`yanglimc®samsung.c0m
`
`YongMan Ro
`School of Engineering
`Information and Communication
`
`University
`Yusong-Gu PO Box 77
`Taejon, South Korea
`yr0@icu.ac.kr
`
`Apple 1014
`
`Apple 1014
`
`

`

`Preface
`
`This book provides a comprehensive introduction to the new ISO/MPEG7 standard. The
`individual chapters are written by experts who have actively participated and contributed
`to the development of the standard. The chapters are organized in an intuitive way, with
`clear explanations of the underlying tools and technologies contributing to the standard. A
`large number of illustrations and working demonstrations should make this book a valuable
`resource for a wide spectrum of readers — from graduate students and researchers inter-
`ested in the state~of—the-art media analysis technology to practicing engineers interested
`in implementing the standard.
`
`SEARCH AND RETRIEVAL OF MULTIMEDIA DATA
`
`Multimedia search and retrieval has become a very active research field because of the
`increasing amount of audiovisual (AV) data that is becoming available, and the growing
`difficulty to search, filter or manage such data. Furthermore, many new practical applica—
`tions such as large—scale multimedia search engines on the Web, media asset management
`systems in corporations, AV broadcast servers, and personal media servers for consumers
`
`are about to be widely available. This context has led to the development of efficient
`processing tools that are able to create the description of AV material or to support the
`identification or retrieval of AV documents. Besides the research activity on processing
`tools, the need for interoperability between devices has also been recognized and several
`standardization activities have been launched. MPEG—7, also called “Multimedia Content
`Description Interface”, standardizes the description of multimedia content supporting a
`wide range of applications. Standardization activities do not focus so much on processing
`tools but concentrate on the selection of features that have to be described, and on the
`way to structure and instantiate them with a common language.
`I
`
`As an emerging research area of wide interest, multimedia content description
`has a large audience. There are many workshops and conferences related to this topic
`every year, and their number is growing. The MPEG—7 technology covers the most recent
`developments in multimedia search and retrieval.
`
`This book presents a comprehensive overview of the principles and conCepts
`involved in a complete chain of AV material indexing, metadata description (based on
`the MPEG-7 standard), information retrieval and browsing. The book offers a practical
`step—by—step walk through of the components,rfrom systems to schemas Ato Iapélip—visual
`ppe
`
`
`
`l li 3
`
`3
`
`f
`
`l
`
`3
`
`
`
`Apple 1014
`
`

`

`
`
`PREFACE
`
`. descriptors. It addresses the selection of the multimedia features to be described, the orga-
`' nization and structuring of the description, the language to instantiate the description, as
`well as the major processing tools used for indexing and retrieval of images and video
`sequences. The accompanying electronic documentation will include numerous examples
`and working demonstrations of many of these components.
`
`Researchers and students interested in multimedia database technology will find
`this book a valuable resource covering a broad overview of the current state of the art in
`search and retrieval. Practicing engineers in industry will find this book useful in build—
`ing MPEG—7 compliant systems, as the only resource outside of the MPEG community
`available to the public at the time of publication.
`
`ORGANIZATION
`
`The book is organized into six sections: Introduction, Systems, Multimedia Description
`Schemes, Visual descriptors, Audio Descriptors and Applications.
`
`Section I: Introduction
`
`This section introduces the MPEG—7 standardization activity and the history behind this
`new standard. In Chapter 1, Leonardo Chiariglione, the convenor of MPEG, provides the
`motivation for the new standard. Chapter 2, by Pereira and Koenen, outlines the various
`activities within MPEG—7 that gained momentum towards the end of 1998, culminating
`in the final standard in 2002.
`‘
`
`Section II: Systems
`
`The systems section covers three major areas: Systems Architecture, Description Defini—
`tion Language and the Binary Format for MPEG-7. The chapter on Systems Architecture
`discusses the design principles behind MPEG—7 Systems and highlights the most impor-
`tant processing steps for transport and consumption of MPEG-7 descriptions. The second
`chapter focuses on the language used to define the various description elements called
`Descriptors or Description Schemes that are presented in Sections III, IV and V. Finally,
`the Binary Format for MPEG-7 is described in Chapter 5. This format has been designed
`so as to efficiently compress and transport MPEG—7 descriptions.
`
`Section III: Multimedia Description Schemes
`
`Section III describes the organization of features that can be described with MPEG-7. The
`organization of this section is based on the functionality provided by the various Descrip—
`tion Schemes. Chapter 6 provides an overview of the entire section. Chapter 7 discusses
`elementary Descriptions Schemes or Descriptors that are used as building blocks for more
`complex Descriptions Schemes. The tools available for description of a single multimedia
`document are reviewed in chapter 8. The most important features related to content man-
`agement and description, including low—level as well as high-level features, are analyzed.
`
`Apple 1014
`
`
`
`Mr.met.va1W«W-wmmm
`
`
`
`Apple 1014
`
`

`

` PREFACE
`
`W P
`
`urely audio or visual features are very briefly mentioned in this chapter. A detailed pre—
`sentation of the corresponding set of tools is given in Sections IV (Visual features) and
`Section V (audio features). The main functionalities supported by the tools of Chapter 8
`include search, retrieval and filtering. Navigation and browsing are supported by a spe—
`cific set of tools described in Chapter 9. Furthermore, the description of collections of
`documents or of descriptions is presented in Chapter 10. Finally, for some applications, it
`has been recognized that it is necessary to define in a normative way the user preferences
`and the usage history pertaining to the consumption of the multimedia material. This
`allows, for example, matching between user preferences and MPEG-7 content descrip-
`tions in order to facilitate personalization of the processing. These tools are described in
`Chapter 11.
`
`Section IV: Visual Descriptors
`
`This section begins with an overview in Chapters 12 and 13 describes color'descriptors
`that represent different aspects of color distribution in images and Video. These include
`descriptors for a color histogram of a single image as well as a collection of images, color
`structure, dominant color, and color layout. Chapter 14 presents three texture descriptors: a
`homogeneous texture descriptor, a coarse level browsing descriptor and an edge histogram
`descriptor. Chapter 15 presents descriptors that represent contour shape, region shape and
`3—D shapes. The section concludes with motion descriptors in Chapter 16.
`
`Section V: Audio Descriptors
`
`An overview of the audio descriptors is provided in Chapter 17. Chapter 18 describes the
`spoken content technology in more detail. Sound recognition and sound similarity tools
`are outlined in Chapter 19.
`
`Section VI: Applications
`
`Finally, we conclude with a section on the potential applications of MPEG—7. The appli~
`cations are broadly classified into search and browsing related, and mobile applications.
`Chapter 20 covers some interesting search and browsing applications that include real
`time video retrieval, browsing of TV news broadcast using MPEG~7 tools, and audio and
`music retrieval. Chapter 21 discusses two interesting mobile applications.
`
`DVD
`
`The accompanying DVD contains additional material, including technical reports, some
`working demonstrations and the official MPEG—7 reference software. The demonstrations
`on the DVD include video browsing and shot retrieval, and search and browsing of
`images using texture. We hope that researchers and graduate students will find this useful
`in their work.
`
`Apple 1014
`
`.
`
`
`
`
`
`Apple 1014
`
`

`

`
`
`W..Lmemmmm
`
` PREFACE
`
`
`
`ACKNOWLEDGMENTS
`
`We would like to express our gratitude and sincere thanks to all the contributors without
`whose dedication and timely contributions this work would not have been possible. Our
`special thanks to Leonardo Chiariglione, the convenor of MPEG, for his encouragement
`and support throughout the course of this project. We would like to thank the Interna—
`tional Organization for Standardization (ISO) and in particular Jacques—Olivier Chabot
`and Keith Brannon for allowing us to publish the MPEG—7 Reference software on the
`accompanying DVD.
`
`We would also like to thank Dr. Lutz Ihlenburg of Heinrich—Hertz—Institut, Ger-
`many, for assisting on editorial issues and for providing many valuable comments and
`suggestions. Our thanks to Shawn Newsam and Lei Wang for organizing the material for
`the DVD. We extend our thanks to the many reviewers who helped editing individual
`chapters.
`
`BSM would also like to thank Samsung Electronics for its support in facilitat-
`ing the participation in the MPEG—7 activities. Special thanks to Dr. Hyundoo Shin and
`Dr. Yanglim Choi for their support during the past three years. Thanks to Shawn Newsam,
`Xinding Sun, Gomathi Sankar, Ying Li, Ashish Agarwal, and Lei Wang for reviewing
`some of the chapters. He would like to thank all the members of the Vision research
`laboratory at UCSB for their help in putting together this manuscript.
`
`B. S. Manjunath
`Philippe Salembier
`Thomas Sikora
`
`Apple 1014
`
`Apple 1014
`
`

`

`2 C
`
`ontext, Goals and Procedures
`
`
`
`Fernando Pereira and Rob Koenen
`
`Instituto Superior Técnico, Lisbon, Portugal, InterTruSt Technologies Corp,
`CA, USA
`‘
`
`-W____
`.W_..
`
`2.1 MOTIVATION AND OBJECTIVES
`
`Producing multimedia content today is easier than ever before. Using digital cameras,
`personal computers and the Internet, virtually every individual in the world is a potential
`content producer, capable of creating content that can be easily distributed and published.
`The same technologies allow content, which would in the past remain inaccessible, to be
`made available on—line.
`
`However, what would seem like a dream can easily turn into an ugly nightmare if no
`means are available to manage the explosion in available content. Content, analogue and
`digital alike, has value only if it can be discovered and used. Content that cannot be easily
`found is like content that does not exist, and potential revenues are directly dependent on
`users finding the content. The easier it becomes to produce content, the faster the amount
`of content grows and the more complex the problem of managing content gets. The same
`digital technology that lowers the thresholds for producing and publishing content can also
`help in analyzing and classifying it, in extracting and manipulating features for specific
`applications and in searching and discovering content. Be it with or without automated
`support, information about content is a prerequisite for being able to find and manage it.
`
`To date, people looking for content have used text-based browsers with very mod-
`erate retrieval performance; typically, these search engines yield much noise around the
`hits. The fact that they are in widespread use nonetheless indicates that a need exists.
`These text-based engines rely on human operators to manually describe the multimedia
`content with keywords and free annotations. For two reasons this is increasingly unac-
`ceptable. First, it is a costly process, and the cost increases with the growing amount
`of content. Second, these descriptions are inherently subjective and their usage is often
`confined to the application domain that the descriptions were created for. H‘ence;‘it is
`necessary to automatically and objectively describe, index and annotate multimedia infor-
`mation, notably audiovisual data, using tools that automatically extract (possibly complex)
`audiovisual features from the content to substitute or complement manual, text-based
`Apple 1014
`
`
`
`
`
`Mm‘-"w—‘w“W"v»\.__._.....W..w~
`
`
`
`
`
`Apple 1014
`
`

`

`
`
` CONTEXT, GOALS AND PROCEDURES
`
`
`
`. descriptions. These automatically extracted audiovisual features will have three advan-
`tages over human annotations: (1) they will be automatically generated, (2) they can be
`more objective and domain-independent and (3) they can be native to the audiovisual
`content. Native descriptions would use nontextual data to describe content, using features
`such as color, shape, texture, melody and sound envelopes, in a way that allows the user
`to search by comparing descriptions. Even though automatically extracted descriptions
`will be very useful, it is evident that descriptions, the ‘bits about the bits’, will always
`include textual components. There are many features that can only be expressed through
`text, for example, authors and titles.
`
`The situation depicted above has been recognized for a number of years now, and
`a lot of work has been invested in the last years in researching relevant technologies.
`Several products addressing this problem have already emerged in the market, such as
`Virage’s Videologger [1]. These products, as well as the large number of papers in jour-
`nals, conferences and workshops, were an indication that the time was ripe to address the
`multimedia content description problem at a much larger scale.
`
`The aforementioned problem and the technological situation were recognized by
`MPEG (the Moving Picture Experts Group) [2] in July 1996, when it decided, at the
`Tampere MPEG meeting, to start a standardization project, generally known as MPEG—7,
`and formally called Multimedia Content Description Interface (ISO/IEC 15938) [3]. The
`MPEG—7 project has the objective to specify a standard way of describing various types of
`multimedia information: elementary pieces, complete works and repositories, irrespective
`of their representation format and storage medium. The objective is to facilitate the quick
`and efficient identification of interesting and relevant information and the efficient man—
`agement of that information [4]. These descriptions are both textual (annotations, names
`of actors etc.) and nontextual (statistical features, camera parameters etc.) Like the other
`members of the MPEG family, MPEG—7 defines a standard representation of multimedia
`information satisfying a set of well-defined requirements. But MPEG—7 is quite a different
`standard than its predecessors. MPEG-1, MPEG—2 and MPEG—4 all represent the content
`itself - ‘the bits’ ~ while MPEG—7 represents information about the content — ‘the bits
`about the bits’. While the first reproduce the content, the latter describes the content. The
`requirements for these two purposes are very different [5], although there is also some
`interesting overlap in technologies, and sometimes the frontiers are not that sharp.
`
`Even without MPEG—7, there are many ways to describe multimedia content in
`use today in various digital asset management systems. Such systems, however, generally
`do not allow a search across different repositories and do not facilitate content exchange
`between different databases using different description systems. These are interoperability
`issues, and creating a standard is an appropriate way to address them. A standard way to do
`multimedia content description allows content and its descriptions to be exchanged across
`different systems. Also, it sets an environment in which tools from different providers
`can work together, creating an infrastructure for transparent management of multimedia
`content. The main results of the MPEG—7 standard are this increased interoperability, the
`prospect to offer lower—cost products through the creation of a sizable market with new,
`standard-based services and a rapidly growing user base [6]. This agreement — a standard
`is no more and no less than an agreement between its users — will stimulate both content
`providers and users and simplify the entire content—identification process. Of course, the
`standard needs to be technically sound, since otherwise proprietary solutions will prevail,
`
`Apple 1014
`
`Apple 1014
`
`

`

`
`
`DRIVING PRINCIPLES
`
`which will hamper interoperability. The challenge in MPEG—7 was matching the needs
`with the available technologies, or, in other words, reconciling what is possible with what
`is useful.
`
`Participants in the development of MPEG-7 represent broadcasters, equipment and
`software manufacturers, digital content creators and managers, telecommunication ser-
`vice providers, publishers and intellectual property rights managers, as well as university
`researchers.
`
`2.2 DRIVING PRINCIPLES
`
`in which
`The MPEG—7 standardization project was preceded by an exploration phase,
`some fundamental principles appeared to be generally shared among all participants [4].
`These driving principles are more than just high-level requirements, as they express the
`vision behind MPEG—7. This Vision has guided the requirements gathering process [5]
`and the subsequent tools development work [4].
`
`The guiding principles, which set the foundations of the MPEG—7 standard, are as
`follows [4]:
`
`0 Wide application base: MPEG-7 shall be applicable to the content associated to any
`application domain, real-time—generated or not; MPEG—7 shall not be tuned to any
`specific type of application. Moreover, the content may be stored, and may be made
`available on—line, off—line or streamed.
`
`0 Relation with content: MPEG—7 shall allow the creation of descriptions to be used:
`
`— stand—alone, for example, just providing a summary of the content;
`
`— multiplexed with the content itself, for example, when broadcast together with the
`content;
`
`— linked to one or more versions of the content, for example, in Internet-based media.
`
`0 Wide array of data types: MPEG—7 shall consider a large variety of data types (or
`modalities) such as speech, audio, image, Video, graphics, 3—D models, synthetic audio
`and so on. Since MPEG-7 emphasis is on audiovisual information, no new description
`tools should be developed for textual data. Rather, existing solutions shall be considered
`such as Standard Generalized Markup Language (SGML), Extensible Markup Language
`(XML) or Resource Description Framework (RDF) [5].
`
`0 Media independence: MPEG-7 shall be applicable independently of the medium that
`carries the content. Media can include paper, film, tape, CD, a hard disk, a digital
`broadcast, Internet streaming and so on.
`
`o Object—based : MPEG-7 shall allow the object—based description of content. The content
`can be represented, in this case described, as a composition of multimedia objects and it
`shall be possible to independently access the descriptive data regarding specific objects
`in the content.
`
`0 Format independence: 'MPEG-7 shall be applicable independently of the content repre—
`sentation format, whether analogue or digital, compressed or uncompressed. TherefOre,
`audiovisual content could be represented in Phase Alternate Line (PAL), National Tele—
`vision Standards Committee (NTSC), MPEG-1, MPEG—2 or MPEG-4 and so forth.
`There is, however, a special relation with MPEG—4 [7, 8] since both MPEG-7 and
`Apple 1014
`
`Apple 1014
`
`

`

` CONTEXT, GOALS AND PROCEDURES
`
`
`
`MPEG—4 are multimedia representation standards that are built using an object—based
`“data model. As such, they are both unique and they complement each other very well,
`allowing very powerful applications to be created.
`
`0 Abstraction level: MPEG—7 shall include description capabilities with different levels
`of abstraction, from low—level, often statistical features, to high-level features convey—
`ing semantic meaning. Often the lowlevel features can be extracted automatically,
`whereas the more semantically meaningful features need to be extracted manually or
`semiautomatically. Also, different levels of description granularity shall be possible
`within each abstraction level. Note that higher-level conclusions often find evidence in
`lower—level features.
`
`0 Extensibility: MPEG—7 shall allow the extension of the core set of description tools in
`a standard way. It is recognized that a standard such as MPEG—7 can never contain all
`the structures needed to address every single application domain, and thus it shall be
`possible to extend the standard in a way that guarantees as much interoperability as
`possible.
`
`These principles not only characterize the MPEG—7 vision but they also indicate what sets
`MPEG—7 apart from other similar standardization efforts.
`
`2.3 WHAT IS STANDARDIZED?
`
`Technology and standards, as so many things in life, may get old and obsolete. In addi—
`tion, since the less flexible and dynamic they are, the easier it is for them to become
`obsolete,
`it
`is essential
`that standards are as flexible and minimally constraining as
`possible, while still serving their fundamental objective — interoperability. To MPEG,
`this means that a standard must specify the minimum necessary, but not more than
`that. This approach allows industrial competition and further evolution of the tech—
`nology in the so—called ‘nonnormative’ areas — the areas that the standard does not
`fix. To MPEG7,
`this implies that only the description format — syntax and seman-
`tics —- and its decoding will be standardized. Elements that are explicitly not specified
`are techniques for extraction and encoding, and the ‘consumption’ (description usage)
`phase. Although good analysis and retrieval tools will be as essential for a success—
`ful MPEG-7 application as motion estimation and rate control are for MPEG—1 and
`MPEG—2 applications, and video segmentation for some MPEG—4 applications,
`their
`standardization is not required for interoperability; in fact, the descriptions’ consumer
`does not care that much about the way the descriptions are created, provided that they
`can be understood and used. The specification of content analysis tools — automatic
`or semiautomatic — is out of the scope of the standard, and so are the programs and
`machines that ‘consume’ MPEG—7 descriptions. Developing these tools will be a task
`for the industries that build and sell MPEG—7—enabled products. This approach ensures
`that good use can be made of the continuous improvements in the relevant technical
`areas. New technological developments can be leveraged to build improved automatic
`analysis tools, matching engines and so on, and the descriptions they produce or con-
`sume will remain compliant with the standard. Therefore, progress need not stop at the
`moment the standard is frozen. It is possible to rely on technical competition for obtain—
`ing ever better results. This is happening for MPEG—Z, where improvements in encoding
`
`Apple 1014
`
`
`
`
`
`Apple 1014
`
`

`

`MPEG STANDARDS DEVELOPMENT PROCESS
`
`techniques have slashed bit-rates for digital television almost to half over the past four
`years.
`
`The first edition of the MPEG-7 standard is commonly designated as Version 1. The
`standard will be extended in the future with additional tools to address more requirements
`and provide more functionality. This will happen in the form of amendments to the
`standard. It is common to designate as Version N of a part of the standard, the set of
`tools in Version 1 extended with the tools specified in Amendment N-l for that part of
`the standard; for example, Amendment 1 of a part of the MPEG—4 standard is commonly
`known as the Version 2 of that part of the standard.
`
`.
`
`2.4 MPEG STANDARDS DEVELOPMENT PROCESS
`
`
`
`._.w,MA,“NA...A..-“..._
`
`3
`l
`:
`
`ll t
`
`(5
`
`When content representation changed from analogue to digital, the technology develop-
`ment process also changed, if only in terms of speed of developments and the fact that
`it was no longer sufficient to simply designate vertical columns of technology for well-
`defined applications. Thus, it is essential for standardization bodies such as MPEG to take
`this environment into account in the standards it creates and the way it sets those standards.
`For a'decade now, it has no longer been possible to employ the ‘sys

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket