`ORGANISATION INTERNATIONALE DE NORMALISATION
`ISO/IEC JTC1/SC29/WG11
`CODING OF MOVING PICTURES AND AUDIO
`
`ISO/IEC JTC1/SC29/WG11 N4668
`
`March 2002
`
`Source:
`Status:
`Title:
`Editor:
`
`WG11 (MPEG)
`Final
`MPEG-4 Overview - (V.21 – Jeju Version)
`Rob Koenen (rob.koenen@m4if.org)
`
`All comments, corrections, suggestions and additions to this document are welcome, and should be send
`to both the editor and the chairman of MPEG’s Requirements Group: Fernando Pereira, fp@lx.it.pt
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`1
`
`Comcast, Ex. 1141
`
`
`
`
`
`Overview of the MPEG-4 Standard
`
`Executive Overview
`
`MPEG-4 is an ISO/IEC standard developed by MPEG (Moving Picture Experts Group), the
`committee that also developed the Emmy Award winning standards known as MPEG-1 and
`MPEG-2. These standards made interactive video on CD-ROM, DVD and Digital Television
`possible. MPEG-4 is the result of another international effort involving hundreds of researchers
`and engineers from all over the world. MPEG-4, with formal as its ISO/IEC designation
`’ISO/IEC 14496’, was finalized in October 1998 and became an International Standard in the
`first months of 1999. The fully backward compatible extensions under the title of MPEG-4
`Version 2 were frozen at the end of 1999, to acquire the formal International Standard Status
`early in 2000. Several extensions were added since and work on some specific work-items
`work is still in progress.
`
`MPEG-4 builds on the proven success of three fields:
`• Digital television;
`•
`Interactive graphics applications (synthetic content);
`•
`Interactive multimedia (World Wide Web, distribution of and access to content)
`MPEG-4 provides the standardized technological elements enabling the integration of the
`production, distribution and content access paradigms of the three fields.
`
`More information about MPEG-4 can be found at MPEG’s home page (case sensitive):
`http://mpeg.telecomitalialab.com This web page contains links to a wealth of information
`about MPEG, including much about MPEG-4, many publicly available documents, several lists
`of ‘Frequently Asked Questions’ and links to other MPEG-4 web pages. The standard can be
`bought from ISO, send mail to sales@iso.ch. Notably, the complete software for MPEG-4
`version 1 can be bought on a CD ROM, for 56 Swiss Francs. It can also be downloaded for free
`from ISO’s website: www.iso.ch/ittf - look under publicly available standards and then for
`“14496-5”. This software is free of copyright restrictions when used for implementing MPEG-
`4 compliant technology. (This does not mean that the software is fee of patents). As well, much
`information is available from the MPEG-4 Industry Forum, M4IF, http://www.m4if.org. See
`section 7, The MPEG-4 Industry Forum.
`
`This document gives an overview of the MPEG-4 standard, explaining which pieces of
`technology it includes and what sort of applications are supported by this technology.
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`2
`
`
`
`Table of Contents
`
`1.1
`1.2
`1.3
`1.4
`1.5
`1.6
`
`3.1
`3.2
`3.3
`3.4
`3.5
`
`5.1
`5.2
`5.3
`5.4
`5.5
`5.6
`
`6.1
`6.2
`
`1
`
`2
`3
`
`4
`5
`
`6
`
`7
`8
`
`8.1
`8.2
`
`9
`10
`
`10.1
`
`Scope and features of the MPEG-4 standard ...................................................5
`Coded representation of media objects............................................................ 5
`Composition of media objects........................................................................... 6
`Description and synchronization of streaming data for media objects ........ 7
`Delivery of streaming data ................................................................................ 8
`Interaction with media objects ......................................................................... 9
`Management and Identification of Intellectual Property .............................. 9
`Versions in MPEG-4........................................................................................10
`Major Functionalities in MPEG-4..................................................................10
`Transport .......................................................................................................... 10
`DMIF ................................................................................................................. 11
`Systems.............................................................................................................. 11
`Audio ................................................................................................................. 12
`Visual................................................................................................................. 12
`Extensions Underway ......................................................................................15
`Profiles in MPEG-4 .........................................................................................16
`Visual Profiles .................................................................................................. 16
`Audio Profiles................................................................................................... 18
`Graphics Profiles.............................................................................................. 18
`Scene Graph Profiles ....................................................................................... 19
`MPEG-J Profiles .............................................................................................. 20
`Object Descriptor Profile ................................................................................ 20
`Verification Testing: checking MPEG’s performance...................................21
`Video.................................................................................................................. 21
`Audio ................................................................................................................. 23
`The MPEG-4 Industry Forum ........................................................................25
`Licensing of patents necessary to implement MPEG-4..................................27
`Roles in Licensing MPEG-4 ............................................................................ 27
`Licensing Situation........................................................................................... 28
`Deployment of MPEG-4 ..................................................................................29
`Detailed technical description of MPEG-4 DMIF and Systems ....................30
`
`Transport of MPEG-4 ..................................................................................... 31
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`3
`
`
`
`DMIF ................................................................................................................. 32
`10.2
`Demultiplexing, synchronization and description of streaming data ......... 37
`10.3
`Advanced Synchronization (FlexTime) Model.............................................. 40
`10.4
`Syntax Description ........................................................................................... 42
`10.5
`Binary Format for Scene description: BIFS ................................................. 43
`10.6
`User interaction................................................................................................ 45
`10.7
`Content-related IPR identification and protection....................................... 46
`10.8
`MPEG-4 File Format....................................................................................... 47
`10.9
`MPEG-J ............................................................................................................ 50
`10.10
`Object Content Information ........................................................................... 51
`10.11
`11
`Detailed technical description of MPEG-4 Visual .........................................52
`11.1
`Natural Textures, Images and Video.............................................................. 52
`11.2
`Structure of the tools for representing natural video................................... 56
`11.3
`The MPEG-4 Video Image Coding Scheme .................................................. 57
`11.4
`Coding of Textures and Still Images .............................................................. 59
`11.5
`Synthetic Objects ............................................................................................. 60
`12
`Detailed technical description of MPEG-4 Audio ..........................................64
`12.1
`Natural Sound .................................................................................................. 64
`12.2
`Synthesized Sound............................................................................................ 69
`13
`Detailed Description of the Animation Framework eXtension (AFX)..........71
`14
`Annexes ............................................................................................................73
`A
`The MPEG-4 development process ................................................................ 73
`B
`Organization of work in MPEG ..................................................................... 75
`C
`Glossary and Acronyms................................................................................... 77
`
`
`
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`4
`
`
`
`1
`
`Scope and features of the MPEG-4 standard
`The MPEG-4 standard provides a set of technologies to satisfy the needs of authors, service
`providers and end users alike.
`• For authors, MPEG-4 enables the production of content that has far greater reusability, has
`greater flexibility than is possible today with individual technologies such as digital
`television, animated graphics, World Wide Web (WWW) pages and their extensions. Also,
`it is now possible to better manage and protect content owner rights.
`• For network service providers MPEG-4 offers transparent information, which can be
`interpreted and translated into the appropriate native signaling messages of each network
`with the help of relevant standards bodies. The foregoing, however, excludes Quality of
`Service considerations, for which MPEG-4 provides a generic QoS descriptor for different
`MPEG-4 media. The exact translations from the QoS parameters set for each media to the
`network QoS are beyond the scope of MPEG-4 and are left to network providers. Signaling
`of the MPEG-4 media QoS descriptors end-to-end enables transport optimization in
`heterogeneous networks.
`• For end users, MPEG-4 brings higher levels of interaction with content, within the limits
`set by the author. It also brings multimedia to new networks, including those employing
`relatively low bitrate, and mobile ones. An MPEG-4 applications document exists on the
`MPEG Home page (www.cselt.it/mpeg), which describes many end user applications,
`including interactive multimedia broadcast and mobile communications.
`For all parties involved, MPEG seeks to avoid a multitude of proprietary, non-interworking
`formats and players.
`
`MPEG-4 achieves these goals by providing standardized ways to:
`1. represent units of aural, visual or audiovisual content, called “media objects”. These media
`objects can be of natural or synthetic origin; this means they could be recorded with a
`camera or microphone, or generated with a computer;
`2. describe the composition of these objects to create compound media objects that form
`audiovisual scenes;
`3. multiplex and synchronize the data associated with media objects, so that they can be
`transported over network channels providing a QoS appropriate for the nature of the
`specific media objects; and
`interact with the audiovisual scene generated at the receiver’s end.
`
`4.
`
`The following sections illustrate the MPEG-4 functionalities described above, using the
`audiovisual scene depicted in Figure 1.
`
`1.1
`
`Coded representation of media objects
`MPEG-4 audiovisual scenes are composed of several media objects, organized in a hierarchical
`fashion. At the leaves of the hierarchy, we find primitive media objects, such as:
`• Still images (e.g. as a fixed background);
`• Video objects (e.g. a talking person - without the background;
`• Audio objects (e.g. the voice associated with that person, background music);
`
`MPEG-4 standardizes a number of such primitive media objects, capable of representing both
`natural and synthetic content types, which can be either 2- or 3-dimensional. In addition to the
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`5
`
`
`
`media objects mentioned above and shown in Figure 1, MPEG-4 defines the coded
`representation of objects such as:
`• Text and graphics;
`• Talking synthetic heads and associated text used to synthesize the speech and animate the
`head; animated bodies to go with the faces;
`• Synthetic sound.
`
` A
`
` media object in its coded form consists of descriptive elements that allow handling the object
`in an audiovisual scene as well as of associated streaming data, if needed. It is important to
`note that in its coded form, each media object can be represented independent of its
`surroundings or background.
`The coded representation of media objects is as efficient as possible while taking into account
`the desired functionalities. Examples of such functionalities are error robustness, easy
`extraction and editing of an object, or having an object available in a scaleable form.
`
`Composition of media objects
`Figure 1 explains the way in which an audiovisual scene in MPEG-4 is described as composed
`of individual objects. The figure contains compound media objects that group primitive media
`objects together. Primitive media objects correspond to leaves in the descriptive tree while
`compound media objects encompass entire sub-trees. As an example: the visual object
`corresponding to the talking person and the corresponding voice are tied together to form a new
`compound media object, containing both the aural and visual components of that talking
`person.
`
`Such grouping allows authors to construct complex scenes, and enables consumers to
`manipulate meaningful (sets of) objects.
`
`More generally, MPEG-4 provides a standardized way to describe a scene, allowing for
`example to:
`• Place media objects anywhere in a given coordinate system;
`• Apply transforms to change the geometrical or acoustical appearance of a media object;
`• Group primitive media objects in order to form compound media objects;
`• Apply streamed data to media objects, in order to modify their attributes (e.g. a sound, a
`moving texture belonging to an object; animation parameters driving a synthetic face);
`• Change, interactively, the user’s viewing and listening points anywhere in the scene.
`
`The scene description builds on several concepts from the Virtual Reality Modeling language
`(VRML) in terms of both its structure and the functionality of object composition nodes and
`extends it to fully enable the aforementioned features.
`
`
`1.2
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`6
`
`
`
`au d iov isu al ob jects
`
`voice
`
`sprite
`
`audiovisual
`presentation
`
`2D background
`
`y
`
`scene
`coordinate
`system
`
`z
`
`3D objects
`
`x
`
`user events
`
`audio
`compositor
`
`multiplexed
`downstream
`control / data
`
`multiplexed
`upstream
`control/data
`
`video
`compositor
`projection
`plane
`
`hypothetical viewer
`
`display
`
`speaker
`
`user input
`
`
`
`Figure 1 - an example of an MPEG-4 Scene
`
`
`
`1.3
`
`Description and synchronization of streaming data for media objects
`Media objects may need streaming data, which is conveyed in one or more elementary streams.
`An object descriptor identifies all streams associated to one media object. This allows handling
`hierarchically encoded data as well as the association of meta-information about the content
`(called ‘object content information’) and the intellectual property rights associated with it.
`Each stream itself is characterized by a set of descriptors for configuration information, e.g., to
`determine the required decoder resources and the precision of encoded timing information.
`Furthermore the descriptors may carry hints to the Quality of Service (QoS) it requests for
`transmission (e.g., maximum bit rate, bit error rate, priority, etc.)
`
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`7
`
`
`
`1.4
`
`Synchronization of elementary streams is achieved through time stamping of individual access
`units within elementary streams. The synchronization layer manages the identification of such
`access units and the time stamping. Independent of the media type, this layer allows
`identification of the type of access unit (e.g., video or audio frames, scene description
`commands) in elementary streams, recovery of the media object’s or scene description’s time
`base, and it enables synchronization among them. The syntax of this layer is configurable in a
`large number of ways, allowing use in a broad spectrum of systems.
`
`Delivery of streaming data
`The synchronized delivery of streaming information from source to destination, exploiting
`different QoS as available from the network, is specified in terms of the synchronization layer
`and a delivery layer containing a two-layer multiplexer, as depicted in Figure 2.
`The first multiplexing layer is managed according to the DMIF specification, part 6 of the
`MPEG-4 standard. (DMIF stands for Delivery Multimedia Integration Framework) This
`multiplex may be embodied by the MPEG-defined FlexMux tool, which allows grouping of
`Elementary Streams (ESs) with a low multiplexing overhead. Multiplexing at this layer may be
`used, for example, to group ES with similar QoS requirements, reduce the number of network
`connections or the end to end delay.
`
`The “TransMux” (Transport Multiplexing) layer in Figure 2 models the layer that offers
`transport services matching the requested QoS. Only the interface to this layer is specified by
`MPEG-4 while the concrete mapping of the data packets and control signaling must be done in
`collaboration with the bodies that have jurisdiction over the respective transport protocol. Any
`suitable existing transport protocol stack such as (RTP)/UDP/IP, (AAL5)/ATM, or MPEG-2’s
`Transport Stream over a suitable link layer may become a specific TransMux instance. The
`choice is left to the end user/service provider, and allows MPEG-4 to be used in a wide variety
`of operation environments.
`
`
`Elementary Streams
`
`Elementary Stream Interface
`
`SL SL SL
`
`SL
`
`SL
`
`....
`
`SL
`
`SL
`
`Sync Layer
`
`FlexMux Channel
`
`SL-Packetized Streams
`
`DMIF Application Interface
`
`Delivery Layer
`
`DMIF Layer
`
`DMIF Network Interface
`
`....
`
`TransMux Layer
`(not specified in MPEG-4)
`....
`
`FlexMux
`
`FlexMux
`
`FlexMux
`
`TransMux Channel
`
`FlexMux Streams
`
`File
`
`Broad-
`cast
`
`Inter-
`active
`
`(RTP)
`UDP
`IP
`
`(PES)
`MPEG2
`TS
`
`AAL2
`ATM
`
`H223
`PSTN
`
`DAB
`Mux
`
`TransMux Streams
`
`Figure 2 - The MPEG-4 System Layer Model
`
`
`
`
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`8
`
`
`
`1.5
`
`1.6
`
`Use of the FlexMux multiplexing tool is optional and, as shown in Figure 2, this layer may be
`empty if the underlying TransMux instance provides all the required functionality. The
`synchronization layer, however, is always present.
`
`With regard to Figure 2, it is possible to:
`•
`Identify access units, transport timestamps and clock reference information and identify
`data loss.
`• Optionally interleave data from different elementary streams into FlexMux streams
`• Convey control information to:
`•
`Indicate the required QoS for each elementary stream and FlexMux stream;
`• Translate such QoS requirements into actual network resources;
`• Associate elementary streams to media objects
`• Convey the mapping of elementary streams to FlexMux and TransMux channels
`Parts of the control functionalities are available only in conjunction with a transport control
`entity like the DMIF framework.
`
`Interaction with media objects
`In general, the user observes a scene that is composed following the design of the scene’s
`author. Depending on the degree of freedom allowed by the author, however, the user has the
`possibility to interact with the scene. Operations a user may be allowed to perform include:
`• Change the viewing/listening point of the scene, e.g. by navigation through a scene;
`• Drag objects in the scene to a different position;
`• Trigger a cascade of events by clicking on a specific object, e.g. starting or stopping a
`video stream;
`• Select the desired language when multiple language tracks are available;
`More complex kinds of behavior can also be triggered, e.g. a virtual phone rings, the user
`answers and a communication link is established.
`
`Management and Identification of Intellectual Property
`It is important to have the possibility to identify intellectual property in MPEG-4 media
`objects. Therefore, MPEG has worked with representatives of different creative industries in
`the definition of syntax and tools to support this. A full elaboration of the requirements for the
`identification of intellectual property can be found in ‘Management and Protection of
`Intellectual Property in MPEG-4, which is publicly available from the MPEG home page.
`
`MPEG-4 incorporates identification the intellectual property by storing unique identifiers,
`which are issued by international numbering systems (e.g. ISAN, ISRC, etc.1). These numbers
`can be applied to identify a current rights holder of a media object. Since not all content is
`identified by such a number, MPEG-4 Version 1 offers the possibility to identify intellectual
`property by a key-value pair (e.g.:»composer«/»John Smith«). Also, MPEG-4 offers a
`standardized interface that is integrated tightly into the Systems layer to people who want to
`use systems that control access to intellectual property. With this interface, proprietary control
`systems can be easily amalgamated with the standardized part of the decoder.
`
`
`1 ISAN: International Audio-Visual Number, ISRC: International Standard Recording Code
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`9
`
`
`
`2
`
`Versions in MPEG-4
`MPEG-4 Version 1 was approved by MPEG in December 1998; version 2 was frozen in
`December 1999. After these two major versions, more tools were added in subsequent
`amendments that could be qualified as versions, even though they are harder to recognize as
`such. Recognizing the versions is not too important, however; it is more important to
`distinguish Profiles. Existing tools and profiles from any version are never replaced in
`subsequent versions; technology is always added to MPEG-4 in the form of new profiles.
`Figure 3 below depicts the relationship between the versions. Version 2 is a backward
`compatible extension of Version 1, and version 3 is a backward compatible extension of
`Version 2 – and so on. The versions of all major parts of the MPEG-4 Standard (Systems,
`Audio, Video, DMIF) were synchronized; after that, the different parts took their own paths.
`
`
`MPEG-4
`Versions
`
`Version N
`Version 3
`
`Version 2
`
`Version 1
`
`Figure 3 - relation between MPEG-4 Versions
`
`
`
`
`
`The Systems layer of Version later versions is backward compatible with all earlier versions. In
`the area of Systems, Audio and Visual, new versions add Profiles, do not change existing ones.
`In fact, it is very important to note that existing systems will always remain compliant, because
`Profiles will never be changed in retrospect, and neither will the Systems Syntax, at least not in
`a backward-incompatible way.
`
`3
`
`3.1
`
`Major Functionalities in MPEG-4
`This section contains, in an itemized fashion, the major functionalities that the different parts
`of the MPEG-4 Standard offers in the finalized MPEG-4 Version 1. Description of the
`functionalities can be found in the following sections.
`
`
`Transport
`In principle, MPEG-4 does not define transport layers. In a number of cases, adaptation to a
`specific existing transport layer has been defined:
`• Transport over MPEG-2 Transport Stream (this is an amendment to MPEG-2 Systems)
`• Transport over IP (In cooperation with IETF, the Internet Engineering Task Force)
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`10
`
`
`
`3.2
`
`3.3
`
`DMIF
`DMIF, or Delivery Multimedia Integration Framework, is an interface between the application
`and the transport, that allows the MPEG-4 application developer to stop worrying about that
`transport. A single application can run on different transport layers when supported by the
`right DMIF instantiation.
`MPEG-4 DMIF supports the following functionalities:
`• A transparent MPEG-4 DMIF-application interface irrespective of whether the peer is a
`remote interactive peer, broadcast or local storage media.
`• Control of the establishment of FlexMux channels
`• Use of homogeneous networks between interactive peers: IP, ATM, mobile, PSTN,
`Narrowband ISDN.
`• Support for mobile networks, developed together with ITU-T
`• UserCommands with acknowledgment messages.
`• Management of MPEG-4 Sync Layer information
`
`Systems
`As explained above, MPEG-4 defines a toolbox of advanced compression algorithms for audio
`and visual information. The data streams (Elementary Streams, ES) that result from the coding
`process can be transmitted or stored separately, and need to be composed so as to create the
`actual multimedia presentation at the receiver side.
`
`The systems part of the MPEG-4 addresses the description of the relationship between the
`audio-visual components that constitute a scene. The relationship is described at two main
`levels.
`• The Binary Format for Scenes (BIFS) describes the spatio-temporal arrangements of the
`objects in the scene. Viewers may have the possibility of interacting with the objects, e.g.
`by rearranging them on the scene or by changing their own point of view in a 3D virtual
`environment. The scene description provides a rich set of nodes for 2-D and 3-D
`composition operators and graphics primitives.
`• At a lower level, Object Descriptors (ODs) define the relationship between the Elementary
`Streams pertinent to each object (e.g the audio and the video stream of a participant to a
`videoconference) ODs also provide additional information such as the URL needed to
`access the Elementary Steams, the characteristics of the decoders needed to parse them,
`intellectual property and others.
`Other issues addressed by MPEG-4 Systems:
`• A standard file format supports the exchange and authoring of MPEG-4 content
`•
`Interactivity, including: client and server-based interaction; a general event model for
`triggering events or routing user actions; general event handling and routing between
`objects in the scene, upon user or scene triggered events.
`Java (MPEG-J) is used to be able to query to terminal and its environment support and
`there is also a Java application engine to code ’MPEGlets’.
`• A tool for interleaving of multiple streams into a single stream, including timing
`information (FlexMux tool).
`• A tool for storing MPEG-4 data in a file (the MPEG-4 File Format, ‘MP4’)
`•
`Interfaces to various aspects of the terminal and networks, in the form of Java API’s
`(MPEG-J)
`
`•
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`11
`
`
`
`• Transport layer independence. Mappings to relevant transport protocol stacks, like
`(RTP)/UDP/IP or MPEG-2 transport stream can be or are being defined jointly with the
`responsible standardization bodies.
`• Text representation with international language support, font and font style selection,
`timing and synchronization.
`• The initialization and continuous management of the receiving terminal’s buffers.
`• Timing identification, synchronization and recovery mechanisms.
`• Datasets covering identification of Intellectual Property Rights relating to media objects.
`
`Audio
`MPEG-4 Audio facilitates a wide variety of applications which could range from intelligible
`speech to high quality multichannel audio, and from natural sounds to synthesized sounds. In
`particular, it supports the highly efficient representation of audio objects consisting of:
`
`General Audio Signals
`Support for coding general audio ranging from very low bitrates up to high quality is provided
`by transform coding techniques. With this functionality, a wide range of bitrates and
`bandwidths is covered. It starts at a bitrate of 6 kbit/s and a bandwidth below 4 kHz and
`extends to broadcast quality audio from mono up to multichannel. High quality can be achieved
`with low delays. Parametric Audio Coding allows sound manipulation at low speeds. Fine
`Granularity Scalability (or FGS, scalability resolution down to 1 kbit/s per channel)
`
`Speech signals
`Speech coding can be done using bitrates from 2 kbit/s up to 24 kbit/s using the speech coding
`tools. Lower bitrates, such as an average of 1.2 kbit/s, are also possible when variable rate
`coding is allowed. Low delay is possible for communications applications. When using the
`HVXC tools, speed and pitch can be modified under user control during playback. If the CELP
`tools are used, a change of the playback speed can be achieved by using and additional tool for
`effects processing.
`
`Synthetic Audio
`MPEG-4 Structured Audio is a language to describe 'instruments' (little programs that generate
`sound) and 'scores' (input that drives those objects). These objects are not necessarily musical
`instruments, they are in essence mathematical formulae, that could generate the sound of a
`piano, that of falling water – or something 'unheard' in nature.
`
`Synthesized Speech
`Scalable TTS coders bitrate range from 200 bit/s to 1.2 Kbit/s which allows a text, or a text
`with prosodic parameters (pitch contour, phoneme duration, and so on), as its inputs to
`generate intelligible synthetic speech.
`
`Visual
`The MPEG-4 Visual standard allows the hybrid coding of natural (pixel based) images and
`video together with synthetic (computer generated) scenes. This enables, for example, the
`virtual presence of videoconferencing participants. To this end, the Visual standard comprises
`tools and algorithms supporting the coding of natural (pixel based) still images and video
`sequences as well as tools to support the compression of synthetic 2-D and 3-D graphic
`geometry parameters (i.e. compression of wire grid parameters, synthetic text).
`
`3.4
`
`3.4.1
`
`3.4.2
`
`3.4.3
`
`3.4.4
`
`3.5
`
`MPEG-4 Overview
`
`© MPEG 1999-2002 – unlimited reproduction permitted if not modified
`
`12
`
`
`
`3.5.1
`
`3.5.2
`
`3.5.3
`
`3.5.4
`
`The subsections below give an itemized overview of functionalities that the tools and
`algorithms of in the MPEG-4 visual standard.
`
`Formats Supported
`The following formats and bitrates are be supported by MPEG-4 Visual :
`• bitrates: typically between 5 kbit/s and more than 1 Gbit/s
`• Formats: progressive as well as interlaced video
`• Resolutions: typically from sub-QCIF to ’Studio’ resolutions (4k x 4k pixels)
`
`Compression Efficiency
`• For all bit rates addressed, the algorithms are very efficient. This includes the compact
`coding of textures with a quality adjustable between "acceptable" for very high
`compression ratios up to "near lossless".
`• Efficient compression of textures for texture mapping on 2-D and 3-D meshes.
`• Random access of video to allow functionalities such as pause, fast forward and fast
`reverse of stored video.
`
`Content-Based Functionalities
`• Content-based coding of images and video allows separate decoding and reconstruction of
`arbitrarily shaped video objects.
`• Random access of content in video sequences allows functionalities such as pause, fast
`forward and fast reverse of stored video objects.
`• Extended manipulation of content in video sequences allows functionalities such as
`warping of synthetic or natural text, textures, image and video overlays on reconstructed
`video content. An example is the mapping of text in front of a moving video object where
`the text moves coherently with the object.
`
`Scalability of Textures, Images and Video
`• Complexity scalability in the encoder allows encoders of different complexity to generate
`valid and meaningful bitstreams for a given texture, image or video.
`• Complexit