`STANDARD
`
`ISO/IEC
`13818-2
`
`First edition
`1996-05-15
`
`information technology — Generic coding
`of moving pictures and associated audio
`information: Video
`
`Technologies de |'information — Codage des images animées et du son
`associé: Vidéo
`
`Reference number
`ISOMEC 13818-2:1996(E}
`
`SAMSUNG-1043
`
` 1
`
`1
`
`SAMSUNG-1043
`
`
`
`ISO/IEC 13818-2:1996(E)
`
`CONTENTS
`
`Page
`
`wewwowvrwwmaonew=
`
`awnaBSZRSG8SRx2=
`
`—-=so
`
`Normative references .........cccccccccscseeseecserseresescesceesecssesacsaecesesesaesssseuensencasenennaeeaesenageaeceesedsnsasceeseeaseescseeanenancags
`
`Deefimitions.......ccc-cccceccesseseccesencescorecsssesssecescesssceseeaesacsasaccasraeensensersnssaeneegscieteeesceaesseesseeseeceueecessesanssnsnsensusentancats
`
`
`
`Abbreviations and symbols............
`41
`Arithmetic Operators .0...0.0. eee cece cceneraneccsceceseereeneeeereraneeeeenaies
`4.2
`Logical Operators ...........ccceeccceeeeteeseseeeeeeesseeneaeescseseeneeseenenenesenenenenaraenas
`43
`Relational Operators .......cccccccssssessececssseeeecrssnsesenessesceeeerersseresenseeeseeassenttaneeaeaeces
`44
`Bitwise Operators .......scccccsessessesssssssnessssseneeneesesssseessevecananeaecinqnesacenseeeeeensensaneaeeseaisaaseesesissarauenressnaserses
`4.5
`ASSIQMMENL «0...ses cessscsesesseeeressssensnsesenesenseeencnsassseesssssaanarcessseuaeesesnsananereraneeaeaeersaeauaavanssaasausesaaneuseess
`4.6
`MIM€MONICS.......cccccccsesccceseccesessoeeecaesecesenceececeeseeeesessesenansassasanenessesssensaneeseeeeaenecseneeaesesesesansanessenanaaseses
`47
`COMSTANES ......ccssccescesessstseseeeeressessersesesnsnsceeersesssesesssesevavavecssssaeaesensasserensesgesassessessatanensensssananasssnsoesereey
`
`Conventions...
`Methodofdescribingbitstreamsyntax...
`5.1
`Definition of functions...
`-
`peeveneauesceaue
`5.2
`Reserved, forbidden and markerbit...
`5.3
`5.4
`Arithmetic precision... ccsscceseeeesceeseeescnseaeeeceeseeneeneneeenenneeesaueteetecannaeeteaenanenecnetaraenenaerteesrars
`
`Video bitstream syntax and semantics ..........cccesecsssseseesecssssrensscecsesensesceasseseeseeaeeeeeeceasneaeseecaseneneneanenaenenty
`6.1
`Structure of coded video data oo... ccc ceccceeescseeetessssnsensnsscsseneeeesersnsersercrneeensanseneaneaaeecsesessasensesanensane
`6.2
`6.3
`
`Video bitstream SYNtAX......ccessccssseeceesecseseeessseseessssavsnsecsusscsseesscessueneareesnsneseserssenesenseeseeeetencessenanans
`Video bitstream S@MaNtiCs ..........ccc cc ccseeeeeeenseneeeeeresenseeesateassaesensessaaeencsseenenesnensnnensensperseseaseesreaaasees
`
`Inverse QUaNtiSation oo... cece ceeeetesseseeecerseceesennerereeseneeeeeeeeaeeeseeeceeteneetananenensnnenacaeeneaecasea
`Tmverse DCTooo. ....cceceeeccecccesseeescececeeeeceeceeneestescesenescssestaseesseansneeserensneeecneasneaesaesesseseeseseareseaeeacaeaee
`
`The Video decoding Process .........ccccccsesssseecsssseseneceecssseeenscssusevanssssuauansasdeseeeecsstsaesesaesesnenenssegereesesseseetanens
`7A
`Higher syntactic Structures .........:.csecsssssscsesereessnssesnsessesseaesessasensnesesesusearesssssanenenesonensansesearsneneesenssanens
`7.2
`Variable length decoding..........:cssesesesssesenenscecsssseeseecsssnsessssenseasassarsearsteessstsenenrerensasnenesesseaesenenesesanans
`IMVerse SCAM........cccsscsseesesserseenecerensnneeenecetessanearagenececbecbeesuecessasesenessteeeneeenensnaganeeereetenaneansrsaeeseseeeesunees
`7.3
`7.4
`75
`7.6
`V7
`7.8
`79
`7.10
`TAL
`7.12
`
`
`
`Motion COMPENSALION 00.ee ce ces eeseeeneceneseeraeeetensansanenscaseneneseceusnsseneceesaseenessesaeeaeassasaeias
`Spatial scalability ..........ccceccscesersssserserseseserseeetseteteenseesessesssessesseseaesessesnaeseesensesanenseesensnseesarnseneeeceaens
`SNR scalability 20.0000... cece ccee see ceesecsecesseceeeeecseeaseeescseaseeassasaecaasecsseneneannecssecaneecsecseneanereeceeensesersanees
`Temporalscalability .
`Data partitioning...
`ceeeesenseneseeassaeas
`Hybrid scalability...
`Output of the decodingprprocess...
`Profiles and levels ..
`caves
`daecansenencessansecsesavsesavansecssceeaneseseuseeeaseecnnaesasssieneseapesaseasaneesentauansaensaseas
`ISO/IEC It 172-2compatibility.ceseecseses
`8.1
`
`8.2
`
`Relationship between defined profiles..
`
`8.3
`Relationship between defined levels.....
`8.4
`
`Scalable layers 2.0.0.2... cececc ccceseeescseceeneceeceeeeeeneeseneneeees
`Parameter values for defined profiles, levels and layers ...........0.cccccccsseeseseseecsnsescesensnenessnanseesneeees
`8.5
`
`_ -w
`
`ith
`
`5
`
`6
`
`7
`
`8
`
`© ISO/IEC 1996
`
`All sights reserved. Unless otherwise specified, no part of this publication may be reproduced or
`utilized in any form or by any means, electronic or mechanical,
`including photocopying and
`microfilm, without permission in writing from the publisher.
`ISOMEC Copyright Office « Case postale 56 * CH-1211 Genéve 20 * Switzerland
`Printed in Switzerland
`
`2
`
`
`
`© [SO/IEC
`
`ISO/IEC 13818-2:1996(E)
`
`Annex A — Discrete cosine tramsform...........ccccecssssesssessensessesessssacearsecsesenseesesenseneeneenseneentenseeseerensieeinesneeneersaneeres
`
`125
`
`126
`Annex B — Variable length code tables 0... ccc ccccesesscnsseeassscsssecececseseneeserensensnssnseecaeeesnesseemsseeessesaasaeeesissasaenee
`126
`B.1 Macroblock addressing .............:ccceceseceseeeeserscseseesacenecaceseceeseneesseeasesaeenesasaeesetesaasaesaeeniaeaaeeeseneseaee ns
`127
`B.2 Macroblock type oi. .cceccsesessssssesseceessssseesscsssssesassscassseessssessessecisanneaesaraseaeeasssassaaresissaeageeneassenanaes
`132
`B.3)— Macroblock pattern... ceceescsseseeseesesonseeenseseeseesesssseceeesassesneeeserscsnensaseseatensasasseeserausesassseenesssenseeeases
`
`B.4—MOtion VeCt9rs ...........sceceeserseseeerseneeneesesserenersceaaneanersesaenessesceseasnerreensenennrsesaeenrnssentareasaitatarsaearcasnaeasnes 133.
`B.5
`DCT coefficients 2.2.0... ceeec cee cseteeeseseneesereeeeneees
`134
`
`Annex C — Video buffering verifier 2.0.00... ccc eeceee ce nseeeesescssseeanecsssensnseeesenseasteeeeseeacdeseaseseeusesssseeenssesansevae
`
`Annex D — Features supported by the algorithm...cc cceeseessseneescecsasenseeesseesensesensesasenseeseesecgenseeceessanenseecansetas
`Dib
`QVGrVIOW. eee eeeeeesesreeseeceeeeecesaesesenenesecsrenererseseneasesreranevanarassenensserseensensnseseensnasavseereneenravereeneesenenees
`D.2 Video formats... cece eececeeceeteeeeseeseseseeeessessscesesssseseaecacsseseseesessesenecsessesenenseseeeesenasaeeeeaesensesenseseaees
`
`D3
`D4
`
`Picture quality ........0.ccckecccsste see ceseecesesssseeseesccsseetesoseuseveessasaceaesessdsesecaessssseseesecsessceeceesnssecensssrerseeases
`Datta rate COMO... cee eet neececeneeeseeneneeaceeneeeeee sed senacaecsesananeecseseeensaseneadenscascendsaneeteneeaaeee
`
`Low delay mode...........cccccccsssesescsssecssssssseencsscsscsnssessssaustesessssaesacsecssesansasssssasaceascatseeancarsereaaeeasecesseenses
`D5
`Random access/channel HOpping ............ccccccssecesessessesescersesenenenecsssenenessseseseaesesesutanseacarseseeeaseesseeeses
`D.6
`D.7F Sabaility oo. ccc cececeesceeseseeseeeesseenecacsnsasascsesaeanseececaeseacessaseneuesessssesenesstesdesessicseseenesescerasseseeseeseseses
`
`D.8
`Compatibility 0... eee eececeesseseeeesesseseseeeessenneneneesensneeanesseas
`
`
`D.9 Differences between this Specification and ISOMTEC 11172-2..... eecescseaeaseeseseeeaeee
`D.10 Complexity ...........:ccccssssscssssssetsssessesassessecesssssssseestesssseuseeesssssasuessesesseesnessssessesenesssaeseasaesaseasaaeasensasenese
`D.11 Editing encoded bitstreams v.00... cee cssesesessersesescsseerenenerncsesensnsssesesessnssesessesesesssrsnsearasesranseanscassenenses
`D.12 Trick modes ....cscccssscscseseeseseereessetesenesessesesesessscseseesseeasavenececseseeeessseensesensteseeasedescousesnesecsasaneeesscansanenses
`D.13 Error resilience... cccecceseseessesesesnscsseeseraereneaenensesanenssecasenseneceeseneseeceseanenaesscosesaesecsnssaenesaceseneaeeas
`
`D.14 Concatenated sequences.........cccccccecceesscceessesssceetensssseceecesassaeacespeeseneenessessascereessssassecersssonenesenreseesees®
`Annex E — Profile and level restrictions ...0............c:cccesseescssseseesesssssceessecececeececeatenenecssesneeesersaesaneasqacaeansenenessneeseces
`
`E.1
`E.2
`
`Syntax elementrestrictions in profiles 2.00.2... csceccsserceeeereesenenererensereneessesnesanaeseansnanenassanenerecsssensnses
`Permissible layer combinations....0......0.ccceccseceesessssessseeecsesseeeserssssssesassessseeassesseseeaesecaesensnesesessnenseee
`
`Annex F — Bibliography oo... ccc ccneeessesesseesesssseeeeesssseevansetesseneeersssssessestesssnesessssesssecaesesssaetansessesenseeeasessneneees
`
`143
`
`148
`148
`148
`149
`149
`150
`150
`150
`157
`157
`160
`160
`160
`161
`168
`
`169
`169
`180
`
`201
`
`3
`
`
`
`ISO/IEC 13818-2:1996(E)
`
`® ISO/IEC
`
`Foreword
`
`FEC (the
`and
`International Organization for Standardization)
`ISO (the
`International Electrotechnical Commission)
`form the specialized system for
`worldwide standardization. National bodies that are members of ISO or IEC par-
`ticipate in the development of
`International Standards
`through technica!
`committees established by the respective organization to deal with particular fields
`of technical activity. ISO and IEC technical committees collaborate in fields of
`mutual
`interest. Other
`international organizations, governmental
`and non-
`governmental, in liaison with ISO and IEC, also take part in the work.
`
`In the field of information technology, ISO and IEC have established a joint
`technical committee, ISO/IEC JTC 1. Draft International Standards adopted by the
`Joint technical committee are circulated to national bodies for voting. Publication
`as an International Standard requires approval by at least 75 % of the national
`bodies casting a vote.
`
`International Standard ISO/IEC 13818-2 was prepared by Joint Technical
`Committee
`ISO/IEC JTC 1,
`Information technology, Subcommittee SC 29,
`Ceding of audio, picture, multimedia
`and hypermedia
`information,
`in
`collaboration with
`ITU-T. The
`identical
`text
`is
`published
`as
`ITU-T
`Recommendation H.262.
`
`ISOMEC 13818 consists of the following parts, under the generaltitle Information
`technology — Generic coding of moving pictures and associated audio
`information:
`
`— Part I: Systems
`— Part 2; Video
`
`— Part 3: Audio
`
`— Part 4: Compliancetesting
`
`— Part 6: Extensions for DSM-CC
`
`— Part 9: Extension for real time interface for systems decoders
`
`Annexes A to C form anintegral part of this part of ISO/IEC 13818. Annexes D to
`F are for information only.
`
`4
`
`
`
`© ISOMEC
`
`Introduction
`
`Intro. 1
`
`Purpose
`
`ISO/IEC 13818-2:1996(E)
`
`This Part of this Specification was developed in response to the growing need for a generic coding method of moving
`pictures and of associated sound for various applications such as digital storage media, television broadcasting and
`communication. The use of this Specification means that motion video can be manipulated as a form of computer data
`and can bestored on various storage media, transmitted and received over existing and future networks and distributed
`on existing and future broadcasting channels.
`
`Intro. 2
`
`Application
`
`The applications of this Specification cover, but are not limited to, such areas as listed below:
`
`BSS
`
`Broadcasting Satellite Service (to the home)
`
`CATV Cable TV Distribution on optical networks, copper, etc.
`CDAD Cable Digital Audio Distribution
`
`DSB
`
`DTTB
`EC
`
`ENG
`
`FSS
`HTT
`
`IPC
`
`ISM
`
`Digital Sound Broadcasting(terrestrial and satellite broadcasting)
`
`Digital Terrestrial Television Broadcasting
`Electronic Cinema
`
`Electronic News Gathering (including SNG,Satellite News Gathering)
`
`Fixed Satellite Service (e.g. to head ends)
`Home Television Theatre
`
`Interpersonal Communications (videoconferencing, videophone, etc.)
`
`Interactive Storage Media (optical disks, etc.)
`
`MMM Multimedia Mailing
`NCA
`Newsand Current Affairs
`
`NDB
`RVS
`
`SSM
`
`Networked Database Services (via ATM,etc.)
`Remote Video Surveillance
`
`Serial Storage Media (digital VTR,etc.)
`
`Intro. 3
`
`Profiles and levels
`
`This Specification is intended to be generic in the sense that it serves a wide range of applications, bitrates, resolutions,
`qualities and services. Applications should cover, among other things, digital storage media, television broadcasting and
`communications. In the course of creating this Specification, various requirements trom typical applications have been
`considered, necessary algorithmic elements have been developed, and they have been integrated into a single syntax.
`Hence, this Specification will facilitate the bitstream interchange among different applications.
`
`Considering the practicality of implementing the full syntax of this Specification, however, a limited number of subsets
`of the syntax are also stipulated by meansof “profile” and “level’’. These and other related terms are formally defined in
`clause 3.
`
`A “profile” is a defined subset of the entire bitstream syntax that is defined by this Specification. Within the bounds
`imposed by the syntax of a given profile it is still possible to require a very large variation in the performance of
`encoders and decoders depending upon the values taken by parameters in the bitstream. For instance, it is possible to
`specify framesizes as large as (approximately) 2'4 samples wide by 2!4 lines high. It is currently neither practical nor
`economic to implement a decoder capable of dealing with all possible framesizes.
`
`In order to deal with this problem, “levels” are defined within each profile. A level is a defined set of constraints
`imposed on parameters in the bitstream. These constraints may be simple limits on numbers. Alternatively they may take
`the form of constraints on arithmetic combinations of the parameters (e.g. frame width multiplied by frame height
`multiplied by framerate).
`
`Bitstreams complying with this Specification use a common syntax. In order to achieve a subset of the complete syntax,
`flags and parameters are included in the bitstream that signal the presence or otherwise of syntactic elements that occur
`later in the bitstream. In order to specify constraints on the syntax (and hence define a profile) it is thus only necessary to
`constrain the values of these flags and parameters that specify the presence oflater syntactic elements.
`
`5
`
`
`
`ISO/IEC 13818-2:1996(E)
`
`© ISO/IEC
`
`Intro. 4
`
`The scalable and the non-scalable syntax
`
`Thefull syntax can be divided into two major categories: One is the non-scalable syntax, which is structured as a super
`set of the syntax defined in ISO/IEC 11172-2. The main feature of the non-scalable syntax is the extra compression tools
`for interlaced video signals. The second is the scalable syntax, the key property of which is to enable the reconstruction
`of useful video from pieces of a total bitstream. This is achieved by structuring the total bitstream in two or more layers,
`starting from a standalone base layer and adding a number of enhancement layers. The base layer can use the non-
`scalable syntax, or in some situations conform to the ISO/IEC 11172-2 syntax.
`
`Intro. 4.1
`
`Overview of the non-scalable syntax
`
`The coded representation defined in the non-scalable syntax achieves a high compression ratio while preserving good
`image quality. The algorithm is not lossless as the exact sample values are not preserved during coding. Obtaining good
`image quality at the bitrates of interest demands very high compression, which is not achievable with intra picture
`coding alone, The need for random access, however, is best satisfied with pure intra picture coding. The choice of the
`techniques is based on the need to balance a high image quality and compression ratio with the requirement to make
`random access to the codedbitstream.
`
`A number of techniques are used to achieve high compression. The algorithm first uses block-based motion
`compensation to reduce the temporal redundancy. Motion compensationis used both for causal prediction of the current
`picture from a previous picture, and for non-causal, interpolative prediction from past and future pictures. Motion
`vectors are defined for each 16-sample by 16-line region ofthe picture. The prediction error, is further compressed using
`the Discrete Cosine Transform (DCT) to remove spatial correlation before it is quantised in an irreversible process that
`discards the less important information. Finally, the motion vectors are combined with the quantised DCT information,
`and encoded using variable length codes.
`
`Intro. 4.1.1 Temporal processing
`
`Because of the conflicting requirements of random access and highly efficient compression, three main picture types are
`defined. Intra Coded Pictures (I-Pictures) are coded without reference to other pictures. They provide access points to
`the coded sequence where decoding can begin, but are coded with only moderate compression. Predictive Coded
`Pictures (P-Pictures) are coded more efficiently using motion compensated prediction from a past intra or predictive
`coded picture and are generally used as a reference for further prediction. Bidirectionally-predictive Coded Pictures
`(B-Pictures) provide the highest degree of compression but require both past and future reference pictures for motion
`compensation. Bidirectionally-predictive coded pictures are never used as references for prediction (except in the case
`that the resulting picture is used as a reference in a spatially scalable enhancementlayer). The organisation of the three
`picture types in a sequence is very flexible. The choice is left to the encoder and will depend on the requirements of the
`application. Figure Intro. 1 illustrates an exampleof the relationship amongthe three different picture types.
`
`BidirectionalInterpolation
`
`71516650-94/d01
`
`Figure Intro. 1 - Example of temporal picture structure
`
`vi
`
`6
`
`
`
`© ISO/IEC
`
`Intro. 4.1.2 Coding interlaced video
`
`ISO/IEC 13818-2:1996(E)
`
`Each frameof interlaced video consists of two fields which are separated by one field-period. The Specification allows
`either the frame to be encoded as picture or the two fields to be encoded as two pictures. Frame encoding orfield
`encoding can be adaptively selected on a frame-by-frame basis. Frame encoding is typically preferred when the video
`scene contains significant detail with limited motion, Field encoding, in which the second field can be predicted from the
`first, works better when there is fast movement.
`
`Intro. 4.1.3 Motion representation — Macroblocks
`
`As in ISO/IEC 11172-2, the choice of 16 by 16 macroblocks for the motion-compensation unit is a result of the trade-off
`between the coding gain provided by using motion information and the overhead needed to represent
`it. Each
`macroblock can be temporally predicted in one of a number of different ways. For example, in frame encoding, the
`prediction from the previous reference framecanitself be either frame-based or field-based. Depending on the type of
`the macroblock, motion vector information and other side information is encoded with the compressed prediction error
`in each macroblock. The motion vectors are encoded differentially with respect to the last encoded motion vectors using
`variable length codes. The maximum length of the motion vectors that may be represented can be programmed, on a
`picture-by-picture basis, so that the most demanding applications can be met without compromising the performance of
`the system in more normalsituations.
`
`It is the responsibility of the encoder to calculate appropriate motion vectors. This Specification does not specify how
`this should be done.
`
`Intro. 4.1.4 Spatial redundancy reduction
`
`Both source pictures and prediction errors have high spatial redundancy. This Specification uses a block-based DCT
`method with visually weighted quantisation and run-length coding. After motion compensated prediction or
`interpolation, the resulting prediction error is split into 8 by 8 blocks. These are transformed into the DCT domain where
`they are weighted before being quantised. After quantisation many of the DCT coefficients are zero in value and so
`two-dimensional run-length and variable length coding is used to encode the remaining DCT coefficients efficiently.
`
`Intro. 4.1.5 Chrominance formats
`
`In addition to the 4:2:0 format supported in ISOMEC 11172-2 this Specification supports 4:2:2 and 4:4:4 chrominance
`formats.
`
`Intro. 4,2
`
`Scalable extensions
`
`The scalability tools in this Specification are designed to support applications beyond that supported by single layer
`video. Among the noteworthy applications areas addressed are video telecommunications, video on Asynchronous
`Transfer Mode networks (ATM), interworking of video standards, video service hierarchies with multiple spatial,
`temporal and quality resolutions, HDTV with embedded TV, systems allowing migration to higher temporal resolution
`HDTV, etc. Although a simple solution to scalable video is
`the simulcast
`technique which is based on
`transmission/storage of multiple independently coded reproductions of video, a more efficient alternative is scalable
`video coding, in which the bandwidth allocated to a given reproduction of video can be partially re-utilised in coding of
`the next reproduction of video. In scalable video coding, it is assumed that given a coded bitstream, decoders of various
`complexities can decode and display appropriate reproductions of coded video. A scalable video encoderis likely to
`have increased complexity when compared to a single layer encoder. However, this Recommendation | International
`Standard provides several different forms of scalabilities that address non-overlapping applications with corresponding
`complexities. The basic scalability tools offered are:
`
`-
`
`data partitioning;
`
`SNR scalability;
`
`spatial scalability; and
`
`temporal scalability.
`
`Moreover, combinations of these basic scalability tools are also supported and are referred to as hybrid scalability. In the
`case of basic scalability, two layers of video referred to as the lower layer and the enhancement layer are allowed,
`whereas in hybrid scalability up to three layers are supported. Tables Intro.
`1
`to Intro. 3 provide a few example
`applications of various scalabilities.
`
`Vi
`
`7
`
`
`
`ISO/IEC 13818-2:1996(E)
`
`© ISO/IEC
`
`Table Intro. 1 — Applications of SNR scalability
`
`Recommendation Sameresolution and format as|Two quality service for Standard TV (SDTV)
`
`ITU-R BT.601
`lower layer
`
`
`
`
`
`
`
`
`
`
`
`High Definition Sameresolution and format as|Two quality service for HDTV
`lowerlayer
`
`4:2:0 high definition
`
`4:2:2 chroma simulcast
`
`Video production / distribution
`
`Table Intro. 2 — Applications of spatial scalability
`
`
`
`
`
`
`
`Interlace (30 Hz)
`
`Interlace (30 Hz)
`
`HDTV/SDTY scalability
`
`Progressive (30 Hz)
`
`Interlace (30 Hz)
`
`ISO/IEC 11172-2/compatibility with this Specification
`
`Interlace (30 Hz)
`
`Progressive (60 Hz)
`
`Migration to high resolution progressive HDTV
`
`
`
`
`
`
`
`Table Intro. 3 - Applications of temporal scalability
`
`Progressive (30 Hz)
`
`Progressive (30 Hz)
`
`Progressive (60 Hz)
`
`Migration to high resolution progressive
`HDTV
`
`HDTV
`
`Interlace (30 Hz)
`
`Interlace (30 Hz)
`
`Progressive (60 Hz)
`
`Migration to high resolution progressive
`
`Intro. 4.2.1
`
`Spatial scalable extension
`
`Spatial scalability is a tool intended for use in video applications involving telecommunications, interworking of video
`standards, video database browsing, interworking of HDTV and TV,etc., i.e. video systems with the primary common
`feature that a minimum of two layers of spatial resolution are necessary. Spatial scalability involves generating two
`spatial resolution video layers from a single video source suchthat the lower layer is coded byitself to provide the basic
`spatial resolution and the enhancement layer employs the spatially interpolated lower layer and carries the full spatial
`resolution of the input video source. The lower and the enhancement layers may either both use the coding tools in this
`Specification, or the ISO/IEC 11172-2 Standard for the lower layer and this Specification for the enhancementlayer.
`The latter case achieves a further advantage by facilitating interworking between video coding standards. Moreover,
`spatial scalability offers flexibility in choice of video formats to be employed in each layer. An additional advantage of
`spatial scalability is its ability to provide resilience to transmission errors as the more important data of the lower layer
`can be sent over channel with better error performance, while the less critical enhancementlayer data can be sent over a
`channel with poor error performance.
`
`Intro. 4.2.2.
`
`SNR scalable extension
`
`SNR scalability is a tool intended for use in video applications involving telecommunications, video services with
`multiple qualities, standard TV and HDTV,i.e. video systems with the primary commonfeature that a minimum of two
`layers of video quality are necessary. SNR scalability involves generating two video layers of same spatial resolution but
`different video qualities from a single video source such that the lower layer is coded byitself to provide the basic video
`quality and the enhancementlayer is coded to enhance the lower layer. The enhancement layer when added back to the
`
`vill
`
`8
`
`
`
`© [ISO/IEC
`
`ISO/IEC 13818-2:1996(E)
`
`lower layer regenerates a higher quality reproduction of the input video, The lower and the enhancement layers may
`either use this Specification or ISO/IEC 11172-2 Standard for the lower layer-and this Specification for the enhancement
`layer. An additional advantage of SNR scalability is its ability to provide high degree of resilience to transmission errors
`as the more important data of the lower layer can be sent over channel with better error performance, while the less
`critical enhancementlayer data can be sent over a channel with poor error performance.
`
`Intro. 4.2.3 Temporal scalable extension
`
`intended for use in a range of diverse video applications from telecommunications
`Temporal scalability is a tool
`to HDTVfor which migration to higher temporal resolution systems from that of lower temporal resolution systems may
`be necessary. In many cases, the lower temporal resolution video systems may be either the existing systems or the less
`expensive early generation systems, with the motivation of introducing more sophisticated systems gradually. Temporal
`scalability involves partitioning of video frames into layers, whereas the lower layer is coded by itself to provide the
`basic temporal rate and the enhancementlayer is coded with temporal prediction with respect to the lower layer, these
`layers when decoded and temporal multiplexed to yield full temporal resolution of the video source. The lower temporal
`resolution systems may only decode the lower layer to provide basic temporal resolution, whereas more sophisticated
`systems of the future may decode both layers and provide high temporal
`resolution video while maintaining
`interworking with earlier generation systems. An additional advantage of temporal scalability is its ability to provide
`resilience to transmission errors as the more important data of the lower layer can be sent over channel with better error
`performance, while the less critical enhancement layer can be sent over a channel with poor error performance.
`
`Intro. 4.2.4 Data partitioning extension
`
`intended for use when two channels are available for transmission and/or storage of a
`Data partitioning is a tool
`video bitstream, as may be the case in ATM networks,terrestrial broadcast, magnetic media, etc. The bitstream is
`partitioned between these channels such that more critical parts of the bitstream (such as headers, motion vectors, low
`frequency DCTcoefficients) are transmitted in the channel with the better error performance, andlesscritical data (such
`as higher frequency DCT coefficients) is transmitted in the channel with poor error performance. Thus, degradation to
`channel errors are minimised since the critical parts of a bitstream are better protected. Data from neither channel may be
`decoded on a decoderthat is not intended for decoding data partitioned bitstreams.
`
`9
`
`
`
`INTERNATIONAL STANDARD
`
`ITU-T RECOMMENDATION
`
`ISOAEC 13818-2:1996(E)
`
`INFORMATION TECHNOLOGY -
`GENERIC CODING OF MOVING PICTURES AND
`ASSOCIATED AUDIO INFORMATION: VIDEO
`
`1
`
`Scope
`
`This Recommendation | International Standard specifies the coded representation of picture information for digital
`storage media and digital video communication and specifies the decoding process. The representation supports constant
`bitrate transmission, variable bitrate transmission, random access, channel hopping, scalable decoding, bitstream editing,
`as well as special functions such as fast forward playback, fast reverse playback, slow motion, pause and still pictures.
`This Recommendation | International Standard is forward compatible with ISOMEC 11172-2 and upward or downward
`compatible with EDTV, HDTV, SDTV formats.
`
`This Recommendation | International Standard is primarily applicable to digital storage media, video broadcast and
`communication. The storage media may be directly connected to the decoder, or via communications means such as
`busses, LANs,or telecommunications links.
`
`2
`
`Normative references
`
`The following Recommendations and International Standards contain provisions which through referencein this text,
`constitute provisions of this Recommendation| International Standard. At the time of publication, the editions indicated
`were valid. All Recommendations and Standards are subject to revision, and parties to agreements based on this
`Recommendation | International Standard are encouraged to investigate the possibility of applying the most recent
`edition of the Recommendations and Standards indicated below. Members of IEC and ISO maintain registers of
`currently valid International Standards. The Telecommunication Standardization Bureau of the ITU maintainsalist of
`currently valid ITU-T Recommendations.
`
`-
`
`—
`
`—
`
`—
`
`-
`
`—
`
`~
`
`—
`
`Recommendations and Reports of the CCIR, 1990, XVIIth Plenary Assembly, Dusseldorf 1990,
`Volume XI — Part 1 Broadcasting Service (Television) - Recommendation ITU-R BT.601-3 Encoding
`parameters of digital television for studios.
`
` CCIR Volume X and XI Part 3 - Recommendation ITU-R BR.648 Recording ofaudio signals.
`
`CCIR Volume X and XI Part 3 — Report ITU-R 955-2 Satellite sound broadcasting to vehicular, portable
`and fixed receivers in the range 500 - 3000 MHz.
`
`ISO/IEC 11172-1:1993, Information technology — Coding of moving pictures and associated audio for
`digital storage media at up to about 1,5 Mbit/s — Part 1 ; Systems.
`
`ISO/IEC 11172-2:1993, Information technology — Ceding of moving pictures and associated audio for
`digital storage media at up to about 1,5 Mbit/s — Part 2 : Video.
`
`ISO/IEC 11172-3:1993, Information technology — Coding of moving pictures and associated audio for
`digital storage media at up to about 1,5 Mbit/s — Part 3 : Audio.
`
` IBEE Standard Specifications for the Implementations of 8 by 8 Inverse Discrete Cosine Transform, IEEE
`Std 1180-1990, December 6, 1990.
`
`TEC Publication 908:1987, Compact disc digital audio system.
`
`“ —
`
` JEC Publication 461:1986, Time and control code for video tape recorders.
`
`—
`
`-
`
`ITU-T Recommendation H.261 (1993), Video codec for audiovisual services at p x 64 kbit/s.
`
`CCITT Recommendation T.81 (1992) (JPEG) ISOMEC 10918-1:1994, Information technology — Digital
`compression and coding of continuous-tone still images — Requirements and guidelines.
`
`ITU-T Rec. H.262 (1995 E)
`
`1
`
`10
`
`10
`
`
`
`[ISO/IEC 13818-2:1996(E)
`
`3
`
`Definitions
`
`For the purposes of this Recommendation | International Standard, the following definitions apply.
`
`3.1
`
`AC coefficient: Any DCT coefficient for which the frequency in one or both dimensionsis non-zero.
`
`big picture: A coded picture that would cause VBV buffer underflow as defined in C.7. Big pictures can only
`3.2
`occur in sequences where low_delay is equal to 1. “Skipped picture” is a term that is sometimes used to describe the
`same concept.
`
`3.3
`
`3.4
`
`B-field picture: A field structure B-Picture.
`
`B-framepicture: A frame structure B-Picture.
`
`B-picture; bidirectionally predictive-coded picture: A picture that is coded using motion compensated
`3.5
`prediction from past and/or future reference fields or frames.
`
`backward compatibility: A newer coding standard is backward compatible with an older coding standard if
`3.6
`decoders designed to operate with the older coding standard are able to continue to operate by decoding all or part of a
`bitstream produced according to the newer coding standard.
`
`backward motion vector: A motion vector that is used for motion compensation from a reference frame or
`3.7
`referencefield at a later time in display order.
`
`3.8
`
`3.9
`
`3.10
`
`3.11
`
`3.12
`
`backward prediction: Prediction from the future reference frame(field).
`
`base layer: First, independently decodable layer of a scalable hierarchy.
`
`bitstream; stream: An ordered series of bits