`STANDARD
`
`ISO/IEC
`11172-2
`First edition
`1993-08-01
`
`
`
`Information technology — Coding of
`moving pictures and associated audio for
`digital storage media at up to about
`1,5 Mbit/s —
`
`Part 2:
`Video
`
`Technologies de I'information — Codage de I'image animée et du son
`associé pour jes supports de stackage numérique jusqu’a environ
`1,5 Mbit/s —
`
`Partie 2: Vidéo
`
`
`
`
`
`-
`
`Reference number
`ISO/IEC 11172-2:1993(E)
`
`1
`
`SAMSUNG-1034
`
`1
`
`SAMSUNG-1034
`
`
`
`ISOVIEC 11172-2: 1993 (E)
`
`Contents
`
`Page
`
`FOreWOrd oo... ccccceeee eset cence een een eee n ene et eee e eu Ee eH HEHE EERE DEES ER EH ED EG EERE EERE SAN aT peeeeee ereslil
`
`Tmtroduction...........sessceeereeeeeseesserereeesesseneeneeneesscesseeeeenecessseeeseeeaeeseceeneeesess iv
`
`Section 1: General ...........ccccceececsseeeteeeseeeaeenersneeeseeeeeegeeensenceseeeseeeeasieeseeaaneeaas 1
`
`LiL
`
`SCOPC...... cece cece cece ecneee eect se eteeneeeeeeteeeeseeesseseeeeeeeeeensneteeeeananaenenebentaeaeed 1
`
`1.2
`
`Normative references............ccsceenseeesenenneeeternceenetemesenerteeenegnarteeteseneetaes L
`
`Section 2: Technical
`
`eclements...............ccccecceceeeeeeeeeseeneeeecseseeeeeeeeceeeeeeeeneneeeneseas 3
`
`QL
`
`Defimitions...........ce cee ceceeeceeeeecaneceeeeeecneaaeeeeeeeaeeeaeeaseeeesseeeessetsesseeeeeeed
`
`2.2.
`
`Symbols and abbreviations ccceeeeeneeseeecaeeeceeeterceceeeeueeesueceueeeneseuerroneesetess 11
`
`2.3. Method of describing bitstream symtax.........cecceecccsseeereeeeeereteeeeeeeeeneeees 13
`
`2.4
`
`Requirements........... 00. ccccceceee nee ceeee eee nee etree eneeeweeaneseeesuseeesseesanensesenanens 15
`
`Annexes
`
`A
`
`B
`
`Cc
`
`D
`
`E
`
`F
`
`8 by 8 Inverse discrete cosine transform .................cccccsccceeeeeeee ee eeeeeeeeeeees 39
`
`Variable length code tables ................::cceceeeeeeeeeeeeeees baneseneeeesesnenaeousenteas 40
`
`Video buffering verifier..............000.0002. cece eee cee e seen eneeeeeeeessenaneeeee ened 49
`
`Guide to encoding Video .........c0ccccccceseeeseeecueeeeeeeeeeeceeeeeueensereatesensanernes 51
`
`Bibliography ........ 00... ccceceeeeecee eee eeee nese eee eeteeceeeeeesebeeeeneeeseeceenenna een tens 108
`
`List of patent bolders...........c cece cesceceseteeeeesseetesesenerenseenesesttneenieenes 109
`
`© ISO/IEC 1993
`All rights reserved. No part of this publication may be reproduced orutilized in any form or by
`any means, electronic or mechanical, including photocopying and microfilm, without
`permission in writing from the publisher.
`
`ISOAEC Copyright Office » Case Postale 56 * CH1211 Genéve 20 * Switzerland
`
`Printed in Switzerland.
`
`ii
`
`2
`
`
`
`© ISO/IEC
`
`ISO/IEC 11172-2: 1993 (E)
`
`Foreword
`
`ISO (the International Organization for Standardization) and IEC (the Inter-
`national Electrotechnical Commission) form the specialized system for
`worldwide standardization. National bodies that are members of ISO or
`IEC participate in the development of
`International Standards through
`technical committees established by the respective organization to deal
`with particular fields of technical activity.
`ISO and IEC technical com-
`mittees collaborate in fields of mutual interest. Other international organ-
`izations, governmental and non-governmental, in liaison with ISO and IEC,
`also take part in the work.
`
`In the field of information technology, ISO and IEC have established a joint
`technical committee, ISO/IEC JTC 1. Draft International Standards adopted
`by the joint technical committee are circulated to national bodies for vot-
`ing. Publication as an International Standard requires approval by at least
`75 % of the national bodies casting a vote.
`
`International Standard ISO/IEC 11172-2 was prepared by Joint Technical
`Cammittee ISO/IEC JTC 1, information technology, Sub-Committee SC 29,
`Coded representation of audio, picture, multimedia and hypermedia infor-
`mation.
`
`ISO/IEC 11172 consists of the following parts, under the general title /n-
`formation technology — Coding of moving pictures and associated audio
`for digital storage media at up to about 1,5 Mbit/s:
`
`— Part 1: Systems
`
`— Part 2; Video
`
`— Part 3: Audio
`
`— Part 4: Compliance testing
`
`Annexes A, B and C form an integral part of this part of ISO/IEC 11172.
`Annexes D, E and F are for information only.
`
`iii
`
`3
`
`
`
`ISONEG 11172-2: 1993 (E)
`
`© ISO/IEC
`
`Introduction
`
`Note -- Readers interested in an overview of the MPEG Videolayer should read this Introduction and then
`
`proceed to annex D, before returning to clauses 1 and 2.
`
`0.1
`
`Purpose
`
`This part of ISO/IEC 11172 was developed in response to the growing need for a common format for
`representing compressed video on variousdigital storage media such as CDs, DATs, Winchester disks and
`optical drives. This part of ISO/IEC 11172 specifies a coded representation that can be used for
`compressing video sequences to bitrates around 1,5 Mbit/s. The use ofthis part of ISO/IEC 11172 means
`that motion video can be manipulated as a forn of computer data and can be transmitted and received over
`existing and future nctworks. The coded representation can be used with both 625-line and 525-line
`television and provides flexibility for use with workstation and personal computerdisplays.
`
`This part of ISO/IEC 11172 was developed to operate principally from storage media offering a continuous
`transfer rate of about 1,5 Mbit/s. Nevertheless it can be used more widcly than this because the approach
`taken is generic.
`
`0.1.1 Coding parameters
`
`The intention in developing this part of ISO/IEC 11172 has been to define a source coding algorithm with a
`large degreeofflexibility that can be used in manydifferent applications. To achievethis goal, a number of
`the parameters defining the characteristics of coded bitstreams and decoders are contained in the bitstream
`itself, This allows for example, the algorithm to be used for pictures with a variety of sizcs and aspect
`ratios and on channels or devices operating at a wide range ofbitrates.
`
`Because of the large range of the characteristics of bitstreams that can be represented by this part of ISO/IEC
`11172, a sub-set of these coding parameters knownas the "Constrained Parameters" has been defined. The
`aim in defining the constrained parameters is to offer guidance about a widely useful range of parameters.
`Conformingto this set of constraints is not a requirementof this part of ISO/IEC 11172. A flag in the
`bitstream indicates whether or notit is a Constrained Parameters bitstream.
`
`Summary of the Constrained Parameters:
`
`Less than or equal to 768pels
`Less than or equal to 576lines
`Less than or
`equal to 396 macroblocks
`——<—ts—CCCsS Less than or equal to 396x25 macroblocks/s
`Less than or equal to 30 Hz
`
`
`
`
` Less than -64 to +63,5 pels (using half-pel vectors)
` 'Pelrate
`
`Input buffer size (in VB V model
`
`
`
`| £4
`Less than or equal to 327 680 bits
`
`Less than or equal to 1 856 000 bits/s (constantbitrate
`
`4
`
`0.2 Overview of the algorithm
`
`The coded representation defined in this part of ISO/IEC 11172 achieves a high compression ratio while
`preserving good picture quality. The algorithm is not lossless as the exact pel values are not preserved
`during coding. The choice of the techniques is based onthe need to balance a high picture quality and
`compression ratio with the requirement to make random access ta the codedbitstream. Obtaining good
`picture quality at the bitrates of interest demands a very high compression ratio, which is not achievable
`with intraframe coding alone. The need for random access, however,is best satisfied with pure intraframe
`coding, This requircs a carcful balance betweenintra- and interframe coding and between recursive and non-
`recursive temporal redundancy reduction.
`
`4
`
`
`
`© ISO/IEC
`
`ISO/IEC 11172-2: 1993 (E)
`
`A number oftechniques are used to achieve a high compressionratio. Thefirst, which is almost
`independent from this part of ISO/IEC 11172,is to select an appropriate spatial resolution for the signal.
`The algorithm then uses block-based motion compensation to reduce the temporal redundancy. Motion
`compensation is used for causal prediction ofthe current picture from a previouspicture, for non-causal
`prediction of the current picture from a future picture, or for interpolative prediction from past and future
`pictures. Motion vectors are defined for each 16-pel by 16-line region of the picture. The difference signal,
`the predictionerror, is further compressed using the discrete cosine transform (DCT)to removespatial
`correlation before it is quantized in an irreversible process that discards the less important information.
`Finally, the motion vectors are combined with the DCTinformation, and coded using variable length codes.
`
`0.2.1 Temporal processing
`
`Because ofthe conflicting requirements of random access and highly efficient compression, three main
`picture types are defined. Intra-coded pictures (I-Pictures) are coded without reference to other pictures.
`They provide access points to the coded sequence where decoding can begin, but are coded with only a
`moderate compression ratio. Predictive coded pictures (P-Pictures) are coded more efficiently using motion
`compensated prediction fromapastintra or predictive coded picture and are generally used as a reference for
`further prediction. Bidirectionally-predictive coded pictures (B-Pictures) provide the highest degree of
`compression but require both past and future reference pictures for motion compensation. Bidirectionally-
`Predictive coded pictures are never used as references for prediction. The organisation ofthe three picture
`types in a sequenceis very flexible. The choice is left to the encoder and will depend on the requirements of
`the application. Figure 1 illustrates the relationship betweenthe three different picture types.
`
`7X8 Bi-directional
`
`ai =
`HN
`
`Figure 1 -- Example of temporal picture structure
`
`The fourth picture type definedinthis part of ISO/IEC 11172, the D-picture, is provided to allow a simple,
`butlimited quality, fast-forward playback mode,
`
`0.2.2 Motion representation - macroblocks
`
`The choice of 16 by 16 macroblocks for the motion-compensation unit is a result of the trade-off between
`increasing the coding efficiency provided by using motion information and the overhcad needed to store it.
`Each macroblock can be one of a numberof different types. For example, intra-coded, forward-predictive-
`coded, backward-predictive coded, and bidirectionally-predictive-coded macroblocks are permitted in
`bidirectionally-predictive coded pictures. Depending on the type of the macroblock, motion vector
`information and other side information are storcd with the compressed prediction error signal in each
`macroblock, The motion vectors are encoded differentially with respect to the last coded motion vector,
`using variable-length codes, The maximum length ofthe vectors that may be represented can be
`programmed, on a picture-by-picture basis, so that the most demanding applications can be met without
`compromising the performance of the system in more normal situations.
`
`It is the responsihility of the encoder to calculate appropriate motion vectors. This part of ISO/IEC 11172
`does not specify how this should be done.
`
`5
`
`
`
`ISOMEC 11172-2: 1993 (E)
`
`© ISO/EC
`
`0.2.3 Spatial redundancy reduction
`
`Both original pictures and prediction error signals have high spatial redundancy. This part of ISO/IEC
`11172 uses a block-based DCT method with visually weighted quantization and run-length coding. Each 8
`by8 block of the original picture for intra-coded macroblocksor of the prediction error for predictive-coded
`macroblocks is transformed into the DCT domain whereit is scaled before being quantized. After
`quantization manyof the coefficients are zero in value and so two-dimensional run-length and variable
`length coding is used to encode the remaining coefficients efficiently.
`
`0.3 Encoding
`
`This part of ISO/IEC 11172 does notspecify an encoding process. It specifies the syntax and semantics of
`the bitstrcam andthe signal processing in the decoder. As a result, many options are left open to encoders
`to trade-off cost and speed againstpicture quality and coding efficiency. This clause is a brief description of
`
`Source input pictures
`
`the functions that need to be performed by an encoder. Figure 2 shows the main functional blocks.
`
`SIOIOOAUOMO
`
`
`where
`
`DCTis discrete cosine transform
`DCT" is inverse discrete cosine transform
`Q is quantization
`q! is dequantization
`VLCis variable length coding
`
`Figure 2 -- Simplified video encoder block diagram
`
`The input video signal must be digitized and represented as a luminance and two colour difference signals
`(Y, Cp, C;). This may be followed by preprocessing and format conversionto select an appropriate
`window,resolution and input format. This part of ISO/IEC 11172 requires that the colour difference
`signals (Ch and Cy) are subsampled with respect to the luminance by 2:1 in both vertical and horizontal
`directions and are reformatted, if necessary, as a non-interlaced signal.
`
`The encoder must choose which picture type to use for each picture. Having defined the picture types, the
`encoder estimates motion vectors for each 16 by 16 macroblock in the picture.
`In P-Pictures one vector is
`needed for each non-inura macroblock and in B-Pictures one or two vectors are needed.
`
`If B-Pictures are used, some reordering of the picture sequence is necessary before encoding. Because B-
`Pictures are coded using bidirectional motion compensated prediction, they can only be decodedafter the
`subsequent reference picture (an I or P-Picture) has been decoded. Therefore the pictures are reordered by the
`
`vi
`
`6
`
`
`
`© ISONEC
`
`ISO/IEC 11172-2: 1993 (E)
`
`encoder so that the pictures arrive at the decoderin the order for decoding. The correct display order is
`recovered by the decoder.
`
`The basic unit of coding within a picture is the macroblock. Within each picture, macroblocks are encoded
`in sequence,left to right, top to bottom. Each macroblock consists of six 8 by 8 blocks: four blocks of
`luminance, one block of Cb chrominance,and one block of Cr chrominance. See figure 3. Note that the
`picture area covered by the four blocks of luminance is the same as the area covered by each ofthe
`chrominance blocks. This is due to subsampling ofthe chrominance information to match the sensitivity of
`the human visual system.
`
`oOfi|
`
`Y
`
`Cb
`
`Cr
`
`Figure 3 -- Macroblock structure
`
`It depends onthe picture type, the
`Firstly, for a given macroblock, the coding modeis chosen.
`effectiveness of motion compensated prediction in that local region, and the nature of the signal within the
`block. Secondly, depending on the coding mode, a motion compensated prediction of the contents of the
`block based on past and/or future reference pictures is formed. This prediction is subtracted from the actual
`data in the current macroblock to form an error signal. Thirdly, this error signal is separated into 8 by 8
`blocks (4 luminance and 2 chrominance blocks in each macroblock) and a discrete cosine transformis
`performed on each block. Each resulting 8 by 8 block of DCT coefficients is quantized and the two-
`dimensional block is scanned in a zig-zag order to convert it into a one-dimensional string of quantized DCT
`coefficients. Fourthly, the side-information for the macroblock (mode, motion vectors etc) and the
`quantized coefficient data are encoded. For maximumefficiency, a numberof variable length code tables are
`defined for the different data elements. Run-length codingis used for the quantized coefficient data.
`
`A consequence of using different picture types and variable length coding is that the overall data rate is
`variable. In applications that involve a fixed-rate channel, a FIFO buffer may be used to match the encoder
`output to the channel. The status of this buffer may be monitored to control the number ofbits generated
`by the encoder. Controlling the quantization process is the most direct way of controlling the bitrate, This
`part of ISO/IEC 11172 specifies an abstract modelof the buffering system (the Video Buffering Verifier) in
`order to constrain the maximum variability in the numberof bits that are used tor a given picture.
`‘This
`ensures that a bitstream can be decoded with a buffer of knownsize.
`
`Atthis stage, the coded representation of the picture has been generated. The final step in the encoder is to
`regenerate I-Pictures and P-Pictures by decoding the data so that they can be used as reference pictures for
`subsequent encoding. The quantized coefficients are dequantized and an inverse 8 by 8 DCT is performed on
`each block. The prediction error signal produced is then added back to the prediction signal and limited to
`the required rangeto give a decoded reference picture.
`
`0.4 Decoding
`
`Decoding is the inverse of the encoding operation. It is considerably simpler than encoding asthere is no
`need to perform motion estimation and there are many fewer options. The decoding process is defined by
`this part of ISO/IEC 11172, The description that follows is a very brief overview of one possible way of
`decoding a bitstream. Other decoders with different architectures are possible. Figure 4 showsthe main
`functional blocks.
`
`vii
`
`7
`
`
`
`ISO/IEC 11172-2: 1993 (E)
`
`© ISO/IEC
`
`
`
`
`Motion Vectors
`.Picture store
`
`and
`
`Predictor
`
`
`Reconstructed
`output pictures
`
`bi
`
`Where
`
`is inverse discrete cosine transform
`DCT-!
`Q-!_is dequantization
`MUX"!
`is demultiplexing
`VLD_is variable length decoding
`
`Figure 4 -- Basic video decoder block diagram
`
`For fixed-rate applications, the channelfills a FIFO buffer at a constant rate with the coded bitstream.
`decoder reads this buffer and decodes the data elements in the bitstream according to the defined syntax.
`
`‘The
`
`As the decoder reads the bitstream,it identifies the start of a coded picture and then the type of the picture.
`It decodes each macroblock in the picture in tun. The macroblock type and the motion vectors,if present,
`are used to construct a prediction of the current macroblock based on past and future reference pictures that
`have been stored in the decoder. The coefficient data are decoded and dequantized. Each 8 by 8 block of
`coefficient data is transformed by an inverse DCT(specified in anncx A), and the result is added to the
`prediction signal andlimited to the defined range.
`
`After all the macroblocks in the picture have beenprocessed, the picture has been reconstructed.If it is an I-
`picture or a P-picture it is a reference picture for subsequentpictures and is stored, replacing the oldest stored
`reference picture. Before the pictures are displayed they may need to be re-ordered from the coded order to
`their natural display order. After reordering, the pictures are available,in digital form, for post-processing
`and display in any mannerthat the application chooses.
`
`0.5
`
`Structure of the coded video bitstream
`
`This part of ISO/IEC 11172 specifies a syntax for a coded video bitstream. This syntax contains six layers,
`each of which either supports a signal processing or a system function:
`
`
`
`
`
`Layers_of_the syntax
`
`
`
`
`
`Sequence layer
`Random access unit: context
`Groupofpictures layer
`Random access unit: video
`
`Picture layer
`Primary coding unit
`
`
`Slice layer
`Resynchronization unit
`
`
`Macroblock layer
`Motion compensation unit
`
`
`
`Block layer
`DCTunit
`
`
`0.6
`
`Features supported by the algorithm
`
`Applications using compressed video on digital storage media needto be able to perform a number of
`operations in addition to normal forward playback of the sequence. The coded bitstream has been designed
`to support a numberof these operations.
`
`viii
`
`8
`
`
`
`© ISO/IEC
`
`ISOMEG 11172-2: 1993 (E)
`
`0.6.1 Random access
`
`Random accessis an essential feature for video on a storage medium. Random access requires that any
`picture can be decoded in a limited amount oftime. It implies the existence of access points in the
`bitstream - that is segments of information that are identifiable and can be decoded withoutreference to other
`segments of data. A spacing of two random access points (Intra-Pictures) per second can be achieved
`without significant loss of picture quality.
`
`0.6.2 Fast search
`
`Depending on the storage medium,it is possible to scan the access points in a coded bitstream (with the
`help of an application-spccific directory or other knowledge beyond the scope ofthis part of ISO/IEC
`11172)to obtain a fast-forward and fast-reverse playback effect.
`
`0.6.3 Reverse playback
`
`Some applications may require the video signal to be played in reverse order. This can be achieved ina
`decoder by using memory to store entire groups of pictures after they have heen decoded hefore being
`displayed in reverse order. An encoder can makethis feature easier by reducing the length of groups of
`pictures.
`
`0.6.4 Error robustness
`
`Mostdigital storage media and communication channels are not error-free. Appropriate channel coding
`schemes should be used and are beyond the scope of this part of ISO/IEC 11172. Nevertheless the
`compression scheme definedin this part of ISO/IEC 11172 is robust to residual errors. The slice structure
`allows a decoder to recover after a data error and to resynchronize its decoding. Therefore, bit errors in the
`compressed data will cause errors in the decoded pictures to be limited in area. Decoders may be able to use
`concealmentstrategies to disguise these errors.
`
`0.6.5 Editing
`
`There is a conflict between the requirement for high coding efficiency and easy editing. The coding structure
`and syntax have not been designed with the primary aim of simplifying editing at any picture. Nevertheless
`a number of features have beenincluded that enable editing of coded data.
`
`ix
`
`9
`
`
`
`This page intentionally left blank
`
`10
`
`10
`
`
`
`a I
`
`
`
`NTERNATIONAL STANDARD © ISO/IEC ISONEC 11172-2: 1993 (E)
`
`Information technology — Coding of moving
`pictures and associated audio for digital storage
`media at up to about 1,5 Mbit/s —
`
`Part 2:
`Video
`
`Section 1: General
`
`1.1
`
`Scope
`
`This part of ISO/IEC 11172 specities the coded representation of video for digital storage media and
`specifies the decoding process. The representation supports normal speed forward playback, as well as
`special functions such as random access,fast forward playback, fast reverse playback, normal speed reverse
`playback, pause andstill pictures. This part of ISO/IEC 11172 is compatible with standard 525- and 625-
`line television formats, and it provides flexibility for use with personal computer and workstation displays.
`
`ISO/IEC 11172 is primarily applicable to digital storage media supporting a continuous transfer rate up to
`about 1,5 Mbit/s, such as Compact Disc, Digital Audio Tape, and magnetic hard disks. Neverthelessit can
`be used more widely than this because of the generic approach taken. The storage media may be direcly
`connected to the decoder, or via communications means such as busses, LANs,or telecommunications
`links. This part of ISO/IEC 11172 is intended for non-interlaced vidco formats having approximately 288
`lines of 352 pels and picture rates around 24 Hz to 30 Hz.
`
`1.2 Normative references
`
`The following International Standards contain provisions which, through reference in this text, constitute
`provisions ofthis part of ISO/IEC 11172. At the time ofpublication, the editions indicated were valid.
`All standardsare subject to revision, and parties to agreements based onthis part of ISO/IEC 11172 are
`encouraged to investigate the possibility of applying the mostrecenteditions of the standards indicated
`below. Members of IEC and ISO maintain registers of currently valid International Standards.
`
`ISOAEC 11172-1:1993 Information technology - Coding of moving pictures and associated audio for digital
`storage media at up to about 1,5 Mbit/s - Part 1: Systems.
`
`ISOAEC 11172-3:1993 Information technology - Coding ofmoving pictures and associated audiofordigital
`storage media at up to about 1,5 Mbit/s - Part 3 Audio.
`
`CCIR Recommendation 601-2 Encoding parametersof digital television for studios.
`
`CCIR Report 624-4 Characteristics ofsystemsfor monochrome and colourtelevision.
`
`CCIR Recommendation 648 Recording of audio signals.
`
`CCIR Report 955-2 Sound broadcasting by satellitefor portable and mobile receivers, including Annex IV
`Summary description ofAdvanced Digital System IL.
`
`CCITT Recommendation J.17 Pre-emphasis used on Sound-Programme Circuits.
`
`11
`
`11
`
`
`
`ISO/IEC 11172-2: 1993 (E)
`
`© ISONEC
`
`IEEE Draft Standard P1180/D2 1990 Specification for the implementation of8x 8 inverse discrete cosine
`transform”.
`
`IEC publication 908:1987 CD Digital Audio System.
`
`12
`
`12
`
`
`
`© ISO/EC
`
`ISONEC 11172-2: 1993 (E)
`
`Section 2: Technical elements
`
`2.1. Definitions
`
`For the purposes of ISO/IEC 11172, the following definitions apply. If specific to a part, this is noted in
`square brackets.
`
`2.1.1 ac coefficient [video]: Any DCT coefficient for which the frequency in one or both dimensions
`is non-zero.
`
`In the case of compressed audio an access unit is an audio access unit.
`2.1.2 access unit [system]:
`the case of compressed video an access unit is the coded representation ofa picture.
`
`In
`
`2.1.3 adaptive segmentation [audio]: A subdivision of the digital representation of an audio signal
`in variable segments of time.
`
`2.1.4 adaptive bit allocation [audio]: The assignment of bits to subbands in a time and frequency
`varying fashion according to a psychoacoustic model.
`
`2.1.5 adaptive noise allocation [audio]: The assignment of coding noise to frequency bands in a
`time and frequency varying fashion according to a psychoacoustic model.
`
`2.1.6 alias [audio]: Mirrored signal component resulting from sub-Nyquist sampling.
`
`2.1.7 analysis filterbank [audio]: Filterbank in the encoder that transforms a broadband PCM audio
`signal into a set of subsampled subband samples.
`
`2.1.8 audio access unit [audio]: For Layers [ and II an audio access unit is defined as the smallest
`part of the encoded bitstream which can be decoded by itself, where decoded means "fully reconstructed
`sound", For Layer III an audio access unit is part of the bitstream that is decodable with the use of
`previously acquired main information.
`
`2.1.9 audio buffer [audio]: A buffer in the system target decoder for storage of compressed audio data.
`
`2.1.10 audio sequence [audio]: A non-interrupted series of audio frames in which the following
`parameters are not changed:
`-ID
`
`- Layer
`- Sampling Frequency
`- For LayerI andII: Bitrate index
`
`2.1.11 backward motion vector [video]: A motion vector that is used for motion compensation
`from a reference picture at a later time in display order.
`
`2.1.12 Bark [audio]: Unit of critical band rate. The Bark scale is a non-linear mapping of the frequency
`scale over the audio range closely corresponding with the frequency selectivity of the human ear across the
`band.
`
`2.1.13 bidirectionally predictive-coded picture; B-picture [video]: A picture that is coded
`using motion compensated prediction from a past and/or future reference picture.
`
`2.1.14 bitrate: The rate at which the compressed bitstream is delivered from the storage medium to the
`input of a decoder.
`
`2.1.15 block companding [audio]: Normalizing of the digital representation of an audio signal
`within a certain time period.
`
`2.1.16 block [video]: An 8-row by 8-column orthogonal blockof pels.
`
`2.1.17 bound [audio]: The lowest subband in which intensity stereo coding is used.
`
`13
`
`13
`
`
`
`ISO/IEC 11172-2: 1993 (E)
`
`© ISO/IEC
`
`2.1.18 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of 8-bits
`from thefirst bit in the stream.
`
`2.1.19 byte: Sequence of 8-bits.
`
`2.1.20 channel: A digital medium that stores or transports an ISO/IEC 11172 stream.
`
`2.1.21 channel [audio]: Theleft and right channels ofa stereo signal
`
`2.1.22 chrominance (component) [video]: A matrix, block or single pel representing one of the
`two colour difference signals related to the primary colours in the manner defined in CCIR Rec 601. The
`symbols used for the colour difference signals are Cr and Cb.
`
`2.1.23 coded audio bitstream [audio]: A coded representation of an audio signal as specified in
`ISO/IEC 11172-3.
`
`2.1.24 coded video bitstream [video]: A coded representation of a series of one or more pictures as
`specified in this part of ISO/IEC 11172.
`
`2.1.25 coded order [video]: The order in which the pictures are stored and decoded. This order is not
`necessarily the same as the display order.
`
`2.1.26 coded representation: A data element as represented in its encoded form.
`
`2.1.27 coding parameters [video]: ‘Ihe set of user-definable parameters that characterize a coded video
`bitstream. Bitstreams are characterised by coding parameters. Decoders are characterised by the bitstreams
`that they are capable of decoding.
`
`2.1.28 component [video]: A matrix, block or single pel from onc of the three matrices (luminance
`and two chrominance) that make upapicturc.
`
`2.1.29 compression: Reduction in the number of bits used to represent anitem of data.
`
`2.1.30 constant bitrate coded video [video]: A compressed video bitstream with a constant
`average bitrate.
`
`2.1.31 constant bitrate: Operation where the bitrate is constant from start to finish of the compressed
`bitstream.
`
`2.1.32 constrained parameters |video|:
`2.4.3.2,
`
`‘Ihe values of the set of coding parameters defined in
`
`2.1.33 constrained system parameter stream (CSPS) [system]: An ISO/IEC 11172
`multiplexed stream for which the constraints defined in 2.4.6 of ISOMEC 11172-1 apply.
`
`2.1.34 CRC: Cyclic redundancy code.
`
`2.1.35 critical band rate [audio]: Psychoacoustic function of frequency. At a given audible
`frequencyit is proportional to the numberofcritical bands below that frequency. The units ofthe critical
`band rate scale are Barks.
`
`2.1.36 critical band [audio]: Psychoacoustic measure in the spectral domain which corresponds to the
`frequencyselectivity of the human ear. This selectivity is expressed in Bark.
`
`2.1.37 data element: An item of data as represented before encoding and after decoding.
`
`2.1.38 de-coefficient [video]: The DCT coefficient for which the frequency is zero in both
`dimensions.
`
`14
`
`14
`
`
`
`© ISO/IEC
`
`ISO/IEC 11172-2: 1998 (E)
`
`2.1.39 dc-coded picture; D-picture [video]: A picture that is coded using only information from
`itself. Of the DCT coefficients in the coded representation, only the de-coefficients are present.
`
`2.1.40 DCTcoefficient: The amplitude of a specific cosine basis function.
`
`2.1.41 decoded stream: The decoded reconstruction of a compressed bitstream.
`
`2.1.42 decoder input buffer [video]: The first-in first-out
`buffering verifier.
`
`(FIFO) buffer specified in the video
`
`2.1.43 decoder input rate [video]: The data rate specified in the video buffering verifier and encoded
`in the coded video bitstream.
`
`2.1.44 decoder; An embodiment of a decoding process.
`
`2.1.45 decoding (process): The process defined in ISO/IEC 11172 that reads an input coded bitstream
`and produces decoded pictures or audio samples.
`
`2.1.46 decoding time-stamp; DTS [system]: A field that may be present in a packet header that
`indicates the time that an access unit is decoded in the system target decoder.
`
`2.1.47 de-emphasis [audio]: Filtering applied to an audio signal after storage or transmission to undo
`a linear distortion due to emphasis.
`
`2.1.48 dequantization [video]: The process ofrescaling the quantized DCT coefficients after their
`representation in the bitstream has been decoded and before they are presented to the inverse DCT.
`
`2.1.49 digital storage media; DSM: A digital storage or transmission device or system,
`
`2.1.50 discrete cosine transform; DCT [video]: Either the forward discrete cosine transform or the
`inverse discrete cosine transform, The DCTis an invertible, discrete orthogonal transformation. The
`inverse DCTis defined in annex A.
`
`2.1.51 display order [video]: The order in which the decoded pictures should be displayed. Normally
`this is the same order in which they were presented at the input of the encoder.
`
`2.1.52 dual channel mode [audio]: A mode, where two audio channels with independent programme
`contents (e.g. bilingual) are encoded within onc bitstream. The coding processis the sameas for the stcrco
`mode.
`
`2.1.53 editing: ‘he process by which one or more compressed bitstreams are manipulated to produce a
`new compressed bitstream. Conforming edited bitstreams must meet the requirements definedinthis part of
`ISO/IEC 11172.
`
`2.1.54 elementary stream[system]: A generic term for one of the coded video, coded audio or other
`coded bitstreams.
`
`2.1.55 emphasis [audio]: Filtering applied to an audio signal before storage or transmission to
`improve the signal-to-noise ratio at high frequencies.
`
`2.1.56 encoder: An embodiment of an encoding process.
`
`2.1.57 encoding (process): A process, not specified in ISO/IEC 11172, that reads a stream of in