`
`HANDBOOK
`
`MPEG· 1 MPEG-2 MPEG-4
`
`1
`
`Comcast, Ex. 1043
`
`
`
`F
`
`The MPEG
`Handbook
`MPEG-1, MPEG-2, MPEG-4
`
`John Watkinson
`
`Focal Press
`OXFORD AUCKLAND BOSTON JOHANNESBURG MELBOURNE NEW DELHI
`
`2
`
`
`
`Focal Press
`An imprint of Butterworth-Heinemann
`Linacre House, Jordan Hill, Oxford OX2 SDP
`225 Wildwood Avenue, Woburn, MA 01801-2041
`A division of Reed Educational and Professional Publishing Ltd
`
`-@ A member of the Reed Elsevier plc group ·
`
`First published 2001
`
`© John Watkinson 2001
`
`All rights reserved. No part of this publication may be reproduced in any
`material form (including photocopying or storing in any medium by
`electronic means and whether or not transiently or incidentally to some
`other use of this publication) without the written permission of the
`copyright holder except in accordance with the provisions of the Copyright,
`Designs and Patents Act 1988 or under the terms of a licence issued by the
`Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London,
`England WlP OLP. Applications for the copyright holder's written
`permission to reproduce any part of this publication should be addressed
`to the publishers
`
`British Library Cataloguing in Publication Data
`A catalogue record for this book is available from the British Library
`
`Library of Congress Cataloguing in Publication Data
`A catalogue record for this book is available from the Library of Congress
`
`For information on all Focal Press publications visit
`our website at www.focalpress.com
`
`ISBN O 240 51656 7
`
`Composition by Genesis Typesetting, Rochester, Kent
`Printed and bound in Great Britain
`
`FOR EVERY TITLE THAT WE PUBLISH, BUTTERWORTH·HEINEMANN
`WILL PAY FOR BTCV TO PLANT AND CARE FOR A TREE.
`
`3
`
`
`
`Contents
`
`Preface
`
`Acknowledgements
`
`ress
`
`Chapter 1 Introduction to compression
`
`1.1 What is MPEG?
`1.2 Why compression is necessary
`1.3 MPEG-1, 2 and 4 contrasted
`1.4
`Some applications of compression
`1.5
`Lossless and perceptive coding
`1.6 Compression principles
`1.7 Video compression
`1.7.1
`Intra-coded compression
`1.7.2
`Inter-coded compression
`1.7.3
`Introduction to motion compensation
`1.7.4 Film-originated video compression
`Introduction to MPEG-1
`1.8
`1.9 MPEG-2: Profiles and Levels
`1.10
`Introduction to MPEG-4
`1.11 Audio compression
`1.11.1 Sub-band coding
`1.11.2 Transform coding
`1.11.3 Predictive coding
`1.12 MPEG bitstreams
`1.13 Drawbacks of compression
`
`xi
`
`xiii
`
`1
`
`1
`3
`4
`5
`6
`8
`13
`14
`15
`16
`18
`19
`20
`22
`26
`26
`26
`26
`27
`28
`
`4
`
`
`
`vi Contents
`
`1.14 Compression pre-processing
`
`1.15 Some guidelines
`References
`
`
`
`Chapter 2 Fundamentals
`
`2.1 What is an audio signal?
`2.2 What is a video signal?
`2.3 Types of video
`
`2.4 What is a digital signal?
`2.5 Sampling
`2.6 Reconstruction
`
`2.7 Aperture effect
`2.8 Choice of audio sampling rate
`
`
`2.9 Video sampling structures
`
`2.10 The phase-locked loop
`2.11 Quantizing
`
`2.12 Quantizing error
`2.13 Dither
`2.14 Introduction to digital processing
`
`
`2.15 Logic elements
`
`2.16 Storage elements
`2.17 Binary coding
`2.18 Gain control
`
`2.19 Floating-point coding
`
`2.20 Multiplexing principles
`2.21 Packets
`
`2.22 Statistical multiplexing
`
`2.23 Timebase correction
`References
`
`
`
`Chapter 3 Processing for compression
`
`
`
`3.1 Introduction
`3.2 Transforms
`3.3 Convolution
`3.4 FIR and IIR filters
`3.5 FIR filters
`3.6 Interpolation
`
`3.7 Downsampling filters
`3.8 The quadrature mirror filter
`
`
`3.9 Filtering for video noise reduction
`3.10 Warping
`
`29
`30
`31
`
`32
`
`32
`32
`33
`35
`38
`42
`46
`49
`51
`54
`56
`58
`61
`63
`65
`67
`69
`78
`81
`82
`83
`84
`84
`86
`
`87
`
`87
`90
`92
`95
`95
`102
`113
`114
`118
`119
`
`5
`
`
`
`29
`30
`31
`
`32
`
`32
`32
`33
`35
`38
`42
`46
`49
`51
`54
`56
`58
`61
`63
`65
`67
`69
`78
`81
`82
`83
`84
`84
`86
`
`87
`
`87
`90
`92
`95
`95
`102
`113
`114
`118
`119
`
`3.11 Transforms and duality
`3.12 The Fourier transform
`3.13 The discrete cosine transform (DCT)
`3.14 The wavelet transform
`3.15 The importance of motion compensation
`3.16 Motion-estimation techniques
`3.16.1 Block matching
`3.16.2 Gradient matching
`3.16.3 Phase correlation
`3.17 Motion-compensated displays
`3.18 Camera-shake compensation
`3.19 Motion-compensated de-interlacing
`3.20 Compression and requantizing
`References
`
`Chapter 4 Audio compression
`
`Introduction
`4.1
`The deciBel
`4.2
`4.3 Audio level metering
`4.4
`The ear
`4.5
`The cochlea
`Level and loudness
`4.6
`4.7
`Frequency discrimination
`4.8 Critical bands
`4.9 Beats
`4.10 Codec level calibration
`4.11 Quality meas.urement
`4.12 The limits
`4.13 Compression applications
`4.14 Audio compression tools
`4.15 Sub-band coding
`4.16 Audio compression formats
`4.17 MPEG audio compression
`4.18 MPEG Layer I audio coding
`4.19 MPEG Layer II audio coding
`4.20 MPEG Layer III audio coding
`4.21 MPEG-2 AAC
`advanced audio coding
`4.22 Dolby AC-3
`4.23 MPEG-4 Audio
`4.24 MPEG-4 AAC
`4.25 Compression in stereo and surround sound
`References
`
`Contents vii
`
`126
`128
`137
`138
`144
`146
`146
`148
`149
`153
`155
`157
`158
`163
`
`165
`
`165
`166
`171
`174
`176
`178
`179
`180
`182
`184
`185
`187
`187
`188
`193
`194
`194
`197
`202
`204
`207
`213
`215
`215
`216
`222
`
`6
`
`
`
`viii Contents
`
`Chapter 5 MPEG video compression
`
`The eye
`5.1
`5.2 Dynamic resolution
`5.3 Contrast
`5.4 Colour vision
`5.5 Colour difference signals
`5.6
`Progressive or interlaced scan?
`5.7
`Spatial and temporal redundancy in MPEG
`I and P coding
`5.8
`5.9 Bidirectional coding
`5.10 Coding applications
`5.11
`Intra-coding
`5.12
`Intra-coding in MPEG-1 and MPEG-2
`5.13 A bidirectional coder
`5.14 Slices
`5.15 Handling interlaced pictures
`5.16 MPEG-1 and MPEG-2 coders
`5.17 The elementary stream
`5.18 An MPEG-2 decoder
`5.19 MPEG-4
`5.20 Video objects
`5.21 Texture coding
`5.22 Shape coding
`5.23 Padding
`5.24 Video object coding
`5.25 Two-dimensional mesh coding
`5.26 Sprites
`5.27 Wavelet-based compression
`5.28 Three-dimensional mesh coding
`5.29 Animation
`5.30 Scaleability
`5.31 Coding artifacts
`5.32 MPEG and concatenation
`References
`
`Chapter 6 Program and transport streams
`
`Introduction
`6.1
`Packets and time stamps
`6.2
`Transport streams
`6.3
`6.4 Clock references
`6.5
`Program Specific Information (PSI)
`
`224
`
`224
`228
`232
`233
`235
`238
`243
`248
`249
`252
`253
`257
`261
`265
`266
`271
`272
`273
`276
`278
`281
`287
`289
`290
`293
`298
`300
`304
`313
`315
`318
`322
`328
`
`329
`
`329
`329
`332
`333
`335
`
`7
`
`
`
`6.6 Multiplexing
`6.7 Remultiplexing
`Reference
`
`Chapter 7 MPEG applications
`
`7.1
`Introduction
`7.2 Video phones
`7.3 Digital television broadcasting
`7.4
`The DVB receiver
`7.5 CD-Video and DVD
`7.6
`Personal video recorders
`7.7 Networks
`7.8
`FireWire
`7.9 Broadband networks and ATM
`7.10 ATMAALs
`References
`
`Index
`
`Contents
`
`ix
`
`336
`338
`339
`
`340
`
`340
`342
`342
`355
`356
`361
`363
`371
`373
`377
`380
`
`381
`
`224
`
`224
`228
`232
`233
`235
`238
`243
`248
`249
`252
`253
`257
`261
`265
`266
`271
`272
`273
`276
`278
`281
`287
`289
`290
`293
`298
`300
`304
`313
`315
`318
`322
`328
`
`329
`
`329
`329
`332
`333
`335
`
`8
`
`
`
`Preface
`
`This book completely revises the earlier book entitled MPEG-2. It is an
`interesting insight into the rate at which this technology progresses that
`this book was in preparation only a year after MPEG-2 was first
`published. The impetus for the revision is, of course, MPEG-4 which is
`comprehensively covered here. The opportunity has also been taken to
`improve a number of explanations and to add a chapter on applications
`ofMPEG.
`The approach of the book has not changed in the slightest. Compres(cid:173)
`sion is a specialist subject with its own library of specialist terminology
`which is generally accompanied by a substantial amount of mathematics.
`I have always argued that mathematics is only a form of shorthand, itself
`a compression technique! Mathematics describes but does not explain,
`whereas this book explains and then describes.
`A chapter of fundamentals is included to make the main chapters
`easier to follow. Also included are some guidelines which have been
`found practically useful in getting the best out of compression systems.
`The reader who has endured this book will be in a good position to
`tackle the MPEG standards documents themselves, although these are
`not for the faint-hearted, especially the MPEG-4 documents which are
`huge and impenetrable. One wonders what they will come up with
`next!
`
`9
`
`
`
`Acknowledgements
`
`Information for this book has come from a number of sources to whom I
`am indebted. The publications of the ISO, AES and SMPTE provided
`essential reference material. Thanks also to the following for lengthy
`discussions and debates: Peter de With, Steve Lyman, Bruce Devlin, Mike
`Knee, Peter Kraniauskas and Tom MacMahon. The assistance of
`MicroSoft Corp. and Tektronix Inc. is also appreciated. Special thanks to
`Mikael Reichel.
`
`10
`
`
`
`1
`Introduction to compression
`
`1. 1 What is MPEG?
`
`MPEG is actually an acronym for the Moving Pictures Experts Group
`which was formed by the ISO (International Standards Organization) to
`set standards for audio and video compression and transmission.
`Compression is summarized in Figure 1.1. It will be seen in (a) that the
`data rate is reduced at source by the compressor. The compressed data are
`then passed through a communication channel and returned to the original
`rate by the expander. The ratio between the source data rate and the channel
`data rate is called the compression factor. The term coding gain is also used.
`Sometimes a compressor and expander in series are referred to as a
`compander. The compressor may equally well be referred to as a coder and
`the expander a decoder in which case the tandem pair may be called a
`codec.
`Where the encoder is more complex than the decoder, the system is said
`to be asymmetrical. Figure 1.1 (b) shows that MPEG works in this way. The
`encoder needs to be algorithmic or adaptive whereas the decoder is 'dumb'
`and carries out fixed actions. This is advantageous in applications such as
`broadcasting where the number of expensive complex encoders is small
`but the number of simple inexpensive decoders is large. In point-to-point
`applications the advantage of asymmetrical coding is not so great.
`The approach of the ISO to standardization in MPEG is novel because
`it is not the encoder which is standardized. Figure 1.2(a) shows that
`instead the way in which a decoder shall interpret the bitstream is
`defined. A decoder which can successfully interpret the bitstream is said
`to be compliant. Figure l.2(b) shows that the advantage of standardizing
`the decoder is that over time encoding algorithms can. improve yet
`compliant decoders will continue to function with them.
`
`11
`
`
`
`2 The MPEG Handbook
`
`Compressor
`or
`coder
`
`T ransm1ss1on
`channel
`
`-
`
`Expander
`or
`decoder
`
`D t aa
`sink
`
`D ala
`source
`
`(a)
`
`In
`
`'Smart'
`encoder
`
`MPEG -
`compliant
`bitstream
`
`'Dumb'
`decoder
`
`O~t
`
`Encoder is
`algorithmic,
`i.e. it does different
`things according to
`nature of input
`
`_.. Complex to
`make
`
`Decoder is
`deterministic,
`i.e. it always does
`what the bitstream
`tells it to do
`
`_..Simple to
`make
`
`Asymmetrical
`coding system
`
`Inexpensive
`Expensive
`decoder
`coder
`Few
`_____ l_d_ea_l_fo_r _____ Many
`encoders
`broadcast
`decoders
`
`(b)
`
`ln (a) a compression system consists of compressor or coder, a transmission
`Figure 1.1
`channel and a matching expander or decoder. The combination of coder and decoder is
`known as a codec. (b) MPEG is asymmetrical since the encoder is much more complex
`than the decoder.
`
`It should be noted that a compliant decoder must correctly be able to
`interpret every allowable bitstream, whereas an encoder which produces
`a restricted subset of the possible codes can still be compliant.
`The MPEG standards give very little information regarding the
`structure and operation of the encoder. Provided the bitstream is
`compliant, any coder construction will meet the standard, although some
`designs will give better picture quality than others. Encoder construction
`is not revealed in the bitstream and manufacturers can supply encoders
`using algorithms which are proprietary and their details do not need to
`be published. A useful result is that there can be competition between
`different encoder designs which means that better designs can evolve.
`The user will have greater choice because different levels of cost and
`complexity can exist in a range of coders yet a compliant decoder will
`operate with them all.
`MPEG is, however, much more than a compression scheme as it also
`standardizes the protocol and syntax under which it is possible to
`combine or multiplex audio data with video data to produce a digital
`equivalent of a television program. Many such programs can be
`
`12
`
`
`
`Data
`sink
`
`1ut -
`
`transmission
`d decoder is
`,re complex
`
`y be able to
`:h produces
`Lt.
`;arding the
`,itstream is
`rough some
`onstruction
`ly encoders
`not need to
`m between
`can evolve.
`)f cost and
`ecoder will
`
`.e as it also
`possible to
`:e a digital
`ns can be
`
`Introduction to compression 3
`
`Video
`
`Encoder
`
`Encoder is
`not specified
`by MPEG
`except that
`it produces
`compliant
`bitstream
`
`t
`
`(a)
`
`Bitstream t
`
`MPEG
`defines
`this!
`
`Decoder
`
`Video
`
`Compliant
`decoder
`must interpret
`all legal
`MPEG
`bitstreams
`
`Not this
`I
`
`t
`
`Today's
`encoder
`
`Compliant
`bitstream
`
`Today's
`decoder
`
`Tomorrow's 1---C~o~m--'-pl_ia_n_t __
`encoder
`bitstream
`
`Today's
`decoder
`still works
`
`(b)
`
`Secret
`encoder
`
`Compliant
`bitstream
`
`Today's
`decoder
`still works
`
`(c)
`
`Figure 1.2
`(a) MPEG defines the protocol of the bitstream between encoder and
`decoder. The decoder is defined by implication, the encoder is left very much to the
`designer. (b) This approach allows future encoders of better performance to remain
`compatible with existing decoders. (c) This approach also allows an encoder to produce
`a standard bitstream while its technical operation remains a commercial secret.
`
`combined in a single multiplex and MPEG defines the way in which such
`multiplexes can be created and transported. The definitions include the
`metadata which decoders require to demultiplex correctly and which
`·
`users will need to locate programs ot interest.
`As with all video systems there is a requirement for synchronizing or
`genlocking and this is particularly complex when a multiplex is.
`assembled from many signals which are not necessarily synchronized to
`one another.
`
`1.2
`
`Why compression is necessary
`
`Compression, bit rate reduction, data reduction and source coding are all
`terms which mean basically the same thing in this context. In essence the
`
`13
`
`
`
`4 The MPEG Handbook
`
`same (or nearly the same) information is carried using a smaller quantity
`or rate of data. It should be pointed out that in audio compression
`traditionally means a process in which the dynamic range of the sound is
`reduced. In the context of MPEG the same word means that the bit rate
`is reduced, ideally leaving the dynamics of the signal unchanged.
`Provided the context is clear, the two meanings can co-exist without a
`great deal of confusion.
`There are several reasons why compression techniques are popular:
`
`(a) Compression extends the playing time of a given storage device.
`(b) Compression allows miniaturization. With fewer data to store, the
`same playing time is obtained with smaller hardware. This is useful
`in ENG (electronic news gathering) and consumer devices.
`(c) Tolerances can be relaxed. With fewer data to record, storage density
`can be reduced making equipment which is more resistant to adverse
`environments and which requires less maintenance.
`(d) In transmission systems, compression allows a reduction in band(cid:173)
`width which will generally result in a reduction in cost. This may
`make possible a service which would be impracticable without it.
`(e) If a given bandwidth is available to an uncompressed signal,
`compression allows faster than real-time transmission in the same
`bandwidth.
`(f) If a given bandwidth is available, compression allows a better-quality
`signal in the same bandwidth.
`
`1.3
`
`MPEG-1, 2 and 4 contrasted
`
`The first compression standard for audio and video was MPEG-t.1,2
`Although many applications have been found, MPEG-1 was basically
`designed to allow moving pictures and sound to be encoded into the bit
`rate of an audio Compact Disc. The resultant Video-CD was quite
`successful but has now been superseded by DVD. In order to meet the
`low bit requirement, MPEG-1 downsampled the images heavily as well as
`using picture rates of only 24-30 Hz and the resulting quality was
`moderate.
`The subsequent MPEG-2 standard was considerably broader in scope
`and of wider appeal.3 For example, MPEG-2 supports interlace and HD
`whereas MPEG-1 did not. MPEG-2 has become very important because it
`has been chosen as the compression scheme for both DVB (digital video
`broadcasting) and DVD (digital video disk). Developments in standardiz(cid:173)
`ing scaleable and multi-resolution compression which would have
`become MPEG-3 were ready by the time MPEG-2 was ready to be
`
`14
`
`
`
`!ler quantity
`compression
`the sound is
`the bit rate
`unchanged.
`,t without a
`
`! popular:
`
`i device.
`o store, the
`1is is useful
`~s.
`age density
`t to adverse
`
`1n in band(cid:173)
`:. This may
`ithout it.
`sed signal,
`n the same
`
`tier-quality
`
`MPEG-1. 1,2
`ts basically
`into the bit
`was quite
`o meet the
`yas well as
`uality was
`
`er in scope
`ce and HD
`t because it
`gital video
`,tandardiz(cid:173)
`ould have
`ady to be
`
`Introduction to compression 5
`
`standardized and so this work was incorporated into MPEG-2, and as a
`result there is no MPEG-3 standard.
`MPEG-44 uses further coding tools with additional complexity to
`achieve higher compression factors than MPEG-2. In addition to more
`efficient coding of video, MPEG-4 moves closer to computer graphics
`applications. In the more complex Profiles, the MPEG-4 decoder
`. effectively becomes a rendering processor and the compressed bitstream
`describes three-dimensional shapes and surface texture. It is to be
`expected that MPEG-4 will become as important to Internet and wireless
`delivery as MPEG-2 has become in DVD and DVB.
`
`Some· applications of compression
`
`The applications of audio and video compression are limitless and the
`ISO has done well to provide standards which are appropriate to the
`wide range of possible compression products.
`MPEG coding embraces video pictures from the tiny screen of a
`videophone to the high-definition images needed for electronic cinema.
`Audio coding stretches from speech-grade mono to multichannel
`surround sound.
`Figure 1.3 shows the use of a codec with a recorder. The playing time
`of the medium is extended in proportion to the compression factor. In the
`case of tapes, the access time is improved because the length of tape
`needed for a given recording is reduced and so it can be rewound more
`quickly. In the case of DVD (digital video disk aka digital versatile disk)
`the challenge was to store an entire movie on one 12 cm disk. The storage
`density available with today's optical disk technology is such that
`consumer recording of conventional uncompressed video would be out
`of the question.
`In communications, the cost of data links is often roughly proportional
`to the data rate and so there is simple economic pressure to use a high
`compression factor. However, it should be borne in mind that implement-
`
`D t aa
`source
`
`Data
`sink
`
`Compressor
`or
`coder
`
`Expander
`or
`decoder
`
`Storage
`device
`tape,
`disk,
`RAM,
`etc.
`
`Figure 1.3 Compression can be used around a recording medium. The storage capacity
`may be increased or the access time reduced according to the application.
`
`15
`
`
`
`6 The MPEG Handbook
`
`ing the codec also has a cost which rises with compression factor and so
`a degree of compromise will be inevitable.
`In the case of video-on-demand, technology exists to convey full
`bandwidth video to the home, but to do so for a single individual at the
`moment would be prohibitively expensive. Without compression, HDTV
`(high-definition television) requires too much bandwidth. With compres(cid:173)
`sion, HDTV can be transmitted to the home in a similar bandwidth to an
`existing analog SDTV channel. Compression does not make video-on(cid:173)
`demand or HDTV possible; it makes them economically viable.
`In workstations designed for the editing of audio and/ or video, the
`source material is stored on hard disks for rapid access. Whilst top-grade
`systems may function without compression, many systems use compres(cid:173)
`sion to offset the high cost of disk storage. In some systems a compressed
`version of the top-grade material may also be stored for browsing
`purposes.
`When a workstation is used for off-line editing, a high compression
`factor can be used and artifacts will be visible in the picture. This is of no
`consequence as the picture is only seen by the editor who uses it to make
`an EDL (edit decision list) which is no more than a list of actions and the
`timecodes at which they occur. The original uncompressed material is
`then conformed to the EDL to obtain a high-quality edited work. When on(cid:173)
`line editing is being performed, the output of the workstation is the
`finished product and clearly a lower compression factor will have to be
`used. Perhaps it is in broadcasting where the use of compression will
`have its greatest impact. There is only one electromagnetic spectrum and
`pressure from other services such as cellular telephones makes efficient
`use of bandwidth mandatory. Analog television broadcasting is an old
`technology and makes very inefficient use of bandwidth. Its replacement
`by a compressed digital transmission is inevitable for the practical reason
`that the bandwidth is needed elsewhere.
`Fortunately in broadcasting there is a mass market for decoders and
`these can be implemented as low-cost integrated circuits. Fewer encoders
`are needed and so it is less important if these are expensive. Whilst the
`cost of digital storage goes down year on year, the cost of the
`electromagnetic spectrum goes up. Consequently in the future the
`pressure to use compression in recording will ease whereas the pressure
`to use it in radio communications will increase.
`
`1.5
`
`Lossless and perceptive coding
`
`Although there are many different coding techniques, all of them fall into
`one or other of these categories. In lossless coding, the data from the
`expander are identical bit-for-bit with the original source data. The
`
`16
`
`
`
`I
`.I
`11
`\;.\
`
`I I I
`
`Introduction to compression 7
`
`so-called 'stacker' programs which increase the apparent capacity of disk
`drives in personal computers use lossless codecs. Clearly with computer
`programs the corruption of a single bit can be catastrophic. Lossless
`coding is generally restricted to compression factors of around 2:1.
`It is important to appreciate that a lossless coder cannot guarantee a
`particular compression factor and the communications link or recorder
`used with it must be able to function with the variable output data rate.
`Source data which result in poor compression factors on a given codec are
`described as difficult. It should be pointed out that the difficulty is often
`a function of the codec. In other words data which one codec finds
`difficult may not be found difficult by another. Lossless codecs can be
`included in bit-error-rate testing schemes. It is also possible to cascade or
`concatenate lossless codecs without any special precautions.
`Higher compression factors are only possible with lossy coding in
`which data from the expander are not identical bit-for-bit with the source
`data and as a result comparing the input with the output is bound to reveal
`differences. Lossy codecs are not suitable for computer data, but are used
`in MPEG as they allow greater compression factors than lossless codecs.
`Successful lossy codecs are those in which the errors are arranged so that a
`human viewer or listener finds them subjectively difficult to detect. Thus
`lossy codecs must be based on an understanding of psycho-acoustic and
`psycho-visual perception and are often called perceptive codes.
`In perceptive coding, the greater the compression factor required, the
`more accurately must the human senses be modelled. Perceptive coders
`can be forced to operate at a fixed compression factor. This is convenient
`for practical transmission applications where a fixed data rate is easier to
`handle than a variable rate. The result of a fixed compression factor is that
`the subjective quality can vary with the 'difficulty' of the input material.
`Perceptive codecs should not be concatenated indiscriminately especially
`if they use different algorithms. As the reconstructed signal from a
`perceptive codec is not bit-for-bit accurate, clearly such a codec cannot be
`included in any bit error rate testing system as the coding differences
`would be indistinguishable from real errors.
`Although the adoption of digital techniques is recent, compression
`itself is as old as television. Figure 1.4 shows some of the compression
`techniques used in traditional television systems.
`Most video signals employ a non-linear relationship between bright(cid:173)
`ness and the signal voltage which is known as gamma. Gamma is a
`perceptive coding technique which depends on the human sensitivity to
`video noise being a function of the brightness. The use of gamma allows
`the same subjective noise level with an eight-bit system as would be
`achieved with a fourteen-bit linear system.
`One of the oldest techniques is interlace, which has been used in analog
`television from the very beginning as a primitive way of reducing
`
`1ctor and so
`
`convey full
`·idual at the
`sion, HDTV
`th compres(cid:173)
`width to an
`e video-on(cid:173)
`:,le.
`r video, the
`,t top-grade
`se compres(cid:173)
`compressed
`,r browsing
`
`'Ompression
`fhis is of no
`·s it to make
`ons and the
`material is
`k. Whenon(cid:173)
`:ttion is the
`l have to be
`ression will
`ectrum and
`ces efficient
`g is an old
`eplacement
`tical reason
`
`coders and
`~r encoders
`Whilst the
`Jst of the
`future the
`te pressure
`
`m fall into
`from the
`data. The
`
`17
`
`
`
`8 The MPEG Handbook
`
`Progressive
`scan source
`
`Compress
`(lossy)
`
`Interlaced
`signal
`
`(a)
`
`Compress C---::7
`RGB
`.___s_ou_r_ce _ _:----'----►~
`(b)
`
`Component
`source
`
`Compress
`(lossy)
`
`Composite
`PAUNTSC/
`SECAM
`
`(c)
`
`Figure 1.4 Compression is as old as television. (a) Interlace is a primitive way of
`halving the bandwidth. (b) Colour difference working invisibly reduces colour
`resolution. (c) Composite video transmits colour in the same bandwidth as monochrome.
`
`bandwidth. As will be seen in Chapter 5, interlace is not without its
`problems, particularly in motion rendering. MPEG-2 supports interlace
`simply because legacy interlaced signals exist and there is a requirement
`to compress them. This should not be taken to mean that it is a good
`idea.
`The generation of colour difference signals from RGB in video
`represents an application of perceptive coding. The human visual system
`(HVS) sees no change in quality although the bandwidth of the colour
`difference signals is reduced. This is because human perception of detail
`in colour changes is much less than in brightness changes. This approach
`is sensibly retained in MPEG.
`Composite video systems such as PAL, NTSC and SECAM are all
`analog compression schemes which embed a subcarrier in the luminance
`signal so that colour pictures are available in the same bandwidth as
`monochrome. In comparison with a linear-light progressive scan RGB
`picture, gamma-coded interlaced composite video has a compression
`factor of about 10:1.
`In a sense MPEG-2 can be considered to be a modern digital equivalent
`of analog composite video as it has most of the same attributes. For
`example, the eight-field sequence of the PAL subcarrier which makes
`editing difficult has its equivalent in the GOP (group of pictures) of
`MPEG.
`
`1.6
`
`Compression principles
`
`In a PCM digital system the bit rate is the product of the sampling rate
`and the number of bits in each sample and this is generally constant.
`
`18
`
`
`
`Introduction to compression 9
`
`Nevertheless the information rate of a real signal varies. In all real signals,
`part of the signal is obvious from what has gone before or what may come
`later and a suitable receiver can predict that part so that only the true
`information actually has to be sent. If the characteristics of a predicting
`receiver are known, the transmitter can omit parts of the message in the
`knowledge that the receiver has the ability to re-create it. Thus all
`encoders must contain a model of the decoder.
`One definition of information is that it is the unpredictable or
`surprising element of data. Newspapers are a good example of
`information because they only mention items which are surprising.
`Newspapers never carry items about individuals who have not been
`involved in an accident as this is the normal case. Consequently the
`phrase 'no news is good news' is remarkably true because if an
`information channel exists but nothing has been sent then it is most likely
`that nothing remarkable has happened.
`The unpredictability of the punch line is a useful measure of how funny a
`joke is. Often the build-up paints a certain picture in the listener's
`imagination, which the punch line destroys utterly. One of the author's
`favourites is the one about the newly married couple who didn't know the
`difference between putty and petroleum jelly- their windows fell out.
`The difference between the information rate and the overall bit rate is
`known as the redundancy. Compression systems are designed to
`eliminate as much of that redundancy as practicable or perhaps
`affordable. One way in which this can be done is to exploit statistical
`predictability in signals. The information content or entropy of a sample is
`a function of how different it is from the predicted value. Most signals
`have some degree of predictability. A sine wave is highly predictable,
`because all cycles look the same. According to Shannon's theory, any
`signal which is totally predictable carries no information. In the case of
`the sine wave this is clear because it represents a single frequency and so
`has no bandwidth.
`At the opposite extreme a signal such as noise is completely
`unpredictable and as a result all codecs find noise difficult. The most
`efficient way of coding noise is PCM. A codec which is designed using the
`statistics of real material should not be tested with random noise because
`it is not a representative test. Second, a codec which performs well with
`clean source material may perform badly with source material containing
`superimposed noise. Most practical compression units require some form
`of pre-processing before the compression stage proper and appropriate
`noise reduction should be incorporated into the pre-processing if noisy
`signals are anticipated. It will also be necessary to restrict the degree of
`compression applied to noisy signals.
`All real signals fall part-way between the extremes of total predictabil(cid:173)
`ity and total unpredictability or noisiness. If the bandwidth (set by the
`
`way of
`.our
`monochrome.
`
`without its
`rts interlace
`requirement
`it is a good
`
`B in video
`.sual system
`f the colour
`ion of detail
`Lis approach
`
`'.AM are all
`~ luminance
`ndwidth as
`e scan RGB
`:ompression
`
`1 equivalent
`ributes. For
`hich makes
`pictures) of
`
`mpling rate
`ly constant.
`
`19
`
`
`
`10 The MPEG Handbook
`
`Signal
`level
`
`(a)
`
`0
`t5
`~
`C
`0
`'ui en
`~ a.
`E
`0
`(.)
`
`(b}
`
`Ideal
`lossless
`coder
`
`~
`
`'Lossy'
`
`~ r G-----
`
`Entro
`
`Redundancy
`
`FREQUENCY
`
`0
`t5
`~
`C
`0
`'ui
`en
`~
`a.
`E
`0
`(.)
`
`(c)
`
`Better
`quality
`
`Latency (delay)
`
`Better
`quality
`
`Complexity
`
`(a) A perfect coder removes only the redundancy from the input signal and
`Figure 1.5
`results in subjectively lossless coding. If the remaining entropy is beyond the capacity of
`the channel some of it must be lost and the codec will then be lossy. An imperfect coder
`will also be lossy as it fails to keep all entropy. (b) As the compression factor rises, the
`complexity must also rise to maintain quality. (c) High compression factors also tend to
`increase latency or delay through the system.
`
`sampling rate) and the dynamic range (set by the wordlength) of the
`transmission system are used to delineate an area, this sets a limit on the
`information capacity of the system. Figure 1.S