throbber

`
`lain E. G. Richardsnn
`
`H. 264 and
`MPEG-4
`
`MCOMPRESSION
`
`
` {WWI L EY
`
`

`

`H.264 and MPEG-4 Video
`Compression
`
`

`

`

`

`H.264 and MPEG-4 Video
`Compression
`Video Coding for Next-generation Multimedia
`
`Iain E. G. Richardson
`The Robert Gordon University, Aberdeen, UK
`
`

`

`Copyright C(cid:1) 2003
`
`John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester,
`West Sussex PO19 8SQ, England
`
`Telephone
`
`(+44) 1243 779777
`
`Email (for orders and customer service enquiries): cs-books@wiley.co.uk
`Visit our Home Page on www.wileyeurope.com or www.wiley.com
`
`All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system
`or transmitted in any form or by any means, electronic, mechanical, photocopying, recording,
`scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988
`or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham
`Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher.
`Requests to the Publisher should be addressed to the Permissions Department, John Wiley &
`Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed
`to permreq@wiley.co.uk, or faxed to (+44) 1243 770620.
`
`This publication is designed to provide accurate and authoritative information in regard to
`the subject matter covered. It is sold on the understanding that the Publisher is not engaged
`in rendering professional services. If professional advice or other expert assistance is
`required, the services of a competent professional should be sought.
`
`Other Wiley Editorial Offices
`
`John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA
`
`Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA
`
`Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany
`
`John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia
`
`John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809
`
`John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1
`
`Wiley also publishes its books in a variety of electronic formats. Some content that appears
`in print may not be available in electronic books.
`
`British Library Cataloguing in Publication Data
`
`A catalogue record for this book is available from the British Library
`
`ISBN 0-470-84837-5
`
`Typeset in 10/12pt Times roman by TechBooks, New Delhi, India
`Printed and bound in Great Britain by Antony Rowe, Chippenham, Wiltshire
`This book is printed on acid-free paper responsibly manufactured from sustainable forestry
`in which at least two trees are planted for each one used for paper production.
`
`

`

`To Phyllis
`To Phyllis
`
`

`

`

`

`Contents
`
`About the Author
`
`Foreword
`
`Preface
`
`Glossary
`
`1 Introduction
`1.1 The Scene
`1.2 Video Compression
`1.3 MPEG-4 and H.264
`1.4 This Book
`1.5 References
`
`2 Video Formats and Quality
`2.1 Introduction
`2.2 Natural Video Scenes
`2.3 Capture
`2.3.1 Spatial Sampling
`2.3.2 Temporal Sampling
`2.3.3 Frames and Fields
`2.4 Colour Spaces
`2.4.1 RGB
`2.4.2 YCbCr
`2.4.3 YCbCr Sampling Formats
`2.5 Video Formats
`2.6 Quality
`2.6.1 Subjective Quality Measurement
`2.6.2 Objective Quality Measurement
`2.7 Conclusions
`2.8 References
`
`xiii
`
`xv
`
`xix
`
`xxi
`
`1
`1
`3
`5
`6
`7
`
`9
`9
`9
`10
`11
`11
`13
`13
`14
`15
`17
`19
`20
`21
`22
`24
`24
`
`

`

`•viii
`
`3 Video Coding Concepts
`3.1 Introduction
`3.2 Video CODEC
`3.3 Temporal Model
`3.3.1 Prediction from the Previous Video Frame
`3.3.2 Changes due to Motion
`3.3.3 Block-based Motion Estimation and Compensation
`3.3.4 Motion Compensated Prediction of a Macroblock
`3.3.5 Motion Compensation Block Size
`3.3.6 Sub-pixel Motion Compensation
`3.3.7 Region-based Motion Compensation
`3.4 Image model
`3.4.1 Predictive Image Coding
`3.4.2 Transform Coding
`3.4.3 Quantisation
`3.4.4 Reordering and Zero Encoding
`3.5 Entropy Coder
`3.5.1 Predictive Coding
`3.5.2 Variable-length Coding
`3.5.3 Arithmetic Coding
`3.6 The Hybrid DPCM/DCT Video CODEC Model
`3.7 Conclusions
`3.8 References
`
`4 The MPEG-4 and H.264 Standards
`4.1 Introduction
`4.2 Developing the Standards
`4.2.1 ISO MPEG
`4.2.2 ITU-T VCEG
`4.2.3 JVT
`4.2.4 Development History
`4.2.5 Deciding the Content of the Standards
`4.3 Using the Standards
`4.3.1 What the Standards Cover
`4.3.2 Decoding the Standards
`4.3.3 Conforming to the Standards
`4.4 Overview of MPEG-4 Visual/Part 2
`4.5 Overview of H.264 / MPEG-4 Part 10
`4.6 Comparison of MPEG-4 Visual and H.264
`4.7 Related Standards
`4.7.1 JPEG and JPEG2000
`4.7.2 MPEG-1 and MPEG-2
`4.7.3 H.261 and H.263
`4.7.4 Other Parts of MPEG-4
`4.8 Conclusions
`4.9 References
`
`CONTENTS
`
`27
`27
`28
`30
`30
`30
`32
`33
`34
`37
`41
`42
`44
`45
`51
`56
`61
`61
`62
`69
`72
`82
`83
`
`85
`85
`85
`86
`87
`87
`88
`88
`89
`90
`90
`91
`92
`93
`94
`95
`95
`95
`96
`97
`97
`98
`
`

`

`CONTENTS
`
`5 MPEG-4 Visual
`5.1 Introduction
`5.2 Overview of MPEG-4 Visual (Natural Video Coding)
`5.2.1 Features
`5.2.2 Tools, Objects, Profiles and Levels
`5.2.3 Video Objects
`5.3 Coding Rectangular Frames
`5.3.1 Input and Output Video Format
`5.3.2 The Simple Profile
`5.3.3 The Advanced Simple Profile
`5.3.4 The Advanced Real Time Simple Profile
`5.4 Coding Arbitrary-shaped Regions
`5.4.1 The Core Profile
`5.4.2 The Main Profile
`5.4.3 The Advanced Coding Efficiency Profile
`5.4.4 The N-bit Profile
`5.5 Scalable Video Coding
`5.5.1 Spatial Scalability
`5.5.2 Temporal Scalability
`5.5.3 Fine Granular Scalability
`5.5.4 The Simple Scalable Profile
`5.5.5 The Core Scalable Profile
`5.5.6 The Fine Granular Scalability Profile
`5.6 Texture Coding
`5.6.1 The Scalable Texture Profile
`5.6.2 The Advanced Scalable Texture Profile
`5.7 Coding Studio-quality Video
`5.7.1 The Simple Studio Profile
`5.7.2 The Core Studio Profile
`5.8 Coding Synthetic Visual Scenes
`5.8.1 Animated 2D and 3D Mesh Coding
`5.8.2 Face and Body Animation
`5.9 Conclusions
`5.10 References
`
`6 H.264/MPEG-4 Part 10
`6.1 Introduction
`6.1.1 Terminology
`6.2 The H.264 CODEC
`6.3 H.264 structure
`6.3.1 Profiles and Levels
`6.3.2 Video Format
`6.3.3 Coded Data Format
`6.3.4 Reference Pictures
`6.3.5 Slices
`6.3.6 Macroblocks
`
`•ix
`
`99
`99
`100
`100
`100
`103
`104
`106
`106
`115
`121
`122
`124
`133
`138
`141
`142
`142
`144
`145
`148
`148
`149
`149
`152
`152
`153
`153
`155
`155
`155
`156
`156
`156
`
`159
`159
`159
`160
`162
`162
`162
`163
`163
`164
`164
`
`

`

`•x
`
`6.4 The Baseline Profile
`6.4.1 Overview
`6.4.2 Reference Picture Management
`6.4.3 Slices
`6.4.4 Macroblock Prediction
`6.4.5 Inter Prediction
`6.4.6 Intra Prediction
`6.4.7 Deblocking Filter
`6.4.8 Transform and Quantisation
`6.4.9 4 × 4 Luma DC Coefficient Transform and Quantisation
`(16 × 16 Intra-mode Only)
`6.4.10 2 × 2 Chroma DC Coefficient Transform and Quantisation
`6.4.11 The Complete Transform, Quantisation, Rescaling and Inverse
`Transform Process
`6.4.12 Reordering
`6.4.13 Entropy Coding
`6.5 The Main Profile
`6.5.1 B Slices
`6.5.2 Weighted Prediction
`6.5.3 Interlaced Video
`6.5.4 Context-based Adaptive Binary Arithmetic Coding (CABAC)
`6.6 The Extended Profile
`6.6.1 SP and SI slices
`6.6.2 Data Partitioned Slices
`6.7 Transport of H.264
`6.8 Conclusions
`6.9 References
`
`7 Design and Performance
`7.1 Introduction
`7.2 Functional Design
`7.2.1 Segmentation
`7.2.2 Motion Estimation
`7.2.3 DCT/IDCT
`7.2.4 Wavelet Transform
`7.2.5 Quantise/Rescale
`7.2.6 Entropy Coding
`7.3 Input and Output
`7.3.1 Interfacing
`7.3.2 Pre-processing
`7.3.3 Post-processing
`7.4 Performance
`7.4.1 Criteria
`7.4.2 Subjective Performance
`7.4.3 Rate–distortion Performance
`
`CONTENTS
`
`165
`165
`166
`167
`169
`170
`177
`184
`187
`
`194
`195
`
`196
`198
`198
`207
`207
`211
`212
`212
`216
`216
`220
`220
`222
`222
`
`225
`225
`225
`226
`226
`234
`238
`238
`238
`241
`241
`242
`243
`246
`246
`247
`251
`
`

`

`CONTENTS
`
`7.4.4 Computational Performance
`7.4.5 Performance Optimisation
`7.5 Rate control
`7.6 Transport and Storage
`7.6.1 Transport Mechanisms
`7.6.2 File Formats
`7.6.3 Coding and Transport Issues
`7.7 Conclusions
`7.8 References
`
`8 Applications and Directions
`8.1 Introduction
`8.2 Applications
`8.3 Platforms
`8.4 Choosing a CODEC
`8.5 Commercial issues
`8.5.1 Open Standards?
`8.5.2 Licensing MPEG-4 Visual and H.264
`8.5.3 Capturing the Market
`8.6 Future Directions
`8.7 Conclusions
`8.8 References
`
`Bibliography
`
`Index
`
`•xi
`
`254
`255
`256
`262
`262
`263
`264
`265
`265
`
`269
`269
`269
`270
`270
`272
`273
`274
`274
`275
`276
`276
`
`277
`
`279
`
`

`

`

`

`About the Author
`
`Iain Richardson is a lecturer and researcher at The Robert Gordon University, Aberdeen,
`Scotland. He was awarded the degrees of MEng (Heriot-Watt University) and PhD (The
`Robert Gordon University) in 1990 and 1999 respectively. He has been actively involved in
`research and development of video compression systems since 1993 and is the author of over
`40 journal and conference papers and two previous books. He leads the Image Communica-
`tion Technology Research Group at The Robert Gordon University and advises a number of
`companies on video compression technology issues.
`
`

`

`

`

`Foreword
`
`Work on the emerging “Advanced Video Coding” standard now known as ITU-T Recom-
`mendation H.264 and as ISO/IEC 14496 (MPEG-4) Part 10 has dominated the video coding
`standardization community for roughly the past three years. The work has been stimulating,
`intense, dynamic, and all consuming for those of us most deeply involved in its design. The
`time has arrived to see what has been accomplished.
`Although not a direct participant, Dr Richardson was able to develop a high-quality,
`up-to-date, introductory description and analysis of the new standard. The timeliness of this
`book is remarkable, as the standard itself has only just been completed.
`The new H.264/AVC standard is designed to provide a technical solution appropriate
`for a broad range of applications, including:
`
`r Broadcast over cable, satellite, cable modem, DSL, terrestrial.
`r Interactive or serial storage on optical and magnetic storage devices, DVD, etc.
`r Conversational services over ISDN, Ethernet, LAN, DSL, wireless and mobile networks,
`modems.
`r Video-on-demand or multimedia streaming services over cable modem, DSL, ISDN, LAN,
`wireless networks.
`r Multimedia messaging services over DSL, ISDN.
`
`The range of bit rates and picture sizes supported by H.264/AVC is correspondingly broad,
`addressing video coding capabilities ranging from very low bit rate, low frame rate, “postage
`stamp” resolution video for mobile and dial-up devices, through to entertainment-quality
`standard-definition television services, HDTV, and beyond. A flexible system interface for the
`coded video is specified to enable the adaptation of video content for use over this full variety
`of network and channel-type environments. However, at the same time, the technical design
`is highly focused on providing the two limited goals of high coding efficiency and robustness
`to network environments for conventional rectangular-picture camera-view video content.
`Some potentially-interesting (but currently non-mainstream) features were deliberately left out
`(at least from the first version of the standard) because of that focus (such as support of
`arbitrarily-shaped video objects, some forms of bit rate scalability, 4:2:2 and 4:4:4 chroma
`formats, and color sampling accuracies exceeding eight bits per color component).
`
`

`

`•xvi
`
`Foreword
`
`In the work on the new H.264/AVC standard, a number of relatively new technical
`developments have been adopted. For increased coding efficiency, these include improved
`prediction design aspects as follows:
`
`r Variable block-size motion compensation with small block sizes,
`r Quarter-sample accuracy for motion compensation,
`r Motion vectors over picture boundaries,
`r Multiple reference picture motion compensation,
`r Decoupling of referencing order from display order,
`r Decoupling of picture representation methods from the ability to use a picture for reference,
`r Weighted prediction,
`r Improved “skipped” and “direct” motion inference,
`r Directional spatial prediction for intra coding, and
`r In-the-loop deblocking filtering.
`
`In addition to improved prediction methods, other aspects of the design were also enhanced
`for improved coding efficiency, including:
`
`r Small block-size transform,
`r Hierarchical block transform,
`r Short word-length transform,
`r Exact-match transform,
`r Arithmetic entropy coding, and
`r Context-adaptive entropy coding.
`
`And for robustness to data errors/losses and flexibility for operation over a variety of network
`environments, some key design aspects include:
`
`r Parameter set structure,
`r NAL unit syntax structure,
`r Flexible slice size,
`r Flexible macroblock ordering,
`r Arbitrary slice ordering,
`r Redundant pictures,
`r Data partitioning, and
`r SP/SI synchronization switching pictures.
`
`Prior to the H.264/AVC project, the big recent video coding activity was the MPEG-4 Part 2
`(Visual) coding standard. That specification introduced a new degree of creativity and flex-
`ibility to the capabilities of the representation of digital visual content, especially with its
`coding of video “objects”, its scalability features, extended N-bit sample precision and 4:4:4
`color format capabilities, and its handling of synthetic visual scenes. It introduced a number
`of design variations (called “profiles” and currently numbering 19 in all) for a wide variety
`of applications. The H.264/AVC project (with only 3 profiles) returns to the narrower and
`more traditional focus on efficient compression of generic camera-shot rectangular video pic-
`tures with robustness to network losses – making no attempt to cover the ambitious breadth of
`MPEG-4 Visual. MPEG-4 Visual, while not quite as “hot off the press”, establishes a landmark
`in recent technology development, and its capabilities are yet to be fully explored.
`
`

`

`Foreword
`
`•xvii
`
`Most people first learn about a standard in publications other than the standard itself.
`My personal belief is that if you want to know about a standard, you should also obtain a
`copy of it, read it, and refer to that document alone as the ultimate authority on its content,
`its boundaries, and its capabilities. No tutorial or overview presentation will provide all of the
`insights that can be obtained from careful analysis of the standard itself.
`At the same time, no standardized specification document (at least for video coding), can
`be a complete substitute for a good technical book on the subject. Standards specifications are
`written primarily to be precise, consistent, complete, and correct and not to be particularly
`readable. Standards tend to leave out information that is not absolutely necessary to comply
`with them. Many people find it surprising, for example, that video coding standards say almost
`nothing about how an encoder works or how one should be designed. In fact an encoder is
`essentially allowed to do anything that produces bits that can be correctly decoded, regardless
`of what picture quality comes out of that decoding process. People, however, can usually only
`understand the principles of video coding if they think from the perspective of the encoder, and
`nearly all textbooks (including this one) approach the subject from the encoding perspective.
`A good book, such as this one, will tell you why a design is the way it is and how to make
`use of that design, while a good standard may only tell you exactly what it is and abruptly
`(deliberately) stop right there.
`In the case of H.264/AVC or MPEG-4 Visual, it is highly advisable for those new to the
`subject to read some introductory overviews such as this one, and even to get a copy of an
`older and simpler standard such as H.261 or MPEG-1 and try to understand that first. The
`principles of digital video codec design are not too complicated, and haven’t really changed
`much over the years – but those basic principles have been wrapped in layer-upon-layer of
`technical enhancements to the point that the simple and straightforward concepts that lie at
`their core can become obscured. The entire H.261 specification was only 25 pages long, and
`only 17 of those pages were actually required to fully specify the technology that now lies at
`the heart of all subsequent video coding standards. In contrast, the H.264/AVC and MPEG-4
`Visual and specifications are more than 250 and 500 pages long, respectively, with a high
`density of technical detail (despite completely leaving out key information such as how to
`encode video using their formats). They each contain areas that are difficult even for experts
`to fully comprehend and appreciate.
`Dr Richardson’s book is not a completely exhaustive treatment of the subject. However,
`his approach is highly informative and provides a good initial understanding of the key con-
`cepts, and his approach is conceptually superior (and in some aspects more objective) to other
`treatments of video coding publications. This and the remarkable timeliness of the subject
`matter make this book a strong contribution to the technical literature of our community.
`
`Gary J. Sullivan
`
`Biography of Gary J. Sullivan, PhD
`
`Gary J. Sullivan is the chairman of the Joint Video Team (JVT) for the development of the latest
`international video coding standard known as H.264/AVC, which was recently completed as a
`joint project between the ITU-T video coding experts group (VCEG) and the ISO/IEC moving
`picture experts group (MPEG).
`
`

`

`•xviii
`
`Foreword
`
`He is also the Rapporteur of Advanced Video Coding in the ITU-T, where he has
`led VCEG (ITU-T Q.6/SG16) for about seven years. He is also the ITU-T video liaison
`representative to MPEG and served as MPEG’s (ISO/IEC JTC1/SC29/WG11) video chair-
`man from March of 2001 to May of 2002.
`He is currently a program manager of video standards and technologies in the eHome A/V
`platforms group of Microsoft Corporation. At Microsoft he designed and remains active in
`the extension of DirectX® Video Acceleration API/DDI feature of the Microsoft Windows®
`operating system platform.
`
`

`

`Preface
`
`With the widespread adoption of technologies such as digital television, Internet streaming
`video and DVD-Video, video compression has become an essential component of broad-
`cast and entertainment media. The success of digital TV and DVD-Video is based upon the
`10-year-old MPEG-2 standard, a technology that has proved its effectiveness but is now
`looking distinctly old-fashioned. It is clear that the time is right to replace MPEG-2 video
`compression with a more effective and efficient technology that can take advantage of recent
`progress in processing power. For some time there has been a running debate about which
`technology should take up MPEG-2’s mantle. The leading contenders are the International
`Standards known as MPEG-4 Visual and H.264.
`This book aims to provide a clear, practical and unbiased guide to these two standards
`to enable developers, engineers, researchers and students to understand and apply them effec-
`tively. Video and image compression is a complex and extensive subject and this book keeps
`an unapologetically limited focus, concentrating on the standards themselves (and in the case
`of MPEG-4 Visual, on the elements of the standard that support coding of ‘real world’ video
`material) and on video coding concepts that directly underpin the standards. The book takes an
`application-based approach and places particular emphasis on tools and features that are help-
`ful in practical applications, in order to provide practical and useful assistance to developers
`and adopters of these standards.
`I am grateful to a number of people who helped to shape the content of this book. I
`received many helpful comments and requests from readers of my book Video Codec Design.
`Particular thanks are due to Gary Sullivan for taking the time to provide helpful and detailed
`comments, corrections and advice and for kindly agreeing to write a Foreword; to Harvey
`Hanna (Impact Labs Inc), Yafan Zhao (The Robert Gordon University) and Aitor Garay for
`reading and commenting on sections of this book during its development; to members of the
`Joint Video Team for clarifying many of the details of H.264; to the editorial team at John
`Wiley & Sons (and especially to the ever-helpful, patient and supportive Kathryn Sharples);
`to Phyllis for her constant support; and finally to Freya and Hugh for patiently waiting for the
`long-promised trip to Storybook Glen!
`I very much hope that you will find this book enjoyable, readable and above all useful.
`Further resources and links are available at my website, http://www.vcodex.com/. I always
`appreciate feedback, comments and suggestions from readers and you will find contact details
`at this website.
`
`Iain Richardson
`
`

`

`

`

`Glossary
`
`4:2:2 (sampling)
`
`4:4:4 (sampling)
`
`4:2:0 (sampling)
`
`Sampling method: chrominance components have half the horizontal
`and vertical resolution of luminance component
`Sampling method: chrominance components have half the horizontal
`resolution of luminance component
`Sampling method: chrominance components have same resolution as
`luminance component
`arithmetic coding Coding method to reduce redundancy
`artefact
`Visual distortion in an image
`ASO
`Arbitrary Slice Order, in which slices may be coded out of raster
`sequence
`Binary Alpha Block, indicates the boundaries of a region (MPEG-4
`Visual)
`Body Animation Parameters
`BAP
`Region of macroblock (8 × 8 or 4 × 4) for transform purposes
`Block
`Motion estimation carried out on rectangular picture areas
`block matching
`Square or rectangular distortion areas in an image
`blocking
`Coded picture (slice) predicted using bidirectional motion compensation
`B-picture (slice)
`Context-based Adaptive Binary Arithmetic Coding
`CABAC
`Context-based Arithmetic Encoding
`CAE
`Context Adaptive Variable Length Coding
`CAVLC
`Colour difference component
`chrominance
`Common Intermediate Format, a colour image format
`CIF
`COder / DECoder pair
`CODEC
`Method of representing colour images
`colour space
`Discrete Cosine Transform
`DCT
`Direct prediction A coding mode in which no motion vector is transmitted
`DPCM
`Differential Pulse Code Modulation
`DSCQS
`Double Stimulus Continuous Quality Scale, a scale and method for
`subjective quality measurement
`Discrete Wavelet Transform
`
`BAB
`
`DWT
`
`

`

`•xxii
`
`GLOSSARY
`
`Exp-Golomb
`FAP
`FBA
`FGS
`field
`flowgraph
`FMO
`
`Full Search
`GMC
`
`GOP
`H.261
`H.263
`H.264
`HDTV
`Huffman coding
`HVS
`
`JPEG2000
`latency
`Level
`loop filter
`Macroblock
`
`Coding method to reduce redundancy
`entropy coding
`error concealment Post-processing of a decoded image to remove or reduce visible error
`effects
`Exponential Golomb variable length codes
`Facial Animation Parameters
`Face and Body Animation
`Fine Granular Scalability
`Odd- or even-numbered lines from an interlaced video sequence
`Pictorial representation of a transform algorithm (or the algorithm itself)
`Flexible Macroblock Order, in which macroblocks may be coded out of
`raster sequence
`A motion estimation algorithm
`Global Motion Compensation, motion compensation applied to a
`complete coded object (MPEG-4 Visual)
`Group Of Pictures, a set of coded video images
`A video coding standard
`A video coding standard
`A video coding standard
`High Definition Television
`Coding method to reduce redundancy
`Human Visual System, the system by which humans perceive and
`interpret visual images
`hybrid (CODEC) CODEC model featuring motion compensation and transform
`IEC
`International Electrotechnical Commission, a standards body
`Inter (coding)
`Coding of video frames using temporal prediction or compensation
`interlaced (video) Video data represented as a series of fields
`intra (coding)
`Coding of video frames without temporal prediction
`I-picture (slice)
`Picture (or slice) coded without reference to any other frame
`ISO
`International Standards Organisation, a standards body
`ITU
`International Telecommunication Union, a standards body
`JPEG
`Joint Photographic Experts Group, a committee of ISO (also an image
`coding standard)
`An image coding standard
`Delay through a communication system
`A set of conformance parameters (applied to a Profile)
`Spatial filter placed within encoding or decoding feedback loop
`Region of frame coded as a unit (usually 16× 16 pixels in the original
`frame)
`Region of macroblock with its own motion vector (H.264)
`
`Macroblock
`partition
`Macroblock
`sub-partition
`media processor
`motion
`compensation
`motion estimation Estimation of relative motion between two or more video frames
`
`Region of macroblock with its own motion vector (H.264)
`
`Processor with features specific to multimedia coding and processing
`Prediction of a video frame with modelling of motion
`
`

`

`GLOSSARY
`
`•xxiii
`
`motion vector
`
`Vector indicating a displaced block or region to be used for motion
`compensation
`Motion Picture Experts Group, a committee of ISO/IEC
`MPEG
`A multimedia coding standard
`MPEG-1
`A multimedia coding standard
`MPEG-2
`A multimedia coding standard
`MPEG-4
`Network Abstraction Layer
`NAL
`objective quality Visual quality measured by algorithm(s)
`OBMC
`Overlapped Block Motion Compensation
`Picture (coded)
`Coded (compressed) video frame
`P-picture (slice)
`Coded picture (or slice) using motion-compensated prediction from one
`reference frame
`A set of functional capabilities (of a video CODEC)
`profile
`progressive (video) Video data represented as a series of complete frames
`PSNR
`Peak Signal to Noise Ratio, an objective quality measure
`QCIF
`Quarter Common Intermediate Format
`quantise
`Reduce the precision of a scalar or vector quantity
`rate control
`Control of bit rate of encoded video signal
`rate–distortion
`Measure of CODEC performance (distortion at a range of coded bit
`rates)
`Raw Byte Sequence Payload
`Red/Green/Blue colour space
`‘Ripple’-like artefacts around sharp edges in a decoded image
`Real Time Protocol, a transport protocol for real-time data
`Reversible Variable Length Code
`Coding a signal into a number of layers
`Intra-coded slice used for switching between coded bitstreams (H.264)
`A region of a coded picture
`Synthetic Natural Hybrid Coding
`Inter-coded slice used for switching between coded bitstreams (H.264)
`Texture region that may be incorporated in a series of decoded frames
`(MPEG-4 Visual)
`Redundancy due to the statistical distribution of data
`
`RBSP
`RGB
`ringing (artefacts)
`RTP
`RVLC
`scalable coding
`SI slice
`slice
`SNHC
`SP slice
`sprite
`
`statistical
`redundancy
`Lossless or near-lossless video quality
`studio quality
`subjective quality Visual quality as perceived by human observer(s)
`subjective
`Redundancy due to components of the data that are subjectively
`redundancy
`insignificant
`sub-pixel (motion Motion-compensated prediction from a reference area that may be
`compensation)
`formed by interpolating between integer-valued pixel positions
`test model
`A software model and document that describe a reference
`implementation of a video coding standard
`Image or residual data
`Motion compensation featuring a flexible hierarchy of partition sizes
`(H.264)
`
`Texture
`Tree-structured
`motion
`compensation
`
`

`

`•xxiv
`
`TSS
`VCEG
`VCL
`video packet
`VLC
`VLD
`VLE
`VLSI
`VO
`VOP
`VQEG
`VQEG
`Weighted
`prediction
`YCbCr
`YUV
`
`GLOSSARY
`
`Three Step Search, a motion estimation algorithm
`Video Coding Experts Group, a committee of ITU
`Video Coding Layer
`Coded unit suitable for packetisation
`Variable Length Code
`Variable Length Decoder
`Variable Length Encoder
`Very Large Scale Integrated circuit
`Video Object
`Video Object Plane
`Video Quality Experts Group
`Video Quality Experts Group
`Motion compensation in which the prediction samples from two
`references are scaled
`Luminance, Blue chrominance, Red chrominance colour space
`A colour space (see YCbCr)
`
`

`

`1 I
`
`ntroduction
`
`1.1 THE SCENE
`
`Scene 1: Your avatar (a realistic 3D model with your appearance and voice) walks through
`a sophisticated virtual world populated by other avatars, product advertisements and video
`walls. On one virtual video screen is a news broadcast from your favourite channel; you want
`to see more about the current financial situation and so you interact with the broadcast and
`pull up the latest stock market figures. On another screen you call up a videoconference link
`with three friends. The video images of the other participants, neatly segmented from their
`backgrounds, are presented against yet another virtual backdrop.
`Scene 2: Your new 3G vidphone rings; you flip the lid open and answer the call. The face
`of your friend appears on the screen and you greet each other. Each sees a small, clear image
`of the other on the phone’s screen, without any of the obvious ‘blockiness’ of older-model
`video phones. After the call has ended, you call up a live video feed from a football match. The
`quality of the basic-rate stream isn’t too great and you switch seamlessly to the higher-quality
`(but more expensive) ‘premium’ stream. For a brief moment the radio signal starts to break
`up but all you notice is a slight, temporary distortion in the video picture.
`These two scenarios illustrate different visions of the next generation of multimedia
`applications. The first is a vision of MPEG-4 Visual: a rich, interactive on-line world bring-
`ing together synthetic, natural, video, image, 2D and 3D ‘objects’. The second is a vision
`of H.264/AVC: highly efficient and reliable video communications, supporting two-way,
`‘streaming’ and broadcast applications and robust to channel transmission problems. The
`two standards, each with their advantages and disadvantages and each with their supporters
`and critics, are contenders in the race to provide video compression for next-generation comm-
`unication applications.
`Turn on the television and surf through tens or hundreds of digital channels. Play your
`favourite movies on the DVD player and breathe a sigh of relief that you can throw out your
`antiquated VHS tapes. Tune in to a foreign TV news broadcast on the web (still just a postage-
`stamp video window but the choice and reliability of video streams is growing all the time).
`Chat to your friends and family by PC videophone. These activities are now commonplace and
`unremarkable, demonstrating that digital video is well on the way to becoming a ubiquitous
`
`H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia.
`C(cid:1) 2003 John Wiley & Sons, Ltd. ISBN: 0-470-84837-5
`Iain E. G. Richardson.
`
`

`

`•2
`
`INTRODUCTION
`
`and essential component of the entertainment, computing, broadcasting and communications
`industries.
`Pervasive, seamless, high-quality digital video has been the goal of companies, re-
`searchers and standards bodies over the last two decades. In some areas (for example broadcast
`television and consumer video storage), digital video has clearly captured the market, whilst
`in others (videoconferencing, video email, mobile video), market success is perhaps still too
`early to judge. However, there is no doubt that digital video is a globally important industry
`which will continue to pervade businesses, networks and homes. The continuous evolution of
`the digital video industry is being driven by commercial and technical forces. The commercial
`drive comes from the huge revenue potential of persuading consumers and businesses (a) to
`replace analogue technology and older digital technology with new, efficient, high-quality
`digital video products and (b) to adopt new communication and entertainment products that
`have been made possible by the move to digital video. The technical drive comes from con-
`tinuing improvements in processing performance, the availability of higher-capacity storage
`and transmission mechanisms and research and development of video and image processing
`technology.
`Getting digital video from its source (a camera or a stored clip) to its destination (a dis-
`play) involves a chain of components or processes. Key to this chain are the processes of
`compression (encoding) and decompression (decoding), in which bandwidth-intensive ‘raw’
`digital video is reduced to a manageable size for transmission or storage, then reconstructed for
`display. Getting the compression and decompression processes ‘right’ can give a significant
`technical and commercial edge to a product, by providing better image quality, greater relia-
`bility and/or more flexibility than competing solutions. There is therefore a keen interest in the
`continuing development and improvement of video compression and decompression methods
`and systems. The interested parties include entertainment, communication and broadcasting
`companies, software and hardware developers, researchers and holders of potentially lucrative
`patents on new compression algorithms.
`The early successes in the digital video industry (notably broadcast digital television
`and DVD-Video) were underpinned by international standard ISO/IEC 13818 [1], popularly
`known as ‘MPEG-2’ (after the working group that developed the standard, the Moving Picture
`Experts Group). Anticipation of a need for better compression tools has led to the development
`of two further standards for video compression, known as ISO/IEC 14496 Part 2 (‘MPEG-4
`Visual’) [2] and ITU-T Recommendation H.264/ISO/IEC 14496 Part 10 (‘H.264’) [3]. MPEG-
`4 Visual and H.264 share the same ancestry and some common features (they both draw on
`well-proven techniques from earlier standards) but have not

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket