throbber
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/247345079
`
`Real-time video compression
`
`Article · January 1997
`
`CITATIONS
`13
`
`2 authors, including:
`
`Raymond Westwater
`ZPEG, Inc
`33 PUBLICATIONS   272 CITATIONS  
`
`SEE PROFILE
`
`READS
`966
`
`Some of the authors of this publication are also working on these related projects:
`
`ZPEG Video Technology View project
`
`All content following this page was uploaded by Raymond Westwater on 15 April 2016.
`
`The user has requested enhancement of the downloaded file.
`
`IPR2018-01413
`Sony EX1012 Page 1
`
`

`

`Page iii
`
`Real-Time Video Compression
`
`Techniques and Algorithms
`
`by
`
`Raymond Westwater
`Borko Furht
`
`Florida Atlantic University
`
`
`
`IPR2018-01413
`Sony EX1012 Page 2
`
`

`

`Page iv
`
`Distributors for North America:
`Kluwer Academic Publishers
`101 Philip Drive
`Assinippi Park
`Norwell, Massachusetts 02061 USA
`
`Distributors for all other countries:
`Kluwer Academic Publishers Group
`Distribution Centre
`Post Office Box 322
`3300 AH Dordrecht, THE NETHERLANDS
`
`Library of Congress Cataloging-in-Publication Data
`
`A C.I.P. Catalogue record for this book is available
`from the Library of Congress.
`
`Copyright © 1997 by Kluwer Academic Publishers
`
`All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or
`transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without
`the prior written permission of the publisher, Kluwer Academic Publishers, 101 Philip Drive,
`Assinippi Park, Norwell, Massachusetts 02061
`
`Printed on acid-free paper.
`
`Printed in the United States of America
`
`
`
`IPR2018-01413
`Sony EX1012 Page 3
`
`

`

`Contents
`
`Preface
`
`1. The Problem of Video Compression
`
`1.1 Overview of Video Compression Techniques
`
`1.2 Applications of Compressed Video
`
`1.3 Image and Video Formats
`
`1.4 Overview of the Book
`
`2. The MPEG Video Compression Standard
`
`2.1 MPEG Encoder and Decoder
`
`2.2 MPEG Data Stream
`
`3. The H.261/H.263 Compression Standard for Video Telecommunications
`
`3.1 Picture Formats for H.261/H.263 Video Codecs
`
`3.2 H.261/H.263 Video Encoder
`
`3.3 H.261/H.263 Video Decoder
`
`4. The XYZ Video Compression Algorithm
`
`4.1 XYZ Compression Algorithm
`
`4.2 XYZ Decompression Algorithm
`
`5. The Discrete Cosine Transform
`
`5.1 Behavior of the DCT
`
`5.2 Fast One-dimensional DCT Algorithms
`
`5.3 Two-dimensional DCT Algorithms
`
`5.4 Inverse DCT Algorithms
`
`
`
`Page v
`
`vii
`
`1
`
`3
`
`6
`
`8
`
`12
`
`15
`
`15
`
`18
`
`23
`
`24
`
`25
`
`28
`
`29
`
`29
`
`32
`
`37
`
`37
`
`40
`
`47
`
`50
`
`IPR2018-01413
`Sony EX1012 Page 4
`
`

`

`5.5 Three-dimensional DCT Algorithms
`
`6. Quantization
`
`6.1 Defining an Invariant Measure of Error
`
`6.2 Calculation of Transform Variances
`
`6.3 Generating Quantizer Factors
`
`6.4 Adding Human Visual Factors
`
`7. Entropy Coding
`
`7.1 Huffman Coding
`
`7.2 Use of Entropy Coding in JPEG and MPEG
`
`7.3 Adaptive Huffman Coding
`
`8. VLSI Architectures of the XYZ Video Codec
`
`8.1 Complexity of the Video Compression Algorithms
`
`8.2 From Algorithms to VLSI Architectures
`
`8.3 Classification of Video Codec VLSI Architectures
`
`8.4 Implementation of the XYZ Video Compression Algorithm
`
`8.5 Adaptive XYZ Codec Using Mesh Architecture
`
`8.6 XYZ Codec Based on Fast 3D DCT Coprocessor
`
`9. Experimental Results Using XYZ Compression
`
`9.1 PC Implementation
`
`9.2 MasPar Implementation
`
`9.3 Non-adaptive XYZ Compression
`
`10. Conclusion
`
`Bibliography
`
`Index
`
`
`
`Page vi
`
`51
`
`57
`
`58
`
`62
`
`65
`
`67
`
`73
`
`73
`
`76
`
`78
`
`83
`
`83
`
`86
`
`87
`
`90
`
`103
`
`111
`
`123
`
`124
`
`138
`
`144
`
`151
`
`155
`
`163
`
`IPR2018-01413
`Sony EX1012 Page 5
`
`

`

`Page vii
`
`Preface
`
`This book is on real-time video compression. Specifically, the book introduces the XYZ video
`compression technique, that operates in three dimensions, eliminating the overhead of motion
`estimation. First, video compression standards, MPEG and H.261/H.263, are described. They both
`use asymmetric compression algorithms, based on motion estimation. Their encoders are much more
`complex than decoders. The XYZ technique uses a symmetric algorithm, based on the Three-
`Dimensional Discrete Cosine Transform (3D-DCT). 3D-DCT was originally suggested for
`compression about twenty years ago, however at that time the computational complexity of the
`algorithm was to high, it required large buffer memory, and was not as effective as motion
`estimation. We have resurrected the 3D-DCT based video compression algorithm by developing
`several enhancements to the original algorithm. These enhancements made the algorithm feasible for
`real-time video compression in applications such as video-on-demand, interactive multimedia, and
`videoconferencing. The demonstrated results, presented in the book, suggest that the XYZ video
`compression technique is not only a fast algorithm, but also provides superior compression ratios and
`high quality of the video compared to existing standard techniques, such as MPEG and H.261/H.263.
`The elegance of the XYZ technique is in its simplicity, which leads to inexpensive VLSI
`implementation of a XYZ codec.
`
`We would like to thank Jim Prince for conducting experiments in developing visually weighted
`quantizers for the XYZ algorithm, as well as a number of students from Florida Atlantic University,
`who participated in these experiments. We also want to thank Drs. Roy Levow, K. Genesan, and
`Matthew Evett, professors from Florida Atlantic University, Dr. Steve Rosenbaum from Cylex
`Systems, and Joshua Greenberg for constructive discussions during this project.
`
`RAYMOND WESTWATER AND BORKO FURHT
`BOCA RATON, JULY 1996.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 6
`
`

`

`Page 1
`
`1—
`The Problem of Video Compression
`
`The problem of real-time video compression is a difficult and important one, and has inspired a great
`deal of research activity. This body of knowledge has been, to a substantial degree, embodied into the
`MPEG and H.261/H263 motion video standards. However, some important questions remain
`unexplored. This book describes one possible alternative to these standards that has superior
`compression characteristics while requiring far less computational power for its full implementation.
`
`Since about 1989, moving digital video images have been integrated with programs. The difficulty in
`implementing moving digital video is the tremendous bandwidth required for the encoding of video
`data. For example, a quarter screen image (320 x 240 pixels) playing on an RGB video screen at full
`speed of 30 frames per second (fps) requires storage and transmission of 6.9 million bytes per second.
`This data rate is simply prohibitive, and so means of compressing digital video suitable for real-time
`playback are a necessary step for the widespread introduction of digital motion video applications.
`
`Many digital video compression algorithms have been developed and implemented. The compression
`ratios of these algorithms varies according to the subjective acceptable level of error, the definition of
`the word compression, and who is making the claim. Table 1.1 summarizes video compression
`algorithms, their typical compression ratios reported in the literature, and their characteristics.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 7
`
`

`

`Table 1.1 Overview of video compression algorithms.
`
`Compression Algorithm
`
`Typical
`Compression
`Ratio
`
`Characteristics
`
`Page 2
`
`Intel RTV/Indeo
`
`Intel PLV
`
`IBM Photomotion
`
`Motion JPEG
`
`Fractals
`
`Wavelets
`
`H.261/H263
`
`MPEG
`
`3:1
`
`12:1
`
`3:1
`
`10:1
`
`10:1
`
`20:1
`
`50:1
`
`30:1
`
`A 128X240 data stream is interpolated to 256X240. Color is
`subsampled 4:1. A simple 16 bit codebook is used without error
`correction. Frame differencing is used.
`
`A native 256X240 stream is encoded using vector quantization
`and motion compensation. Compression requires specialized
`equipment.
`
`An optimal 8-bit color palette is determined, and run-length
`encoding and frame differencing are used.
`
`Uses 2-D DCT to encode individual frames. Gives good real-time
`results with inexpensive but special-purpose equipment. This
`technique supports random-access since no frame differencing is
`used.
`
`Fractals compress natural scenes well, but require tremendous
`computing power.
`
`2-D and 3-D wavelets have been used in the compression of
`motion video. Wavelet compression is low enough in complexity
`to compress entire images, and therefore does not suffer from the
`boundary artifacts seen in DCT-based techniques.
`
`Real-time compression and decompression algorithm for video
`telecommunications. It is based on 2-D DCT with simple motion
`estimation between frames.
`
`Uses 2-D DCT with motion estimation and interpolation between
`frames. The MPEG standard is difficult and expensive to
`compress, but plays back in real-time with inexpensive equipment.
`
`An ideal video compression technique should have the following characteristics:
`
`• Will produce levels of compression rivaling MPEG without objectionable artifacts.
`
`• Can be played back in real time with inexpensive hardware support.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 8
`
`

`

`Page 3
`
`• Can degrade easily under network overload or on a slow platform.
`
`• Can be compressed in real time with inexpensive hardware support.
`
`1.1—
`Overview of Video Compression Techniques
`
`The JPEG still picture compression standard has been extremely successful, having been
`implemented on virtually all platforms. This standard is fairly simple to implement, is not
`computationally complex, and gets 10:1 to 15:1 compression ratios without significant visual
`artifacts. This standard is based upon entropy encoding of quantized coefficients of the discrete
`cosine transformation of 8x8 blocks of pixel data.
`
`Figure 1.1 shows the block diagram of both the JPEG compression and decompression algorithms. A
`single frame is subdivided into 8x8 blocks, each of which is independently processed. Each block is
`transformed into DCT space, resulting in an 8x8 block of DCT coefficients. These coefficients are
`then quantized by integer division by constants. The quantizing constant for each DCT coefficient is
`chosen to produce minimal visual artifacts, while maximally reducing the representational entropy of
`the coefficients. The quantized coefficients are then entropy coded into a compressed data stream.
`The reduced entropy of the quantized coefficients is reflected in the higher compression ratio of the
`data.
`
`The Motion JPEG (M-JPEG) uses the JPEG compression for each frame. It provides random access
`to individual frames, however the compression ratios are too low (same as in JPEG), because the
`technique does not take advantage of the similarities between adjacent frames.
`
`The MPEG moving compression standard is an attempt to extend DCT-based compression into
`moving pictures. MPEG encodes frames by estimating the motion difference between the frames, and
`encoding the differences into roughly JPEG format. Unfortunately, motion estimation is
`computationally complex, requires specialized equipment to encode, and adds considerable
`complexity to the algorithm. Figure 1.2 illustrates the MPEG compression algorithm for predictive
`frames.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 9
`
`

`

`Page 4
`
`Figure 1.1
`JPEG compression and decompression algorithms.
`
`One of the most promising new technologies is wavelet-based compression [VK95]. Figure 1.3
`illustrates a simple wavelet transform: subband decomposition. The image as a whole is subdivided
`into frequency subbands, which are then individually quantized. One of the most attractive features of
`this system is that it is applied to the image as a whole, thereby avoiding the edge artifacts associated
`with the block-based DCT compression schemes.
`
`The wavelet transform can be applied to the time dimension as well. Experience has shown that this
`decomposition does not give as good compression results as motion compensation. As there are no
`other compression algorithms capable of such high compression ratios, MPEG is considered the
`existing ''state-of-the-art".
`
`The XYZ algorithm is a natural extension of the research that has been done in video compression.
`Much work has been done in the development of transform-based motion video compression
`algorithms, and in the development of quantizing factors based on the sensitivity of the human eye to
`various artifacts of compression.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 10
`
`

`

`Page 5
`
`Figure 1.2
`MPEG compression algorithm for predictive frames. MPEG
`adds motion estimation to the JPEG model.
`
`Figure 1.3
`Octave-band or wavelet decomposition of a still
`image into unequal subbands.
`
`XYZ compression is an alternative extension of DCT encoding to moving pictures. Sequences of
`eight frames are collected into a three-dimensional block to which a three-dimensional DCT will be
`applied. The transformed data is then quantized.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 11
`
`

`

`These quantizing constants are demonstrated to cause artifacts which are minimally visible. The resulting
`data stream is then entropy coded. This process strongly resembles the JPEG encoding process, as
`illustrated in Figure 1.4.
`
`Page 6
`
`Figure 1.4
`XYZ compression algorithm.
`
`This algorithm is built upon a considerable body of published work. The three-dimensional DCT has been
`used to encode errors after motion estimation has been performed [RP77], and true three-dimensional
`DCT-based compression algorithms have been developed where the quantizers were based upon
`minimization of introduced mean square error [NA77]. These algorithms have fallen into disfavor because
`they were considered to require excessive computation, required too much buffer memory, and were not as
`effective as motion estimation. This book refutes these arguments.
`
`Work in visibility of artifacts produced by quantization has also been done [CR90]. Visibility of two-
`dimensional quantization artifacts has been thoroughly explored for the DCT transforms space. The XYZ
`algorithm extends this work to quantization of three-dimensional DCT coefficients.
`
`1.2—
`Applications of Compressed Video
`
`Video compression techniques made feasible a number of applications. Four distinct applications of the
`compressed video can be summarized as: (a) consumer broadcast television, (b) consumer playback, (c)
`desktop video, and (d) videoconferencing.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 12
`
`

`

`Page 7
`
`Consumer broadcast television, which includes digital video delivery to homes, typically requires a small number of high-quality compressors and a
`large number of low-cost decompressors. Expected compression ratio is about 50:1.
`
`Consumer playback applications, such as CD-ROM libraries and interactive games, also require a small number of compressors and a large number
`of low-cost decompressors. The required compression ratio is about 100:1.
`
`Desktop video, which includes systems for authoring and editing video presentations, is a symmetrical application requiring the same number of
`encoders and decoders. The expected compression ratio is in the range from 5:1 to 50:1.
`
`Videconferencing applications also require the same number of encoders and decoders, and the expected compression ratio is about 100:1.
`
`Application
`
`Bandwidth
`
`Standard
`
`Size
`
`Frame Rate [frames/sec]
`
`Table 1.2 Applications of the compressed video and current video compression standards.
`
`Analog Videophone
`
`Low Bitrate Video
`Conferencing
`
`Basic Video Telephony
`
`Video Conferencing
`
`Interactive Multimedia
`
`Digital TV - NTSC
`
`High Definition Television
`
`5-10 Kbps
`
`26-64 Kbps
`
`64-128 Kbps
`
`>= 384 Kbps
`
`1-2 Mbps
`
`3-10 Mbps
`
`15-80 Mbps
`
`none
`
`H.263
`
`H.261
`
`H.261
`
`MPEG-1
`
`MPEG-2
`
`MPEG-2
`
`170x128
`
`128x96
`176x144
`
`176x144
`352x288
`
`352x288
`
`352x240
`
`720x480
`
`1200x800
`
`2-5
`
`15-30
`
`10-20
`
`15-30
`
`15-30
`
`30
`
`30-60
`
`Table 1.2 summarizes applications of the compressed video, by specifying current standards used in various applications, the required bandwidth,
`and typical frame sizes and frame rates.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 13
`
`

`

`Page 8
`
`1.3—
`Image and Video Formats
`
`A digital image represents a two-dimensional array of samples, where each sample is called a pixel.
`Precision determines how many levels of intensity can be represented, and is expressed as the number
`of bits/sample. According to precision, images can be classified into: (a) binary images, represented
`by 1 bit/sample, (b) computer graphics, represented by 4 bits/sample, (c) grayscale images,
`represented by 8 bits/sample, and color images, represented with 16, 24 or more bits/sample.
`
`According to the trichromatic theory, the sensation of color is produced by selectively exciting three
`classes of receptors in the eye. In a RGB color representation system, shown in Figure 1.5, a color is
`produced by adding three primary colors: red, green, and blue (RGB). The straight line, where
`R=G=B, specifies the gray values ranging from black to white.
`
`Figure 1.5
`The RGB representation of color images.
`
`Another representation of color images, YUV representation, describes luminance and chrominance
`components of an image. The luminance component provides a grayscale version of the image, while
`two chrominance components give additional information that converts the grayscale image to a color
`image. The YUV representation is more natural for image and video compression. The exact
`transformation from RGB to YUV representation, specified by the CCIR 601 standard, is given by
`the following equations:
`
`
`
`IPR2018-01413
`Sony EX1012 Page 14
`
`

`

`Page 9
`
`where Y is the luminance component, and U and V are two chrominance components.
`
`An approximate RGB to YUV transformation is given as:
`
`This transformation has a nice feature that, when R+G+B, then Y=R=G=B, and U=V=0. In this case,
`the image is a grayscale image.
`
`Color conversion from RGB to YUV requires several multiplications, which can be computationally
`expensive. An approximation, proposed in [W+94], can be calculated by performing bit shifts and
`adds instead multiplication operations. This approximation is defines as:
`
`
`
`IPR2018-01413
`Sony EX1012 Page 15
`
`

`

`This approximation also gives a simplified YUV to RGB transformation, expressed by:
`
`Page 10
`
`Another color format, referred to as YCbCr format, is intensively used for image compression. In
`YCbCr format, Y is the same as in a YUV system, however U and V components are scaled and zero-
`shifted to produce Cb and Cr, respectively, as follows:
`
`In this way, chrominance components Cb and Cr are always in the range [0,1].
`
`Computer Video Formats
`
`Resolutions of an image system refers to its capability to reproduce fine detail. Higher resolution
`requires more complex imaging systems to represent these images in real time. In computer systems,
`resolution is characterized with number of pixels. Table 1.3 summarizes popular computer video
`formats, and related storage requirements.
`
`Television Formats
`
`In television systems, resolution refers to the number of line pairs resolved on the face of the display
`screen, expressed in cycles per picture height, or cycles per picture width. For example, the NTSC
`broadcast system in North America and Japan, denoted as 525/59.94, has about 483 picture lines.
`
`The HDTV system will approximately double the number of lines of current broadcast television at
`approximately the same field rate. For example, a 1050x960 HDTV system will have 960 total lines.
`Spatial and temporal characteristics of conventional television systems (such as NTSC, SECAM, and
`PAL), and high-
`
`
`
`IPR2018-01413
`Sony EX1012 Page 16
`
`

`

`definition TV systems (HDTV) are presented in Tables 1.4 and 1.5, respectively [BF91].
`
`Table 1.3 Characteristics of various computer video formats.
`
`Resolution (pixels)
`
`Colors (bits)
`
`Computer Video
`Format
`
`320x200
`
`640x350
`
`640x480
`
`4 (2 bits)
`
`16 (4 bits)
`
`256 (8 bits)
`
`1024x768
`
`256 (8 bits)
`
`CGA - Color
`Graphics Adapter
`
`EGA - Enhanced
`Graphics Adapter
`
`VGA - Video
`Graphics Adapter
`
`88514/A Display
`Adapter Mode
`
`XGA - Extended
`Graphics Array (a)
`
`XGA - Extended
`Graphics Array (b)
`
`SVGA - Super VGA
`
`Page 11
`
`Storage
`Capacity
`Per Image
`
`128,000 bits=
`16 KB
`
`896,000 bits=
`112 KB
`
`2,457,600 bits=
`307.2 KB
`
`6,291,456 bits=
`786.432 KB
`
`6,291,456 bits=
`786.432 KB
`
`6,291,456 bits
`=786.432 KB
`
`65,000 (24 bits)
`
`256 (8 bits)
`
`640x480
`
`1024x768
`
`1024x768
`
`65,000 (24 bits)
`
`2.36 MB
`
`Table 1.4 Spatial characteristics of television systems [BF91].
`
`System
`
`Total Lines
`
`Active
`Lines
`
`Vertical
`Resolution
`
`Optimal
`Viewing
`Distance [m]
`
`Aspect
`Ratio
`
`Horizontal
`Resolution
`
`HDTV
`USA
`
`1050
`
`HDTV Europe
`
`1250
`
`NTSC
`
`PAL
`
`SECAM
`
`525
`
`625
`
`625
`
`960
`
`1000
`
`484
`
`575
`
`575
`
`675
`
`700
`
`242
`
`290
`
`290
`
`2.5
`
`2.4
`
`7.0
`
`6.0
`
`6.0
`
`16/9
`
`16/9
`
`4/3
`
`4/3
`
`4/3
`
`600
`
`700
`
`330
`
`425
`
`465
`
`Total
`Picture
`Elements
`
`720,000
`
`870,000
`
`106,000
`
`165,000
`
`180,000
`
`
`
`IPR2018-01413
`Sony EX1012 Page 17
`
`

`

`Page 12
`
`Table 1.5 Temporal characteristics of television systems [BF91].
`
`System
`
`Total Channel
`Width [MHz]
`
`Video Baseband
`Y [MHz]
`
`Video Baseband
`R-Y [MHz]
`
`Video Baseband
`B-Y [MHz]
`
`Scanning Rate
`Camera [Hz]
`
`Scanning Rate
`HDTV Display
`[Hz]
`
`Scanning Rate
`Convent.
`Display [Hz]
`
`HDTV USA
`
`HDTV Europe
`
`NTSC
`
`PAL
`
`SECAM
`
`9.0
`
`12.0
`
`6.0
`
`8.0
`
`8.0
`
`10.0
`
`14.0
`
`4.2
`
`5.5
`
`6.0
`
`5.0
`
`7.0
`
`1.0
`
`1.8
`
`2.0
`
`5.0
`
`7.0
`
`0.6
`
`1.8
`
`2.0
`
`59.94
`
`50
`
`59.94
`
`50
`
`50
`
`59.94
`
`100
`
`NA
`
`NA
`
`NA
`
`59.94
`
`50
`
`59.94
`
`50
`
`50
`
`1.4—
`Overview of the Book
`
`This book is divided into ten chapters:
`
`1. Video compression. This current chapter introduces the problem of compressing motion video, illustrates the motivation for the 3-D
`solution chosen in the book, and briefly describes the proposed solution. Image and video formats are introduced as well.
`
`2. MPEG. This chapter describes the MPEG compression standard. Important contributions in the field and related work are emphasized.
`
`3. H.261/H.263. This chapter describes the compression standard for video telecommunications.
`
`4. XYZ compression. The XYZ video compression algorithm is described in detail in this chapter. Both encoder and decoder are presented, as
`well as an example of compressing 8x8x8 video block.
`
`5. 3-D DCT. The theory of the Discrete Cosine Transform is developed and extended to three dimensions. A fast 3-D algorithm is developed.
`
`6. Quantization. Discussion is presented on the issues of determining optimal quantizers using various error criteria. A model of Human
`Visual System is used to develop factors that weigh the DCT coefficients according to their relative visibility.
`
`7. Entropy coding. A method for encoding the quantized coefficients is developed based on the stochastic behavior of the pixel data.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 18
`
`

`

`Page 13
`
`8. VLSI architectures for XYZ codec. Issues concerning real-time implementation of the XYZ
`compression algorithm are analyzed including the complexity of the algorithm and mapping the
`algorithm into various VLSI architectures.
`
`9. Results. Obtained results of an implementation of the XYZ compression algorithm are presented.
`
`10. Conclusion. Summary of contributions are outlined, emphasizing the real-time features of the
`compression algorithm, visual quality, and compression ratio. Directions for future research are given
`as well.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 19
`
`

`

`|PR2018—01413
`
`Sony EX1012 Page 20
`
`IPR2018-01413
`Sony EX1012 Page 20
`
`

`

`Page 15
`
`2—
`The MPEG Video Compression Standard
`
`The Motion Picture Experts' Group was assembled by the International Standards Organization (ISO)
`to establish standards for the compression, encoding, and decompression of motion video. MPEG-1
`[IS92b] is a standard supporting compression of image resolutions of approximately 352x288 at 30
`fps into a data stream of 1.5 Mbps. This data rate is suitable for pressing onto CD-ROM. The MPEG-
`2 standard [IS93b] supports compression of broadcast television (704x576 at 30 fps) and HDTV
`(1920x1152 at 60 fps) of up to 60 Mpixels/sec (appx. 700 Mb) at compression ratios of roughly three
`times those expected of moving JPEG [IS92a] (playback rates of up to 80 Mbps).
`
`The MPEG standard specifies the functional organization of a decoder. The data stream is cached in a
`buffer to reduce the effect of jitter in delivery and decode, and is demultiplexed into a video stream,
`an audio stream, and additional user-defined streams. The video stream is decoded into a ''video
`sequence" composed of the sequence header and groups of pictures.
`
`2.1—
`MPEG Encoder and Decoder
`
`The specification of the MPEG encoder defines many compression options. While all of these options
`must be supported by the decoder, the selection of which options to support in compression is left to
`the discretion of the implementer. An MPEG encoder may choose compression options balancing the
`need for high compression
`
`
`
`IPR2018-01413
`Sony EX1012 Page 21
`
`

`

`Page 16
`
`ratios against the complexity of motion compensation or adaptive quantization calculations.
`Decisions will be affected by such factors as:
`
`• A need for real-time compression. MPEG algorithms are complex, and there may not be sufficient
`time to implement exotic options on a particular platform.
`
`• A need for high compression ratios. For highest possible compression ratios at highest possible
`quality, every available option must be exercised.
`
`• A need for insensitivity to transmission error. MPEG-2 supports recovery from transmission errors.
`Some error recovery mechanisms are implemented by the encoder.
`
`• Fast algorithms. Development of fast algorithms may make compression options available that
`would otherwise be impractical.
`
`• Availability of specialized hardware. Dedicated hardware may increase the performance of the
`encoder to the point that additional compression options can be considered.
`
`In the MPEG standard, frames in a sequence are coded using three different algorithms, as illustrated
`in Figure 2.1.
`
`Figure 2.1
`Types of frames in the MPEG standard.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 22
`
`

`

`Page 17
`
`I frames (intra frames) are self-contained and coded using a DCT-based technique similar to JPEG. I
`frames are used as random access points in MPEG streams, and they give the lowest compression
`ratios within MPEG.
`
`P frames (predicted frames) are coded using forward predictive coding, where the actual frame is
`coded with reference to a pervious frame (I or P). This process is similar to H.261/H.263 predictive
`coding, except the previous frame is not always the closest previous frames, as in H.261/H.263
`coding. The compression ratio of P frames is significantly higher than of I frames.
`
`B frames (bidirectional or interpolated frames) are coded using two reference frames, a past and a
`future frame (which can be I or P frames). Bidirectional, or interpolated coding provides the highest
`amount of compression [Fur95b].
`
`I, P, and B frames are described in more detail in Section 8.2. Note that in Figure 2.1, the first three B
`frames (2,3, and 4) are bidirectionally coded using the past frame I (frame 1), and the future frame P
`(frame 5). Therefore, the decoding order will differ from the encoding order. The P frame 5 must be
`decoded before B frames 2,3, and 4, and I frame 9 before B frames 6,7, and 8. If the MPEG sequence
`is transmitted over the network, the actual transmission order should be {1,5,2,,3,4,8,6,7,8}.
`
`The MPEG application determines a sequence of I, P, and B frames. If there is a need for fast random
`access, the best resolution would be achieved by coding the whole sequence as I frames (MPEG
`becomes identical to Motion JPEG). However, the highest compression ratio can be achieved by
`incorporating a large number of B frames.
`
`The block diagram of the MPEG encoder is given in Figure 2.2, while the MPEG decoder is shown in
`Figure 2.3.
`
`I frames are created similarly to JPEG encoded pictures, while P and B frames are encoded in terms
`of previous and future frames. The motion vector is estimated, and the difference between the
`predicted and actual blocks (error terms) are calculated. The error terms are then DCT encoded and
`the entropy encoder is used to produce the compact code.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 23
`
`

`

`Page 18
`
`Figure 2.2
`The block diagram of the MPEG encoder.
`
`2.2—
`MPEG Data Stream
`
`The MPEG specification defines a "video sequence" composed of a video sequence header and many Group-
`Of-Pictures (GOP), as illustrated in Figure 2.4. The video sequence header defines the video format, picture
`dimensions, aspect ratio, frame rate, and delivered data rate. Supported video formats include CCIR601,
`HDTV(16:9), and VGA. Supported chroma formats include "4:2:0" (YUV) and
`
`
`
`IPR2018-01413
`Sony EX1012 Page 24
`
`

`

`"4:4:4" (RGB). A suggested buffer size for the video sequence is also specified, a number intended to buffer
`jitter caused by differences in decode time.
`
`Page 19
`
`Figure 2.3
`The block diagram of the MPEG decoder.
`
`A GOP contains pictures that may be encoded into one of three supported compression formats. The GOP
`header contains a starting time for the group, and can therefore be used as a point of random access. Each frame
`within the GOP is numbered, and its number coupled with the GOP start time and the playback frame rate
`determines its playback time. Each picture is subdivided into "slices" and then into "macroblocks". A
`macroblock is composed of four 8x8 blocks of luminance data, and typically two 8x8 blocks of chrominance
`data, one Cr and one Cb.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 25
`
`

`

`Page 20
`
`Figure 2.4
`MPEG data stream.
`
`I Picture Format
`
`The I (Intraframe) picture format substantially corresponds to the JPEG format. These pictures are
`encoded by transformation into DCT space, quantization of the resultant coefficients, and entropy
`coding of the result. Transformation into DCT space is performed by an 8x8 DCT. Quantization is
`performed by reference to a user-loadable quantization table modified by a scale factor. This
`mechanism supports adaptive quantization at the cost of additional complexity - although 30%
`improvement in compression is claimed [PM93].
`
`After quantization, the resulting coefficients are reordered in zig-zag order, run-length coded,
`variable-length coded, and entropy coded. The resulting data stream should show roughly JPEG
`levels of compression.
`
`P Picture Format
`
`The P (Predicted) picture format introduces the concept of motion compensation. Each macroblock is
`coded with a vector that predicts its value from an earlier I or P
`
`
`
`IPR2018-01413
`Sony EX1012 Page 26
`
`

`

`Page 21
`
`frame. The decoding process copies the contents of the macroblock-sized data at the address
`referenced by the vector into the macroblock of the P frame currently being decoded. Five bits of
`resolution are reserved for the magnitude of the vector in each of the x and y directions, meaning that
`1024 possible data blocks may be referenced by the predicted macroblock. However, eight possible
`magnitude ranges may be assigned to those five bits, meaning as many as 8192 macroblocks might
`have to be evaluated to exhaustively determine the best vector. Each evaluation might require testing
`as many as 384 pixels, and a further complexity is seen in performing fractional interpolation of
`pixels (vector motions as small as 1/2 pixel are supported). Finally, the difference between the
`prediction and the macroblock to be compressed may be encoded in like fashion to I frame encoding
`above.
`
`B Picture Format
`
`The B (Bidirectional prediction) picture format is calculated with two vectors. A backwards vector
`references a macroblock-sized region in the previous I or P frame, the forward vector references a
`macroblock-sized region in the next I or P frame. For this reason, I and P frames are placed in the
`coded stream before any B frames that reference them.
`
`The macroblock-sized regions referenced by the motion compensation vectors are averaged to
`produce the motion estimate for the macroblock being decoded. As with P frames, the error between
`the prediction and the frame being encoded is compressed and placed in the bitstream. The error
`factor is decompressed and added to the prediction to form the B frame macroblock.
`
`Many demanding technical issues are raised by the MPEG specification. These include fast
`algorithms for the DCT, fast algorithms for motion vector estimation, algorithms for adaptive
`quantization, and decompression in environments that allow some errors.
`
`
`
`IPR2018-01413
`Sony EX1012 Page 27
`
`

`

`|PR2018—01413
`
`Sony EX1012 Page 28
`
`IPR2018-01413
`Sony EX1012 Page 28
`
`

`

`Page 23
`
`3—
`The H.261/H.263 Compression Standard for Video Telecommunications
`
`ITU has developed a video conferencing standard H.324 at very low bitrate for the General Switched
`Telephone Network (GSTN) and mobile radio [IT95a, IT95b, IT93]]. The H.324 is a
`recommendation for real-time vo

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket