`CE"Vida0
`
`Processing
`
`Comcast - Exhibit 1019, page 1
`Comcast- Exhibit1019, page1
`
`Comcast - Exhibit 1019, page 1
`
`
`
`
`
`NU.u.dLIIIla-.29.}?Inf
`
`
`..1Sagagammaw.uhaooEmfi#0.I
`
`
`
`
`‘t..I"I...l‘.I.ill-t
`
`
`
`
`3!»..-.{lit-Ignite"....11...pr
`
`isfl...“‘.-M,I11..9i.t.I$3.31!“.
`
`
`
`
`
`tn]
`
`(I.
`
`..I
`
`‘1.!
`
`I?!'0I
`
`‘..JI...
`
`_
`Comcast - Exhibit 1019, page 2
`..mInXE
`tSaCm0C
`2egap9,
`1O1
`u.“
`
`Comcast - Exhibit 1019, page 2
`
`
`
`Academic Press Series in
`Communications, Networking, and Multimedia
`
`EDITOR-IN-CHIEF
`
`Jerry D. Gibson
`Southern Methodist University
`
`This series has been established to bring together a variety of publications that represent the latest in cutting-edge research,
`theory, and applications of modern communication systems. All traditional and modern aspects of communications as
`well as all methods of computer communications are to be included. The series will include professional handbooks,
`books on communication methods and standards, and research books for engineers and managers in the world-wide
`communications industry.
`
`1-
`
`Comcast - Exhibit 1019, page 3
`
`Comcast - Exhibit 1019, page 3
`
`
`
`This book is printed on acid-free paper. ®
`
`Copyright © 2000 by Academic Press
`
`All rights reserved.
`
`No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including
`photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
`Requests for permission to make copies of any part of the work should be mailed to the following address: Permissions Department,
`Harcourt, Inc., 6277 Sea Harbor Drive, Orlando, Florida, 32887-6777.
`Explicit Permission from Academic Press is not required to reproduce a maximum of two figures or tables from an Academic Press
`article in another scientific or research publication provided that the material has not been credited to another source and that full
`credit to the Academic Press article is given.
`
`ACADEMIC PRESS
`A Harcourt Science and Technology Company
`525 B Street, Suite 1900, San Diego, CA 92101-4495, USA
`http://www.academicpress.com
`
`Academic Press
`Harcourt Place, 32 Jamestown Road, London, NW1 7BY, UK
`http://www.hbuk.co.uk/ap/
`
`Library of Congress Catalog Number: 99-69120
`
`ISBN: 0-12-119790-5
`
`Printed in Canada
`
`00 01 02 03 04 05 FR 9 8 7 6 5 4 3 2 1
`
`Comcast - Exhibit 1019, page 4
`
`Comcast - Exhibit 1019, page 4
`
`
`
`6.1
`
`Basic Concepts and
`Techniques of Video Coding
`and the H.261 Standard
`
`rnett
`lily of Texas
`
`1 Introduction
`Introduction to Video Compression
`2
`3 Video Compression Application Requirements
`4 Digital Video Signals and Formats
`4.1 Sampling of Analog Video Signals • 4.2 Digital Video Formats
`5 Video Compression Techniques
`5.1 Entropy and Predictive Coding • 5.2 Block Transform Coding: The Discrete Cosine
`Transform • 5.3 Quantization • 5.4 Motion Compensation and Estimation
`6 Video Encoding Standards and H.261
`6.1 The H.261 Video Encoder
`7 Closing Remarks
`References
`
` 555
` 556
` 558
` 560
`
` 563
`
` 569
`
` 573
` 573
`
`oduction
`
`ect of video coding is of fundamental importance to
`s in engineering and the sciences. Video engineer-
`uickly becoming a largely digital discipline. The digi-
`ission of television signals via satellites is common-
`d widespread HDTV terrestrial transmission is slated
`in 1999. Video compression is an absolute require-
`tthe growth and success of the low-bandwidth trans-
`l!f digital video signals. Video encoding is being used
`digital video communications, storage, processing, ac-
`And reproduction occur. The transmission of high-
`Atimedia information over high-speed computer net-
`t central problem in the design of Quality of Services
`r digital transmission providers. The Motion Pictures
`11P (MPEG) has already finalized two video coding
`MPEG-1 and MPEG-2, that define methods for the
`ton of digital video information for multimedia and
`:f6rmats. MPEG-4 is currently addressing the trans-
`?F very low bitrate video. MPEG-7 is addressing the
`nation of video storage and retrieval services (Chap-
`el 9.2 discuss video storage and retrieval). A central
`
`aspect to each of the MPEG standards are the video encoding
`and decoding algorithms that make digital video applications
`practical. The MPEG Standards are discussed in Chapters 6.4
`and 6.5.
`Video compression not only reduces the storage requirements
`or transmission bandwidth of digital video applications, but it
`also affects many system performance tradeoffs. The design and
`selection of a video encoder therefore is not only based on its
`ability to compress information. Issues such as bitrate versus
`distortion criteria, algorithm complexity, transmission channel
`characteristics, algorithm symmetry versus asymmetry, video
`source statistics, fixed versus variable rate coding, and standards
`compatibility should be considered in order to make good en-
`coder design decisions.
`The growth of digital video applications and technology in
`the past few years has been explosive, and video compression
`is playing a central role in this success. Yet, the video coding
`discipline is relatively young and certainly will evolve and change
`significantly over the next few years. Research in video coding has
`great vitality and the body of work is significant. It is apparent
`that this relevant and important topic will have an immense
`affect on the future of digital video technologies.
`
`U°4'?"Ae..derhic Press.
`Pda lion 01 any form reserved
`
`555
`
`Comcast - Exhibit 1019, page 5
`
`Comcast - Exhibit 1019, page 5
`
`
`
`556
`
`2 Introduction to Video Compression
`
`Video or visual communications require significant amounts
`of information transmission. Video compression, as consid-
`ered here, involves the bitrate reduction of a digital video signal
`carrying visual information. Traditional video-based compres-
`sion, like other information compression techniques, focuses
`on eliminating the redundant elements of the signal. The de-
`gree to which the encoder reduces the bitrate is called its cod-
`ing efficiency; equivalently, its inverse is termed the compression
`ratio:
`
`coding efficiency = (compression ratio)-1
`= encoded bitrate/decoded bitrate. (1)
`
`Compression can be a lossless or lossy operation. Because of
`the immense volume of video information, lossy operations are
`mainly used for video compression. The loss of information or
`distortion measure is usually evaluated with the mean square
`error (MSE), mean absolute error (MAE) criteria, or peak signal-
`to-noise ratio (PSNR):
`
`MSE
`
`1
`MN
`
`M N
`
`M N
`
`D — 10 ?, i)
`
`1=1 J=I
`MAE = 1mN EE (i, D _ i(i,
`i=, ,=,
`PSNR = 20 logic
`
`( 2"
`MSEI /2
`
`(2)
`
`for an image I and its reconstructed image I, with pixel indices
`1 < i < M and 1 < j < N, image size N x M pixels, and
`n bits per pixel. The MSE, MAE, and PSNR as described here
`
`Handbook of Image and Vid
`arc global measures and do not necessarily give a ‘ gc
`cation of the reconstructed image quality. In
`the tii),:
`the human observer determines the
`quality of the rec
`image and video quality. The concept of
`distortion
`rIus
`ing efficiency is one of the most
`fundamentalopitcroodie;t1.4.
`technical evaluation of video encoders. The t
`•
`-
`quality assessment of compressed images and video Is disc
`
`in Secti on8.2.
`Video
`signals contain information in three dimensions.
`dimensions are modeled as spatial and temporal
`don-insa'
`video encoding. Digital video compression methods seek t
`imize information redundancy independently in each (1
`The major international video compression standards lMp.
`MPEG-2, H.261) use this approach. Figure 1 scheinaticall
`picts a generalized video compression system that impl
`the spatial and temporal encoding of a digital image scq
`Each image in the sequence Ik is defined as in Eq. ( 1.
`spatial encoder operates on image blocks, typically of tht
`of 8 x 8 pixels each. The temporal encoder generally o
`on 16 x 16 pixel image blocks. The system is designed for
`modes of operation, the intraframe mode and the interfrtt
`mode.
`The single-layer feedback structure of this generalized mo
`is representative of the encoders that are recommended
`the International Standards Organization (ISO). and Interna-
`tional Telecommunications Union (ITU) video coding stan
`MPEG-1, MPEG-2/H.262, and H.261 [1-3]. The feedback
`is used in the interframe mode of operation and generates
`prediction error between the blocks of the current frame
`the current prediction frame. The prediction is generated byte"
`motion compensator. The motion estimation unit creates ma
`vectors for each 16 x 16 block. The motion vectors and p
`ously reconstructed frame are fed to the motion compensitt
`create the prediction.
`
`Ik +
`
`Prediction
`Error Eh
`
`Prediction
`
`Pk
`
`. 0 1 Quantizer
`
`Spatial
`Operator
`T
`
`Variable
`Length Coder
`VLC
`
`V
`Inverse
`Quantizer
`
`H
`
`Inverse Spatial
`Operator
`T'
`
`Ike or E4,
`
`Transmit
`Encoded
`Intraframe
`Sub-block
`Or
`Encoded
`Interframe
`Prediction
`Error and
`Motion
`Vector
`
`Intraframe — Open
`Interframe - Closed
`
`Motion
`Compensation
`
`Delayed
`Frame
`Memory
`A
`V
`Motion
`Estimation
`
`Motion Vectors
`
`Variable
` Length Coder
`VLC
`
`MV,,
`
`
`p.
`
`FIGURE 1 Generalized video compression system.
`
`Comcast - Exhibit 1019, page 6
`
`Comcast - Exhibit 1019, page 6
`
`
`
`Basic Concepts and Techniques of Video Coding and the H.261 Standard
`
`557
`
`Theintrarrame mode spatially encodes an entire current frame
`
`a periodic basis, e.g., every 15 frames, to ensure that system-
`ti- c errors do not continuously propagate. The intraframe mode
`also be used to spatially encode a block whenever the inter-
`roe encoding mode cannot meet its performance threshold.
`e intraframe versus interframe mode selection algorithm is
`Di included in this diagram. It is responsible for controlling the
`ection of the encoding functions, data flows, and output data
`reams for each mode.
`The Intraframe encoding mode does not receive any input
`from the feedback loop. I k is spatially encoded, and losslessly
`encoded by the variable length coder (VLC) forming lk„ which
`rralisinitted to the decoder. The receiver decodes
`produc-
`ing the reconstructed image subblock ik. During the interframe
`ei)ding mode, the current frame prediction Pk is subtracted from
`the current frame input IA to form the current prediction error
`EA. The prediction error is then spatially and VLC encoded to
`and it is transmitted along with the VLC encoded mo-
`form
`tion vectors MVt. The decoder can reconstruct the current frame
`It by using the previously reconstructed frame ik_ I (stored in
`the decoder), the current frame motion vectors, and the predic-
`don error. The motions vectors M Vk operate on Ik_i to generate
`the current prediction frame Pk. The encoded prediction error
`is decoded to produce the reconstructed prediction error Ek.
`The prediction error is added to the prediction to form the cur-
`rent frame Ik. The functional elements of the generalized model
`are described here in detail.
`
`1. Spatial operator: this element is generally a unitary two-
`dimensional linear transform, but in principle it can be
`any unitary operator that can distribute most of the signal
`energy into a small number of coefficients, i.e., decorrelate
`thesignal data. Spatial transformations are successively ap-
`plied to small image blocks in order to take advantage of
`the high degree of data correlation in adjacent image pix-
`els. The most widely used spatial operator for image and
`video coding is the discrete cosine transform (DCT). It is
`applied to 8 x 8 pixel image blocks and is well suited for
`image transformations because it uses real computations
`with fast implementations, provides excellent decorrela-
`tiro of signal components, and avoids generation of spu-
`rious components between the edges of adjacent image
`blocks,
`,f‘
`uantizer: the spatial or transform operator is applied to
`the input in order to arrange the signal into a more suit-
`able format for subsequent lossy and lossless coding oper-
`ations. The quantizer operates on the transform generated
`c"'ficients. This is a lossy operation that can result in a sig-
`n acant reduction in the bitrate. The quantization method
`u
`i-11 this kind ofvideo encoder is usually scalar and non-
`muln:torrn. The scalar quantizer simplifies the complexity of
`Operation as compared to vector quantization (VQ).
`ze non uniform quantization interval is sized according
`
`to the distribution of the transform coefficients in order
`to minimize the bitrate and the distortion created by the
`quantization process. Alternatively, the quantization in-
`terval size can be adjusted based on the performance of
`the human Visual System (HVS). The Joint Pictures Expert
`Group (JPEG) standard includes two (luminance and color
`difference) HVS sensitivity weighted quantization matri-
`ces in its "Examples and Guidelines" annex. JPEG coding
`is discussed in Sections 5.5 and 5.6.
`3. Variable length coding: The lossless VLC is used to ex-
`ploit the "symbolic" redundancy contained in each block
`of transform coefficients. This step is termed "entropy cod-
`ing" to designate that the encoder is designed to minimize
`the source entropy. The VLC is applied to a serial bit stream
`that is generated by scanning the transform coefficient
`block. The scanning pattern should be chosen with the
`objective of maximizing the performance of the VLC. The
`MPEG encoder for instance, describes a zigzag scanning
`pattern that is intended to maximize transform zero coef-
`ficient run lengths. The H.261 VLC is designed to encode
`these run lengths by using a variable length Huffman code.
`
`The feedback loop sequentially reconstructs the encoded spa-
`tial and prediction error frames and stores the results in order
`to create a current prediction. The elements required to do this
`are the inverse quantizer, inverse spatial operator, delayed frame
`memory, motion estimator, and motion compensator.
`
`1. Inverse operators: The inverse operators (2- ' and 7-1 arc
`applied to the encoded current frame I k, or the current
`prediction error Ek. in order to reconstruct and store the
`frame for the motion estimator and motion compensator
`to generate the next prediction frame.
`2. Delayed frame memory: Both current and previous frames
`must be available to the motion estimator and motion
`compensator to generate a prediction frame. The number
`of previous frames stored in memory can vary based upon
`the requirements of the encoding algorithm. MPEG-1 de-
`fines a B frame that is a bidirectional encoding that requires
`that motion prediction be performed in both the forward
`and backward directions. This necessitates storage of mul-
`tiple frames in memory.
`3. Motion estimation: The temporal encoding aspect of this
`system relies on the assumption that rigid body motion is
`responsible for the differences between two or more suc-
`cessive frames. The objective of the motion estimator is to
`estimate the rigid body motion between two frames. The
`motion estimator operates on all current frame 16 x 16
`image blocks and generates the pixel displacement or mo-
`tion vector for each block. The technique used to generate
`motion vectors is called block-matching motion estimation
`and is discussed further in Section 5.4. The method uses
`the current frame Ik and the previous reconstructed frame
`
`::
`
`Comcast - Exhibit 1019, page 7
`
`Comcast - Exhibit 1019, page 7
`
`
`
`11111.11
`
`558
`
`Handbook of Image and Video
`
`ik_ i as input. Each block in the previous frame is assumed
`to have a displacement that can be found by searching for
`it in the current frame. The search is usually constrained
`to be within a reasonable neighborhood so as to minimize
`the complexity of the operation. Search matching is usu-
`ally based on a minimum MSE or MAE criterion. When a
`match is found, the pixel displacement is used to encode
`the particular block. If a search does not meet a minimum
`MSE or MAE threshold criterion, the motion compen-
`sator will indicate that the current block is to be spatially
`encoded by using the intraframe mode.
`4. Motion compensation: The motion compensator makes
`use of the current frame motion estimates MVk and the
`previously reconstructed frame 4_1 to generate the cur-
`rent frame prediction Pt. The current frame prediction is
`constructed by placing the previous frame blocks into the
`current frame according to the motion estimate pixel dis-
`placement. The motion compensator then decides which
`blocks will be encoded as prediction error blocks using
`motion vectors and which blocks will only be spatially en-
`coded.
`
`The generalized model does not address some video compres-
`sion system details such as the bit-stream syntax (which supports
`different application requirements), or the specifics of the en-
`coding algorithms. These issues are dependent upon the video
`compression system design.
`Alternative video encoding models have also been researched.
`Three-dimensional (3-D) video information can be compressed
`directly using VQ or 3-D wavelet encoding models. VQ encodes
`a 3-D block of pixels as a codebook index that denotes its "closest
`or nearest neighbor" in the minimum squared or absolute error
`sense. However, the VQ codebook size grows on the order as the
`number of possible inputs. Searching the codebook space for the
`nearest neighbor is generally very computationally complex, but
`structured search techniques can provide good bitrates, quality,
`and computational performance. Tree-structured VQ (TSVQ)
`[13] reduces the search complexity from codebook size IV to log
`N, with a corresponding loss in average distortion. The simplicity
`of the VQ decoder (it only requires a table lookup for the trans
`mitted codebook index) and its bitrate-distortion performance
`make it an attractive alternative for specialized applications. The
`complexity of the codebook search generally limits the use of
`VQ in real-time applications. Vector quantizers have also been
`proposed for interframe, variable bitrate, and subband video
`compression methods [4].
`Three-dimensional wavelet encoding is a topic of recent inter-
`est. This video encoding method is based on the discrete wavelet
`transform methods discussed in Section 5.4. The wavelet trans-
`form is a relatively new transform that decomposes a signal into
`a multiresolution representation. The multiresolution decompo-
`sition makes the wavelet transform an excellent signal analysis
`tool because signal characteristics can be viewed in a variety of
`
`time-frequency scales. The wavelet
`
`transformisnincofilipd::roinn:,11th
`practice by the use of multiresolution or subband
`The wavelet filterbank is well suited for video e natruarc•atelryi-siitca,
`.reca
`of its ability to adapt to the multiresolution cha
`video signals. Wavelet transform encodings are
`chical in their time-frequency representation and
`able for progressive transmission [6]. They have also been silo
`to possess excellent bitrate-distortion characteristics.
`Direct three-dimensional video compression sveasil
`sten
`from a major drawback for real-time encoding and trans
`sion. In order to encode a sequence of images in one ()per'
`the sequence must be buffered. This introduces a buffering a
`computational delay that can be very noticeable in the
`interactive video communications.
`Video compression techniques treating visual informatio
`accordance with HVS models have recently been introdu
`These methods are termed "second-generation or object-b
`methods, and attempt to achieve very large compression ra
`by imitating the operations of the HVS. The HVS model
`also be incorporated into more traditional video compression
`techniques by reflecting visual perception into various as
`ti
`of the coding algorithm. HVS weightings have been designed for
`the DCT AC coefficients quantizer used in the MPEG encoder.
`A discussion of these techniques can be found in Chapter 6,3,
`Digital video compression is currently enjoying tremendous
`growth, partially because of the great advances in VLSI, ASIC,
`and microcomputer technology in the past decade. The real-time
`nature of video communications necessitates the use of general
`purpose and specialized high—performance hardware devices. In
`the near future, advances in design and manufacturing technolo-
`gies will create hardware devices that will allow greater adapt-
`ability, interactivity, and interoperability of video applications.
`These advances will challenge future video compression tech-
`nology to support format-free implementations.
`
`3 Video Compression Application
`Requirements
`A wide variety of digital video applications currently exist. They
`range from simple low-resolution and low-bandwidth applica-
`tions (multimedia, PicturePhone) to very high-resolution and
`high-bandwidth (HDTV) demands. This section will present re-
`quirements of current and future digital video applications and
`the demands they place on the video compression system.
`conipres
`As a way to demonstrate the importance of video
`sion, the transmission of digital video television signk15
`sented. The bandwidth required by a digital television sagaell:
`approximately one-half the number of picture elernel"'
`els) displayed per second. The analog pixel size in the vc7
`dimension is the distance between scanning lines, and 11.":, hring
`zontal dimension is the distance the scanning spot move:
`
`11.
`
`Comcast - Exhibit 1019, page 8
`
`Comcast - Exhibit 1019, page 8
`
`
`
`1
`
`psis Concepts and Techniques of Video Coding and the H.261 Standard
`
`559
`
`of the highest video signal transmission frequency. The
`iyidth is given by Eq (3):
`
`Bw = (cycles/frame)(FR)
`
`(cYcles/line)(X) (FR)
`(0.5) (aspect ratio)(FR)(ND(RK)
`0.84
`
`(0.8)(FR)(ML )(RH),
`
`(3)
`
`= system bandwidth,
`number of frames transmitted per second (fps),
`= number of scanning lines per frame,
`= horizontal resolution (lines), proportional
`to pixel resolution.
`
`ational Television Systems Committee (NTSC) aspect ra-
`4/3, the constant 0.5 is the ratio of the number of cycles to
`umber of lines, and the factor 0.84 is the fraction of the hor-
`scanning interval that is devoted to signal transmission.
`e NTSC transmission standard used for television broad-
`in the United States has the following parameter values:
`29.97 fps, NL = 525 lines, and RH = 340 lines. This
`s a video system bandwidth Bw of 4.2 MHz for the NTSC
`s t sys tem. In order to transmit a color digital video signal,
`tal pixel format must be defined. The digital color pixel is
`of three components: one luminance (Y) component oc-
`g 8 bits, and two color difference components (U and V)
`uiring 8 bits. The NTSC picture frame has 720 x 480 x 2
`luminance and color pixels. In order to transmit this in-
`tion for an NTSC broadcast system at 29.97 frames/s, the
`bandwidth is required:
`
`Bw
`
`1/2bitrate = 1/2(29.97 fps) x (24 bits/pixel)
`
`x (720 x 480 x 2 pixels/frame)
`
`= 249 MHz.
`
`resents an increase of —59 times the available system
`th, and —41 times the full transmission channel band-
`_MHz) for current NTSC signals. HDTV picture res-
`equires up to three times more raw bandwidth than
`Pie! (Two transmission channels totaling 12 MHz are
`for terrestrial HDTV transmissions.) It is clear from
`plc that terrestrial television broadcast systems will
`digital transmission and digital video compression
`the overall bitrate reduction and image quality re-
`r HDTV signals.
`pie not only points out the significant bandwidth
`eats for digital video information, but also indirectly
`the issue of digital video quality requirements. The
`between bitrate and quality or distortion is a funda-
`
`mental issue facing the design of video compression systems.
`To this end, it is important to fully characterize an applica-
`tion's video communications requirements before designing or
`selecting an appropriate video compression system. Factors that
`should be considered in the design and selection of a video com-
`pression system include the following items.
`
`2.
`
`1. Video characteristics: video parameters such as the dy-
`namic range, source statistics, pixel resolution, and noise
`content can affect the performance of the compression sys-
`tem.
`Transmission requirements: transmission bitrate require-
`ments determine the power of the compression system.
`Very high transmission bandwidth, storage capacity, or
`quality requirements may necessitate lossless compression.
`Conversely, extremely low bitrate requirements may dic-
`tate compression systems that trade off image quality for
`a large compression ratio. Progressive transmission is a key
`issue for selection of the compression system. It is gen-
`erally used when the transmission bandwidth exceeds the
`compressed video bandwidth. Progressive coding refers to
`a multiresolution, hierarchical, or subband encoding of
`the video information. It allows for transmission and re-
`construction of each resolution independently from low to
`high resolution. In addition, channel errors affect system
`performance and the quality of the reconstructed video.
`Channel errors can affect the bit stream randomly or in
`burst fashion. The channel error characteristics can have
`different effects on different encoders, and they can range
`from local to global anomalies. In general, transmission
`error correction codes (ECC) are used to mitigate the ef-
`fect of channel errors, but awareness and knowledge of this
`issue is important.
`3. Compression system characteristics and performance: the
`nature of video applications makes many demands on the
`.video compression system. Interactive video applications
`such as videoconferencing demand that the video corn-
`pression systems have symmetric capabilities. That is, each
`participant in the interactive video session must have the
`same video encoding and decoding capabilities, and the
`system performance requirements must be met by both
`the encoder and decoder. In contrast, television broad-
`cast video has significantly greater performance require-
`ments at the transmitter because it has the responsibility
`of providing real-time high quality compressed video that
`meets the transmission channel capacity. Digital video sys-
`tem implementation requirements can vary significantly.
`Desktop televideo conferencing can be implemented by
`using software encoding and decoding, or it may require
`specialized hardware and transmission capabilities to pro-
`vide a high-quality performance. The characteristics of the
`application will dictate the suitability of the video com-
`pression algorithm for particular system implementations.
`
`Comcast - Exhibit 1019, page 9
`
`Comcast - Exhibit 1019, page 9
`
`
`
`560
`
`The importance of the encoder and system implementa-
`tion decision cannot be overstated; system architectures
`and performance capabilities are changing at a rapid pace
`and the choice of the best solution requires careful analysis
`of the all possible system and encoder alternatives.
`4. Rate-distortion requirements: the rate-distortion require-
`ment is a basic consideration in the selection of the video
`encoder. The video encoder must be able to provide the
`bitrate(s) and video fidelity (or range of video fidelity)
`required by the application. Otherwise, any aspect of the
`system may not meet specifications. For example, if the bi-
`trate specification is exceeded in order to support a lower
`MSE, a larger than expected transmission error rate may
`cause a catastrophic system failure.
`5. Standards requirements: video encoder compatibility with
`existing and future standards is an important considera-
`tion if the digital video system is required to interoperate
`with existing or future systems. A good example is that of a
`desktop videoconferencing application supporting a num-
`ber of legacy video compression standards. This results in
`requiring support of the older video encoding standards on
`new equipment designed for a newer incompatible stan-
`dard. Videoconferencing equipment not supporting the
`old standards would not be capable or as capable to work
`in environments supporting older standards.
`
`These factors are displayed in Table 1 to demonstrate video
`compression system requirements for some common video com-
`munications applications. The video compression system de-
`signer at a minimum should consider these factors in making
`a determination about the choice of video encoding algorithms
`and technology to implement.
`
`TABLE 1 Digital video application requirements
`
`Handbook of Image and Video p
`4 Digital Video Signals and Formats
`Video compression techniques make use of signal mdcii
`order to be able to utilize the body of digital signal
`sis/processing theory and techniques that have been d,v1/
`over the past fifty or so years. The design of a video corn
`sion system, as represented by the generalized model intro-
`in Section 2, requires a knowledge of the signal character'
`and the digital processes that are used to create the digital
`signal. It is also highly desirable to understand video dl
`systems, and the behavior of the HVS.
`
`4.1 Sampling of Analog Video Signals
`Digital video information is generated by sampling the in
`sity of the original continuous analog video signal / (x,
`in three dimensions. The spatial component of the video
`nal is sampled in the horizontal and vertical dimensions (x,
`and the temporal component is sampled in the time dimen
`(t). This generates a series of digital images or image sego
`/(i, j, k). Video signals that contain colorized information
`usually decomposed into three parameters (YCrCb, YUV, R
`whose intensities are likewise sampled in three dimensions.
`sampling process inherently quantizes the video signal du
`the digital word precision used to represent the intensity val
`Therefore the original analog signal can never be Rind
`exactly, but for all intents and purposes, a high-quality di
`video representation can be reproduced with arbitrary dos
`to the original analog video signal. The topic of video same
`and interpolation is discussed in Chapter 7,2.
`An important result of sampling theory is the Aryquisr
`piing theorem. This theorem defines the conditions under w
`
`Application
`
`Bitrate Req. Distortion Req.
`
`Transmission Req. Computational Req.
`
`Standards Req.
`
`Network video
`on demand
`
`1.5 Mbps
`10 Mbps
`
`High
`medium
`
`Video phone
`
`64 Kbps
`
`High distortion
`
`Desktop multimedia
`video CDROM
`
`1.5 Mbps
`
`High distortion
`to medium
`
`Internet
`100 Mbps
`LAN
`1SDN p x 64
`
`PC channel
`
`10 Mbps
`
`1.5 Mbps
`
`Medium
`distortion
`High distortion
`
`Fast ethernet
`100 Mbps
`Ethernet
`
`Desktop LAN
`videoconference
`Desktop WAN
`videoconference
`
`Desktop dial-up
`videoconference
`Digital satellite
`television
`HDTV
`
`64 Kbps
`
`10 Mbps
`
`Very high
`distortion
`Low distortion
`
`20 Mbps
`
`Low distortion
`
`POTS and
`internet
`Fixed service
`satellites
`12-MHz
`terrestrial link
`PC channel
`
`DVD
`
`20 Mbps
`
`Low distortion
`
`MPEG-1
`MPEG-2
`
`H.261 encoder
`H.261 decoder
`MPEG-1 decoder
`
`MPEG-1
`MPEG-2
`MPEG-7
`H.26I
`
`M PEG-1
`MPEG-2
`MPF.G-7
`Hardware decoders MPEG-2,
`H.261
`Hardware decoders MPEG-1,
`MPEG-4,
`H.263
`MPEG-4,
`H.263
`MPEG-2
`
`Software decoder
`
`MPEG-2 decoder
`
`MPEG-2 decoder
`
`MPEG-2 decoder
`
`MPEG-2
`
`MPEG-2
`
`Comcast - Exhibit 1019, page 10
`
`Comcast - Exhibit 1019, page 10
`
`
`
`Concepts and Techniques of Video Coding and the H.261 Standard
`basic
`
`561
`
`ILI
`
`- fa
`
`fa
`
`0
`(a)
`
`I L, 1
`
`<
`-f
`
`I
`-3/2f
`
`-I/2f, -fa
`
`fa 1/21,
`
`i
`3/2f,
`
`f
`
`-f
`
`-3/2f, -f, -fe
`
`fa f, 3/2f,
`
`0
`
`(c)
`
`f
`
`FIGURE 2 Nyquist sampling theorem, with magnitudes ofFourier spectra for (a) input I; (b) sampled
`input 15, with f, > 2 fs; (c) sampled input 1„ with I-, < 2 fa.
`
`,pled analog signals can be "perfectly" reconstructed. If these
`ditions are not met, the resulting digital signal will contain
`laced components which introduce artifacts into the recon-
`ction. The Nyquist conditions are depicted graphically for
`Bone dimensional case in Fig. 2.
`The one dimensional signal 1 is sampled at rate f,. It is band-
`ned (as are all real-world signals) in the frequency domain
`than uPPor frequency bound of f B. According to the Nyquist
`'Piing theorem, if a bandlimited signal is sampled, the result-
`8,
`Fourier s pectrum is made up of the original signal spectrum
`LI
`plus
`p i• icates of the original spectrum spaced at integer
`ler,les of the sampling frequency fs. Diagram (a) in Fig. 2
`'the magnitude I L I of the Fourier spectrum for L The
`itii d(' of the Fourier spectrum I L, I for the sampled sig-
`,eze 1ts5huwn for two cases. Diagram (b) presents the case
`ce h urig
`inal signal 1 can be reconstructed by recovering
`I spectral island. Diagram (c) disp