throbber
Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`MPEG-1 AND MPEG-2 Video Standards
`
`Supavadee Aramvith and Ming-Ting Sun
`
`Information Processing Laboratory, Department of Electrical Engineering, Box 352500
`
`University of Washington, Seattle, Washington 98195-2500
`
`{supava,sun } @ee.washington.edu
`
`1. MPEG-1 Video Coding Standard
`
`1.1 Introduction
`
`1.1.1 Background and structure of MPEG-1 standards activities
`
`The development of digital video technology in the 19805 has made it possible to use digital video compression in
`
`various kinds of applications. The effort to develop standards for coded representation of moving pictures, audio, and
`
`their combination is carried out in the Moving Picture Experts Group (MPEG). MPEG is a group formed under the
`
`auspices of the International Organization for Standardization (ISO) and the International Electrotechnical Commission
`
`(IEC).
`
`It operates in the framework of the Joint ISO/IEC Technical Committee 1 (JTC l) on Information Technology,
`
`which was formally Working Group 11 (WGl 1) of Sub—Committee 29 (SC29). The premise is to set the standard for
`
`coding moving pictures and the associated audio for digital storage media at about 1.5 Mbit/s so that a movie can be
`
`compressed and stored in a CD-ROM (Compact Disc — Read Only Memory). The resultant standard is the
`
`international standard for moving picture compression, ISO/IEC 11172 or MPEG-1 (Moving Picture Experts Group -
`
`Phase 1). MPEG-1 standards consist of 5 parts, including: Systems (11172-l), Video (11172-2), Audio (11172-3),
`
`Conformance Testing (11172-4), and Software Simulation (11172-5).
`
`In this chapter, we will focus only on the video
`
`part.
`
`The activity of the MPEG committee started in 1988 based on the work of ISO IPEG (Joint Photographic Experts
`
`Group) [1] and CCITT Recommendation H.261: “Video Codec for Audiovisual Services at px64 kbits/s” [2]. Thus, the
`
`MPEG-1 standard has much in common with the JPEG and H.261 standards. The MPEG development methodology
`
`was similar to that of H.261 and was divided into three phases: Requirements, Competition, and Convergence [3]. The
`
`purpose of the Requirements phase is to precisely set the focus of the effort and determine the rule for the competition
`
`phase. The document of this phase is a “Proposal Package Description” [4] and a test methodology [5]. The next step
`
`is the competition phase in which the goal is to obtain state of the art technology fi‘om the best of academic and
`
`industrial research. The criteria are based on the technical merits and the trade-ofi~ between video quality and the cost
`
`of implementation of the ideas and the subjective test [5]. After the competition phase, various ideas and techniques
`
`1
`
`DISH 1034
`
`Sling TV v. Realtime
`|PR2018—01342
`
`1
`
`DISH 1034
`Sling TV v. Realtime
`IPR2018-01342
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`are integrated into one solution in the convergence phase. The solution results in a document called the simulation
`
`model. The simulation model implements, in some sort of programming language, the operation of a reference encoder
`
`and a decoder. The simulation model is used to carry out simulations to optimize the performance of the coding scheme
`
`[6]. A series of fiilly documented experiments called core experiments are then carried out. The MPEG committee
`
`reached the Committee Draft (CD) status in September 1990 and the Committee Draft (CD 11172) was approved in
`
`December 1991. International Standard (IS) 11172 for the first three parts was established in November 1992. The IS
`
`for the last two parts was finalized in November 1994.
`
`1.1.2 MPEG-1 target applications and requirements
`
`The MPEG standard is a generic standard, which means that it is not limited to a particular application. A variety of
`
`digital storage media applications of MPEG-1 have been proposed based on the assumptions that the acceptable video
`
`and audio quality can be obtained for a total bandwidth of about 1.5 Mbits/s. Typical storage media for these
`
`applications include CD-ROM, DAT (Digital Audio Tape), Winchester-type computer disks, and writable optical disks.
`
`The target applications are asymmetric applications where the compression process is performed once and the
`
`decompression process is required ofien. Examples of the asymmetric applications include video CD, video on
`
`demand, and video games.
`
`In these asymmetric applications, the encoding delay is not a concern. The encoders are
`
`needed only in small quantities while the decoders are needed in large volumes. Thus, the encoder complexity is not a
`
`concern while the decoder complexity needs to be low in order to result in low-cost decoders.
`
`The requirements for compressed video in digital storage media mandate several important features of the MPEG-1
`
`compression algorithm. The important features include normal playback, frame-based random access and editing of
`
`video, reverse playback, fast forward / reverse play, encoding high-resolution still frames, robustness to uncorrectable
`
`errors, etc. The applications also require MPEG—1 to support flexible picture-sizes and frame-rates. Another
`
`requirement is that the encoding process can be performed in reasonable speed using existing hardware technologies
`
`and the decoder can be implemented using small number of chips in low cost.
`
`Since MPEG-1 video coding algorithm is based heavily on H.261, in the following sections, we will focus only on
`those which are different fi‘om H.261.
`
`1.2 MPEG-1 Video Coding vs. H.261
`
`1.2.1 Bi—directional motion compensated prediction
`
`2
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`In H.261, only the previous video frame is used as the reference frame for the motion compensated prediction (forward
`
`prediction). MPEG-1 allows the fiiture flame to be used as the reference fi'ame for the motion compensated prediction
`
`(backward prediction), which can provide better prediction. For example, as shown in figure 1, if there are moving
`
`objects, and if only the forward prediction is used, there will be uncovered areas (such as the block behind the car in
`
`Frame N) for which we may not be able to find a good matching block from the previous reference picture (Frame N—
`
`1). On the other hand, the backward prediction can properly predict these uncovered areas since they are available in
`
`the 11.1th reference picture, i.e. fi'ame N+1 in this example. Also shown in the figure, if there are objects moving into
`
`the picture (the airplane in the figure), these new objects cannot be predicted from the previous picture, but can be
`
`predicted from the future picture.
`
`
`
`Frame N-‘I
`
`Frame N
`
`Frame N+1
`
`Figure 1: A video sequence showing the benefits of bi-directional prediction.
`
`1.2.2 Motion compensated prediction with half-pixel accuracy
`
`The motion estimation in H.261 is restricted to only integer-pixel accuracy. However, a moving object ofien moves to
`
`a position which is not on the pixel-grid but between the pixels. MPEG-1 allows half-pixel-accuracy motion vectors.
`
`By estimating the displacement at a finer resolution, we can expect improved prediction and, thus, better performance
`
`than motion estimation with integer—pixel accuracy. As shown in Figure 2, since there is no pixel-value at the half-
`
`pixel locations, interpolation is required to produce the pixel-values at the half-pixel positions. Bi-linear interpolation
`
`is used in MPEG-1 for its simplicity. As in H.261, the motion estimation is performed only on luminance blocks. The
`
`resulting motion vector is scaled by 2 and applied to the chrominance blocks. This reduces the computation but may
`
`not necessarily be optimal. Motion vectors are differentially encoded with respect to the motion vector in the preceding
`
`adjacent macroblock. The reason is that the motion vectors of adjacent regions are highly correlated, as it is quite
`
`common to have relatively uniform motion over areas of picture.
`
`1.3 MPEG-1 video structure
`
`1.3.1 Source Input Format (SIF)
`
`3
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`
`
`
`
`
`‘x/lnteger-pixel grid
`
`
`
`’1
`
`X Pixel values on integer-pixel grid
`lnterpolaled pixel values on half-pixel
`. grid using bilinear interpolation from
`
`\ pixelvaluesoninteger-pixelgrid
`
`Half-pixel grid
`
`Figure 22 Half-pixel motion estimation.
`
`The typical MPEG-1 input format is the Source Input Format (SIF). SIF was derived from CCIR601, a worldwide
`
`standard for digital TV studio. CCIR601 specifies the Y Cb Cr color coordinate where Y is the luminance component
`
`(black and white information), and Cb and Cr are two color difference signals (chrominance components). A
`
`luminance sampling frequency of 13.5 MHz was adopted. There are several Y Cb Cr sampling formats, such as 4:424,
`
`4:222, 4:121, and 422:0.
`
`In 424:4, the sampling rates for Y, Cb, and Cr are the same.
`
`In 42222, the sampling rates of Cb
`
`and Cr are half of that of Y.
`
`In 4:1:1 and 42:0, the sampling rates of Cb and Cr are one quarter of that of Y. The
`
`positions on Cb Cr samples for 4:424, 42222, 42121, and 42220 are shown in Figure 3.
`
`x : Luminance samples
`
`0 : Chrominance samples
`
`
`(d)
`
`Figure 3: Luminance and chrominance samples in (a) 4:424 format (b) 422:2 format (c) 4:1:1 format (d) 42220 format.
`
`Converting analog TV signal to digital video with the 13.5 MHz sampling rate of CCIR601 results in 720 active pixels
`
`per line (576 active lines for PAL and 480 active lines for NTSC). This results in a 720x480 resolution for NTSC and a
`
`720x576 resolution for PAL. With 42222, the uncompressed bit-rate for transmitting CCIR601 at 30 frames/s is then
`
`about 166 Mbits/s. Since it is difficult to compress a CCIR601 video to 1.5 Mb/s with good video quality, in MPEG-1,
`
`typically the source video resolution is decimated to a quarter of the CCIR601 resolution by filtering and sub-sampling.
`
`The resultant format is called Source Input Format (SIF) which has a 360x240 resolution for NTSC and a 360x288
`
`resolution for PAL. Since in the video coding algorithm, the block-size of 16x16 is used for motion compensated
`
`prediction, the number of pixels in both the horizontal and the vertical dimensions should be multiples of 16. Thus, the
`
`four left-most and right-most pixels are discarded to give a 352x240 resolution for NTSC systems (30 frames/s) and a
`
`352x288 resolution for PAL systems (25 frames/s). The chrominance signals have half of the above resolutions in both
`
`4
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`the horizontal and vertical dimensions (4:220, 176x120 for NTSC and 176x144 for PAL). The uncompressed bit-rate
`
`for SIF (NTSC) at 30 frames/s is about 30.4 Mbits/s.
`
`1.3.2
`
`Group Of Pictures (GOP) and I-B-P Pictures
`
`In MPEG, each video sequence is divided into one or more groups of pictures (GOPs). There are four types of pictures
`
`defined in MPEG-1: I-, P-, B-, and D—pictures of which the first three are shown in figure 4. Each GOP is composed of
`
`one or more pictures; one of these pictures must be an I-picture. Usually, the spacing between two anchor fiames (I- or
`
`P-pictures) is referred to as M, and the spacing between two successive I-pictures is referred to as N.
`and N=9.
`
`In Figure 4, M=3
`
`//\/—\“n
`
`Ullflllflllflllflllflllfl
`
`\IQCDQ/J
`k)\Group ofpictures
`
`Group ofpictures
`
`
`
`Figure 4: MPEG Group Of Pictures.
`
`I—pictures (Intra—coded pictures) are coded independently with no reference to other pictures. I-pictures provide random
`
`access points in the compressed video data, since the I—pictures can be decoded independently without referencing to
`
`other pictures. With I-pictures, an MPEG bit-stream is more editable. Also, error propagation due to transmission
`
`errors in previous pictures will be terminated by an I-picture since the I-picture does not have a reference to the
`
`previous pictures. Since I-pictures use only transform coding without motion compensated predictive coding,
`
`it
`
`provides only moderate compression.
`
`P—pictures (Predictive—coded pictures) are coded using the forward motion-compensated prediction similar to that in
`
`H.261 fiom the preceding 1- or P—picture. P-pictures provide more compression than the I-pictures by virtue of motion-
`
`compensated prediction. They also serve as references for B—pictures and future P—pictures. Transmission errors in the
`
`I-pictures and P-pictures can propagate to the succeeding pictures since the I-pictures and P—pictures are used to predict
`
`the succeeding pictures.
`
`B-pictures (Bi-directional-coded pictures) allow macroblocks to be coded using bi-directional motion-compensated
`
`prediction from both the past and future reference I- or P—pictures.
`
`In the B-pictures, each bi-directional motion-
`
`compensated macroblock can have two motion vectors: a forward motion vector which references to a best matching
`
`5
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`block in the previous 1- or P-pictures, and a backward motion vector which references to a best matching block in the
`
`next I— or P-pictures as shown in figure 5. The motion compensated prediction can be formed by the average of the two
`
`referenced motion compensated blocks. By averaging between the past and the future reference blocks, the effect of
`
`noise can be decreased. B-pictures provide the best compression compared to I- and P-pictures.
`
`I- and P-pictures are
`
`used as reference pictures for predicting B—pictures. To keep the structure simple and since there is no apparent
`
`advantage to use B-pictures for predicting other B-pictures, the B—pictures are not used as reference pictures Hence, B-
`
`pictures do not propagate errors.
`
`Backward motion vector
`
`Fonivard motion vector
`
`
`
`
`v/l—_.i 4" — Best matching macroblock
`
`
`l__l
`I
`|
`Future reference picture
`
`_—\‘§:/ CurrentB-picture
`
`
`
`Ml
`
`l
`
`Best matching macroblock
`
`Past reference picture
`
`Figure 5: Bi—directional motion estimation.
`
`D-pictures (DC-pictures) are low-resolution pictures obtained by decoding only the DC coefficient of the Discrete
`
`Cosine Transform coefficients of each macroblock. They are not used in combination with I-, P-, or B-pictures. D-
`
`pictures are rarely used, but are defined to allow fast searches on sequential digital storage media.
`
`The trade-off of having fiequent B—pictures is that it decreases the correlation between the previous 1- or P-picture and
`
`the next reference P- or I-picture.
`
`It also causes coding delay and increases the encoder complexity. With the example
`
`shown in Figure 4 and Figure 6, at the encoder, if the order of the incoming pictures is 1, 2, 3, 4, 5, 6, 7, ..., the order of
`
`coding the pictures at the encoder will be: 1, 4, 2, 3, 7, 5, 6,
`
`At the decoder, the order of the decoded pictures will
`
`be 1, 4, 2, 3, 7, 5, 6,
`
`However, the display order after the decoder should be 1, 2, 3, 4, 5, 6, 7. Thus, frame-
`
`memories have to be used to put the pictures in the correct order. This picture re—ordering causes delay. The
`
`computation of bi-directional motion vectors and the picture-re—ordering frame-memories increase the encoder
`
`complexity.
`
`In Figure 6, two types of GOPs are shown. GOPl can be decoded without referencing other GOPs.
`
`It is called a
`
`Closed-GOP. In GOP2, to decode the 8Lh B- and 9"1 B-pictures, the 7111 P—picture in GOPl is needed. GOP2 is called an
`
`Open GOP which means the decoding of this GOP needs to reference other GOPs.
`
`6
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`Encoder Input:
`
`11 ZB 3B 4P SB 68 7P SB 9310111812813P14B1SB16P
`
`4
`H
`
`I»
`
`GOP1
`
`GOPZ
`
`Decoder Input:
`11 4P 2B 3B 7P 53 GB 1018B QB 13P 11B1ZB 16P 14B 15B
`
`4————
`54
`GOP1
`
`GOPZ
`
`r
`
`CLOSED
`
`OPEN
`
`Figure 6: Frame reordering.
`
`1.3.3
`
`Slice, Macroblock, and Block structures
`
`An MPEG picture consists of slices. A slice consists of a contiguous sequence of macroblocks in a raster scan order
`
`(from left to right and from top to bottom). In an MPEG coded bit-stream, each slice starts with a slice—header which is
`
`a clear-codeword (a clear-codeword is a unique bit-pattern which can be identified without decoding the variable-length
`
`codes in the bit-stream). Due to the clear-codeword slice-header, slices are the lowest level of units which can be
`
`accessed in an MPEG coded bit-stream without decoding the variable—length codes.
`
`Slices are important in the
`
`handling of channel errors.
`
`If a bit-stream contains a bit-error, the error may cause error propagation due to the
`
`variable-length coding. The decoder can regain synchronization at the start of the next slice. Having more slices in a
`
`bit-stream allows better error-termination, but the overhead will increase.
`
`A macroblock consists of a 16x16 block of luminance samples and two 8x8 block of corresponding chrominance
`
`samples as shown in figure 7. A macroblock thus consists of four 8x8 Y—blocks, one 8x8 Cb block, and one 8x8 Cr
`
`block. Each coded macroblock contains motion-compensated prediction information (coded motion vectors and the
`
`prediction errors). There are four types of macroblocks: intra, forward—predicted, backward-predicted, and averaged
`
`macroblocks. The motion information consists of one motion vector for forward— and backward—predicted macroblocks
`
`and two motion vectors for bi-directionally-predicted (or averaged) macroblocks. P-pictures can have intra- and
`
`forward-predicted macroblocks. B-pictures can have all four types of macroblocks. The first and last macroblocks in a
`
`slice must always be coded. A macroblock is designated as a skipped macroblock when its motion vector is zero and
`
`all the quantized DCT coefficients are zero. Skipped macroblocks are not allowed in I—pictures. Non-intra coded
`
`macroblocks in P- and B-pictures can be skipped. For a skipped macroblock, the decoder just copies the macroblock
`
`from the previous picture.
`
`7
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`\
`
`(l:—1§CRIC)ji-1_OCIK:.\II
`
`
`'
`
`
`
`i SLICE 1 | [
`
`;
`
` I SLICE 2
`
`
`
`[SLICE 1
`Y LUMINANCE
`|
`-———— :SLIEE a
`_| SLICE 14
`Queue 3
`M— I Cr Chrominance
`i_sL:CE_1_4
`BLICE 15
`
`I
`I
`
`‘
`
`
`
`
`
`
`
`1
`_
`
`‘|
`
`Fallen: 1
`games 2
`iSLIDE 3
`I
`l Cb Chrom inance
`i
`'
`fight—.—
`'
`g
`
`Figure 7: Macroblock and slice structures
`
`1.4 Summary of the major differences between MPEG-1 video and H.261
`
`As compared to H.261, MPEG-1 video differs in the following aspects:
`
`. MPEG-1 uses bi-directional motion compensated predictive coding with half-pixel accuracy while H.261 has no
`
`bi-directional prediction (B-pictures) and the motion vectors are always in integer-pixel accuracy.
`
`0 MPEG-1 supports the maximum motion vector range of —512 to +511.5 pixels for half-pixel motion vectors and —
`
`1024 to +1023 for integer-pixel motion vectors while H.261 has a maximum range of only :15 pixels.
`
`0 MPEG-1 uses visually weighted quantization based on the fact that the human eye is more sensitive to quantization
`
`errors related to low spatial frequencies than to high spatial frequencies. MPEG-1 defines a default 64-element
`
`quantization matrix, but also allows custom matrices appropriate for different applications. H.261 has only one
`
`quantizer for the intra DC coefficient and 31 quantizers for all other coefficients.
`
`0
`
`H.261 only specifies two source formats: CIF (Common Intermediate Format, 352x288 pixels) and QCIF (Quarter
`
`CIF, 176x144 pixels).
`
`In MPEG-1, the typical source format is SIF (352x240 for NTSC, and 352x288 for PAL).
`
`However, the users can specify other formats. The picture size can be as large as 4k x 4k pixels. There are certain
`
`parameters in the bit-streams that are left flexible, such as the number of lines per picture (less than 4096), the
`
`number of pels per line (less than 4096), picture rate (24, 25, and 30 fi‘ames/s), and fourteen choices of pel aspect
`ratios.
`
`8
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`&
`
`In MPEG-1, I-, P—, and B—pictures are organized as a flexible Group Of Pictures (GOP).
`
`- MPEG-1 uses a flexible slice structure instead of Group Of Blocks (GOB) as defined in H.261.
`
`I MPEG-1 has D-pictures to allow the fast-search option.
`
`.
`
`In order to allow cost effective implementation of user terminals, MPEG-1 defines a Constrained Parameter Set
`
`which lays down specific constraints, as listed in Table 1.
`
`Table 1: MPEG-1 Constrained Parameter Set.
`
`0 Horizontal size <= 720 pels
`° Vertical size <= 576 pels
`0 Total number ofMacroblocks/picture <= 396
`0 Total number of Macroblockslsecond
`<= 396x25 = 330x30
`0 Picture rate <= 30 frames/second
`0 Bit rate <= 1.86 Mbits/second
`
`- Decoder Buffer <= 376832 bits
`
`1.5 Simulation Model
`
`Similar to H.261, MPEG-1 specifies only the syntax and the decoder. Many detailed coding options such as the rate-
`
`control strategy,
`
`the quantization decision levels,
`
`the motion estimation schemes, and coding modes for each
`
`macroblock are not specified. This allows future technology improvement and product differentiation. In order to have
`
`a reference MPEG—1 video quality, Simulation Models were developed in MPEG-1. A simulation model contains a
`
`specific reference implementation of the MPEG-1 encoder and decoder including all the details which are not specified
`
`in the standard. The final version of the MPEG-1 simulation model is “Simulation Model 3” (SM3) [7].
`
`In SM3, the
`
`motion estimation technique uses one forward and/or one backward motion vector per macroblock with half—pixel
`
`accuracy. A two-step search scheme which consists of a full-search in the range of +/- 7 pixels with the integer-pixel
`
`precision, followed by a search in 8 neighboring half-pixel positions, is used. The decision of the coding mode for each
`
`macroblock (whether or not it will use motion compensated prediction and intra/inter coding), the quantizer decision
`
`levels, and the rate-control algorithm are all specified.
`
`1.6 MPEG-1 video bit-stream structures
`
`As shown in figure 8, there are 6 layers in the MPEG-1 video bit-stream: the video sequence, group of pictures, picture,
`
`slice, macroblock, and block layers.
`
`9
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`0 A video sequence layer consists of a sequence header, one or more groups of pictures, and an end-of-sequence
`
`code.
`
`It contains the setting of the following parameters: the picture size (horizontal and vertical sizes), pel aspect
`
`ratio, picture rate, bit-rate, the minimum decoder buffer size (video buffer verifier size), constraint parameters flag
`
`(this flag is set only when the picture size, picture rate, decoder buffer size, bit rate, and motion parameters satisfy
`
`the constraints bound in Table l), the control for the loading of 64 eight-bit values for intra and non-intra
`
`quantization tables, and the user data.
`
`0
`
`The GOP layer consists of a set of pictures that are in a continuous display order.
`
`It contains the setting of the
`
`following parameters: the time code which gives the hours-minutes-seconds time interval from the start of the
`
`sequence, the closed GOP flag which indicates whether the decoding operation needs pictures from the previous
`
`GOP for motion compensation, the broken link flag which indicated whether the previous GOP can be used to
`
`decode the current GOP, and the user data.
`
`I
`
`The picture layer acts as a primary coding unit. It contains the setting of the following parameters: the temporal
`
`reference which is the picture number in the sequence and is used to determine the display order, the picture types
`
`(I/P/B/D), the decoder buffer initial occupancy which gives the number of bits that must be in the compressed
`
`video buffer before the idealized decoder model defined by MPEG decodes the picture (it is used to prevent the
`
`decoder buffer overflow and underflow), the forward motion vector resolution and range for P- and B-pictures, the
`
`backward motion vector resolution and range for B—pictures, and the user data.
`
`I
`
`The slice layer acts as a resynchronization unit. It contains the slice vertical position where the slice starts, and the
`
`quantizer scale that is used in the coding of the current slice.
`
`0
`
`The macroblock layer acts as a motion compensation unit.
`
`It contains the setting of the following parameters: the
`
`optional stuffing bits, the macroblock address increment, the macroblock type, quantizer scale, motion vector, and
`
`the Coded Block Pattern which defines the coding patterns of the 6 blocks in the macroblock.
`
`0
`
`The block layer is the lowest layer of the video sequence and consists of coded 8x8 DCT coefficients. When a
`
`macroblock is encoded in the Intra—mode, the DC-coefficient is encoded similar to that in JPEG (the DC coefficient
`
`of the current macroblock is predicted fi'om the DC coefficient of the previous macroblock). At the beginning of
`
`each slice, predictions for DC coefficients for luminance and chrominance blocks are reset to 1024. The
`
`differential DC values are categorized according to their absolute values and the category information is encoded
`
`using VLC (Variable-Length Code). The category information indicates the number of additional bits following
`
`the VLC to represent the prediction residual. The AC-coefficients are encoded similar to that in H.261 using a
`
`VLC to represent the zero-run-length and the value of the non-zero coefficient. When a macroblock is encoded in
`
`non-intra modes, both the DC— and AC-coefficients are encoded similar to that in H.261.
`
`10
`
`10
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`Above the video sequence layer, there is a system layer in which the video sequence is packetized. The video and
`
`audio bit streams are then multiplexed into an integrated data stream. These are defined in the Systems part.
`
`1.7 Summary
`
`MPEG-1 is mainly for storage media applications. Due to the use of B—picture, it may result in long end-to-end delay.
`
`The MPEG-1 encoder is much more expensive than the decoder due to the large search range, the half-pixel accuracy
`
`in motion estimation, and the use of the bi-directional motion estimation. The MPEG-l syntax can support a variety of
`
`frame-rates and formats for various storage media applications. Similar to other video coding standards, MPEG-1 does
`
`not specify every coding option (motion estimation, rate-control, coding modes, quantization, pre—processing, post-
`
`processing, etc.). This allows continuing technology improvement and product differentiation.
`
`I
`
`Video sequence
`
`
`I Sequence header
`" GOP
`I GOP __ GOP
`I WI Sequence layer
`
`
`Picture
`I Picture
`Picture
`I
`
`GOP header
`
`IWI GOP layer
`
`
`
`I
`
`Picture header
`
`Slice header
`
`Slice
`
`I
`Slice
`Slice
`I
`-----
`I WI Picture layer
`
`
`I Macroblock I Macroblock __ Macroblock I
`I Macroblock I
`
`Slice layer
`
`I
`
`I
`
`I
`
`
`I Macroblock header-i Blocko I Block1
`Block2 I Block3 I Block4 I Blocks
`I Macroblock layer
`
`IDifferential Dc coefficienlI AC coefficientI AC ceefficientI-AC coefficient
`
`.
`
`I End-Of—Block
`
`BIOCK layer
`
`Figure 8: MPEG-1 bit-stream syntax layers.
`
`2. MPEG-2 video coding standard
`
`2.1 Introduction
`
`2.1.1 Background and structure of MPEG-2 standards activities
`
`The MPEG-2 standard represents the continuing efforts of the MPEG committee to develop generic video and audio
`
`coding standards after their development of MPEG-1. The idea of this second phase of MPEG work came from the fact
`
`that MPEG-1 is optimized for applications at about 1.5 Mb/s with input source in SIF, which is a relatively low-
`
`resolution progressive format. Many higher quality higher bit-rate applications require a higher resolution digital video
`
`source such as CCIR601, which is an interlaced format. New techniques can be developed to code the interlaced video
`better.
`
`11
`
`11
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`The MPEG-2 committee started working in late 1990 after the completion of the technical work of MPEG—1. The
`
`competitive tests of video algorithms were held in November 1991, followed by the collaborative phase. The
`
`Committee Draft (CD) for the video part was achieved in November 1993. The MPEG-2 standard (ISO/IEC 13818)
`
`[8] currently consists of 9 parts. The first five parts are organized in the same fashion as MPEG-1: systems, video,
`
`audio, conformance testing, and simulation sofiware technical report. The first three parts of MPEG-2 reached
`
`International Standard (IS) status in November 1994. Part 4 and 5 were approved in March 1996. Part 6 of the MPEG-
`
`2 standard specifies a full set of Digital Storage Media Control Commands (DSM—CC). Part 7 is the specification of a
`
`non-backward compatible audio. Part 8 was originally planned to be the coding of 10-bit video but was discontinued.
`
`Part 9 is the specification of Real—time Interface (RTI) to Transport Stream decoders which may be utilized for
`
`adaptation to all appropriate networks carrying MPEG-2 Transport Streams.
`
`Part 6 and Part 9 have already been
`
`approved as International Standards in July 1996. Like the MPEG-1 standard, MPEG-2 video coding standard
`
`specifies only bit stream syntax and the semantics of the decoding process. Many encoding options were left
`
`unspecified to encourage continuing technology improvement and product differentiation.
`
`MPEG-3, which was originally intended for HDTV (High Definition digital Television) at higher bit-rates, was merged
`
`with MPEG-2. Hence there is no MPEG-3. MPEG-2 video coding standard (ISO/IEC 13818-2) was also adopted by
`
`ITU—T as ITU—T Recommendation H.262 [9].
`
`2.1.2 Target applications and requirements
`
`MPEG-2 is primarily targeted at coding high-quality video at 4 —15 Mb/s for Video On Demand (VOD), digital
`
`broadcast television, and Digital Storage Media such as DVD (Digital Versatile Disc). It is also used for coding HDTV
`
`(High-Definition TV), Cable/Satellite digital TV, video services over various networks, 2-way communications, and
`
`other high-quality digital video applications.
`
`The requirements from MPEG-2 applications mandate several
`
`important features of the compression algorithm.
`
`Regarding picture quality, MPEG-2 needs to be able to provide good NTSC quality video at a bit-rate of about 4-6
`
`Mbits/s and transparent NTSC quality video at a bit—rate of about 8-10 Mbits/s.
`
`It also needs to provide the capability
`
`of random access and quick channel-switching by means of inserting I-pictures periodically. The MPEG-2 syntax also
`
`needs to support trick modes, e.g. fast forward and fast reverse play, as in MPEG-1. Low-delay mode is specified for
`
`delay-sensitive visual communications applications. MPEG-2 has scalable coding modes in order to support multiple
`
`grades of video quality, video formats, and frame-rate for various applications. Error resilience options include intra
`
`motion vector, data partitioning, and scalable coding. Compatibility between the existing and the new standard coders
`
`is another prominent feature provided by MPEG-2. For example, MPEG-2 decoders should be able to decode MPEG-1
`
`bit-streams.
`
`If scalable coding is used, the base-layer of MPEG-2 signals can be decoded by a MPEG—1 decoder.
`
`Finally, it should allow reasonable complexity encoders and low-cost decoders be built with mature technology. Since
`
`12
`
`12
`
`

`

`Copyright 1999 Academic Press. This material will be published in the Image and Video Processing Handbook.
`
`MPEG-2 video is based heavily on MPEG—1, in the following sections, we will focus only on those features which are
`different from MPEG-1 video.
`
`2.2. MPEG-2 Profiles and Levels
`
`MPEG-2 standard is designed to cover a wide range of applications. However, features needed for some applications
`
`may not be needed for other applications. If we put all the features into one single standard, it may result in an overly
`
`expensive system for many applications.
`
`It is desirable for an application to implement only the necessary features to
`
`lower the cost of the system. To meet this need, MPEG-2 classified the groups of features for important applications
`
`into Profiles. A Profile is defined as a specific subset of the MPEG-2 bit stream syntax and functionality to support a
`
`class of applications (e.g. low-delay video conferencing applications, or storage media applications). Within each
`
`Profile, Levels are defined to support applications which have difi'erent quality requirements (e.g. diff

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket