throbber
US007298782B2
`
`(12) United States Patent
`Kuriakin et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,298,782 B2
`Nov. 20, 2007
`
`(54) METHOD AND APPARATUS FOR
`IMPROVED MEMORY MANAGEMENT OF
`VIDEO IMAGES
`(75) Inventors: Valery Kuriakin, Nizhny Novgorod
`(RU); Alexander Knyazev, Nizhny
`Novgorod (RU); Roman Belenov,
`Nizhny Novgorod (RU); Yen-Kuang
`Chen, Franklin Park, NJ (US)
`
`(73) Assignee: last Corporation, Santa Clara, CA
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 893 days.
`
`(21) Appl. No.: 10/355,704
`(22) Filed:
`Jan. 31, 2003
`O
`O
`Prior Publication Data
`US 2003/0151610 A1
`Aug. 14, 2003
`
`(65)
`
`Related U.S. Application Data
`(62) Division of application No. 09/607,825, filed on Jun.
`30, 2000, now Pat. No. 6,961,063.
`(51) Int. Cl
`we
`(2006.01)
`H04N 7/2
`375/240.26
`52) U.S. C
`irr irrir.
`(52)
`(58) Field of Classification Search ................ 348/420,
`348/413, 402,421, 416, 400, 699, 405.1,
`348/419.1,424, 441, 425, 453, 449; 382/107,
`382/235, 234, 238,244; 386/109, 111; 375/240.03,
`375/240.21, 240,26, 240.25
`See application file for complete search history.
`References Cited
`
`(56)
`
`U.S. PATENT DOCUMENTS
`
`5/1999 Nadehara
`5,907,500 A
`6/2000 Yamada et al.
`6,078,690 A
`3/2001 Herrera
`6,208,350 B1
`7/2001 Chen et al.
`6,259,741 B1
`6,326,984 B1* 12/2001 Chow et al. ................ 715,764
`
`OTHER PUBLICATIONS
`Bilas, Angelos et al., Real Time Parallel MPEG2 Decoding in
`Sofiware, Princeton Univ., Depts. of Comp. Sci and Elect. Eng., pp.
`1-14, IPPS Apr. 1997.
`Coelho, R. and Hawash, M., DirectX, RDX, RSX and MMX Tech
`nology, Chapter 22.10 Speed Up Graphics Writes with Write
`Combining. Addison-Wesley; Reading, MA., 1998, pp. 369-371.
`(Continued)
`Primary Examiner Tung Vo
`Assistant Examiner Behrooz Senfi
`(74) Attorney, Agent, or Firm—Blakely, Sokoloff, Taylor &
`Zafman LLP
`
`(57)
`
`ABSTRACT
`
`A novel storage format enabling a method for improved
`memory management of video images is described. The
`method includes receiving an image consisting of a plurali
`9.
`g
`g of a plurality
`of color components. Once received, the plurality of color
`components is converted to a mixed format of planar format
`and packed format. The mixed packet format is implemented
`by storing one or more of the plurality of color components
`in a planar format and storing one or more of the plurality
`of color components in a packed format. A method for
`writing out video images is also described utilizing a write
`combining (WC) fame buffer. The decoding method motion
`compensates groups of macroblocks in order to eliminate
`partial writes from the WC frame buffer.
`
`5,561,780 A * 10/1996 Glew et al. ................. T11 126
`
`6 Claims, 15 Drawing Sheets
`
`WIDEODECODER240
`
`B.SREAM
`
`
`
`MOTON
`COMPENSATION
`BLOCK
`
`
`
`CONVERSION
`BLOCK
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 1
`
`

`

`US 7,298.782 B2
`Page 2
`
`OTHER PUBLICATIONS
`Bilas, Angelos et al., Real Time Parallel MPEG 2 Decoding in
`Software, Princeton Uiversity, Departments of Computer Science
`and Electrical Engineering, pp. 1-14.
`
`Chapter 22.10 Speed Up Graphics Writes with Write Combining, pp.
`369-371.
`
`* cited by examiner
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 2
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 1 of 15
`
`US 7,298,782 B2
`
`100
`
`yuyZ Format
`
`F.G. 1A
`
`
`
`F.G. 1B
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 3
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 2 of 15
`
`US 7,298,782 B2
`
`120
`
`122
`
`124
`
`LUMA (Y)
`352 X 240
`
`Chroma (u)
`176 x 120
`
`Chroma (v)
`176 x 120
`
`FIG. 2A
`
`
`
`130
`
`132
`
`134
`
`LUMA (Y)
`352 X 240
`
`Chroma (u)
`176 X 240
`
`Chroma (v)
`176 X 240
`
`FIG. 2B
`
`
`
`140
`
`142
`
`144
`
`LUMA (Y)
`352 X 240
`
`CHROMA (u)
`352 x 240
`
`CHROMA (v)
`352 X 240
`
`FIG. 2C
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 4
`
`

`

`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 5
`
`

`

`U.S. Patent
`
`US 7,298,782 B2
`
`
`
`
`
`
`
`EONERHEBB}}
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 6
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 5 Of 15
`
`US 7,298,782 B2
`
`070€
`
`
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 7
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 6 of 15
`
`US 7.298,782 B2
`
`MOTON COMPENSATION
`
`240
`
`PREVIOUS FRAME
`
`BT-STREAM
`242
`
`
`
`
`
`-
`
`
`
`
`
`
`
`X
`"box 140
`v-320
`8X8 in
`W. 166X,140
`
`N
`
`N
`
`N
`
`y
`--- a------
`y
`Y
`v 330i
`
`312
`
`w
`
`MOTON
`VECTOR
`Nuv
`V
`
`/
`
`o
`
`V-340
`
`M
`N.
`
`FIG. 6A
`
`.
`
`FIG. 6B
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 8
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 7 of 15
`
`US 7,298,782 B2
`
`400
`
`2.
`
`
`
`
`
`
`
`
`
`N
`S.
`Y
`Ll
`S
`2
`
`UL
`
`UNCOMPRESSED
`
`WDEO
`
`UNCOMPRESSED
`
`WDEO
`
`MOTION ESTIMATION
`
`HUFFMAN DECODNG
`
`
`
`
`
`DCT
`
`OUANTIZATION
`
`
`
`RUNLENGTHENCODING
`
`RUN LENGTH DECODNG
`
`
`
`JNVERSE QUANTIZATION
`
`INVERSE DCT
`
`S.
`r
`Y
`l
`O
`S2
`
`C
`
`
`
`
`
`
`
`HUFFMAN ENCODING
`
`MOTION COMPENSATION
`
`COMPRESSED
`VIDEO
`
`UNCOMPRESSED
`VIDEO
`
`FIG. 7A
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 9
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 8 of 15
`
`US 7,298,782 B2
`
`
`
`
`
`
`
`N NOILOW §
`
`N § NOIIWWIISE§
`
`:
`
`
`
`
`
`09?7 èJEGJOONE ORGIA
`
`
`
`
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 10
`
`

`

`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 11
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 10 of 15
`
`US 7,298,782 B2
`
`520 5
`
`522A
`
`WLD
`
`524A
`DCT
`
`526A
`
`M.C.
`
`522B
`WLD
`524B
`DCT
`526B
`
`M.C
`
`522C
`WLD
`524C
`DCT
`526C
`M.C.
`
`522D
`WLD
`524D
`DCT
`526D
`M.C.
`
`F.G. 9
`
`
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 12
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 11 of 15
`
`US 7,298,782 B2
`
`
`
`Z19
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 13
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 12 of 15
`
`US 7,298,782 B2
`
`Z19
`
`&
`
`† 19
`
`
`
`
`
`
`
`009
`
`CINE
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 14
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 13 of 15
`
`US 7,298,782 B2
`
`079
`
`Z
`
`
`
`
`
`CINE
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 15
`
`

`

`U.S. Patent
`
`Nov. 20, 2007
`
`Sheet 14 of 15
`
`US 7,298,782 B2
`
`708 5
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`RECEIVE A PORTION OF AN
`ENCODED BIT STREAM
`REPRESENTING AN
`ENCODED BLOCK
`
`VARABLE LENGTH DECODE
`THE ENCODED BLOCK TO
`GENERATEA OUANTIZED
`BLOCK
`
`PERFORMIOON THE
`OUANTIZED BLOCK TO
`GENERATE AFREOUENCY
`SPECTRUM
`
`
`
`PERFORM DCT ON THE
`OUANTIZED BLOCK USING
`THE FREQUENCY SPECTRUM
`TO GENERATE A DECODED
`BLOCK
`
`
`
`
`
`PLURALITY
`OF MACROBLOCKS
`DECODED2
`
`MOTION COMPENSATE THE
`PLURALITY OF MACROBLOCKS
`TO GENERATE A PLURALITY
`OF MC MACROBLOCKS
`
`
`
`ADDITIONAL
`ENCODED BLOCKS2
`
`FIG. 16
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 16
`
`

`

`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 17
`
`

`

`US 7,298,782 B2
`
`1.
`METHOD AND APPARATUS FOR
`IMPROVED MEMORY MANAGEMENT OF
`VIDEO IMAGES
`
`The application is a divisional of U.S. patent application,
`Ser. No. 09/607,825, filed Jun. 30, 2000 now U.S. Pat. No.
`6,961,063.
`
`FIELD OF THE INVENTION
`
`The present invention relates to video images, and, in
`particular, to a novel storage format for enabling improved
`memory management of video images.
`
`BACKGROUND OF THE INVENTION
`
`10
`
`15
`
`25
`
`30
`
`40
`
`45
`
`In accordance with the NTSC (National Television Stan
`dards Committee) and PAL (Phase Alternating Line) stan
`dard, video images are presented in the YUV color space.
`The Y signal represents a luminance value while the U and
`V signals represent color difference or chrominance values.
`YUV video image data may be transmitted in packed format
`or planar format. In packed format, all the data for a given
`set of pixels of the video image is transmitted before any
`data for another set of pixels is transmitted. As a result, in
`packed format, YUV data is interleaved in the transmitted
`pixel data stream 100, as depicted in FIG. 1 (YUY2 packed
`format). In planar format, Y. U and V data values are stored
`into separate Y. U and V memory areas (planes) in System
`memory 110, as depicted in FIG. 1B.
`FIGS. 2A, 2B and 2C are diagrams illustrating three
`different formats for representing video images in the YUV
`color space. A video image frame may consist of three
`rectangular matrices representing the luminance Y and the
`two-chrominance values U and V. Y matrices 120, 130 and
`35
`140 have an even number of rows and columns. In YUV
`4:2:0 color space format, chrominance component matrices
`122 and 124 may be one half in size of Y matrix 120 in
`horizontal and vertical directions as depicted in FIG. 2A. In
`YUV 4:2:2 format, chrominance component matrices 132
`and 134 may be one half in size of Y matrix 130 in the
`horizontal direction and the same size in the vertical direc
`tion as depicted in FIG. 2B. Finally, in YUV 4:4:4 format,
`chrominance component matrices 142 and 144 may be the
`same size as Y matrix 140 in the horizontal and vertical
`directions as depicted in FIG. 2C.
`To store video data efficiently, conventional digital video
`systems contain a data compressor that compresses the video
`image data using compression techniques. Many conven
`tional compression techniques are based on compressing the
`Video image data by processing the different pixel compo
`nents separately. For example, in accordance with Motion
`Picture Experts Group (MPEG) or International Telecom
`munications Union (ITU) Video compression standards, a
`YUV-data compressor may encode the Y data independently
`of encoding U data and encoding V data. Such a compressor
`preferable receives video data in planar format, in which the
`Y. U, and V data for multiple pixels are separated and
`grouped together in three distinct data streams of Y only, U
`only and V only data, as described above (FIG. 1B).
`60
`Although planar format provides significant advantages
`for data compression, several disadvantages arise when
`storing or processing data received in planar format. For
`example, a video decoder that receives video image data in
`YUV planar format requires three pointers to the Y. U and
`V component values. For basic DVD (Digital Versatile Disk)
`and HDTV (High Definition Television) mode, each mac
`
`50
`
`55
`
`65
`
`2
`roblock has three blocks of pixels: Y:16x16, U:8x8 and
`V:8x8. In addition, the U and V components are located in
`different memory locations. In terms of code size, three
`blocks of code are required for conventional motion com
`pensation of the video image data. Moreover, a separate
`memory page usually must be opened for each YUV com
`ponent.
`In terms of cache efficiency, for YUV Video in the 4:2:0
`format (FIG. 2A), the useful area (in a cache line) for the
`Y-component is about sixteen-bytes per cache line. For the
`U and V components, the useful area is eight-bytes per line
`per color-component. Therefore, two rows in a macroblock
`potentially occupy four cache lines since the U and V
`components are vertically and horizontally Sub-sampled in
`4:2:0 format (two Y cache lines, one U cache line, and one
`V cache line). For YUV video in 4:2:2 format (FIG. 2B), six
`cache lines are required (2 Y cache lines, 2 U cache lines,
`and 2 V cache lines.) Although YUY2 packed format (FIG.
`1A), as described above, uses only two cache lines and could
`be used to overcome this cache inefficiency problem, con
`ventional motion compensation of data in YUY2 format is
`inefficient.
`Therefore, there remains a need to overcome the limita
`tions in the above described existing art, which is satisfied
`by the inventive structure and method described hereinafter.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Additional features and advantages of the invention will
`be more apparent from the following detailed description
`and appended claims when taken in conjunction with the
`drawings, in which:
`FIG. 1A illustrates a video image data stream in YUV2
`packed format as known in the art.
`FIG. 1B illustrates video image data stored in YV12 pure
`planar format, as known in the art.
`FIGS. 2A-C illustrate the YUV 4:2:0 color space format,
`the YUV 4:2:2 color space format and the YUV 4:4:4 color
`space format, as known in the art.
`FIG. 3 illustrates a block diagram of a conventional
`computer system as known in the art.
`FIG. 4 is a block diagram illustrating a video decoder in
`accordance with one embodiment of the present invention.
`FIGS. 5A and 5B illustrate YUV color space storage
`formats as known in the art.
`FIG. 5C illustrates a block diagram of a mixed storage
`format according to an exemplary embodiment of the
`present invention.
`FIGS. 6A-6B illustrate motion compensation of decoded
`blocks according to a further embodiment of the present
`invention.
`FIG. 7A is a block diagram illustrating steps for encoding
`and decoding MPEG image data as known in the art.
`FIG. 7B is a block diagram illustrating a video encoder in
`accordance with a further embodiment of the invention.
`FIG. 8 is a block diagram illustrating a write combining
`(WC) buffer as known in the art.
`FIG. 9 illustrates method steps for decoding an encoded
`MPEG video bit stream as known in the art.
`FIG. 10 illustrates method steps for decoding an encoded
`MPEG video bit stream in accordance with the novel motion
`compensation technique as taught by the present invention.
`FIG. 11A illustrates a stride of an image frame represented
`in the mixed storage format as taught by the present inven
`tion.
`FIG. 11B is a block diagram illustrating a write combining
`(WC) buffer including a cache line containing a Y-compo
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 18
`
`

`

`3
`nent and a UV component represented in the mixed storage
`format as taught by the present invention.
`FIG. 12 illustrates method steps for improved memory
`management of video images utilizing a mixed storage
`format according to an embodiment of the present invention.
`FIG. 13 illustrates additional method steps for improved
`memory management of video images utilizing a mixed
`storage format according to a further embodiment of the
`present invention.
`FIG. 14 illustrates additional method steps for improved
`memory management of video images utilizing a mixed
`storage format according to a further embodiment of the
`present invention.
`FIG. 15 illustrates additional method steps for improved
`memory management of video images utilizing a mixed
`storage format according to a further embodiment of the
`present invention.
`FIG. 16 illustrates method steps for decoding an encoded
`bit stream according to an embodiment of the present
`invention.
`FIG. 17 illustrates additional method steps for decoding
`an encoded bit stream according to a further embodiment of
`the present invention.
`FIG. 18 illustrates additional method steps for decoding
`an encoded bit stream utilizing a mixed storage format
`according to a further embodiment of the present invention.
`
`DETAILED DESCRIPTION
`
`System Architecture
`
`The present invention overcomes the problems in the
`existing art described above by providing a novel storage
`format enabling a method for improved memory manage
`ment of video images. In the following detailed description,
`numerous specific details are set forth in order to provide a
`thorough understanding of the present invention. However,
`one having ordinary skill in the art should recognize that the
`invention may be practiced without these specific details. In
`addition, the following description provides examples, and
`the accompanying drawings show various examples for the
`purposes of illustration. However, these examples should
`not be construed in a limiting sense as they are merely
`intended to provide examples of the present invention rather
`than to provide an exhaustive list of all possible implemen
`tations of the present invention. In some instances, well
`known structures, devices, and techniques have not been
`shown in detail to avoid obscuring the present invention.
`Referring to FIG. 3, a block diagram illustrating major
`components of a computer system 200 in which the inven
`tive storage format may be implemented is now described.
`The computer system 200 includes a display controller 220.
`The display controller 220 is, for example a Video Graphics
`Adapter (VGA), Super VGA (SVGA) or the like. Display
`controller 120 generates pixel data for display 290, which is,
`for example, a CRT, flat panel display or the like. The pixel
`data is generated at a rate characteristic of the refresh of
`display 290 (e.g., 60 Hz, 72 Hz, 75 Hz or the like) and
`horizontal and Vertical resolution of a display image (e.g.,
`640x480 pixels, 1024x768 pixels, 800x600 or the like).
`Display controller 220 may generate a continuous stream of
`pixel data at the characteristic rate of display 290.
`Display controller 220 is also provided with a display
`memory 222, which stores pixel data in text, graphics, or
`video modes for output to display 290. Host CPU 210 is be
`coupled to display controller 220 through bus 270 and
`updates the content of display memory 222 when a display
`
`4
`image for display 290 is altered. Bus 270 may comprise, for
`example, a PCI bus or the like. System memory 280 may be
`coupled to Host CPU 210 for storing data. Hardware video
`decoder 230 is provided to decode video data such as, for
`example, MPEG video data. MPEG video data is received
`from an MPEG video data source (e.g., CD-ROM or the
`like). Alternatively, the video decoder 230 is implemented
`as, for example, a conventional software decoder 282 stored
`in the system memory 280. As such, one of ordinary skill in
`the art will recognize that the teaching of the present
`invention may be implemented in either software or hard
`ware video decoders. Once decoded, the decoded video data
`is outputted to system memory 270 or directly to display
`memory 230.
`Referring to FIG. 4, the components of a video decoder
`240 according to a first embodiment of present invention are
`further described. The video decoder 240 may be utilized as
`the hardware decoder 230 or software decoder 282 within
`the computer system 200. MPEG data received from an
`MPEG data source may be decoded and decompressed as
`follows. The video decoder 240 receives an MPEG bit
`stream 242 at a Variable Length Decoding (VLD) block 244.
`The VLD block 244 decodes the MPEG bit stream 242 and
`generates a quanitized block 246 that is transferred to an
`Inverse Quantanization Block (IQ block) 266. The IQ block
`266 performs inverse quantization on the quantized block
`246 to generate a frequency spectrum 268 for the quantized
`block. An Inverse Discrete Cosine Transform (IDCT) block
`246 performs inverse discrete cosine transformation of the
`quantized block 246 using the frequency spectrum 268 to
`generate a decoded block 252 that is transferred to the
`motion compensation block (MCB) 248. Motion compen
`sation is performed by the MCB 248 to recreate the MPEG
`data 256. Finally, color conversion block 262 converts the
`MPEG data 256 into the Red, Green, Blue (RGB) color
`space in order to generate pictures 264.
`Conventional MPEG decoders, such hardware video
`decoder 230 or software video decoder 282, decode a
`compressed MPEG bit stream into a storage format depend
`ing on the particular compression format used to encode the
`MPEG bit stream. For the reasons described above, YUV
`planar format is the preferred format for compression of
`MPEG images within conventional MPEG decoders. Con
`sequently, the decoded block 252 outputted by the IDCT
`block 250 as well as the MPEG data 256 outputted by the
`MCB 254 are generated in YUV planar format within
`conventional MPEG decoders. Unfortunately, YUV planar
`format is an inefficient format during motion compensation
`of the decoded block 252.
`Accordingly, FIG. 5C depicts a novel mixed storage
`format 300 described by the present invention that is utilized
`by the video decoder 240. Careful review of FIGS. 5A-5C
`illustrates that Y component values are stored in a planar
`array 300A while the U and V components are interleaved
`in a packed array 300B. Using the mixed storage format300,
`decoded block 252 received from the IDCT block 246 is
`converted from planar format (FIG. 5B) to the mixed storage
`format300. Storage of reference frames 260 and MPEG data
`256 in the mixed storage format 300 optimizes motion
`compensation of the decoded block 252 as depicted in FIGS.
`6A and 6B.
`FIG. 6A depicts a previous frame 260 that is stored in a
`reference frames block 258. The previous frame 260
`includes a 320x280 pixel Y-component 310, a 160x140
`pixel U-component 312 and a 160x140 pixel V-component
`314, represented in planar format 304 (FIG. 5B). FIG. 6B
`depicts a portion of the Video Decoder 240. Together FIGS.
`
`US 7,298,782 B2
`
`5
`
`10
`
`15
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 19
`
`

`

`US 7,298,782 B2
`
`5
`
`10
`
`15
`
`5
`6A and 6B depict the combination of the decoded block 252
`(252A, 252B, and 252C) and corresponding YUV compo
`nents, as determined by a motion vector (V) 248, of the
`previous frame 260 during motion compensation. During the
`decoding process, the VLD block 244 generates the motion
`vector (V) 248, which corresponds to the decoded block 252
`generated by the IDCT Block 250. Referring again to FIGS.
`6A and 6B, the motion vector (V) 248 identifies Y-block
`316, U-block 318 and V-block 320 of the previous frame
`260. In order to motion compensate the decoded block 252,
`Y data 252A is combined with Y-block 316, U data is
`combined with U-block 318 and V data is combined with
`V-block 320 in order to generate MPEG data 256 (322A,
`322B and 322C).
`However, if the previous frame 260 and the decoded block
`252 are stored in the mixed storage format 300 (FIG. 5C),
`the motion compensation process is streamlined. Referring
`again to FIG. 6A, the previous frame can be stored in the
`mixed storage format 300 (FIG. 5C) as indicated by the
`320x140 UV-component 330, such that a UV-block 335 is
`formed. Referring again to FIG. 6B, the decoded block can
`be stored in the mixed storage format 300 (FIG. 5B) as
`indicated by UV data 340. The UV-block 335 is then
`combined with the UV data 340 in order to generate UV
`MPEG data 350.
`25
`Careful review of FIGS. 6A and 6B illustrates the advan
`tages of using the mixed packed format 300 (FIG. 5C)
`during motion compensation. Motion compensation using
`the planar format 304 (FIG. 5B) requires (1) opening three
`memory pages, (2) using three pointers, and (3) performing
`30
`three operations for each YUV component to motion com
`pensate the decoded block 252. In contrast, motion com
`pensation using the mixed storage format 300 (FIG. 5C)
`requires (1) opening two memory pages, (2) using two
`pointers, and (3) performing two operations for each Y and
`UV component to motion compensate the decoded block
`252. In addition, referring again to FIGS. 6A and 6B, storage
`of YUV data in cache memory (not shown) requires thirty
`two cache lines in planar format 304 (FIG. 5B) (eight-cache
`lines for the Y-component, eight cache lines for each U and
`40
`V component). In contrast, storage of the Y and UV com
`ponents in the mixed storage format 300 (FIG.5C) requires
`twenty-four cache lines (sixteen cache lines for the Y-com
`ponent and eight cache lines for the UV component).
`Moreover, benefits from using of the mixed storage
`format 300 (FIG.5C) are not limited to video decoders. FIG.
`7A depicts the steps for encoding and decoding video
`images 400. Careful review of FIG. 7A illustrates that the
`encoding 402 and decoding 420 of video images essentially
`mirror each other. Consequently, referring to FIG. 7B, the
`mixed storage format may be used in a Video Encoder 450
`during motion estimation 452, motion compensation 454
`and storage of reference frames 456. For the reasons
`described above use of the mixed storage format 300 (FIG.
`5C) will reduce hardware costs of the Video encoder (less
`pointers), and speed up the encoding process (less opera
`tions and memory access). In addition, cache line efficiency
`results from using the mixed storage format 300 (FIG.5C),
`as described above.
`Although use of the mixed storage format 300 (FIG.5C)
`provides improved cache line efficiency, there remains a
`need to improve access speed to graphics memory. One
`technique for overcoming this problem is using a Write
`Combining (WC) buffer in order to accelerate writes to the
`video frame buffer, as depicted in FIG. 8. FIG. 8 depicts a
`memory address space 500 including a WC region 502. The
`WC buffer 512 utilizes the fact that thirty-two byte burst
`
`50
`
`6
`writes are faster than individual byte or word writes since
`burst writes consume less bandwidth from the system bus.
`Hence, applications can write thirty-two bytes of data to the
`WC frame buffer 512 before burst writing the data to its final
`destination 508.
`Nonetheless, not all applications take advantage of WC
`buffers. One problem associated with WC buffers is that WC
`buffers generally contain only four or six entries in their WC
`region 502. Consequently, any memory stores to an address
`that is not included in the current WC buffer 512 will flush
`out one or some entry in the WC buffer 512. As a result,
`partial writes will occur and reduce the system performance.
`However, by writing data sequentially and consecutively
`into the WC region 502 of the system memory 500, partial
`memory writes to the WC region 502 are eliminated. Refer
`ring again to FIG. 8, if we write the first pixel 504 in line one
`of an image, then the first pixel 506 in line two of the image,
`it is very likely that the WC buffer 512 (holding only one
`byte) will be flushed out. This occurs because we are writing
`to the WC region 502 in a vertical manner, such that the
`second write does not map to the same entry of the WC
`buffer 512. In contrast, when we write to the WC region 502
`in a sequential and consecutive manner, the first thirty-two
`pixels 508 of line one of the image may be written to the WC
`buffer 512 before the first thirty-two bytes of pixels 508 are
`burst written to their final destination in the WC region 502.
`Once we completely write the thirty-second pixel byte 508
`in the WC buffer 512, the entire thirty-two bytes of pixels
`508 can be burst written to their final destination.
`In MPEG video decoding, it is important to reduce partial
`writes during motion compensation and during frame output
`to graphics memory. A novel method for re-ordering of
`operations during motion compensation of video frames in
`order to reduce partial writes is now described with refer
`ence to FIG. 9 and FIG. 10.
`FIG. 9 is a block diagram depicting steps for decoding an
`MPEG bit stream. Instead of motion compensating a block
`after a block, we propose motion compensation of blocks in
`groups of four block (as shown in FIG. 10). MPEG video
`bits streams are generally decoded as follows: (1) VLD the
`motion vector and the DCT coefficients of a block 522; (2)
`IQ and IDCT of the DCT coefficients of the block 524; and
`(3) MC the residue of the block with the displaced reference
`block 526. One problem of this approach is that it causes
`many partial writes.
`Generally, after a block is MC, the resulting block is
`written back to the frame buffer. Assuming the video image
`is stored in a linear fashion and the width of the video is
`larger than the size of an entry in WC buffer, the resulting
`marcoblock is written to the frame as follows. As the block
`is written back to the frame buffer line after line, each
`eight-byte write starts a new cache line. Thus, after storing
`four lines, the application is forcing the WC buffer to flush
`some of its entries. That is, partials writes (16 bytes out of
`32 bytes) occur.
`However by motion compensating four blocks together,
`partial writes are eliminated. That is, Step one (522) and
`Step two (524) are repeated four times before Step three
`(526) is performed as depicted in FIG. 10. Furthermore,
`instead of writing out the second line of a first block after the
`first line of the first block, the first line of a second block is
`written out. This is because the first line of the first block and
`the first line of the second block belong to the same cache
`lines. Consequently, the WC buffer 500 can easily combine
`the write operations in a burst operation. However, those
`skilled in the art will recognize that various numbers of
`
`35
`
`45
`
`55
`
`60
`
`65
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1027, p. 20
`
`

`

`US 7,298,782 B2
`
`7
`blocks may be chose as the plurality of block, such the
`number of blocks is dependent on the line size of the write
`combining buffer.
`The real advantages of writing the data out in a multiple
`of four blocks comes from using the mixed storage format
`300 (FIG.5C). In this format that, Y components of a video
`image have the same stride as the UV components of the
`video image. Referring to FIG. 11A, an image frame 550 is
`depicted utilizing the mixed storage format 300. The image
`550 includes a 320x280 pixel Y-component 552 and a
`320x140 pixel UV-component 554. As such a Y-block 556
`has an equal stride 560 as a UV-block 558. As a result,
`whenever we finish writing to a full cache line 572 (thirty
`two bytes) of a WC buffer 570 (FIG. 11B), we generate a
`“full write” of Y-components 574 (thirty-two bytes) and
`15
`corresponding UV components 576 (thirty-two bytes). We
`only need four blocks to guarantee a “full-write' of the
`cache line. That is, we don't need four extra blocks to
`guarantee a “full-write of the cache line. This property
`provides a distinct advantage over previous pure planner
`YV12 format. Procedural method steps for implementing
`the inventive mixed storage format 300 (FIG. 5C) and a
`modified method for decoding an encoded bit stream are
`now described.
`
`10
`
`Operation
`
`25
`
`8
`are written to a U-plane 304B of the planar arrays 304 and
`the V component is written to a V-plane 304B of the planar
`arrays 304. The planar format is, for example, chosen as one
`of YV12 planar format, YUV12 planar format, YUV16
`planar format, or YUV9 planar format. In addition, color
`components are presented in a color space chosen as, for
`example, one of a YUV color space, a YCrCb color space,
`a YIQ color space, or an RGB color space.
`Referring now to FIG. 16, a method 700 is depicted for
`decoding an encoded bit stream, for example, in the Video
`Decoder 240 as depicted in FIG. 4. At step 704, a portion of
`the encoded bit stream is received representing an encoded
`block. Alternatively, a quanitized block 246 may be
`received. At step 706, the encoded block is variable length
`decoded (VLD) to generate a quantized block. When the
`quanitized block is received at step 704, step 706 is not
`performed. Those skilled in the art will appreciate that the
`encoded block may be decoded in various ways and remain
`with in the scope of this invention. At step 708, inverse
`quantization (IQ) is performed on the quantized block to
`generate a frequency spectrum for the quantized block. At
`step 710, inverse discrete cosine transformation (IDCT) of
`the quantized block is performed using the frequency spec
`trum to generate a decoded block. At step 712, steps 704
`through 710 are repeated for a plurality of encoded blocks.
`As a result, a plurality of decoded blocks, representing a
`plurality of macroblocks, are formed. At step 714, the
`plurality of macroblocks are motion compensated as a group
`in order to generate a plurality of motion compensated (MC)
`macroblocks. Finally at step 740, steps 704 through 714 are
`repeated for each encoded block represented by the encoded
`bit stream. The encoded bit streams is, for example, an
`encoded MPEG video bit stream.
`FIG. 17 depicts additional method steps 716 for motion
`compensating the plurality of macroblocks of step 714. At
`step 718, four blocks are used as the plurality of blocks.
`Finally at step 720, pixel data of the four MC blocks is
`written as a group and in a sequential manner to a frame
`buffer, such that prior to being burst written to the frame
`buffer, the pixel data is

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket