`
`
`
`[19]
`United States Patent
`
`
`
`6,031,937
`[11] Patent Number:
`
`Graffagnino Feb. 29, 2000 [45] Date of Patent:
`
`
`
`
`
`
`
`
`
`
`
`
`USOO6031937A
`
`
`
`[54] METHOD AND APPARATUS FOR VIDEO
`
`
`
`
`
`COMPRESSION USING BLOCK AND
`
`
`
`
`WAVELET TECHNIQUES
`
`
`
`
`
`[75]
`
`
`
`
`
`Inventor: Peter N. Grafi'agnino, San Francisco,
`Calif.
`
`
`
`
`
`
`
`
`
`
`[73] Assignee: NeXT Software, Inc., Redwood City,
`Calif.
`
`
`
`
`
`
`
`
`[21] Appl. No.: 08/247,006
`
`
`
`
`[22]
`Filed:
`May 19, 1994
`
`
`
`
`
`
`
`[51]
`Int. Cl.7 ................................ G06K 9/36; H04N 7/12
`[52] US. Cl.
`............................................. 382/236; 348/396
`
`
`
`
`
`
`[58] Field of Search ..................................... 382/232—253,
`
`
`
`
`
`382/162, 166, 219—220, 221, 272; 348/384—405;
`
`
`
`
`
`358/426—433, 261.1; 341/67
`
`
`
`
`[56]
`
`
`
`References Cited
`
`
`
`
`PUBLICATIONS
`
`
`
`
`
`
`
`
`
`
`Adobe Systems, Inc., “JPEG Technical Specification, Revi-
`
`
`
`
`
`
`
`sion 9” May 4, 1991, PostScript® Developer Support
`
`Group.
`
`
`
`
`
`Primary Examiner—Jon Chang
`Assistant Examiner—Jayanti K. Patel
`
`
`
`
`
`
`
`
`Attorney, Agent, or Firm—Hecker & Harriman
`
`
`
`ABSTRACT
`
`
`
`[57]
`
`
`
`
`
`
`
`
`A method and apparatus are disclosed for symmetrically
`
`
`
`
`
`
`compressing and decompressing video information in real
`
`
`
`
`
`
`
`
`
`techniques.
`time by coupling block and wavelet
`In the
`
`
`
`
`
`
`
`compression pipeline,
`the image is divided into blocks
`
`
`
`
`
`
`
`
`comprising 2k><2k pixels (in the preferred embodiment, k=1).
`
`
`
`
`
`
`
`
`The average color of each block is computed. The system
`
`
`
`
`
`
`
`
`computes an average luminance for each block and differ-
`
`
`
`
`
`
`
`
`ential luminances of each pixel of the plurality of pixels of
`
`
`
`
`
`
`
`
`each block. A first plurality of frequency details of each
`
`
`
`
`
`
`
`block are determined by Haar transforming the differential
`
`
`
`
`
`
`
`luminances. The system computes an average color differ-
`
`
`
`
`
`
`
`
`
`ence between each block and the preceding block, and
`
`
`
`
`
`
`
`
`
`quantizes the average color difference and the first plurality
`
`
`
`
`
`
`
`
`of frequency details using Lloyd-Max quantization. In an
`
`
`
`
`
`
`
`
`alternate embodiment, skip codes are generated for blocks
`
`
`
`
`
`
`
`
`having the same quantized average color difference and
`
`
`
`
`
`
`
`second plurality of frequency details. The quantized average
`
`
`
`
`
`
`
`
`color difference and a second plurality of frequency details
`
`
`
`
`
`
`
`
`are encoded using variable length codes. The system
`
`
`
`
`
`
`employs lookup tables to decompress the compressed image
`
`
`
`
`
`
`
`
`and to format output pixels. The output of the compression
`
`
`
`
`
`
`
`pipeline containing variable length codes is decoded into
`
`
`
`
`
`
`
`
`fixed-length codes, which are then decoded using a first
`
`
`
`
`
`
`
`lookup table into three device-independent components that
`
`
`
`
`
`
`
`
`represent each block. The three components index a second
`
`
`
`
`
`
`
`lookup table containing precomputed RGB values that
`
`
`
`
`
`
`include precomputed display dependent formatting to pro-
`
`
`
`
`
`
`
`
`duce the output image. In the alternate embodiment, skip
`
`
`
`
`
`
`
`
`codes contained in the output of the variable length decoder
`are decoded.
`
`
`
`
`
`
`46 Claims, 8 Drawing Sheets
`
`
`
`
`
`212
`
`
`
`214
`
`
`
`216
`
`
`
`218
`
`
`
`DIVIDE IMAGE
`
`
`INTO 2AK X 2AK
`
`
`BLOCKS
`
`
`COMPUTE
`
`AVERAGE
`
`RGB VALUE
`
`
`COMPUTE
`
`AVERAGE Y
`
`AND AY
`
`VALUES
`
`
`HAAR
`
`TRANSFORM
`0F AY
`
`VALUES
`
`
`
`
`
`
`COMPUTE
`
`AVERAGE RGB
`
`DIFFERENCES
`
`
`QUANTIZE
`
`USING
`
`LLOYD-MAX
`
`
`
`VARIABLE
`
`LENGTH
`
`CODING
`
`
`Page 1 of 18
`
`GOOGLE EXHIBIT 1009
`
`GOOGLE EXHIBIT 1009
`
`Page 1 of 18
`
`
`
`
`tnetaPS”U
`
`
`
`2b.e
`
`F5
`
`m:
`
`
`
`
`
`
`
`
`
`
`
`
`
`0mm?m.20%025:
`9,vszéo
`
`0x35298
`
`
`
`5&852E;Emmoma:05ESE206E248
`
`IwwwwEEmz<E38$302czzwfi/«mmam
`
`mzazoomzsmmug.
`
`%
`
`01
`
`6,031,937
`
`
`
`mSEDSE
`tEMNEézoa$28
`
`s.FGE
`
`w:
`
`._.m<mOEm
`
`NoVJ
`
`.5n_z_
`
`mug):
`
`Page 2 of 18
`
`Page 2 of 18
`
`
`
`
`
`PS”U
`
`t
`
`
`
`b.eF
`
`.01).:
`
`mm.
`
`%hS
`
`8M
`
`6,031,937
`
`
`
`w.mum8N
`
`
`
`391%,$230E228
`
`
`
`I._.OZm:025:mwmm0<mm><
`
`
`
`
`
`02500X<_2.Q>O._._mmozmmwuia
`
`
`C\l
`
`<2
`
`LL
`
`
`
`
`
`2m8m2<E>mo<mm><mosh:VSxE02
`mmfiw,Wangm3;mom28a
`$2:£2.28E228$32.mesa
`
`
`
`moamwawa¢5Na
`
`
`
`
`
`Page 3 of 18
`
`Page 3 of 18
`
`
`
`
`
`
`US. Patent
`
`
`
`Feb. 29, 2000
`
`
`
`
`Sheet 3 0f8
`
`6,031,937
`
`
`
`
` LAPLACIAN
`
`DISTRIBUTION
`
`
`
`
`
`
`RECONSTRUCTION
`LEVELS
`
`
`
`
`35$?“
`
`
`
`FIG. 3
`
`Page 4 of 18
`
`Page 4 of 18
`
`
`
`
`US. Patent
`
`
`
`
`
`Feb. 29, 2000
`
`
`
`
`Sheet 4 0f8
`
`6,031,937
`
`
`
`
`Mllllllllllllllll,
`
` FIG.4
`
`
`
`
`
`a:
`
`o“
`
`E
`
`8
`
`
`F
`Z
`
`I
`
`s
`
`4
`
`
`9
`
`<1
`
`
`E
`
`
`g
`
`
`Q
`
`
`
`Q I
`
`
`v
`
`
`<r
`
`
`
`
`
`
`
`
`
`
`
`_.J
`
`
`N = Lug
`
`-'N = >_<<r
`
`$0 = o.
`—v _
`
`—._/_—_——
`
`&
`= CO—_ _,
`
`
`
`
`
`
`
`
`
`éé %/
`gg
`
`Page 5 of 18
`
`
`
`US. Patent
`
`Feb. 29, 2000
`
`Sheet 5 0f8
`
`6,031,937
`
` FIG.5
`
`
` x
`
`E
`N
`I
`
`g
`(‘0
`:1:
`
`‘o
`<2-
`:I:
`
`’2'»>
`<6
`m
`E»>
`(U
`(D
`
`a
`m
`V
`
`T—
`L0
`
`53
`
`S.
`Lo
`
`1..
`Lo
`
`:0
`g
`:
`go
`%%%%%
`
`
`lllllllllllllllllll‘
`
`
`
`
`
`Page 6 of 18
`
`
`
`US. Patent
`
`Feb. 29, 2000
`
`Sheet 6 0f8
`
`6,031,937
`
`
`
`
`
`
`
`
`
`
`
` FIG.6
`
`
`’71)>
`(U
`no
`<1(1)>
`(U
`
`o<E
`
`D
`(>6
`0:
`51,
`
`cu:
`I
`
`03>
`I
`
`I E
`
`I
`
`:5
`
`1—
`
`Lo
`
`E
`
`1—
`
`Ln
`
`S!
`
`co
`
`N
`
`""
`
`%%%%
`
`lllllllllllllllllll“
`
`
`
`Page 7 of 18
`
`
`
`
`
`US. Patent
`
`
`
`Feb. 29, 2000
`
`
`
`
`Sheet 7 0f8
`
`6,031,937
`
`
`
`
`
`OC
`
`
`O
`
`[\
`
`LIJ
`
`C)
`
`
`
`
`FRAME
`
`2x2BLOCK 706
`
`
`
`2x2BLOCK
`
`OO Ex(
`
`
`/3
`
`
`
`
`FIG.7
`
`Page 8 of 18
`
`Page 8 of 18
`
`
`
`
`US. Patent
`
`
`
`Feb. 29, 2000
`
`Sheet 8 0f 8
`
`6,031,937
`
`
`
`w.OE
`
`._.mm_>zoo
`
`._.m._m_><>>0mo__2
`
`mhzszmEOo
`
`mom9.
`
`
`
`._._m-m_vmoOOmo
`
`OHZ_mooo
`
`._.m_.m_><>>omo_§
`
`wHZmZOmEOO
`
`12m
`
`mooo
`
`$835283_
`
`
`
`m4m<_m<>
`
`
`
`moooIhosz
`
`m0mmm¢m§0©mo
`
`9%
`
`0mm
`
`omw
`
`05
`
`m m
`
`m
`
`
`
`Page 9 of 18
`
`Page 9 of 18
`
`
`
`
`
`
`
`6,031,937
`
`
`
`
`
`
`
`10
`
`
`1
`METHOD AND APPARATUS FOR VIDEO
`
`
`
`
`COMPRESSION USING BLOCK AND
`
`
`
`
`WAVELET TECHNIQUES
`
`BACKGROUND OF THE INVENTION
`
`
`1. Field of the Invention
`
`
`
`
`
`
`
`
`
`
`
`The present invention relates to the field of data compres-
`sion.
`
`
`
`
`2. Background Art
`
`
`
`
`
`
`Compression is a scheme for reducing the amount of
`
`
`
`
`
`
`
`information required to represent data. Data compression
`
`
`
`
`
`
`
`
`
`schemes are used, for example, to reduce the size of a data
`
`
`
`
`
`
`
`
`
`file so that it can be stored in a smaller memory space. Data
`
`
`
`
`
`
`
`
`compression may also be used to compress data prior to its
`
`
`
`
`
`
`
`
`
`transmission from one site to another, reducing the amount
`
`
`
`
`
`
`
`
`
`
`of time required to transmit the data. To access the com-
`
`
`
`
`
`
`
`
`
`pressed data, it is first decompressed into its original form.
`
`
`
`
`
`A compressor/decompressor (codec) is typically used to
`
`
`
`
`
`
`
`perform the compression and decompression of data. One
`
`
`
`
`
`
`
`measure of the performance or efficiency of a codec is its
`
`
`
`
`
`
`
`“compression ratio”. Compression ratio refers to the ratio of
`
`
`
`
`
`
`
`number of bits of uncompressed data to the number of bits
`
`
`
`
`
`
`
`
`
`of compressed data. Compression ratios may be 2:1, 3:1, 4:1
`etc.
`
`
`
`
`
`
`
`
`Data compression may also be required when the input/
`
`
`
`
`
`
`
`
`
`output rate of a particular data receiver is less than the data
`
`
`
`
`
`
`
`
`
`rate of the transmitted data. This can occur when providing
`
`
`
`
`
`
`
`
`video data to computer systems. Video data of frame size
`
`
`
`
`
`
`320x240 is provided at rates approaching 7 megabytes per
`
`
`
`
`
`
`
`
`
`second. This rate is greater than the rate of commonly used
`
`
`
`
`
`
`I/O subsystems of personal computers. Some representative
`
`
`
`
`
`
`rates of common I/O subsystems found on personal com-
`
`
`
`puters (PC) are:
`35
`
`15
`
`20
`
`25
`
`30
`
`
`
`
`
`
`
`Serial Communications
`
`ISDN
`
`Ethernet/CD-ROM
`SCSI Disk
`
`
`
`
`
`
`
`1—2 kilobytes/sec;
`
`
`8—16 kilobytes/sec;
`
`
`150—300 kilobytes/sec;
`
`
`0.5—2 megabytes/sec.
`
`
`
`
`
`
`
`
`
`
`
`Another measure of video codec compression ratio is the
`
`
`
`
`
`
`average compressed bits-per-pixel. This measure is useful in
`
`
`
`
`
`
`describing video compression because different conventions
`
`
`
`
`
`
`
`
`
`are used for calculating the size of uncompressed video, i.e.,
`
`
`
`
`
`
`
`
`some use 24 bits-per-pixel RGB and others use 4:2:2 sub-
`
`
`
`
`
`
`
`
`sampled YUV (16-bits per pixel). The averaging accounts
`
`
`
`
`
`
`for potentially different strategies employed for frames in a
`
`
`
`
`
`
`
`sequence. The bandwidth requirements for a sequence of
`
`
`
`
`
`
`frames is calculated by multiplying the average compressed
`
`
`
`
`
`
`
`
`bits-per-pixel and the number of frames per second, and
`
`
`
`
`
`
`
`dividing the resulting product by the number of pixels in
`each encoded frame.
`
`
`
`
`
`
`
`
`
`
`
`Nearly all video compression techniques are lossy, i.e.,
`
`
`
`
`
`
`information is inevitably discarded in the compression pro-
`cess. A measure of quality is how much this information is
`
`
`
`
`
`
`
`noticed by a human observer. However,
`there is not a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`consistent, objective model of human perception that can be
`
`
`
`
`
`
`
`applied. A simple, concrete, quality metric that is frequently
`
`
`
`
`
`
`
`used is the Mean-Squared-Error (MSE) that measures the
`
`
`
`
`
`
`
`error on a per-pixel basis from the uncompressed original.
`
`
`
`
`
`Most compression algorithms are computationally
`
`
`
`
`
`
`
`
`complex, which limit their application since very complex
`
`
`
`
`
`
`
`algorithms often require expensive hardware to assist in the
`
`
`
`
`
`compression. A useful number to measure computational
`
`
`
`
`
`complexity of software-based compression algorithms is
`
`40
`
`
`
`45
`
`
`
`50
`
`
`
`55
`
`
`
`60
`
`
`
`65
`
`
`
`Page 10 of 18
`
`
`
`
`2
`
`
`
`
`
`MIPS per megapixels/sec, i.e., essentially instructions/pixel.
`
`
`
`
`
`
`
`
`For example, an algorithm just capable of compressing
`
`
`
`
`
`
`
`320x240 pixels per frame at 30 frames per second on a 40
`
`
`
`
`
`
`MIPS machine has a computational complexity of 40,000,
`
`
`000/(320X240X30)El7 instructions/pixel.
`
`
`
`
`
`
`
`
`Symmetry refers to the ratio of the computational com-
`
`
`
`
`
`
`plexity of compression to that of decompression. Codec’s
`
`
`
`
`
`
`
`are frequently designed with a greater computational load on
`
`
`
`
`
`
`
`
`
`the compressor than the decompressor, i.e., they are asym-
`
`
`
`
`
`
`
`
`metric. While this may be a reasonable strategy for “create-
`
`
`
`
`
`
`
`
`once, play-many” video sequences, it limits the range of
`
`
`
`
`
`
`
`applications for the codecs. Asymmetric compression tech-
`
`
`
`
`
`
`
`
`niques are not suitable for teleconferencing, for example,
`
`
`
`
`
`
`since teleconferencing requires essentially real-time pro-
`
`
`
`
`
`
`cessing and substantially equivalent compression and
`
`
`decompression rates.
`
`
`
`
`
`Block Transform Coding Example (JPEG)
`
`
`
`
`
`
`
`
`
`In the prior art, a class of image compressors called Block
`
`
`
`
`
`
`Transform Coding (BTC) is used. This is a fundamentally
`
`
`
`
`
`
`symmetric,
`image-compression technique that
`is used in
`
`
`
`
`
`
`(MPEG) and (JPEG) compression algorithms. In BTC, an
`
`
`
`
`
`
`
`
`
`image is divided into small blocks, the blocks are trans-
`
`
`
`
`
`
`
`formed using an invertible, two dimensional (2-D) math-
`
`
`
`
`
`
`
`ematical transform, the transformed image is quantized, and
`
`
`
`
`
`
`
`the quantized result is losslessly compressed. This process
`
`
`
`
`
`
`
`
`
`forms the core of JPEG and MPEG compression, which use
`
`
`
`
`
`
`
`8x8 blocks and a Discrete Cosine Transform (DCT) to
`
`
`
`
`perform the 2-D transform.
`
`
`
`
`
`FIG. 1 is a diagram illustrating computational blocks of a
`
`
`
`
`
`
`
`
`prior art system for performing JPEG still-image, compres-
`
`
`
`
`
`
`
`
`sion. Input image 102 is provided to the color-space con-
`
`
`
`
`
`
`
`
`
`version and subsampling block 110. The output of the
`
`
`
`
`
`
`
`color-space conversion and subsampling block 110 is pro-
`
`
`
`
`
`
`
`
`
`
`vided to block 112 for dividing each image plane into 8x8
`
`
`
`
`
`
`
`
`blocks. The output of block 114 is provided to the Discrete
`Cosine Transform block 114. Block 114 provides DC terms
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`116 to quantization block 120, which quantizes the DC terms
`
`
`
`
`
`
`
`116 using differential pulse code modulation (DPCM).
`Block 114 provides AC terms 118 to block 122, which scalar
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`quantizes the AC terms 118 by frequency. The outputs of
`blocks 120 and 122 are provided to the Huffman block 124,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`which compresses the quantized values using variable length
`
`
`
`
`codes to provide output 126.
`
`
`
`
`
`
`
`Digital images 102 are typically stored in an RGB format,
`
`
`
`
`
`
`
`
`
`where each pixel is represented as a tuple of red (R), green
`
`
`
`
`
`
`
`
`
`(G), and blue (B) samples. While RGB format is suited
`
`
`
`
`
`
`
`
`
`
`towards most digital color input and output devices, it is not
`
`
`
`
`
`
`
`
`particularly efficient for the human visual system, or natural
`scenes. For example, in natural scenes the R, G, and B
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`components of colors are highly correlated because most
`natural colors are very close to shades of gray, where
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`R=G=B (i.e., saturated colors are rare). In other words, with
`
`
`
`
`
`
`
`respect to information coding, the correlation between RGB
`signals means that there is redundant information stored in
`
`
`
`
`
`
`
`the R, G, and B channels. To account for this redundant
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`information, color-space conversion and subsampling block
`110 transforms the colors of input image 102 into a color
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`space with an explicit brightness, or luminance, dimension
`
`
`
`
`
`
`
`
`prior to compression. More bits are typically used to pre-
`
`
`
`
`
`
`
`
`
`cisely specify the brightness while relatively fewer bits are
`used to specify the chrominance.
`
`
`
`
`
`
`
`
`
`
`
`
`Broadcast television (TV) uses YUV color space to better
`utilize the bandwidth of TV’s. The YUV color space is
`
`
`
`
`
`
`
`
`
`essentially a rotation of the RGB basis vectors so that the
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`luminance axis (Y) of YUV color space is aligned with the
`
`Page 10 of 18
`
`
`
`
`3
`
`
`
`
`
`
`
`
`
`gray diagonal of RGB color space, which extends from RGB
`
`
`
`
`
`
`
`
`
`
`coordinates (0, 0, 0) to (1, 1, 1). The transformation for
`
`
`
`
`
`
`
`converting RGB color values to YUV space is expressed by
`
`
`Equation (1):
`
`0.061
`0.315
`0.161
`Y
`
`
`
`0.234
`U = —0.079 —0.l55
`
`
`
`0.330 —0.227 —0.053
`V
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`R
`
`B
`
`
`
`
`
`
`
`.
`
`(l)
`
`
`
`10
`
`15
`
`Reduction of redundant information can be achieved using
`
`
`
`
`
`
`
`
`
`
`
`
`
`the YUV color-space representation obtained using Equation
`
`
`
`
`
`
`
`
`(1). The human eye is much less sensitive to spatial detail in
`the U and V channels than it is in the Y channel because
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`receptors in the eye for brightness (Y) are more numerous
`
`
`
`
`
`
`
`
`
`than those for chrominance (U, V). Using this fact, the U and
`
`
`
`
`
`
`
`V components can be sampled at a lower resolution. In JPEG
`
`
`
`
`
`
`compression, the U and V components are frequently sub-
`
`
`
`
`
`
`sampled by a factor of 2 in both x- and y-directions. For
`
`
`
`
`
`
`
`example, four Y samples and one sample each of U and V
`20
`
`
`
`
`
`
`
`
`
`are produced for each 2x2 block of an input image. For 8-bit
`
`
`
`
`
`
`
`
`samples per channel, this effectively produces a 2:1 com-
`
`
`
`
`
`
`
`pression factor. Thus, color-space conversion and subsam-
`
`
`
`
`
`
`
`
`
`
`pling block 110 converts an input image 102 from RGB
`
`
`
`
`
`
`
`
`color space to YUV color space using the transformation of
`25
`
`
`
`
`
`
`
`
`
`Equation (1) and subsamples the input image 102 to reduce
`redundant information.
`
`
`
`
`
`
`
`
`
`
`
`Once block 110 converts the input image 102 to YUV
`
`
`
`
`
`
`
`
`
`color space and subsamples the U and V planes, the prior art
`
`
`
`
`
`
`
`
`JPEG system of FIG. 1 treats the resulting three image
`
`
`
`
`
`
`
`
`planes (Y, U, and V) independently and codes them as three
`
`
`
`
`
`separate 1-channel images. Subsampling of U and V values
`
`
`
`
`
`
`
`reduces the amount of computation performed here as well.
`
`
`
`
`
`
`
`
`
`For each of the resulting YUV image planes, block 112 of
`
`
`
`
`
`
`FIG. 1 segments the image output by color-space conversion
`and subsampling block 110 into fixed-size tiles, or blocks. In
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`JPEG compression, the image is divided into blocks of 8x8
`pixels for a number of reasons. Many transforms have
`
`
`
`
`
`
`
`
`
`
`
`
`non-linear, computational complexity that is alleviated by
`
`
`
`
`
`
`
`
`small block sizes. For example, the computational complex-
`
`
`
`
`
`
`
`ity of a Discrete Cosine Transform (DCT), described below,
`
`
`
`
`
`
`
`is O(nlog(n)). Therefore,
`transforming small, fixed-sized
`
`
`
`
`
`
`
`blocks allows the overall compression algorithm to remain
`
`
`
`
`
`
`
`
`approximately linear in image size. The relatively small
`
`
`
`
`
`
`
`
`blocks localize compression artifacts in an image, i.e., the
`45
`artifacts from a block that is particularly difficult to com-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`press do not ripple throughout the image. Finally, small,
`
`
`
`
`
`
`
`fixed block sizes facilitate easier, hardwired optimization.
`
`
`
`
`
`
`
`
`Once the image is segmented into 8x8 blocks, a spatial
`
`
`
`
`
`
`
`
`transform is performed on each block. In the prior art JPEG
`
`
`
`
`
`
`
`system of FIG. 1, block 116 performs a Discrete Cosine
`
`
`
`
`
`
`
`
`Transform on each block of the three image planes provided
`by block 112. The DCT of block 114 is lossless resulting in
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`64 frequency values for each block. The first value produced
`
`
`
`
`
`
`
`
`by block 114 is a DC term 116 that is essentially the average
`55
`YUV value of an 8x8 block. The remaining values are AC
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`terms 118 that represent edges in the x- and y-directions. The
`
`
`
`
`
`
`
`
`transform “sorts” the block into detail components. Eight-
`
`
`
`
`
`
`
`
`by-eight blocks of an image plane that are relatively smooth
`
`
`
`
`
`
`
`
`
`
`have large values for the DC term 116 and lower frequency
`
`
`
`
`
`
`
`
`
`AC terms 118 and relatively little energy in the higher
`
`
`
`
`
`
`
`
`frequency AC terms 118. Blocks with strong vertical detail
`
`
`
`
`
`
`
`have considerable energy in the horizontal frequencies and
`
`
`
`
`
`comparatively little in the vertical.
`Once block 114 produces DC term 116 and AC terms 118,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`DPCM quantization block 120 and scalar quantization block
`
`
`
`
`
`
`
`
`
`122 quantize the resulting frequency terms 116 and 118,
`
`
`
`
`
`
`respectively. The DC term 116 is processed separately. It is
`
`30
`
`35
`
`40
`
`50
`
`60
`
`65
`
`Page 11 of18
`
`6,031,937
`
`
`
`
`4
`
`
`
`
`
`
`
`
`
`not quantized directly, but rather its difference from the DC
`
`
`
`
`
`
`
`
`term of the previous block is quantized by block 120 using
`
`
`
`
`
`
`Differential Pulse Code Modulation coding, or DPCM. In
`
`
`
`
`
`
`
`Block Transform Coding, differential pulse code modulation
`
`
`
`
`
`
`
`of the DC term 116 takes advantage of block-to-block color
`
`
`
`
`
`
`
`correlations and maintains higher precision for the DC term
`
`
`
`
`
`
`
`
`116. The low frequencies of AC terms 118 are quantized
`
`
`
`
`
`
`
`
`
`finely by block 122, since much of the image energy is
`
`
`
`
`
`
`
`contained there, and the higher frequencies of AC terms 118
`
`
`
`
`
`
`
`
`
`are quantized more coarsely by block 122 using scalar
`
`quantization.
`
`
`
`
`
`
`
`
`In JPEG, variable-length coding block 124 encodes the
`
`
`
`
`
`
`
`entropy of DC term 116 and AC terms 118 after quantization
`
`
`
`
`
`
`
`
`
`by blocks 120 and 122, respectively. The quantized DCT
`
`
`
`
`
`
`
`
`coefficients 116 and 118 are losslessly compressed using a
`
`
`
`
`
`variable-length, Huffman-like code. The quantized DC term
`
`
`
`
`
`
`
`
`
`116 is coded individually with a code that is short for small
`
`
`
`
`
`
`
`
`differences and longer for large differences between block
`
`
`
`
`
`
`
`
`
`values. The sixty-three AC terms 118 are coded into a
`
`
`
`
`
`
`
`continuous bitstream, scanned in zig-zag order, with special
`
`
`
`
`
`
`
`
`
`run-length codes referring to runs of zero. The special
`
`
`
`
`
`
`treatment of zero-valued AC codes 118 is important because
`
`
`
`
`
`
`
`
`
`little of the image energy is located in the higher frequency
`
`
`
`
`
`
`
`
`
`terms of the DCT performed by block 114, and thus there is
`
`
`
`
`
`
`
`
`a high probability that many of the high frequency AC terms
`118 are zero.
`
`
`
`
`
`
`
`
`
`
`The prior art JPEG compression has several disadvan-
`
`
`
`
`
`
`
`tages. While the JPEG techniques provides high compres-
`
`
`
`
`
`
`
`
`sion ratios for still-images,
`is not suitable for many
`it
`
`
`
`
`
`real-time software-based video applications. JPEG is not
`
`
`
`
`
`
`
`capable of providing 320x240><24 fps (or 1.8 Mps) using
`
`
`
`
`
`generally available PC’s due to the computational complex-
`
`
`
`
`
`
`ity. Because JPEG is a still-image standard, it cannot provide
`
`
`
`
`
`
`video rate compression with moderate compression using
`
`
`
`
`
`software. Instead, special hardware is required to provide
`
`
`
`
`
`
`
`
`JPEG compression at video rates that can support the above
`
`
`
`
`
`
`
`rate of 1.8 Mps. This is due to the computational complexity
`of performing a Discrete Cosine Transform on an 8x8 block.
`
`
`
`
`
`
`
`
`
`
`
`MPEG compression provides video compression. While
`MPEG has same basic format as JPEG, it is an asymmetric
`
`
`
`
`
`
`
`
`
`
`
`
`compression method using special hardware that requires
`
`
`
`
`
`significantly greater compression time than decompression
`time, and is therefore unsuitable for providing real-time,
`
`
`
`
`
`
`
`
`
`
`
`symmetric video compression and decompression.
`SUMMARY OF THE INVENTION
`
`
`
`
`
`
`
`
`
`
`The present invention provides a method and apparatus
`
`
`
`
`
`
`for symmetrically compressing and decompressing video
`information in real time by coupling block and wavelet
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`techniques. The present invention performs a wavelet trans-
`form on small blocks of an image and encodes the wavelet
`
`
`
`
`
`
`
`
`transformed blocks. The preferred embodiment of the
`
`
`
`
`
`
`present invention utilizes a block-oriented Haar wavelet
`
`
`
`
`
`
`transform on 2-by-2 pixel blocks and is useful in a wide
`
`
`
`
`
`
`
`
`
`
`
`
`variety of video coding applications.
`
`
`
`
`
`
`
`
`In the compression pipeline, the image is divided into a
`
`
`
`
`
`
`
`
`plurality of blocks, where each block of pixels comprises
`
`
`
`
`
`
`
`
`2k><2k pixels. In the preferred embodiment of the present
`
`
`
`
`
`
`
`
`invention, k is equal to one. The average color of each block
`
`
`
`
`
`
`
`of the plurality of blocks is computed. The present invention
`
`
`
`
`
`
`computes an average luminance of each block dependent on
`the average color of each block and a differential luminance
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`of each pixel of the plurality of pixels of each block. A first
`
`
`
`
`
`
`
`
`plurality of frequency details of each block are determined
`
`
`
`
`
`
`
`
`by Haar transforming the differential luminance of each
`
`
`
`
`
`
`
`
`
`
`
`pixel of the plurality of pixels of each block. The first
`
`
`
`
`
`
`
`plurality of frequency details comprises an average term, a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Page 11 of 18
`
`
`
`6,031,937
`
`5
`
`
`
`
`
`
`
`
`
`horizontal term, a vertical term, and a diagonal term. The
`
`
`
`
`
`
`
`invention computes an average color difference
`present
`
`
`
`
`
`
`
`
`between each block and the block that immediately precedes
`
`
`
`
`
`
`
`
`
`
`it, and then quantizes the average color difference and the
`
`
`
`
`
`
`
`
`first plurality of frequency details. The average color differ-
`
`
`
`
`
`
`
`
`ence and the first plurality of frequency details are quantized
`
`
`
`
`
`
`using Lloyd-Max quantization, which is dependent on a
`variance and a number of reconstruction levels.
`In an
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`alternate embodiment of the present invention, skip codes
`
`
`
`
`
`
`
`are generated when the quantized average color difference
`
`
`
`
`
`
`
`
`
`
`and the second plurality of frequency details of the block
`
`
`
`
`
`
`match those of the corresponding block in a previous frame.
`
`
`
`
`
`
`
`
`The quantized average color difference and a second plu-
`
`
`
`
`
`
`
`
`rality of frequency details are encoded using variable length
`
`
`
`
`
`
`
`
`codes; the second plurality of frequency details is less than
`
`
`
`
`
`
`
`
`
`or equal to the first plurality of frequency details. The second
`
`
`
`
`
`
`
`
`plurality of frequency details comprises the horizontal term
`
`
`
`
`
`
`
`
`
`and the vertical term. In the preferred embodiment of the
`
`
`
`
`
`
`
`
`present
`invention,
`the quantized average color and the
`
`
`
`
`
`
`
`
`second plurality of frequency details are encoded using
`
`
`Huffman coding.
`
`
`
`
`
`
`
`The present invention employs lookup tables to decom-
`
`
`
`
`
`
`
`
`press video information and to format output pixels. The
`
`
`
`
`
`
`
`output of the compression pipeline containing variable
`
`
`
`
`
`
`
`
`length codes is first decoded into fixed-length codes. The
`
`
`
`
`
`
`
`
`fixed-length codes are then decoded into five device-
`
`
`
`
`
`
`
`independent components that represent a 2x2 block using a
`
`
`
`
`
`
`
`
`
`first lookup table. The five components hCode, vCode, and
`
`
`
`
`
`
`
`
`
`a set of three compVals (RGB, described below) are pro-
`
`
`
`
`
`
`
`vided as indices to a second lookup table containing pre-
`
`
`
`
`
`
`computed values of R, G, and B components. The R, G, and
`
`
`
`
`
`
`
`
`B components of the second lookup table include precom-
`
`
`
`
`
`
`
`puted display dependent formatting to produce the output
`
`
`
`
`
`
`image. In an alternate embodiment, skip codes contained in
`the output of the variable length decoder are decoded. Thus,
`
`
`
`
`
`
`
`
`the operations of reconstruction, inverse Haar transform,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`clamping, and dithering are reduced to a few table lookups.
`
`
`
`
`
`
`
`
`The per-pixel operation count is only 5—6 operations per
`
`pixel.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`
`
`
`
`
`
`
`
`FIG. 1 is a diagram illustrating a prior system implement-
`
`
`
`ing JPEG compression;
`
`
`
`
`
`FIG. 2 is a diagram illustrating the compression pipeline
`
`
`
`
`of the present invention;
`
`
`
`
`FIG. 3 is a diagram illustrating a Laplacian distribution
`
`
`
`
`according to the present invention;
`
`
`
`
`
`FIG. 4 is a diagram illustrating the seven components
`
`
`
`
`
`
`
`representing a 2x2 block that are produced by blocks 214
`
`
`
`
`
`
`
`and 216 of FIG. 2 in the compression pipeline of the present
`invention;
`
`
`
`
`
`
`FIG. 5 is a diagram illustrating the six components
`
`
`
`
`
`
`
`representing a 2x2 block that are produced by blocks 214,
`
`
`
`
`
`
`
`
`216, and 218 in the compression pipeline of the present
`invention;
`
`
`
`
`
`
`
`FIG. 6 is a diagram illustrating the five components
`
`
`
`
`
`
`
`
`representing a 2x2 block that are produced by blocks 214,
`
`
`
`
`
`
`
`
`
`
`216, 218, 220 and 222 in the compression pipeline of the
`
`
`present invention;
`
`
`
`
`
`FIG. 7 is a diagram illustrating four microwavelet blocks
`
`
`
`
`
`
`
`of an image frame including a skip code; and
`
`
`
`
`
`FIG. 8 is a diagram illustrating the decompression pipe-
`
`
`
`
`
`line of the present invention.
`DETAILED DESCRIPTION OF THE
`
`
`INVENTION
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The present invention provides a method and apparatus
`
`
`
`
`
`
`for compressing video information using microwavelets. In
`
`
`
`Page 12 of 18
`
`
`
`
`6
`
`
`
`
`
`
`
`the following description, numerous specific details, such as
`
`
`
`
`
`
`
`
`
`block sizes, color spaces, etc., are described in detail to
`
`
`
`
`
`
`
`provide a more thorough description of this invention. It will
`
`
`
`
`
`
`
`
`
`
`be apparent, however,
`to one skilled in the art, that the
`
`
`
`
`
`
`
`invention may be practiced without these specific details. In
`other
`instances, well known features have not been
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`described in detail so as not to unnecessarily obscure the
`
`
`present invention.
`
`
`
`
`
`
`The present
`invention symmetrically compresses and
`
`
`
`
`
`decompresses video information in real-time by effectively
`
`
`
`
`
`
`
`coupling block techniques with wavelet techniques. The
`
`
`
`
`
`
`
`present invention performs a wavelet transform on small
`blocks of an image and encodes the wavelet transformed
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`blocks in a highly efficient manner. Thus, the present inven-
`
`
`
`
`
`tion is a real-time, symmetric compressor/decompressor
`scheme that utilizes a block-oriented Haar wavelet transform
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`on 2-by-2 pixel blocks, in the preferred embodiment, which
`
`
`
`
`
`
`provides desired performance and compression ratios.
`
`
`
`
`
`
`The video compression scheme of the present invention is
`
`
`
`
`
`
`
`a high performance, moderate bit-rate, video compression
`
`
`
`
`
`
`
`
`technique that offers significant advantages over prior art
`
`
`
`
`
`
`software compression technologies and is useful in a wide
`
`
`
`
`
`
`
`
`variety of video coding applications. Unlike many other
`
`
`
`
`
`
`
`prior art video-compression technologies that are software-
`
`
`
`
`
`
`
`
`based, the compressed video of the present invention can be
`
`
`
`
`
`
`
`compressed and decompressed in real time using commonly
`
`
`
`
`
`
`available processing means used in personal computers
`
`(PC).
`
`
`
`
`
`
`The present invention provides symmetrical compression
`
`
`
`
`
`
`
`
`and decompression that are on the same order of magnitude
`
`
`
`
`
`
`in computational complexity with modest compression
`
`
`
`
`
`
`
`
`
`rates. It provides compression ratios of 1.5—2.5 bits per
`
`
`
`
`
`
`
`
`
`pixel. Further, the present invention plays back video infor-
`
`
`
`
`
`
`
`
`
`mation at 320x240><24 fps (or 1.8 Mps) using PC’s and
`
`
`
`
`
`
`
`
`provides high quality video. The advantages of the present
`
`
`
`
`
`
`
`invention make it suitable for a wide range of applications.
`
`
`
`
`
`
`Since the technique is symmetrical, applications such as
`
`
`
`
`
`
`
`teleconferencing are enabled. Further, it provides the advan-
`
`
`
`
`
`
`
`
`tages of asymmetric software approaches with respect to
`
`decompression.
`
`
`
`
`
`BTC Approach of the Present Invention
`
`
`
`
`
`
`
`The basic approach of the present invention is to provide
`
`
`
`
`
`
`
`
`an improved coding approach based on the Block Transform
`
`
`
`
`
`
`
`Coding so that real-time software compression and decom-
`
`
`
`
`
`
`
`
`pression are feasible. To meet performance goals, the present
`
`
`
`
`
`
`
`
`invention processes each pixel of an image using less than
`
`
`
`
`
`
`
`
`
`20 operations per pixel.
`In order to provide real-time
`
`
`
`
`compression/decompression, a YUV transform as taught in
`
`
`
`
`
`
`
`
`
`
`
`the prior art is not performed. For playback, the prior art
`
`
`
`
`
`
`YUV-to-RGB conversion requires five multiplications and
`
`
`
`
`
`
`
`four additions, not including output formatting (dithering) as
`well as memory loads and stores. Thus, the conversion uses
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`essentially half of the computational budget of 20 operations
`
`
`per pixel.
`
`
`
`
`
`Another consideration affecting decoding time is output
`
`
`
`
`
`
`
`formatting. The target playback platforms may have various
`
`
`
`
`
`
`
`display formats: 24-bit RGB, 15-bit RGB, 8-bit grayscale,
`
`
`
`
`
`
`
`etc. For example, a common color display used is 12-bit
`
`
`
`
`
`
`
`RGB. To provide suitable image quality, the present inven-
`
`
`
`
`
`
`
`tion dithers the device independent compressed video infor-
`
`
`
`
`
`
`mation. That is, the compressed data of the present invention
`
`
`
`
`
`
`
`is not dependent on a particular displ