`Grafagnino
`
`54 METHOD AND APPARATUS FOR VIDEO
`COMPRESSION USING BLOCK AND
`WAVELETTECHNIQUES
`75 Inventor: Peter N. Graffagnino, San Francisco,
`Calif.
`
`73 Assignee: NeXT Software, Inc., Redwood City,
`Calif.
`
`21 Appl. No.: 08/247,006
`22 Filed:
`May 19, 1994
`51
`Int. Cl." ................................ G06K9/36; H04N 7/12
`52 U.S. Cl. ............................................. 382/236; 348/396
`58 Field of Search ..................................... 382/232-253,
`382/162, 166,219–220, 221, 272; 348/384-405;
`358/426-433, 261.1, 341/67
`
`56)
`
`References Cited
`
`PUBLICATIONS
`
`Adobe Systems, Inc., “JPEG Technical Specification, Revi
`sion 9” May 4, 1991, PostScript(R Developer Support
`Group.
`
`Primary Examiner Jon Chang
`Assistant Examiner Jayanti K. Patel
`Attorney, Agent, or Firm-Hecker & Harriman
`
`USOO603.1937A
`Patent Number:
`11
`(45) Date of Patent:
`
`6,031,937
`Feb. 29, 2000
`
`ABSTRACT
`57
`A method and apparatus are disclosed for Symmetrically
`compressing and decompressing Video information in real
`time by coupling block and wavelet techniques. In the
`compression pipeline, the image is divided into blockS
`comprising 2x2 pixels (in the preferred embodiment, k=1).
`The average color of each block is computed. The System
`computes an average luminance for each block and differ
`ential luminances of each pixel of the plurality of pixels of
`each block. A first plurality of frequency details of each
`block are determined by Haar transforming the differential
`luminances. The System computes an average color differ
`ence between each block and the preceding block, and
`quantizes the average color difference and the first plurality
`of frequency details using Lloyd-Max quantization. In an
`alternate embodiment, Skip codes are generated for blockS
`having the same quantized average color difference and
`Second plurality of frequency details. The quantized average
`color difference and a Second plurality of frequency details
`are encoded using variable length codes. The System
`employs lookup tables to decompress the compressed image
`and to format output pixels. The output of the compression
`pipeline containing variable length codes is decoded into
`fixed-length codes, which are then decoded using a first
`lookup table into three device-independent components that
`represent each block. The three components indeX a Second
`lookup table containing precomputed RGB values that
`include precomputed display dependent formatting to pro
`duce the output image. In the alternate embodiment, skip
`codes contained in the output of the variable length decoder
`are decoded.
`
`46 Claims, 8 Drawing Sheets
`
`
`
`212
`
`214
`
`216
`
`218
`
`DIVIDE IMAGE
`NTO 2AKX2NK
`BLOCKS
`
`COMPUTE
`AVERAGE
`RGBWALUE
`
`COMPUTE
`AVERAGEY
`ANDAY
`VALUES
`
`HAAR
`TRANSFORM
`OFAY
`VALUES
`
`COMPUTE
`AVERAGERGB
`DIFFERENCES
`
`OUANTIZE
`USING
`LLOYD-MAX
`
`WARIABLE
`LENGTH
`CODING
`
`IPR2018-01413
`Sony EX1007 Page 1
`
`
`
`U.S. Patent
`US. Patent
`
`our2;
`
`
`
`
`
`
`
`
`
`
`
`
`eF
`
`2
`
`
`0mm?m.20%025:
`9,vszéo
`b.@
`
`0x35298
`
`
`
`5&852E;Emmoma:05ESE206E248
`
`IwwwwEEmz<E38$302czzwfi/«mmam
`
`mzazoomzsmmug.
`
`%
`
`f
`
`6,031,937
`6,031,937
`
`mSEDSE
`tEMNEézoa$28
`
`8$Po:
`
`._.m<mOEm
`
`NoVJ
`
`.5n_z_
`
`mug):
`
`|PR2018—O1413
`
`Sony EX1007 Page 2
`
`IPR2018-01413
`Sony EX1007 Page 2
`
`
`
`
`U.S. Patent
`
`6,031,937
`
`SETTWA
`
`SETTWA
`
`
`
`AV HOET TWA 85)}-SX|OOTE
`
`CN
`CD
`L
`
`
`
`
`
`IPR2018-01413
`Sony EX1007 Page 3
`
`
`
`U.S. Patent
`
`Feb. 29, 2000
`
`Sheet 3 of 8
`
`6,031,937
`
`
`
`LAPLACIAN
`DISTRIBUTION
`
`M
`B539 N
`
`V RECONSTRUCTION
`
`LEVELS
`
`FIG. 3
`
`IPR2018-01413
`Sony EX1007 Page 4
`
`
`
`U.S. Patent
`
`Feb. 29, 2000
`
`Sheet 4 of 8
`
`6,031,937
`
`|
`
`
`
`s
`
`s
`<
`
`g?
`<
`
`s
`<1
`
`S? t
`
`r
`
`s
`
`N
`-
`E
`E ><
`E-9 Ol
`- E-1
`
`n
`CD
`C
`3
`
`S.
`
`S?
`
`w
`
`
`
`
`
`O
`-
`-
`
`E- co -- -
`>g
`
`W
`
`"
`
`IPR2018-01413
`Sony EX1007 Page 5
`
`
`
`US. Patent
`
`Feb. 29, 2000
`
`Sheet 5 0f8
`
`6,031,937
`
` FIG.5
`
`
` x
`
`E
`N
`I
`
`g
`(‘0
`:1:
`
`‘o
`<2-
`:I:
`
`’2'»>
`<6
`m
`E»>
`(U
`(D
`
`a
`m
`V
`
`T—
`L0
`
`53
`
`S.
`Lo
`
`1..
`Lo
`
`:0
`g
`:
`go
`%%%%%
`
`
`lllllllllllllllllll‘
`
`
`
`
`
`IPR2018-01413
`Sony EX1007 Page 6
`
`
`
`US. Patent
`
`Feb. 29, 2000
`
`Sheet 6 0f8
`
`6,031,937
`
`
`
`
`
`
`
`
`
`
`
` FIG.6
`
`
`’71)>
`(U
`no
`<1(1)>
`(U
`
`o<E
`
`D
`(>6
`0:
`51,
`
`cu:
`I
`
`03>
`I
`
`I E
`
`I
`
`:5
`
`1—
`
`Lo
`
`E
`
`1—
`
`Ln
`
`S!
`
`co
`
`N
`
`""
`
`%%%%
`
`lllllllllllllllllll“
`
`
`
`IPR2018-01413
`Sony EX1007 Page 7
`
`
`
`U.S. Patent
`
`Feb. 29, 2000
`
`Sheet 7 of 8
`
`6,031,937
`
`
`
`:
`
`O
`Crd
`N
`l
`O
`O
`
`N1
`f
`
`S
`
`s
`
`IPR2018-01413
`Sony EX1007 Page 8
`
`
`
`6,031,937
`6,031,937
`
`U.S. Patent
`US. Patent
`
`b.eF
`
`2
`
`00f000whS
`
`w.OE
`
`._.m._m><>>0mo__2mEm>zoo
`
`
`
`,0%m).0mm
`
`mhzszmEOo
`
`mom9.
`
`
`
`._._m-m_vmoOOmo
`
`OHZ_mooo
`
`._.m_.m_><>>omo_§
`
`wHZmZOmEOO
`
`omw
`
`12m
`
`mooo
`
`$835283_
`
`
`
`m4m<_m<>
`
`
`
`moooIhosz
`
`m0mmm¢m§0©mo
`
`05
`
`m m
`
`m
`
`
`
`|PR2018—O1413
`
`Sony EX1007 Page 9
`
`IPR2018-01413
`Sony EX1007 Page 9
`
`
`
`
`
`
`
`6,031,937
`
`1
`METHOD AND APPARATUS FOR VIDEO
`COMPRESSION USING BLOCK AND
`WAVELETTECHNIQUES
`
`BACKGROUND OF THE INVENTION
`
`2
`MIPS per megapixels/Sec, i.e., eSSentially instructions/pixel.
`For example, an algorithm just capable of compressing
`320x240 pixels per frame at 30 frames per second on a 40
`MIPS machine has a computational complexity of 40,000,
`000/(320x240x30)s 17 instructions/pixel.
`Symmetry refers to the ratio of the computational com
`plexity of compression to that of decompression. Codecs
`are frequently designed with a greater computational load on
`the compressor than the decompressor, i.e., they are asym
`metric. While this may be a reasonable strategy for “create
`once, play-many video Sequences, it limits the range of
`applications for the codecs. ASymmetric compression tech
`niques are not Suitable for teleconferencing, for example,
`Since teleconferencing requires essentially real-time pro
`cessing and Substantially equivalent compression and
`decompression rates.
`Block Transform Coding Example (JPEG)
`In the prior art, a class of image compressors called Block
`Transform Coding (BTC) is used. This is a fundamentally
`Symmetric, image-compression technique that is used in
`(MPEG) and (JPEG) compression algorithms. In BTC, an
`image is divided into Small blocks, the blocks are trans
`formed using an invertible, two dimensional (2-D) math
`ematical transform, the transformed image is quantized, and
`the quantized result is losslessly compressed. This process
`forms the core of JPEG and MPEG compression, which use
`8x8 blocks and a Discrete Cosine Transform (DCT) to
`perform the 2-D transform.
`FIG. 1 is a diagram illustrating computational blocks of a
`prior art System for performing JPEG still-image, compres
`Sion. Input image 102 is provided to the color-Space con
`version and subsampling block 110. The output of the
`color-Space conversion and Subsampling block 110 is pro
`vided to block 112 for dividing each image plane into 8x8
`blocks. The output of block 114 is provided to the Discrete
`Cosine Transform block 114. Block 114 provides DC terms
`116 to quantization block 120, which quantizes the DC terms
`116 using differential pulse code modulation (DPCM).
`Block 114 provides AC terms 118 to block 122, which scalar
`quantizes the AC terms 118 by frequency. The outputs of
`blocks 120 and 122 are provided to the Huffman block 124,
`which compresses the quantized values using variable length
`codes to provide output 126.
`Digital images 102 are typically stored in an RGB format,
`where each pixel is represented as a tuple of red (R), green
`(G), and blue (B) samples. While RGB format is suited
`towards most digital color input and output devices, it is not
`particularly efficient for the human visual System, or natural
`Scenes. For example, in natural Scenes the R, G, and B
`components of colors are highly correlated because most
`natural colors are very close to shades of gray, where
`R=G=B (i.e., saturated colors are rare). In other words, with
`respect to information coding, the correlation between RGB
`Signals means that there is redundant information Stored in
`the R, G, and B channels. To account for this redundant
`information, color-space conversion and Subsampling block
`110 transforms the colors of input image 102 into a color
`Space with an explicit brightness, or luminance, dimension
`prior to compression. More bits are typically used to pre
`cisely specify the brightness while relatively fewer bits are
`used to specify the chrominance.
`Broadcast television (TV) uses YUV color space to better
`utilize the bandwidth of TV's. The YUV color space is
`essentially a rotation of the RGB basis vectors so that the
`luminance axis (Y) of YUV color space is aligned with the
`
`1. Field of the Invention
`The present invention relates to the field of data compres
`Sion.
`2. Background Art
`Compression is a Scheme for reducing the amount of
`information required to represent data. Data compression
`Schemes are used, for example, to reduce the Size of a data
`file So that it can be Stored in a Smaller memory Space. Data
`compression may also be used to compress data prior to its
`transmission from one site to another, reducing the amount
`of time required to transmit the data. To access the com
`pressed data, it is first decompressed into its original form.
`A compressor/decompressor (codec) is typically used to
`perform the compression and decompression of data. One
`measure of the performance or efficiency of a codec is its
`“compression ratio'. Compression ratio refers to the ratio of
`number of bits of uncompressed data to the number of bits
`of compressed data. Compression ratioS may be 2:1, 3:1, 4:1
`etc.
`Data compression may also be required when the input/
`output rate of a particular data receiver is less than the data
`rate of the transmitted data. This can occur when providing
`Video data to computer Systems. Video data of frame size
`320x240 is provided at rates approaching 7 megabytes per
`Second. This rate is greater than the rate of commonly used
`I/O Subsystems of personal computers. Some representative
`rates of common I/O Subsystems found on personal com
`puters (PC) are:
`
`15
`
`25
`
`35
`
`Serial Communications
`ISDN
`Ethernet?CD-ROM
`SCSI Disk
`
`1-2 kilobytes/sec;
`8-16 kilobytes/sec;
`150-300 kilobytes/sec;
`0.5–2 megabytes/sec.
`
`Another measure of Video codec compression ratio is the
`average compressed bits-per-pixel. This measure is useful in
`describing Video compression because different conventions
`are used for calculating the Size of uncompressed Video, i.e.,
`Some use 24 bits-per-pixel RGB and others use 4:2:2 Sub
`Sampled YUV (16-bits per pixel). The averaging accounts
`for potentially different Strategies employed for frames in a
`Sequence. The bandwidth requirements for a Sequence of
`frames is calculated by multiplying the average compressed
`bits-per-pixel and the number of frames per Second, and
`dividing the resulting product by the number of pixels in
`each encoded frame.
`Nearly all video compression techniques are lossy, i.e.,
`information is inevitably discarded in the compression pro
`ceSS. A measure of quality is how much this information is
`noticed by a human observer. However, there is not a
`consistent, objective model of human perception that can be
`applied. A simple, concrete, quality metric that is frequently
`used is the Mean-Squared-Error (MSE) that measures the
`error on a per-pixel basis from the uncompressed original.
`Most compression algorithms are computationally
`complex, which limit their application Since very complex
`algorithms often require expensive hardware to assist in the
`compression. A useful number to measure computational
`complexity of Software-based compression algorithms is
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`IPR2018-01413
`Sony EX1007 Page 10
`
`
`
`6,031,937
`
`3
`gray diagonal of RGB color space, which extends from RGB
`coordinates (0, 0, 0) to (1, 1, 1). The transformation for
`converting RGB color values to YUV space is expressed by
`Equation (1):
`
`0.061, R
`0.315
`0.161
`Y
`0.234
`U --O.O79 -0.155
`V
`0.330 -0.227 -0.053 B
`
`(1)
`
`4
`not quantized directly, but rather its difference from the DC
`term of the previous block is quantized by block 120 using
`Differential Pulse Code Modulation coding, or DPCM. In
`Block Transform Coding, differential pulse code modulation
`of the DC term 116 takes advantage of block-to-block color
`correlations and maintains higher precision for the DC term
`116. The low frequencies of AC terms 118 are quantized
`finely by block 122, Since much of the image energy is
`contained there, and the higher frequencies of AC terms 118
`are quantized more coarsely by block 122 using Scalar
`quantization.
`In JPEG, variable-length coding block 124 encodes the
`entropy of DC term 116 and AC terms 118 after quantization
`by blocks 120 and 122, respectively. The quantized DCT
`coefficients 116 and 118 are losslessly compressed using a
`variable-length, Huffman-like code. The quantized DC term
`116 is coded individually with a code that is short for small
`differences and longer for large differences between block
`values. The sixty-three AC terms 118 are coded into a
`continuous bitstream, Scanned in Zig-Zag order, with Special
`run-length codes referring to runs of Zero. The Special
`treatment of zero-valued AC codes 118 is important because
`little of the image energy is located in the higher frequency
`terms of the DCT performed by block 114, and thus there is
`a high probability that many of the high frequency AC terms
`118 are zero.
`The prior art JPEG compression has several disadvan
`tages. While the JPEG techniques provides high compres
`Sion ratioS for Still-images, it is not Suitable for many
`real-time software-based video applications. JPEG is not
`capable of providing 320x240x24 fps (or 1.8 Mps) using
`generally available PCS due to the computational complex
`ity. Because JPEG is a still-image Standard, it cannot provide
`Video rate compression with moderate compression using
`Software. Instead, Special hardware is required to provide
`JPEG compression at Video rates that can Support the above
`rate of 1.8 Mps. This is due to the computational complexity
`of performing a Discrete Cosine Transform on an 8x8 block.
`MPEG compression provides video compression. While
`MPEG has same basic format as JPEG, it is an asymmetric
`compression method using Special hardware that requires
`Significantly greater compression time than decompression
`time, and is therefore unsuitable for providing real-time,
`Symmetric Video compression and decompression.
`SUMMARY OF THE INVENTION
`The present invention provides a method and apparatus
`for Symmetrically compressing and decompressing video
`information in real time by coupling block and wavelet
`techniques. The present invention performs a wavelet trans
`form on Small blocks of an image and encodes the wavelet
`transformed blocks. The preferred embodiment of the
`present invention utilizes a block-oriented Haar wavelet
`transform on 2-by-2 pixel blocks and is useful in a wide
`variety of Video coding applications.
`In the compression pipeline, the image is divided into a
`plurality of blocks, where each block of pixels comprises
`2x2 pixels. In the preferred embodiment of the present
`invention, k is equal to one. The average color of each block
`of the plurality of blocks is computed. The present invention
`computes an average luminance of each block dependent on
`the average color of each block and a differential luminance
`of each pixel of the plurality of pixels of each block. A first
`plurality of frequency details of each block are determined
`by Haar transforming the differential luminance of each
`pixel of the plurality of pixels of each block. The first
`plurality of frequency details comprises an average term, a
`
`15
`
`25
`
`35
`
`40
`
`50
`
`Reduction of redundant information can be achieved using
`the YUV color-Space representation obtained using Equation
`(1). The human eye is much less Sensitive to spatial detail in
`the U and V channels than it is in the Y channel because
`receptors in the eye for brightness (Y) are more numerous
`than those for chrominance (U,V). Using this fact, the U and
`V components can be sampled at a lower resolution. In JPEG
`compression, the U and V components are frequently Sub
`sampled by a factor of 2 in both X- and y-directions. For
`example, four Y Samples and one Sample each of U and V
`are produced for each 2x2 block of an input image. For 8-bit
`Samples per channel, this effectively produces a 2:1 com
`pression factor. Thus, color-space conversion and SubSam
`pling block 110 converts an input image 102 from RGB
`color Space to YUV color Space using the transformation of
`Equation (1) and Subsamples the input image 102 to reduce
`redundant information.
`Once block 110 converts the input image 102 to YUV
`color Space and Subsamples the U and V planes, the prior art
`JPEG system of FIG. 1 treats the resulting three image
`planes (Y, U, and V) independently and codes them as three
`Separate 1-channel images. Subsampling of U and V values
`reduces the amount of computation performed here as well.
`For each of the resulting YUV image planes, block 112 of
`FIG. 1 Segments the image output by color-Space conversion
`and subsampling block 110 into fixed-size tiles, or blocks. In
`JPEG compression, the image is divided into blocks of 8x8
`pixels for a number of reasons. Many transforms have
`non-linear, computational complexity that is alleviated by
`Small block sizes. For example, the computational complex
`ity of a Discrete Cosine Transform (DCT), described below,
`is O(n log(n)). Therefore, transforming Small, fixed-sized
`blocks allows the overall compression algorithm to remain
`approximately linear in image size. The relatively Small
`blockS localize compression artifacts in an image, i.e., the
`artifacts from a block that is particularly difficult to com
`45
`preSS do not ripple throughout the image. Finally, Small,
`fixed block sizes facilitate easier, hardwired optimization.
`Once the image is Segmented into 8x8 blocks, a Spatial
`transform is performed on each block. In the prior art JPEG
`system of FIG. 1, block 116 performs a Discrete Cosine
`Transform on each block of the three image planes provided
`by block 112. The DCT of block 114 is lossless resulting in
`64 frequency values for each block. The first value produced
`by block 114 is a DC term 116 that is essentially the average
`YUV value of an 8x8 block. The remaining values are AC
`55
`terms 118 that represent edges in the X- and y-directions. The
`transform "sorts” the block into detail components. Eight
`by-eight blocks of an image plane that are relatively Smooth
`have large values for the DC term 116 and lower frequency
`AC terms 118 and relatively little energy in the higher
`frequency AC terms 118. Blocks with strong vertical detail
`have considerable energy in the horizontal frequencies and
`comparatively little in the vertical.
`Once block 114 produces DC term 116 and AC terms 118,
`DPCM quantization block 120 and scalar quantization block
`122 quantize the resulting frequency terms 116 and 118,
`respectively. The DC term 116 is processed separately. It is
`
`60
`
`65
`
`IPR2018-01413
`Sony EX1007 Page 11
`
`
`
`S
`horizontal term, a vertical term, and a diagonal term. The
`present invention computes an average color difference
`between each block and the block that immediately precedes
`it, and then quantizes the average color difference and the
`first plurality of frequency details. The average color differ
`ence and the first plurality of frequency details are quantized
`using Lloyd-Max quantization, which is dependent on a
`variance and a number of reconstruction levels. In an
`alternate embodiment of the present invention, Skip codes
`are generated when the quantized average color difference
`and the Second plurality of frequency details of the block
`match those of the corresponding block in a previous frame.
`The quantized average color difference and a Second plu
`rality of frequency details are encoded using variable length
`codes; the Second plurality of frequency details is less than
`or equal to the first plurality of frequency details. The Second
`plurality of frequency details comprises the horizontal term
`and the vertical term. In the preferred embodiment of the
`present invention, the quantized average color and the
`Second plurality of frequency details are encoded using
`Huffman coding.
`The present invention employs lookup tables to decom
`preSS Video information and to format output pixels. The
`output of the compression pipeline containing variable
`length codes is first decoded into fixed-length codes. The
`fixed-length codes are then decoded into five device
`independent components that represent a 2x2 block using a
`first lookup table. The five components hCode, vCode, and
`a set of three compVals (RGB, described below) are pro
`Vided as indices to a Second lookup table containing pre
`computed values of R, G, and B components. The R, G, and
`B components of the Second lookup table include precom
`puted display dependent formatting to produce the output
`image. In an alternate embodiment, Skip codes contained in
`the output of the variable length decoder are decoded. Thus,
`the operations of reconstruction, inverse Haar transform,
`clamping, and dithering are reduced to a few table lookups.
`The per-pixel operation count is only 5-6 operations per
`pixel.
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a diagram illustrating a prior System implement
`ing JPEG compression;
`FIG. 2 is a diagram illustrating the compression pipeline
`of the present invention;
`FIG. 3 is a diagram illustrating a Laplacian distribution
`according to the present invention;
`FIG. 4 is a diagram illustrating the Seven components
`representing a 2x2 block that are produced by blockS 214
`and 216 of FIG. 2 in the compression pipeline of the present
`invention;
`FIG. 5 is a diagram illustrating the Six components
`representing a 2x2 block that are produced by blockS 214,
`216, and 218 in the compression pipeline of the present
`invention;
`FIG. 6 is a diagram illustrating the five components
`representing a 2x2 block that are produced by blockS 214,
`216, 218, 220 and 222 in the compression pipeline of the
`present invention;
`FIG. 7 is a diagram illustrating four microwavelet blocks
`of an image frame including a skip code; and
`FIG. 8 is a diagram illustrating the decompression pipe
`line of the present invention.
`DETAILED DESCRIPTION OF THE
`INVENTION
`The present invention provides a method and apparatus
`for compressing Video information using microwavelets. In
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6,031,937
`
`6
`the following description, numerous Specific details, Such as
`block sizes, color Spaces, etc., are described in detail to
`provide a more thorough description of this invention. It will
`be apparent, however, to one skilled in the art, that the
`invention may be practiced without these specific details. In
`other instances, well known features have not been
`described in detail So as not to unnecessarily obscure the
`present invention.
`The present invention Symmetrically compresses and
`decompresses Video information in real-time by effectively
`coupling block techniques with wavelet techniques. The
`present invention performs a wavelet transform on Small
`blocks of an image and encodes the wavelet transformed
`blocks in a highly efficient manner. Thus, the present inven
`tion is a real-time, Symmetric compressor/decompressor
`Scheme that utilizes a block-oriented Haar wavelet transform
`on 2-by-2 pixel blocks, in the preferred embodiment, which
`provides desired performance and compression ratios.
`The Video compression Scheme of the present invention is
`a high performance, moderate bit-rate, Video compression
`technique that offerS Significant advantages over prior art
`Software compression technologies and is useful in a wide
`variety of Video coding applications. Unlike many other
`prior art Video-compression technologies that are Software
`based, the compressed Video of the present invention can be
`compressed and decompressed in real time using commonly
`available processing means used in personal computers
`(PC).
`The present invention provides Symmetrical compression
`and decompression that are on the same order of magnitude
`in computational complexity with modest compression
`rates. It provides compression ratioS of 1.5-2.5 bits per
`pixel. Further, the present invention plays back Video infor
`mation at 320x240x24 fps (or 1.8 Mps) using PC's and
`provides high quality Video. The advantages of the present
`invention make it Suitable for a wide range of applications.
`Since the technique is Symmetrical, applications Such as
`teleconferencing are enabled. Further, it provides the advan
`tages of asymmetric Software approaches with respect to
`decompression.
`BTC Approach of the Present Invention
`The basic approach of the present invention is to provide
`an improved coding approach based on the Block Transform
`Coding So that real-time Software compression and decom
`pression are feasible. To meet performance goals, the present
`invention processes each pixel of an image using less than
`20 operations per pixel. In order to provide real-time
`compression/decompression, a YUV transform as taught in
`the prior art is not performed. For playback, the prior art
`YUV-to-RGB conversion requires five multiplications and
`four additions, not including output formatting (dithering) as
`well as memory loads and Stores. Thus, the conversion uses
`essentially half of the computational budget of 20 operations
`per pixel.
`Another consideration affecting decoding time is output
`formatting. The target playback platforms may have various
`display formats: 24-bit RGB, 15-bit RGB, 8-bit grayscale,
`etc. For example, a common color display used is 12-bit
`RGB. To provide Suitable image quality, the present inven
`tion dithers the device independent compressed Video infor
`mation. That is, the compressed data of the present invention
`is not dependent on a particular display format. Thus, even
`an optimized 24-bit to 12-bit RGB dithering conversion
`typically requires 10 operations per pixel. It is accordingly
`apparent that color-Space conversion and dithering can con
`
`IPR2018-01413
`Sony EX1007 Page 12
`
`
`
`7
`Sume the entire computational budget. Therefore, as
`described below, the present provides precomputed output
`formatting incorporated in lookup tables used to decompress
`encoded Video information.
`FIG. 2 is a diagram illustrating the compression pipeline
`of the present invention for compressing video information
`using wavelets. An RGB image 210 is provided as input to
`block 212, which divides the image into 2'x2' blocks. In the
`preferred embodiment, k is equal to one, i.e., 2x2 blocks are
`used. The 2x2 blocks are output to block 214, which
`computes the average RGB value (R. B., and G) of
`each 2x2 block. The output of block 214 is coupled to the
`input of block 216, which computes the average luminance
`value Y
`of each 2x2 block and the differential lumi
`nances AY for each pixel of a block, described below.
`The output of block 216 is coupled to the input of Haar
`transformation block 218, which performs a Haar transform
`of the AY values of each 2x2 block. The output of Haar
`transform block 218 is coupled to the input of block 220 for
`computing the differences (AR, AB, and AG)
`between the average RGB values of the present 2x2 block
`and the average RGB values of the previous 2x2 block.
`The output of block 220 is coupled to the input of
`Lloyd-Max quantization block 222, which quantizes the
`25
`Haar transform values H2, H3, and H4dia (described
`below) and the RGB differences AR, AB, and AG.
`The output of Lloyd-Max quantization block 222 is coupled
`to the input of variable-length coding block 224. In the
`preferred embodiment of the present invention, Huffman
`coding is implemented to perform variable length encoding
`of the six data H2,H3), H4, AR, AB, and AG
`representing each 2x2 block to produce output 226.
`In FIG. 2, skip code block 223 may be inserted between
`the output of block 222 and variable-length coding block
`224. The skip code block 223 is indicated by an asterisk
`beside the numeral 223 and the block itself is dashed instead
`of Solid. In a first embodiment for producing Symmetrically
`compressed Still-images, the output of Lloyd-Max quanti
`Zation block 222 is coupled to the input of variable length
`coding block 224 to produce output 226, while in a Second
`embodiment, skip code block 226 is inserted between the
`two blocks to provide temporal compression of Video infor
`mation using Skip codes, described below.
`The present invention works on image blocks of 2x2
`pixels. Block 212 parses the input image 210 into 2x2
`blocks of pixels. For each 2x2 block, block 214 computes a
`full-precision, average (DC) RGB value, i.e., R, G, and
`B, to represent its color value. In block 216, unlike the
`prior art, the present invention does not use a full YUV-to
`RGB conversion in block 216. Instead, block 216 uses a
`modified YUV-to-RGB technique that retains the compres
`Sion advantages of treating luminance and chrominance
`information differently. As described below, lookup tables
`and a Small block size are used, thereby allowing output
`formatting to be precomputed in lookup tables and not
`calculated for each pixel.
`FIG. 4 is a diagram illustrating the Seven components
`representing a 2x2 block that are produced by blockS 214
`and 216 in the compression pipeline of the present invention.
`The 2x2 block 410 comprising pixel1 401, pixel2 402,
`pixel3 403, and pixel4404 is produced by block 212 from
`input image 210. The 2x2 block is provided to block 214,
`which produces DC RGB value 420. While DC RGB value
`420 is represented as a single block in FIG. 4, it should be
`apparent that DC RGB value 420 comprises three
`components, i.e., an R, G, and B component. The
`
`8
`2x2 block 410 and DC RGB value 420 are provided to block
`216, which computes Y value and the AY's. In FIG. 4,
`AY1430, AY2 431, AY3 432, and AY4 433 represent the
`differential luminances between each particular luminance
`value of a pixel and the average luminance value Y. The
`respective differential luminances are indicated in the dia
`gram by a corresponding lighter colored block within the
`larger block. Thus, the differential luminance AY1 is indi
`cated by lightened block 430 in the upper left hand portion
`of the block (illustrated below DC RGB block 420) corre
`sponding to the position of pixel1401 of 2x2 block 410. The
`AY2431, AY3 432, and AY4433 values are illustrated in the
`upper right, lower left, and lower right portion of blockS
`(illustrated in descending order below the block containing
`AY1430), respectively. Output 440 illustrates the 2x2 block
`410 as represented by DC RGB block 420, AY1430, AY2
`431, AY3 432, and AY4 433 produced by blocks 214 and
`216.
`The term AY is the luminance Y difference between each
`pixel and the average luminance Y,
`for the correspond
`ing block, i.e., AY=Y-Y. Effectively, this block Struc
`ture is equivalent to chroma Subsampling by a factor of two
`in the X- and y-directions. Only luminance Y information is
`available on a per-pixel basis, and full color information is
`only present on a per-block basis: AY, AY, AY, AY,
`R, G, and B. At the output of block 216, seven
`components, or bytes, remain to represent the block, which
`is down from the original 4 pixelsX3 bytes per pixel=12
`bytes of the original RGB image for 24-bit RGB color.
`Using this approach, the present invention maintains the
`compression advantages of a YUV-pixel transformation and
`U, V Subsampling without explicitly performing it on a
`per-pixel basis.
`Once the blocks are preprocessed, as described above, the
`number of components is additionally reduced by perform
`ing a simple, two-dimensional transform of the AY values.
`The two-dimensional transform is performed on the lumi
`nance values instead of the full precision color of pixels. For
`the two-dimensional transform, the present invention per
`forms a 2-by-2 Haar transform (a second-order wavelet
`transform) that is very fast when applied to 2-by-2 blocks.
`In block 218 of FIG. 2, a Haar transform of AY values of
`each 2x2 block is performed. Within the 2x2 block, only AC
`luminance variations (i.e., AYS) are encoded. In the present
`invention, the transform involves values for 2-2, 4, 8, etc.,
`where a low integer for k is preferred. In the preferred
`embodiment of the present invention, the integer k is equal
`to one (i.e.,2'-2), so that the wavelet transform is