`US006529634l£l
`
`(12) United States Patent
`Thyagarajan et al.
`
`um Patent No.:
`(45) Date of Patent:
`
`US 6,529,634 B1
`*Mar. 4, 2003
`
`(54) CONTRAST SENSITIVE VARIANCE BASED
`ADAPTIVE BLOCK SIZE DCT IMAGE
`
`O'l‘II]jR PUBLlCA'l‘I(JNS
`
`(75)
`
`COMPRESNON
`Inventors: Kadayum Thyagamjan, San Diego’
`CA (US); Michael J. Merritt, San
`Die 0, CA US
`g
`(
`)
`(73J AS51311“: Q“a'W“““s 1"‘-'-a 53“ Dl°8°s CA (US)
`
`( * ) Noti*e:
`
`Subject to any disclaimer, the term of this
`patent
`is extended or adjusted under 35
`U.S.C.‘. 154(b) by 0 days.
`
`This patent is subject to a terminal dis-
`claimcr»
`
`(21) App]. No.: lJ9t436,085
`
`(22
`
`Filed:
`
`Nov.8, 1999
`
`“A fully adaptive DCT based color image sequence coder”,
`Chen, Rong—Jian el al., Signal Processing Image Commu-
`nication, vol. 6, No. 4, Aug.
`l, 1994, pp. 289-301.‘
`"Adaptive block—size transform coding for im age compres-
`sion,” Javier Bracamonte et 31., 1997 IEEE International
`
`Conference 0“ A‘-'0“5li°5- Speech» and Signal Processing.
`1997, IC/\SSP—9't', vol. 4, 199?, pp. 2721-2724?
`_
`‘
`‘
`‘
`_
`Vaisey J. et al.; “image Compression With Variable Block
`
`Size Segmentation”; IEEB Transactions on Signal Process-
`ing, US, IEEE, Inc. NYvol. 40, No. 8, pp. 2041. 2044, 2047,
`‘ d 2048; A .1, 1992.
`‘m
`"g
`
`4 Ciwdby examine,
`
`(51)
`
`Int. CL?
`
`GOISK W36; H04N "U415
`
`Prirttarj; E,\'rmtfner—Wenpeng Chen
`(74) Attorney, Agent, or Ft'rm—Philip Wadsworth; Gregory
`
`(52) U.S. C].
`
`3321239; 3321240; 3585433
`
`'3' 0饰"= 5a“"iP 3- Min“
`(57)
`ABSTRACT
`
`382239, 240,
`.
`(58) Field of Search
`383248, 232, "?;:58t433, 261.2, 426.14;
`375/[240-03; 34031; 240-2: 24024
`
`(56)
`
`Referenees Cited
`US. l’A'l'l-.‘N'I‘ l)0(.‘UMEN'I'S
`
`A system and method for image compression utilizing
`adaptively sized blocks and sub-blocks of discrete cosine
`
`lransfonn eoeflicient data is presented. A block size assign-
`ment element in the encoder selects the block or sub—bloek
`
`of an input block of pixels to be processed. The selection is
`based on the variance of pixel values. Blocks with variances
`.
`.
`.
`_
`.
`larger than a threshold are .‘Sllbd1\t'1(le(l, while bloeks with
`variances smaller than a threshold are not subdivided. A
`transforin element transforms the pixel values of the selected
`blocks into the frequency domain. The frequency domain
`values may then be quantized, serialized, and variable length
`Coded in prcparalinn for transmission‘
`
`:7:
`2 at
`._
`,
`.eri
`-
`.--
`,
`—
`‘M995 Lee
`5,452,104 A ,=
`lUt’l995 Shin
`5,455,630 A *
`I 1,-‘I995 Bhargava et al.
`5,471,243 A
`l2tl99t') Rodriguez ..
`5,535,944 A
`3,-‘I998 Shin at al.
`5,724,451 A
`==r
`'
`:..<3..<
`5tl.‘J.‘J‘) Hirabayashi
`‘.39 .55? A
`FOREIGN PATENT DOCUMENTS
`
`
`
`A
`
`.3)
`
`.
`358x433
`..... U 332,939
`_ 375,r24()_24
`353/500
`-‘ 382534“
`382232
`
`DH
`
`l‘)5fl3S?l A1 *
`
`3_t'l99fi
`
`HtJ4Nt‘.r’t2t:
`
`35 Claims, 3 Drawing Sheets
`
` M II n~: . umrr r...
`'
`
`
`<.
`
`I
`
`_ /
`
`run-rt-.-=...
`
`I.:iltk\vuk|,\N.'E\>‘l '
`
`.. -u.“-..
`
`mI
`
`._'n-:I\\u4.‘I;t-Bluis .
`
`
`
`1
`
`Google Inc.
`(3000 1008
`[PR of US Pat. No. '.=',974,339
`
`
`
`U.S. Patent
`
`m
`
`Ma
`
`256.,SU
`
`1BM
`
`
`
`F5253mmmm>z_m:m£m<>mmm_m_mwn_7w.flaoz<umo<N2N
`
`
`
`
`
`mmwiflmmmammoomo
`
`
`
`4,.n_mN.:<EmmmmooummmN_.E<:o
`
`
`z<umF525Mo<No_Nm;_m<E<>
`
`m2:j..........-JS".Ezz<:uH"zo_mm=>_m.z$fi“_IIIiIIIIIIILM/2:
`
`NE
`
`mmfimugm
`
`+zm_Ez2mm<
`
`
`
`6Q:.9.,/7.HDE
`
`
`
`
`
`
`U.S. Patent
`
`Mar. 4, 2003
`
`Sheet 2 of3
`
`US 6,529,634 B1
`
`202
`
`204
`
`READ A16 X 16 BLOCK
`
`COM PUTE ITS VARIANCE V16
`
`206 SET R = (I,
`
`WRITE ADDRESS
`OF I6 3: I6 BLOCK
`
` SET Qi = 0.
`
`WRITE ADDRESS OF
`
`OF ith 3 x 8 BLOCK
` COMPUTE jlh 4 x 4
`BLOCK VARIANCE V4 ii
`
`225
`SET Pij = 0,
` WRITE ADDRESS
`0Fj‘h 4 x 4 BLOCK
`
` SET PU = 1, WRITE ADDRESS or
`
`2 x 2 BLOCKS IN jlh 4 x 4 BLOCK
`
`
`
`230
`
`FIG. 2
`
`
`
`U.S. Patent
`
`Mar. 4, 2003
`
`Sheet 3 of3
`
`US 6,529,634 B1
`
`
`
`fl
`
`FIG. 3A
`
`R
`
`WW WW
`
`P11
`
`P22
`
`P13
`
`P14 P31
`
`P32
`
`P33
`
`P34
`
`WW
`
`FIG. 3B
`
`Illflflflflfl
`
`PQR DATA
`
`FIG. 3C
`
`
`
`US 6,529,634 B1
`
`1
`CONTRAST SENSITIVE VARIANCE BASED
`ADAPTIVE BLOCK SIZE DCT IMAGE
`COMPRESSION
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`The present invention relates to image processing. More
`specifically, the present invention relates to a compression
`scheme for image signals utilizing adaptively sized blocks
`and sub-blocks of encoded discrete cosine transform coef-
`ficient data.
`
`10
`
`-
`
`II. Description of the Related Art
`In the field of transmission and reception of video signals
`such as are used for projecting “films” or “movies”, various
`improvements are being made to image compression tech-
`niques. Many of the current and proposed video systems
`make use of digital encoding techniques. Digital encoding
`provides a robustness for the communications link which
`resists impairments such as multipath fading and jamming or
`signal interference, each of which could otherwise serious
`degrade image quality. Furthermore, digital
`techniques
`facilitate the use signal encryption techniques, which are
`found useful or even necessary for governmental and many
`newly developing commercial broadcast applications.
`High definition video is an area which benefits from
`improved image compression techniques. When first
`proposed, over-the-air transmission of high definition video
`(or even over-wire or fiber-optical
`transmission) seemed
`impractical due to excessive bandwidth requirements. Typi-
`cal wireless, or other, transmission systems being designed
`did not readily accommodate enough bandwidth. However,
`it has been realized that compression of digital video signals
`may be achieved to a level that enables transmission using _
`reasonable bandwidths. Such levels of signal compression,
`coupled with digital transmission of the signal, will enable
`a video system to transmit with less power with greater
`immunity to channel impairments white occupying a more
`desirable and useful bandwidth.
`
`4-0
`
`One compression technique capable of offering signifi-
`cant levels of compression while preserving the desired level
`of quality for video signals utilizes adaptively sized blocks
`and sub-blocks of encoded Discrete Cosine Transform
`(])(.'t') coefficient data. This technique will hereinafter he
`referred to as the Adaptive Block Size Differential Cosine
`Transform (ABSDCl'} method. This technique is disclosed
`in U.S. Pat. No. 5,021,891, entitled “Adaptive Block Size
`Image Compression Method And System," assigned to the
`assignee of the present invention and incorporated herein by
`reference. DCT techniques are also disclosed in U.S. Pat.
`No. 5,107,345, entitled "Adaptive Block Size Image Com-
`pression Method And System,” assigned to the assignee of
`the present invention and incorporated herein by reference.
`Further, the use of the ABSDCT technique in combination _
`with a Differential Quadtree Transform technique is dis-
`cussed in US. Pat. No. 5,452,104, entitled "Adaptive Block
`Size Image Compression Method And System,” also
`assigned to the assignee of the present invention and incor-
`porated herein by reference. The systems disclosed in these
`patents utilizes what is referred to as “intra-frame" encoding,
`where each frame of image data is encoded without regard
`to the content of any other frame. Using the ABSDCT
`technique, the achievable data rate may be reduced from
`around 1.5 billion bits per second to approximately 50
`million hits per second without discernible degradation of
`the image quality.
`
`45
`
`S0
`
`60
`
`65
`
`2
`The ABSDCT technique may be used to compress either
`a black and white or a color image or signal representing the
`image. The color input signal may be in a YIQ format, with
`Y being the luminance, or brightness, sample, and I and 0
`being the chrominance, or color, samples for each 4x4 block
`of pixels. Other known fomiats such as the YUV or RG13
`formats may also be used. Because of the low spatial
`sensitivity of the eye to color, most research has shown that
`a sub-sample of the color components by a factor of four in
`the horizontal and vertical directions is reasonable.
`
`Accordingly, a video signal may be represented by four
`luminance components and two chrominancc components.
`Using ABSDCT, a video signal will generally be seg-
`mented into blocks of pixels for processing. For each block,
`the luminance and chrominance components are passed to a
`block interleaver. For example, a 16x16 (pixel) block may
`be presented to the block interleaver, which orders or
`organizes the image samples within each 16x16 block to
`produce blocks and composite sub-blocks of data for dis-
`crete cosine transform
`analysis. The DCT operator is
`one method of converting a time-sampled signal to a fre-
`quency representation of the same signal. By converting to
`a frequency representation, the DCT techniques have been
`shown to allow for very high levels of compression, as
`quantizers can be designed to take advantage of the fre-
`quency distribution characteristics of an image. In a pre-
`ferred embodiment, one 16xl6 DCT is applied to a first
`ordering, four 8x8 DC'I's are applied to a second ordering, 16
`4x4 DCTs are applied to a third ordering, and 64 2x2 DCTs
`are applied to a fourth ordering.
`The DCT operation reduces the spatial redundancy inher-
`ent in the video source. After the DCT is performed, most of
`the video signal energy tends to be concentrated in a few
`DUI‘ coefficients. An additional transform, the Differential
`Quad—Tree Transform (DOT), may be used to reduce the
`redundancy among the DCT coefficients
`For the 16x16 block and each sub-block, the DCT coef-
`ficient values and the DQT value (if the DQT is used) are
`analyzed to determine the number ofbits required to encode
`the block or sub-block. Then, the block or the combination
`ofsub—blocks that requires the least number ofbits to encode
`is chosen to represent the image segment. For example, two
`8x8 sub-blocks, six 4x4 sub-blocks, and eight 2x2 sub-
`blocks may be chosen to represent the image segment.
`The chosen block or combination of sub-blocks is then
`
`properly arranged in order into a 16x16 block. The DCT!
`DOT coetlicient values may then undergo frequency
`weighting, quantization, and coding (such as variable length
`coding) in preparation for transmission.
`Although the ABSDCI" technique described above per-
`forms remarkably well,
`it
`is computationally intensive.
`Thus, compact hardware implementation of the technique
`may be difficult. An alternative technique that would make
`hardware implementation more efficient
`is desired. An
`image compression method and system that is more com-
`putationally efflcient is provided by the present invention in
`the manner described below.
`
`SUMMARY OF TIIE INVENTION
`
`invention is system and method of image
`The present
`compression that utilizes adaptively sized blocks and sub-
`blocks of Discrete Cosine Transform coefficient data. In one
`
`embodiment, a 16x16 block of pixels is input to an encoder.
`The encoder comprises a block size assignment element,
`which segments the input block of pixels for processing. The
`block size assignment is based on the variances ofthe input
`
`
`
`US 6,529,634 B1
`
`3
`block and subdivided blocks. In general, areas with larger
`variances will be subdivided into smaller blocks, whereas
`areas with smaller variances will not be subdivided, pro-
`vided the block and sub-block mean values fall into different
`
`predetermined ranges. Thus, first the variance threshold of a
`block is modified from its nominal value depending on its
`mean value, and then the variance of the block is compared
`with a threshold, and if the variance is greater than the
`threshold, then the block is subdivided.
`
`is provided to a transfomi
`The block size assignment
`element, which transforms the pixel data into frequency
`domain data. The transform is performed only on the block
`and sub-blocks selected through block size assignment. The
`transform data then undergoes quantization and serializa-
`tion. For example, zigzag scanning may be utilized to
`serialize the data to produce a stream of data. The stream of
`data may then be coded by a variable length coder in
`preparation for
`transmission. The encoded data is sent
`through a transmission channel to a decoder, where the pixel
`data is reconstructed in preparation for display.
`
`BRIEF DESCRIPTION OF TIIE DRAWINGS
`
`10
`
`The features, objects, and advantages of the present
`invention will become more apparent
`from the detailed -
`description set forth below when taken in conjunction with
`the drawings in which like reference characters identify
`correspondingly throughout and wherein:
`
`FIG. 1 is a block diagram of an image processing system
`that incorporates the variance based block size assignment
`system and method of the present invention;
`FIG. 2 is a flow diagram illustrating the processing steps
`involved in variance based block size assignment;
`FIGS. 31:, 3b, and 3c illustrate an exemplary block size ‘
`assignment,
`the corresponding quad-tree decomposition,
`and the corresponding PQR data.
`
`DI:'.TAILl:'.D DI:'.SCRlP'l'ION OE 'I'I-IE
`PREFERRED EMBODIMENTS
`
`In order to facilitate digital transmission of digital signals
`and enjoy the corresponding benefits, it is generally neces-
`sary to employ some form of signal compression. To achieve
`high definition in a resulting image, it is also important that
`the high quality of the image be maintained. Furthermore,
`computational efficiency is desired for compact hardware
`implementation, which is important in many applications.
`
`4-0
`
`45
`
`50
`
`The present invention provides a system or apparatus and
`method of image compression that takes into account both
`the image quality and computational eficiency in perform-
`ing image compression. The image compression of the
`present
`invention is based on discrete cosine transform
`(DCT) techniques. Generally, an image to be processed in _
`the digital domain would be composed ofpixel data divided
`into an array of non-overlapping blocks, NxN in size. A
`two-dimensional DCT may be performed on each block. The
`two—dimensional DCT is defined by the following relation-
`ship:
`
`E0
`
`x Ir. it :
`‘
`
`ntktfiitft
`fin’
`
`.u
`
`
`2 L?—\
`.-nu:
`_.t
`
`E (In: + i1m'r
`.:‘(m, it tens;
`EN
`
`
`
`[Eu -t- l1Jr.’§
`cos! is ..
`av
`tJs.r'..is.-‘V--I
`
`65
`
`4
`
`K} =
`
`m Fm
`
`E. if .1;
`-_- 0
`iv’?.tt't-::t
`
`.
`
`wherein
`
`and
`
`x(m,n) is the pixel location (m,n) within an NXN block,
`and
`
`X(k,l) is the corresponding DCT coelficient.
`Since pixel values are non-negative, the DCT component
`)((0,(]) is always positive and usually has the most energy. In
`fact. for typical
`images, most of the transform energy is
`concentrated around the component X(U,O). This energy
`compaction property makes the DCT technique such an
`attractive compression method.
`The image compression technique of the present inven-
`tion utilizes contrast adaptive coding to achieve further bit
`rate reduction. It has been observed that most natural images
`are made up of flat relatively slow varying areas, and busy
`areas such as object boundaries and high-contrast texture.
`Contrast adaptive coding schemes take advantage of this
`factor by assigning more bits to the busy areas and less bits
`to the less busy areas.
`Contrast adaptive coding is also useful for reducing the
`blocking effect. In the implementation of other DCT coding
`techniques, the blocking elTect is perhaps the most important
`impairment
`to image quality. Furthermore,
`the blocking
`eflfect tends to be more perceptible in busy areas of the
`image. However, it has been realized that the blocking effect
`is reduced when a smaller sized DCI‘ is used. The blocking
`elIect becomes virtually invisible when a 2x2 DCT is used,
`although the bit per pixel performance may suffer. Thus,
`contrast adaptive coding may reduce the blocking effect by
`assigning smaller block sizes (and thereby more bits) to the
`busy areas and larger block sizes to the relatively blank
`areas.
`
`Another feature of the present invention is that it utilizes
`intraframe coding [spatial processing) instead of interfranie
`coding (spatio—temporal processing). One reason for the
`adoption of intraframe coding is the high complexity of the
`receiver required to process interframe coding signals. Inter-
`frame coding inherently requires multiple frame buffers in
`addition to more complex processing circuits.
`In many
`applications, reduced complexity is needed for actual imple-
`mentation.
`A second reason for using intraframe coding is that a
`situation, or program material, may exist that can make a
`spatio-temporal coding scheme break down and perform
`poorly. For example, 24 frame per second movies can fall
`into this category since the integration time, due to the
`mechanical shutter, is relatively short. The short integration
`time allows a higher degree of temporal aliasing. The
`assumption of frame to frame correlation breaks down for
`rapid motion as it becomes jerky.
`An additional reason for using intraframe coding is that a
`spatio-temporal coding scheme is more difficult to standard-
`ize when both 50 Hz and 60 Hz power line frequencies are
`involved. Television currently transmits signals at either 50
`Hz or 60 Hz. The use of an intraframe scheme, being a
`digital approach, can adapt
`to both 50 Hz and 60 Hz
`operation, or even to 24 frame per second movies by trading
`off frame rate versus spatial resolution.
`For image processing purposes,
`the DCT operation is
`performed on pixel data that
`is divided into an array of
`non-overlapping blocks. Note that although block sizes are
`discussed herein as being N><N in size, it is envisioned that
`
`
`
`US 6,529,634 B1
`
`5
`various block sizes may be used. For example, a NXM block
`size may be utilized where both N and M are integers with
`M being either greater than or less than N. Another impor-
`tant aspect is that the block is divisible into at least one level
`of sub-blocks, such as Nfi><Ni’i, Ni’i><Nfj, Nr’i><Mf', and etc.
`where i and j are integers. Furthermore, the exemplary block
`size as discussed herein is a 16x 16 pixel block with corre-
`sponding block and sub-blocks of DCT coellicienls. It is
`further envisioned that various other integers such as both
`even or odd integer values may be used, e.g. 9x9.
`Referring now to FIG. 1, an image processing system 100
`which incorporates the compression system of the present
`invention is shown. The image processing system 100
`comprises an encoder 102 that compresses a received video
`signal. The compressed signal
`is transmitted through a
`transmission channel 104, and received by a decoder 106.
`The decoder 106 decodes the received signal into image
`samples, which may then be displayed.
`In general, an image is divided into blocks of pixels for
`processing. A color signal may be converted from RGB
`space to Y(‘.,(‘_, space, with Y being the luminance, or
`brightness, component, and C, and C2 being the
`chrominanee, or color, components. Because of the low
`spatial sensitivity of the eye to color, many systems sub-
`sample the C, and C2 components by a factor of four in the
`horizontal and vertical directions, However,
`the sub-
`sampling is not necessary. A full resolution image, known as
`4:4:4 format, may be either very useful or necessary in some
`applications such as those referred to as covering "digital
`cinema.” Two possible Y(.‘.1C2 representations are, the YIQ
`representation and the YUV representation, both of which
`are well known in the art. It is also possible to employ a
`variation of the YUV representation known as YCbCr.
`In a preferred embodiment, each of the Y, Cb, and Cr
`components is processed without sub-sampling. Thus, an
`input of a 16x16 block of pixels is provided to the encoder
`102. The encoder 102 comprises a block size assignment
`element 108, which performs block size assignment
`in
`preparation for video compression. The block size assign-
`ment element 108 determines the block decomposition of
`the l6><l6 block based on the perceptual characteristics of
`the image in the block. Block size assignment subdivides
`each 16><16 block into smaller blocks in a quad-tree fashion
`depending on the activity within a 16x16 block. The block
`size assignment element 108 generates a quad—tree data,
`called the FOR data, whose length can be between 1 and 21
`bits. Thus, if block size assignment determines that a 16x1 6
`block is to be divided, the R bit of the PQR data is set and
`is followed by four additional bits of P data corresponding
`to the four divided 8x8 blocks. If block size assignment
`determines that any of the 8x8 blocks. is to be subdivided,
`then four additional bits of Q data for each 8x8 block
`subdivided are added.
`
`Referring now to FIG. 2, a flow diagram showing details
`of the operation of the block size assignment element 108 is
`provided. The algorithm uses the variance of a block as a
`metric in the decision to subdivide a block. Beginning at step
`202, a 16x16 block of pixels is read. At step 204,
`the
`variance, v16, of the 16x16 block is computed. The variance
`is computed as follows:
`
`.v—: .v—i
`var: 2 E -
`F-.-1
`'-.-L}
`
`J
`
`H
`
`t'\'
`
`1
`
`2»
`
`.-\.'-t .t\'—l
`
`where N-16, and x,.,. is the pixel in the i"" row, j”°' column
`within the NXN block. At step 206,
`first
`the variance
`
`6
`threshold T16 is modified to provide a new threshold T"l6 if
`the mean value of the block is between two predetermined
`values, then the block variance is compared against the new
`threshold, T‘l6.
`If the variance v16 is not greater than the threshold T16,
`then at step 208, the starting address of the 16x16 block is
`written, and the R bit of the PQR data is set to 0 to indicate
`that the 16x16 block is not subdivided. The algorithm then
`reads the next 16x16 block of pixels. lithe variance v16 is
`greater than the threshold T16, then at step 210, the R bit of
`the PQR data is set to 1 to indicate that the 16x16 block is
`to be subdivided into four 8x8 blocks.
`The four 8x8 blocks, i-l:4, are considered sequentially
`for further subdivision, as shown in step 212. For each 8x8
`. block, the variance, v8,-, is computed, at step 214. At step
`216, first the variance threshold T8 is modified to provide a
`new threshold T'8 if the mean value of the block is between
`
`10
`
`two predetennined values, then the block variance is com-
`pared to this new threshold.
`If the variance v8, is not greater than the threshold T8,
`then at step 218, the starting address of the 8x8 block is
`written, and the corresponding Q bit, Q,-, is set to U. The next
`8x8 block is then processed. If the variance v8‘. is greater
`than the threshold T8, then at step 220, the corresponding 0
`bit, Q,-, is set to 1
`to indicate that the 8x8 block is to be
`subdivided into four 4x4 blocks.
`
`The four 4><4 blocks, j ,.=l:4, are considered sequentially
`for further subdivision. as shown in step 222. For each 4x4
`block, the variance, v4,.,., is computed, at step 224. At step
`226, first the variance threshold T4 is modified to provide a
`new threshold T4 if the mean value of the block is between
`
`two predetermined values, then the block variance is com-
`pared to this new threshold.
`If the variance v4,.,. is not greater than the threshold T4,
`then at step 228, the address ofthe 4x4 block is written, and
`the corresponding P bit, 13,}, is set to 0. The next 4x4 block
`is then processed. If the variance v4,-,- is greater than the
`threshold T4, then at step 230, the corresponding P bit, I-’,-,-,
`is set to 1 to indicate that the 4x4 block is to be subdivided
`into four 2:-<2 blocks. In addition, the address of the 4 2x2
`blocks are written.
`The thresholds T16, T8, and T4 may be predetermined
`constants. This is known as the hard decision. Alternatively,
`an adaptive or soft decision may be implemented. The soft
`decision varies the thresholds for the variances depending on
`the mean pixel value of the 2Nx2N blocks, where N can be
`8, 4, or 2. Thus, functions of the mean pixel values, may be
`used as the thresholds.
`For purposes of illustration, consider the following
`example. Let the predetermined variance thresholds for the
`Y component be 50, 1100, and 880 for the l6x'l6, 8x8, and
`4x4 blocks, respectively. In other words, Tl6=50, 'l‘S=1100,
`and 'l'l6=88U. Let the range of mean values be 80 and 100.
`Suppose the computed variance for the 16><16 block is 60.
`Since 60 and its mean value 90 is greater than T16, the
`16x16 block is subdivided into four 8x8 sub—blocks. Sup-
`pose the eomputed variances for the 8x8 blocks are H80,
`935, 980, and 1210. Since two of the 8x8 blocks have
`variances that exceed T8,
`these two blocks are further
`subdivided to produce a total of eight 4x4 sub-blocks.
`l-‘inally, suppose the variances of the eight 4x4 blocks are
`620, 630, 670, 610, 590, 525, 930, and 690, with corre-
`sponding means values 90, 120, 110, 115. Since the mean
`value of the tirst 4x4 block falls in the range (80, 100), its
`threshold will be lowered to T‘4=200 which is less than 880.
`So, this 4x4 block will be subdivided as well as the seventh
`4x4 block. The resulting block size assignment is shown in
`
`4-0
`
`45
`
`S0
`
`60
`
`65
`
`
`
`US 6,529,634 B1
`
`10
`
`.
`
`4-0
`
`45
`
`50
`
`7
`FIG. 3:1. The corresponding qu ad-tree decomposition is
`shown in FIG. 3b. Additionally, the PQR data generated by
`this block size assignment is shown in FIG. 3c.
`Note that a similar procedure is used to assign block sizes
`for the color components C, and C3. The color components
`may be decimated horizontally, vertically, or both.
`Additionally, note that although block size assignment has
`been described as a top down approach, in which the largest
`block (16><16 in the present example) is evaluated first, a
`bottom up approach may instead be used. The bottom up
`approach will evaluate the smallest blocks (2x2 in the
`present example) first.
`Referring back to FIG. 1, the remainder of the image
`processing system 110 will be described. The PQR data,
`along with the addresses of the selected blocks, are provided
`to a DCT element 110. The DCT element 110 uses the PQR
`data to perform discrete cosine transforms of the appropriate
`sizes on the selected blocks. Only the selected blocks need
`to undergo l)(..‘l‘ processing.
`The image processing system 100 may optionally com-
`prise DQT element 112 for reducing the redundancy among
`the DC coefiicients of the DCI's. A DC coeflicient is encoun-
`tered at
`the top left corner of each DCT block. The DC
`coefficients are,
`in general,
`large compared to the AC
`coefficients. The discrepancy in sizes makes it difficult to -
`design an efficient variable length coder. Accordingly, it is
`advantageous to reduce the redundancy among the DC
`coefficients.
`The DOT element 112 performs 2-I) 1)(.*1:<; on the DC
`coeflicients, taken 2x2 at a time. Starting with 2x2 blocks
`within 4-X4 blocks, a 2-D DCT is performed on the four DC
`eoeflicienls. This 2x2 DCI‘ is called the dilIerential quad-
`tree transform, or DOT, of the four DC coefficients. Next,
`the DC coefficient of the DOT along with the three neigh-
`boring DC coefficients with an 8x8 block are used to .
`compute the next level DOT. Finally, the DC coefficients of
`the four 8:-<8 blocks within a 16x16 block are used to
`compute the DOT. Titus, in a 16x16 block, there is one true
`DC coefiicient and the rest are AC coeflicients correspond-
`ing to the DCT and DOT.
`The transform coeificients (both DCT and DOT) are
`provided to a quantizer 114 for quantization. In a preferred
`embodiment, the DCT coefficients are quantized using fre-
`quency weighting masks (I-'WMs) and a quantization scale
`factor. A FWM is a table of frequency weights of the same
`dimensions as the block of input DCT eoefficients. The
`frequency weights apply dilIerent weights to the ditlierent
`DCT coefficients. The weights are designed to emphasize
`the input samples having frequency content that the human
`visual system is more sensitive to, and to dc-emphasize
`samples having frequency content that the visual system is
`less sensitive to. The weights may also be designed based on
`factors such as viewing distances, etc.
`The weights are selected based on empirical data. A
`method for designing the weighting masks for 8x8 DCT _
`coeificients is disclosed in ISOII EC IfC 1 CD 10918, “Digi-
`tal compression and encoding of continuous-tone still
`images—part 1: Requirements and guidelines,” Interna-
`tional Standards Organization, I994, which is herein incor-
`porated by reference. In general, two FWMS are designed,
`one for the luminance component and one for the chromi-
`nance components. The FWM tables for block sizes 2x2,
`4x4 are obtained by decimation and 16x16 by interpolation
`of that for the 8x8 block. The scale factor controls the
`
`E0
`
`65
`
`quality and bit rate of the quantized coetlicients.
`Thus, each DCT coefficient is quantized according to the
`relationship:
`
`8
`
`_l
`__
`DCT’-[L If "
`
`3-«xDC'TU'. ii
`ftmrfi. JF] H; T
`
`I
`
`where DC’I‘(i,j) is the input DCT coefficient, fwm(i,j) is the
`frequency weighting mask, q is the scale factor, and DCTq
`(i,j) is the quantized coefficient. Note that depending on the
`sign of the DCT coeffieient, the first term inside the braces
`is rounded up or down. The DOT coefficients are also
`quantized using a suitable weighting mask. However, mul-
`tiple tables or masks can be used, and applied to each ofthe
`Y, Cb, and Cr components.
`The quantized coefficients are provided to a zigzag scan
`serializer 116. The serializer 116 scans the blocks of quan-
`tized coelficients in a zigzag fashion to produce a serialized
`stream of quantized coeficients. A number of different
`zigzag scanning patterns, as well as patterns other than
`zigzag may also be chosen. A preferred technique employs
`8x8 block sizes for the zigzag scanning, although other sizes
`may be employed.
`Note that the zigzag scan serializer 116 may be placed
`either before or after the quantizer 114. The net results are
`the same.
`In any case, the stream of quantized coefficients is pro-
`vided to a variable length coder 118. The variable length
`coder 118 may make use of run-length encoding of zeros
`followed by Huffman encoding. This technique is discussed
`in detail in aforementioned U.S. Pat. Nos. 5,021,891, S,l(l?,
`345, and 5,452,104, and is summarized herein. A run-length
`coder would take the quantized coefficients and separate out
`the zero from the non-zero coefficients. The zero values are
`referred to as run-length values, and are tlutfman encoded.
`The non-zero values are separately Hu tfman encoded.
`A modified Ilulfman coding of the quantized coefficients
`is also possible and is used in the preferred embodiment.
`Here, after zigzag scanning, a run-length coder will deter-
`mine the run—length{size pairs within each 8x8 block. These
`run-lengthfsize pairs are then Iluffman encoded.
`Hulfman codes are designed from either the measured or
`theoretical statistics of an image. It has been observed that
`most natural images are made up of blank or relatively
`slowly varying areas, and busy areas such as object bound-
`aries and high-contrast
`texture. Huffman coders with
`frequency-domain transforms such as the DCT exploit these
`features by assigning more bits to the busy areas and fewer
`hits to the blank areas. In general, Huffman coders make use
`of look—up tables to code the run-length and the non—zero
`values. Multiple tables are generally used, with 3 tables
`being preferred in the present invention, although 1 or 2 can
`be employed, as desired.
`The compressed image signal generated by the encoder
`102 are transmitted to the decoder 106 via the transmission
`channel 104. The PQR data, which contains the block size
`assignment information. is also provided to the decoder 106.
`The decoder 106 comprises a variable length decoder 120,
`which decodes the run-length values and the non-zero
`values.
`The output of the variable length decoder 120 is provided
`to an inverse zigzag scan serializer 122 that orders the
`coefficients according to the scan scheme employed. The
`inverse zigzag scan serializer 122 receives the PQR data to
`assist in proper ordering of the coeficients into a composite
`coefficient block.
`The composite block is provided to an inverse quantizer
`124, for undoing the processing due to the use of the
`frequency weighting masks.
`The coefficient block is then provided to an IDQTelement
`12.6, followed by an IDCT element 128, if the Differential
`
`
`
`US 6,529,634 B1
`
`9
`Qu ad-tree transform had been applied. Otherwise, the coef-
`ficient block is provided directly to the IDCT element 128.
`The IDQT element 126 and the EDCT element 128 inverse
`transform the coefficients to produce a block of pixel data.
`The pixel data may be then have to be interpolated, con-
`verted to RGB form, and then stored for future display.
`Accordingly, a system and method is presented for image
`compression that performs block size assignment based on
`pixel variance. Variance based block size assignment offers
`several advantages. Because the Discrete Cosine Transform
`is performed after block sizes are detemtined, eflicient
`computation is achieved. The computationally intensive
`transform need only be perfomied on the selected blocks. In
`addition,
`the block selection process is efficient, as the
`variance of pixel values is mathematically simp