throbber
O!CrTALSl(;NALPROCESSD<<l 2, 204 22611992)
`
`High .. Resolution Still Picture Compression *
`Mladen Victor Wickerhauser
`Department of Mathematics, Washington University,
`Campus Box 1146, St. Louis, Missouri 63130
`
`-We shall consider the problem of storing, transmit(cid:173)
`
`1. INTRODUCTION
`
`ting, and manipulating digital electronic images. Be(cid:173)
`cause of the file sizes involved, transmitting images
`will always consume large amounts of bandwidth, and
`storing images will always require hefty resources.
`Because of the large number N of pixels in a high-reso(cid:173)
`lution image, manipulation o f digital images is infea(cid:173)
`sible without low-complexity algorithms, i.e., O(N)
`or O(N log(N)). Our goal is to describe some new
`methods which are firmly grounded in harmonic anal(cid:173)
`ysis and the mathematical theory of function spaces,
`which promise to combine effective image compres(cid:173)
`sion with low-complexity image processing. We s hall
`take a broad perspective, but we shall a lso compare
`specific new algorithms to the state of the art.
`Roughly speaking, most image compression algo(cid:173)
`rithms split into three parts: invertible transforma(cid:173)
`t ion, lossy quantization or rank reduction, a nd en(cid:173)
`tropy coding (or redundancy removal). There are a
`few algorithms which differ fundamentally from this
`scheme, e.g., the collage coding algorithm [ 4 J, o r pure
`vector quantization of the pixels. The former uses a
`deep observation that pictures of natural objects ex(cid:173)
`hibit self-similarity at different scales; we prefer to
`avoid relying on this phenomenon, since our images
`may not be "natural." The latte r uses a complex algo(cid:173)
`rithm to build a superefficient e mpirical vocabulary
`to describe an ensemble of images; we prefer to avoid
`training our algorithm with any sample of images, to
`avoid the problem of producing a sufficiently large
`and suitable ensemble.
`There has emerged an international standard for
`picture compression, promulgated by the Joint Photo·
`
`• Research eupported in parL by AFOSR Gra nL F49620-92·J ·
`0106 and by FBI Contract Al0?183.
`
`graphic Experts Group (JPEG ), which is remarkably
`effective in reducing the s ize of digitized image files.
`JPEG is two-djmensional discrete cosine transform
`(OCT) coding of 8 X 8 blocks of pixels, followed by a
`possibly proprietary quantization scheme on the OCT
`amplitudes, followed by either Huffman, Lempel(cid:173)
`Ziv Welch, or a rithmetic coding oft he quantized coef(cid:173)
`ficients. It has some drawbacks; for example, several
`incompatible implementations are allowed under t he
`standard. Also, JPEG degrades ungracefully at high
`and ultrahigh compression rat ios, a nd it makes cer(cid:173)
`tain assumptions about the picture thai are violated
`by zooming in or out, or other transformations. It
`works so well on typical photographs and many other
`images, however, that it has become the algorithm to
`beat in most applications. J PEG fai ls most noticeably
`on high-resolution (i.e., oversampled ) data, and on
`images which must be closely examined by humans or
`machines.
`Alternatives to JPEG have recently appeared, and
`we shall discuss three of these: the fast discrete wave(cid:173)
`let transform, t he local trigonometric or lapped or(cid:173)
`thogonal transform, and the best-basis algor ithm.
`These differ in the transform coding step; i.e., instead
`of OCT they first apply the wavelet transform, lapped
`orthogonal t ransform, or wavelet packet transform,
`possibly followed by a best-basis search. The resulting
`stream o f amplitudes is then quantized a nd coded to
`remove rPrlunrlllncy.
`Existing image processing algorithms work on the
`original pixels or else on the ( 2-dimensional ) Fourier
`transform of the pixels. If the image has been com(cid:173)
`pressed, it must be uncompressed prior to such pro(cid:173)
`cessing. Alternatively, we can try to devise algorithms
`which transform the compressed parameters. If com(cid:173)
`pression is accomplished by retaining only a low-rank
`approximation to the signal, then we can use more
`complex algorithms for subsequent processing. To
`
`1061 -2004/92$4.00
`C'opyroaht C 1992 by Ac•dtmoc Pras, Inc.
`All rights of ...,production on •ny fom> m;erved.
`
`Page 1 of 23
`
`

`

`put this idea into practice, we need to retain useful
`analytic properties such as the large derivatives used
`in edge detection. These will not be preserved by
`purely information-theoretic coding such as pure vec(cid:173)
`tor quantization, but we can choose transform coding
`methods whose mathematical properties combine ef(cid:173)
`ficient compression with good analytic behavior.
`
`2. TRANSFORM CODING IMAGE COMPRESSION
`
`-A digitally sampled image can represent only a
`
`band-limited function, since there is no way of resolv(cid:173)
`ing spatial frequencies higher than half the pixel
`pitch. Band-limited functions are smooth; in fact they
`are entire analytic, which means that at each point
`they can be differentiated arbitrarily often and the
`resulting Taylor series converges arbitrarily far away.
`Since digitally sampled images faithfully reproduce
`the originals as far as our eyes can tell, we may confi(cid:173)
`dently assume that our images are in fact smooth and
`well approximated by band-limited functions. An(cid:173)
`other way of saying this is that adjacent pixels are
`highly correlated, or that there is a much lower rank
`description of the image which captures virtually all
`of the independent features. In transform coding, we
`seek a basis of these features, in which the coordi(cid:173)
`nates are less highly correlated or even uncorrelated.
`These coordinates are then approximated to some
`precision, and that approximate representation is fur(cid:173)
`ther passed through a lossless redundancy remover.
`Figure 1 depicts a generic image compression trans(cid:173)
`form coder. It embodies a three-step algorithm. The
`first block (Transform) applies an invertible coordi(cid:173)
`nate transformation to the image. We think of this
`transformation as implemented in real arithmetic,
`with enough precision to keep the truncation error
`below the quantization error introduced by the origi(cid:173)
`nal sampling. The output of this block will be treated
`as a stream of real numbers, though in practice we are
`always limited to a fixed precision.
`The second block (Quantize) replaces the real num(cid:173)
`ber coordinates with lower-precision approximations
`which can be coded in a (small) finite number of dig(cid:173)
`its. If the transform step is effective, then the new
`coordinates are mostly very small and can be set to
`zero, while only a few coordinates are large enough to
`survive. The output of this block is a stream of small
`
`integers, most of which are the same (namely 0). If
`our goal is to reduce the rank ofthe representation, we
`can now stop and take only the surviving amplitudes
`and tag them with some identifiers. If our goal is to
`reduce the number of bits that we must transmit or
`store, then we should proceed to the next step.
`The third block (Remove redundancy) replaces the
`stream of small integers with some more efficient al(cid:173)
`phabet of variable-length characters. In this alphabet
`the frequently occurring letters (like "0") are repre(cid:173)
`sented more compactly than rare letters.
`
`3. DECORRELATION BY TRANSFORMATION
`
`-We will consider six pixel transformations which
`
`have proven useful in decorrelating smooth pictures.
`
`3.1. Karhunen-Loeve
`Let us now fix an image size-say height H and
`width W, with N = H X W pixels-and treat the indi(cid:173)
`vidual pixels as random variables. Our probability
`space will consists of some collection of pictures J' =
`{ S1 , S2 , ••• , SM}, where M is a large number. The
`intensity of the nth pixel S ( n), 1 .:; n .:; N, is a ran(cid:173)
`dom variable that takes a nonnegative real value for
`each individual picture S E J'. Nearby pixels in a
`smooth image are correlated, which means that the
`value of one pixel conveys information about the like(cid:173)
`lihood of its neighbors' values. This implies that hav(cid:173)
`ing transmitted the one pixel value at full expense, we
`should be able to exploit this correlation to reduce the
`cost of transmitting the neighboring pixel values.
`This is done by transforming the picture into a new
`set of coordinates which are uncorrelated over the
`collection J' and then transmitting the uncorrelated
`values.
`More precisely, the collection of smooth pictures J'
`has off-diagonal terms in the autocovariance matrix
`A = (A ( i,j)) ~~1 of the pixels in J',
`1 M
`
`A(i,j) = M ~1 Sm(i) X Sm(j),
`where Sm = Sm- ( 1 I M) :Lm Sm. A can be diagonalized
`because it is symmetric (see [ 2, Theorem 5.4, page
`120] for a proof of this very general fact). We can
`write T for the orthogonal matrix that diagonalizes A;
`
`-
`
`-
`
`(1)
`
`Scanned
`image
`
`Remove
`redundancy
`
`FIG. 1.
`
`Idealized transform coder.
`
`::===::1
`Storage I
`
`'------
`
`Page 2 of 23
`
`

`

`then TAT* is diagonal, and Tis called the Karhunen(cid:173)
`Loeve transform, or alternatively the principal orthogo(cid:173)
`nal decomposition. The rows ofT are the vectors of the
`Karhunen-Loeve basis for the collectionS, or equiva(cid:173)
`lently for the matrix A. The number of positive eigen(cid:173)
`values on the diagonal of TAT* is the actual number
`of uncorrelated parameters, or degrees of freedom, in
`the collection of pictures. Each eigenvalue is the vari(cid:173)
`ance of its degree of freedom. TSm is Sm written in
`these uncorrelated parameters, which is what we
`should transmit.
`Unfortunately, the above method is not practical
`because of the amount of computation required. For
`typical pictures, N is in the range 10 4-10 6
`• To diago(cid:173)
`nalize A and find T requires 0 ( N 3
`) operations in the
`general case. Furthermore, to apply T to each picture
`requires 0 ( N 2 ) operations in general. Hence several
`simplifications are usually made.
`
`3.2. DCT
`For smooth signals, the autocovariance matrix is
`assumed to be of the form
`
`A(i,j) = riHI,
`
`(2)
`
`where r is the adjacent pixel correlation coefficient
`and is assumed to be just slightly less than I. The
`expression I i - j I should be interpreted as I i, - j, I +
`I ic - jc I, where i, and ic are respectively the row and
`column indices of pixel i, and similarly for j. Experi(cid:173)
`ence shows that this is quite close to the truth for
`small sections of large collections of finely sampled
`smooth pictures. It is possible to compute the Kar(cid:173)
`hunen-Loeve basis exactly in the limit N-oo: in that
`case A is the matrix of a two-dimensional convolution
`with an even function, so it is diagonalized by the
`two-dimensional DCT. In one dimension, this trans(cid:173)
`form is an inner product with functions such as the
`one in the Fig. 2. This limit transform can be used
`
`0.5
`
`0
`
`-0.5
`
`-1
`
`-1.~0.6 -0.4 -0.2
`
`0
`
`0.2 0.4 0.6 0.8
`
`1.2
`
`1.4
`
`1.6
`
`FIG. 2. Example DCT basis function.
`
`instead of the exact Karhunen-Loeve basis; it has the
`added advantage of being rapidly computable via the
`fast DCT derived from the fast Fourier transform.
`The JPEG algorithm uses this transform and one
`other simplification. N is limited to 64 by taking 8 X 8
`subblocks of the picture. JPEG applies two-dimen(cid:173)
`sional DCT to the subblocks, and then treats the 64
`vectors of amplitudes individually in a manner which
`we will discuss in the next section.
`
`3.3. LCT or LOT
`Rather than use disjoint 8 X 8 blocks as in JPEG, it
`is possible to use "lapped" or "localized" (but still
`orthogonal) discrete cosine functions which are sup(cid:173)
`ported on overlapping patches of the picture. These
`local cosine transforms ( LCT, as in [ 7]) or lapped or(cid:173)
`thogonal transforms (LOT, as in [ 23]) are modifica(cid:173)
`tions of DCT which attempt to solve the blockiness
`problem by using smoothly overlapping blocks. This
`can be done in such a way that the overlapping blocks
`are still orthogonal; i.e., there is no added redundancy
`from using amplitudes in more than one block to rep(cid:173)
`resent a single pixel. For the smooth blocks to be or(cid:173)
`thogonal we must use DCT-IV, which is the discrete
`cosine transform using half-integer grid points and
`half-integer frequencies. The formulas for the smooth
`overlapping basis functions in two dimensions are de(cid:173)
`rived from the following formulas in one dimension.
`For definiteness we will use a particular symmetric
`bump function
`
`sin : (1 + sin 1TX ) ,
`
`b(x) =
`{
`
`0,
`
`3
`1
`if--< X<-,
`2
`2
`otherwise.
`
`(3)
`
`This function is symmetric about the value x = i· It
`is smooth on ( -t, ~)with vanishing derivatives at the
`boundary points, so that it has a continuous deriva(cid:173)
`tive on R. Note that we can modify b to obtain addi(cid:173)
`tional continuous derivatives by iterating the inner(cid:173)
`most sin 1rx. Let b1 ( x) = b ( x) and define
`
`(4)
`
`Then bn will have (use L'Hopital's rule!) at least 2n-J
`vanishing derivatives at -i and ~-
`Now consider the interval of integers I =
`{ 0, 1, 2,
`... , N- 1} , where N = 2 n is a positive integer power
`of 2. This can be regarded as the "current block" of N
`samples in an array; there are previous samples /' =
`{ · · ·, -2, -1} and future samples I"= {N, N + 1,
`· · · } as well. The lapped orthogonal functions are
`mainly supported on I, but they take values on { - N I
`2, ... , -1} C I' and { N, ... , N I 2 - 1} C I" as well;
`
`Page 3 of 23
`
`

`

`those are the overlapping parts. For integers k E { 0, 1,
`1} , we can define the function
`... , N
`
`Apart from b, these are evidently the basis func(cid:173)
`tions for the so-called DCT-IV transform. Figure 3
`shows one such function, with N chosen large enough
`so that the smoothness is evident. The orthogonality
`of such functions may be checked by verifying the
`equations
`
`3N/2-l
`
`2::
`
`j=-N/2
`
`{ 1
`l{;k(j),p,., (j) =
`,
`0,
`
`if k = k',
`
`if k =/= k'.
`
`(6)
`
`The chosen window function or "bell" allows co(cid:173)
`sines on adjacent intervals to overlap while remaining
`orthogonal. For example, the function ifik( j + N) is
`centered over the rangej E { -N, ~N + 1, ... , -1}
`and overlaps the function ifik· ( j) at values j E {- N I 2,
`- N I 2 + 1, ... , N I 2 - 1} . Yet these two functions
`are orthogonal, which may be checked by verifying
`the equation
`
`N/2-1
`
`2::
`
`j=-N/2
`
`if;,.(j + N)ifik' (j) = 0,
`
`for all integers k, k'.
`
`(7)
`
`Of course, rather than calculate inner products
`with the sequences tPk, we can preprocess data so that
`standard fast DCT-IV algorithms may be used. This
`may be visualized as "folding" the overlapping parts
`of the bells back into the interval. This folding can be
`transposed onto the data, and the result will be dis(cid:173)
`joint intervals of samples which can be "unfolded" to
`produce smooth overlapping segments. This is best
`illustrated by an example. Suppose we wish to fold a
`
`/ __ ,-M----~---r~·-·rr~~\~-
`1 ' I II I I II ' '
`v1
`j·····ri/V
`I I \I I I v
`v v v
`
`0.5
`
`0
`
`-0.5
`
`·l
`
`-0.4 -0.2
`
`0
`
`0.2 0.4 0.6 0.8
`
`I
`
`1.2
`
`1.4
`
`FIG. 3. Example LCT basis function.
`
`smooth function across 0, onto the intervals {-N /2,
`... , -1} and {0, 1, ... , Nl2
`1}, using the bell b
`defined above. Then folding replaces the function f =
`f (j} with the left and right parts fo- and fo+:
`
`fo_(j)
`f(j), ifj<-NI2,
`
`b( -); i) f ( j) b( j ~ i ) f (
`1 ) '
`if j E { - N I 2, ... , -1},
`
`fo+ (j)
`
`b( j ~ i) f ( j) + b(- j;; i) f (- j - 1),
`if j E { 0, 1, ... , N 12 - 1},
`f(j), ifj;;;:.NI2.
`(8)
`
`The symmetry of b allows us to use b (- x) instead of
`introducing the bell attached to the left interval. This
`action divides f into two independent functions (the
`even and odd parts of f ) which merge smoothly
`around the grid point 0. The process is an orthogonal
`transformation. We can fold the smooth function
`around the grid point N in a similar manner:
`
`ft- (j)
`f ( j) ,
`if j < N /2,
`b(-jN t-1)f(j) be~i 1)
`Xf(2N
`1), ifjE {NI2, ... ,N 1},
`
`j
`
`fl+ (j)
`
`. + 1
`b 1 N 2 - 1 f( j)
`(
`
`)
`
`+b(-j;;j-1)t(2N-j 1),
`
`ifj E {N, N + 1, ... , 3NI2- 1},
`if j ;;;:. 3N I 2
`( 9 )
`
`t ( j) ,
`
`The new function fo defined below is a smooth, inde(cid:173)
`pendent segment of the original smooth function{,
`restricted to the interval of values { 0, 1, ... , N
`1}:
`
`if j E { 0, 1, ... , N /2 - 1},
`ifjE {N/2, N/2 + 1, ... ,N -1}.
`(10)
`
`Page 4 of 23
`
`

`

`We can now apply theN-point DCT-IV transform
`directly to /0 •
`We can likewise define f m (j) for the values j E
`{mN, mN + 1, ... , (m + 1)N- 1} by the same
`folding process, which segments a smooth function f
`into smooth independent blocks. Folding to intervals
`of different lengths is easily defined as well. We can
`also generalize to two dimensions by separably fold(cid:173)
`ing in x and then in y.
`Unfolding reconstructs f from fo- and fo+ by the
`formulas
`
`f(j)
`
`b( -j;; i) fo- ( J) + b( j ~ i) fo+ ( - J - 1) ,
`
`ifj E { -N/2, ... , -1},
`
`b( j ~ t) fo+ ( J) - b( - j N- i) fo- ( - J - 1) ,
`
`ifjE{0,1, ... ,N/2-l}. (11)
`
`Composing these relations yields f (j) = [ b (j + V
`N) 2 + b(
`-t/N} 2
`] f(j). This equation is verified
`by the bell b defined above, for which the sum of the
`squares is 1.
`
`3.4. Adapted Block Cosines
`We can also build a library of block LCT bases (or
`block DCT bases) and search it for the minimum of
`some cost function. The chosen "best LCT basis" will
`be a patchwork of different-sized blocks, adapted to
`different-sized embedded textures in the picture. It
`will then be necessary to encode the basis choice to(cid:173)
`gether with the amplitudes. A description of two ver(cid:173)
`sions of this algorithm and some experiments may be
`found in [13].
`
`3.5. Subband Coding
`A (one-dimensional) signal may be divided into
`frequency subbands by repeated application of convo(cid:173)
`lution by a pair of digital filters, one high-pass and
`one low-pass, with mutual orthogonality properties.
`Let { hk} :!i/, { gk} :!(/ be two finite sequences, and
`define two operators H and G as
`
`M-l
`
`(Hf)l< = L hJij+2k•
`j=O
`
`M-1
`
`( G I),= L gJj+2k·
`j=O
`
`(12)
`
`H and G are defined on square-summable signal se(cid:173)
`quences of any length. They are also defined for peri(cid:173)
`odic sequences of (even) period P, where we simply
`interpret the index of f as j + 2k (mod P). In that
`case, the filtered sequences will be periodic with pe(cid:173)
`riod P/2.
`
`The adjoints H* and G* of Hand G are defined by
`
`(H* I h
`
`L
`
`O"".k-2j<M
`
`hk-2jf;,
`
`(G*/h =
`
`L
`
`gk-2jt·
`0"".k-2j<M
`
`(13)
`
`Hand G are called (perfect reconstruction) quadra(cid:173)
`ture mirror filters (or QMFs) if they satisfy a pair of
`orthogonality conditions:
`
`HG* GH*
`
`0; H* H + G*G
`
`I.
`
`(14)
`
`Here I is the identity operator. These conditions
`translate to restrictions on the sequences { hk } , { gk } .
`Let m0 , m 1 be the bounded periodic functions defined
`by
`
`m0 (~)
`
`M-1
`
`:Z: hkeik~, mr(~)
`k=O
`
`M-1
`
`L gkeik~.
`k=O
`
`(15)
`
`Then H, G are quadrature mirror filters if and only if
`the matrix below is unitary for all ~:
`
`m 0 (0 m0 (~ + rr).).
`m1 ( 0 mr( ~ + rr)
`
`(
`
`(16)
`
`This fact is proved in [11]. QMFs can be obtained by
`constructing a sequence { hk} with the desired low(cid:173)
`pass
`filter
`response, and
`then putting gk
`( - 1 ) k h M _ 1_k. That reference also contains an algo(cid:173)
`rithm for constructing a family of such { hk}, one for
`each even filter length M.
`The frequency response of one particular pair of
`QMFs ( "C30") is depicted in Fig. 4. We have plotted
`the absolute values of m0 and m 1 , respectively, over
`one period [
`rr, rr]. Note that m 0 attenuates frequen(cid:173)
`cies far from 0, while m 1 attenuates those near 0.
`The traditional block diagram describing the action
`of a pair of quadrature mirror filters is shown in Fig. 5.
`On the left is convolution and downsampling (by 2);
`on the right is upsampling (by 2) and adjoint convo(cid:173)
`lution, followed by summing of the components. The
`broken lines in the middle represent either transmis(cid:173)
`sion or storage.
`The underlying functions of subband filtering are
`produced by iterating H* and G* until we have
`enough points. For example, 10 iterations of H* ap(cid:173)
`plied to the sequence e0 = { · · ·, 0, 0, 1, 0, 0, · · ·}
`produce a 1024-point approximation to the smooth
`function whose translates span the lowest-frequency
`sub band. Likewise, a single G * after 9 iterations of H*
`applied to e0 produces a 1024-point approximation to
`the next lowest-frequency function. These are distin(cid:173)
`guished examples; the first is called the "scaling" or
`
`Page 5 of 23
`
`

`

`0.8
`
`0.6
`
`0.4
`
`0.2
`
`-3
`
`-1
`
`-3
`
`Low pass C30
`
`High pass C30
`
`FIG. 4. Absolute values of m 0 and m 1 •
`
`"father" function, while the second is called the
`"wavelet" or "mother" function. They are depicted in
`Fig. 6 for a particular QMF ( "C30," with 30 taps).
`Higher-frequency subbands are spanned by func(cid:173)
`tions with more oscillations, which are produced by
`using G* earlier in the iteration. The sequence of
`filters used to generate a function can be converted to
`an integer in binary notation as follows. Put F~ = H*
`and Fr = G* in the formula for the function ( respec(cid:173)
`tively, put F0 = Hand F 1 = G). Then for any pair of
`integers nand L with 0 ~ n < 2L we can write n = n02°
`+ n 12 1 + · · · + nL_12L-I, where ni E { 0, 1} for all i =
`0, 1, ... , L- 1. To that combination (n, L) we can
`.
`F* F*
`F*
`assocmte a vector
`no o
`nL_ 1 eo.
`n 1 o • • • o
`For example, the functions given in Fig. 7 are 1024-
`point sequences H*G*(H* ) 8 e0 and G*G*(H* ) 8 e0 , re(cid:173)
`spectively, given by (n, L) = (2, 10) and (n, L) = (3,
`10). It is well known (and shown in [17]) that the
`number of oscillations of the vector produced in this
`manner increases with n', where n is the Gray-code
`permutation of n'. The renumbering n -
`n' relates
`Paley order to sequency order for Walsh functions
`and has an analogous effect in the smooth case. This
`fact can be used to analyze the spectrum of acoustic
`signals by measuring the amplitudes of wavelet
`packet coefficients. Acoustic signal compression by
`wavelet packet and best-basis methods was discussed
`in [32].
`Now we can define four two-dimensional convolu(cid:173)
`tion -decimation operators in terms of H and G,
`namely the tensor products of the pair of quadrature
`mirror filters:
`
`def
`F0 = H®H,
`F 0v(x,y) = L hihjv(i + 2x,j + 2y)
`
`i,j
`
`def
`F 1 =H0G,
`F 1v(x,y) = L higjv(i + 2x,j + 2y)
`
`i,j
`
`(17)
`
`(18)
`
`def
`F 2 = G0H,
`F 2v(x, y) = L: gihjv(i + 2x,j + 2y)
`
`i,j
`
`def
`F3 = G0G,
`F 3v(x, y) = L: gigjv(i + 2x,j + 2y)
`i,j
`
`(19)
`
`(20)
`
`These convolution decimations have the following
`adjoints:
`
`F~v(x,y) = L: hx-2ihy-2jv(i,j)
`
`i,j
`
`Frv(x, y) = L: hx-2igy-2jv(i,j)
`
`i,j
`
`F;v(x, y) = L gx-2ihy-2jv(i,j)
`
`i,j
`
`F;v(x, y) = L gx-2igy-2jv(i,j).
`
`i,j
`
`(21)
`
`(22)
`
`(23)
`
`(24)
`
`The orthogonality relations for this collection are
`
`FnF':n = OnmJ;
`
`(25)
`
`I= F~Fo G FrFl G F;F2 G F;F3.
`
`(26)
`
`By a "picture" we will mean a finite sequence in(cid:173)
`dexed by two coordinates S = S ( x, y) . It is convenient
`to regard pictures as periodic in both x andy, though
`this is not absolutely necessary. For simplicity of im(cid:173)
`plementation, we shall also assume that the x period
`(the "width" Nx = 2 n,) and they period (the "height"
`Ny = 2 ny) are both positive integer powers of 2, so that
`
`FIG. 5. Block diagram of subband filtering.
`
`Page 6 of 23
`
`

`

`C30 Scaling or father function
`
`C30 Wavelet or mother function
`
`FIG. 6. Lowest-frequency subband basis functions.
`
`we can always decimate by 2 and get an integer period.
`The space of such pictures may be decomposed into a
`partially ordered set W of subspaces W ( n, m) called
`subbands (see below), where m ~ 0, and 0 ,;;; n < 4 m.
`These are the images of orthogonal projections com(cid:173)
`posed of products of convolution decimations. Denote
`the space of Nx X Nv pictures by W ( 0, 0) (it is Nx X
`Ny-dimensional), arid define recursively
`
`W ( 4n + i, m + 1) Fi F; W ( n, m) C W ( 0, 0),
`fori= 0, 1, 2, 3.
`(27)
`
`The orthogonality condition on the QMFs implies
`that the projections from W ( 0, 0) onto W ( n, m) are
`orthogonal; i.e., they conserve energy. The subspace
`W(n, m) is (N,2-m) X (Ny2-m)-dimensional. These
`subspaces may be partially ordered by a relation
`which we define recursively as follows. We say W is a
`precursor of W' (write W-< W') if they are equal or if
`W' F* FW for a convolution decimation Fin the set
`{F0 , F 1, F2 , F 3 }. We also say that W-< W'ifthere is a
`finite sequence vl, ... ' v$ of subspaces in w such
`that W-< V1 -< · · · -< Vs-< W'. This is well defined,
`since each application ofF* F increases the index m.
`Subspaces of a single precursor W E W will be
`called its descendents, while the first generation of
`descendents will naturally be called children. By the
`orthogonality condition,
`
`The right-hand side contains all the children of W.
`The subspaces W ( n, m) are called subbands, and
`
`the transform coding method that first transforms a
`signal into sub band coordinates is called subband cod(cid:173)
`ing. If S E W ( 0, 0) is a picture, then its orthogonal
`projection onto W ( n, m) can be computed in the
`standard coordinates of W ( n, m) by the formula
`F(l) • • • F<rnl W ( 0, 0), where the particular filters
`F < 1 l • · • F (m l are determined uniquely by n. Therefore
`we can express in standard coordinates the orthogo(cid:173)
`nal projections of W ( 0, 0) onto the complete tree of
`subspaces W by recursively convolving and decimat(cid:173)
`ing with the filters.
`The quadrature mirror filters Hand G form a parti(cid:173)
`tion of unity in the Fourier transform (or wavenum(cid:173)
`ber) space. The same is true for the separable filters
`F;. They can be described as nominally dividing the
`support set of the Fourier transform S of the picture
`into dyadic squares. If the filters were perfectly sharp,
`then this would be literally true, and the children of W
`would correspond to the four dyadic subsquares one
`scale smaller. We illustrate this in Fig. 8.
`Figure 9 shows two generations of descendents, the
`complete decomposition of R 4 X R 4
`• The sub bands
`are labeled by the "n" index in W ( n, m). Within the
`dyadic squares are the n indices of the corresponding
`subspaces at that level. If we had started with a pic(cid:173)
`ture of N X N pixels, then we could repeat this decom(cid:173)
`position process log2 (N) times.
`All subbands together form a quadtree, in which
`each subspace forms a node and directed edges go
`from precursors to descendents. The orthogonality
`relation among the subs paces implies that every con(cid:173)
`nected subtree which contains the root W ( 0, 0)
`corresponds to an orthonormal subband decomposi(cid:173)
`tion of the original picture: the sub bands correspond
`
`Higher frequency C30 wavelet packets
`
`FIG. 7. Higher-frequency subband basis functions.
`
`Page 7 of 23
`
`

`

`FIG. 8. Four subband descendants of a picture.
`
`to the leaves of the subtree. Having stated this general
`nonsense result, let us consider specific examples.
`The subbands W ( n, m) as we define them are in
`one-to-one correspondence with rectangular regions
`(in fact squares) in wavenumber space, and the quad(cid:173)
`tree stacks these regions one on top of the other. We
`can idealize various orthogonal subband bases as dis(cid:173)
`joint covers of wavenumber space. A few of these are
`schematically represented in Fig. 10.
`The leftmost decomposition is a sub band decompo(cid:173)
`sition in which we have all the subbands at a fixed
`level-in this case, level 2. The subbands are labeled
`( 0, 2), ( 1, 2), ( 2, 2), ( 3, 2), ... , (15, 2) as in Fig. 9.
`The middle decomposition produces two-dimensional
`"isotropic" wavelets, i.e., those which have the same
`scale in both the x and y directions. The sub bands in
`this decomposition are labeled by ( 0, 4), ( 1, 4), ( 2, 4),
`(3, 4), (1, 3), (2, 3), (3, 3), (1, 2), (2, 2), (3, 2), (1,
`1), ( 2, 1), and ( 3, 1).
`The rightmost decomposition is an adapted sub(cid:173)
`band basis such as might be discovered by minimizing
`some cost function over all subband decompositions.
`In that particular example, the subspaces are labeled
`by ( 0, 4)' ( 1, 4)' ( 2, 4)' ( 3, 4)' ( 4, 4)' ( 5, 4)' ( 6, 4)' ( 7,
`4)' ( 8, 4)' ( 9, 4)' ( 10, 4)' ( 11, 4)' ( 3, 3)' ( 16, 4)' ( 17'
`4)' ( 18, 4)' ( 19, 4)' ( 5, 3)' ( 6, 3)' ( 7' 3)' ( 32, 4)' ( 33,
`4), (34, 4), (35, 4), (9, 3), (10, 3), (11, 3), (3, 2), (4,
`2), (5, 2), (24, 3), (25, 3), (26, 3), (27, 3), (7, 2), (8,
`2), (36, 3), (37, 3), (38, 3), (39, 3), (10, 2), (11, 2),
`( 12, 2), (13, 2), ( 14, 2), and (15, 2). It is similar to a
`subband basis which is particularly effective at repre(cid:173)
`senting fingerprint images.
`3.5.1. Complete subband level. We can divide a
`picture into patches and code the spatial frequencies
`of the patches by a decimation-in-space algorithm
`(sub band coding) rather than a decimation-in-fre(cid:173)
`quency algorithm (block DCT) . This is computation(cid:173)
`ally equivalent, but has the advantage of using
`smooth basis functions (if the filters are chosen appro(cid:173)
`priately) and also greater flexibility, since there are
`additional degrees of freedom available in the choice
`of filters. The drawback is poorer time-frequency lo(cid:173)
`calization. Subband basis functions spread out in the
`spatial domain in proportion to their smoothness, and
`they have rather complicated Fourie:..· transforms.
`The sub band correlations are computed recursively
`as depicted in Fig. 11. This method is popular largely
`due to its simplicity. The amplitudes within subbands
`correspond to equal-sized basis functions distin-
`
`guished only by their spatial frequencies. Hence vari(cid:173)
`able bit allocation is simple to implement, using the
`many transform values within a subband as the ran(cid:173)
`dom variables, and allocating bits to a subband ac(cid:173)
`cording to its variance. In addition, it is possible to use
`a simple visibility weighting formula based on the
`eye's spatial frequency response. See the quantization
`section below for further details. With these addi(cid:173)
`tional steps, subband coding becomes quite competi(cid:173)
`tive with other transform coding compression algo(cid:173)
`rithms.
`3.5.2. Wavelets, or multiresolution analysis. We
`can attempt pixel decorrelation by assuming that the
`picture looks more or less the same at various differ(cid:173)
`ent scales. This assumption underlies certain models
`of vision (see [ 24] ) , and partially accounts for the
`natural appearance of fractal computer images. In
`this case, the decorrelating transformation should do
`the same thing at each scale of the picture.
`We can build an orthonormal basis with this prop(cid:173)
`erty (in one dimension) by choosing a special
`"mother function" if; = 1/;( x) and then generating its
`descendants 1/lab = 1/;ab(x) = 2a/ 21/;(2ax- b), with in(cid:173)
`tegers a and b. A (separable) two-dimensional wave(cid:173)
`let basis can be built by taking products if; ab ( x) if; cd (y),
`although recent work by several authors (for exam(cid:173)
`ple, [ 6,21,30] has produced nonseparable multidi(cid:173)
`mensional wavelet bases which could also be used. It
`is not immediately obvious that a function if; with the
`requisite properties exists. The surprising and fortu(cid:173)
`nate recent discovery of many such functions
`( [11,22,36] and others) also provided a fast O(N) al(cid:173)
`gorithm for computing the associated wavelet trans(cid:173)
`forms.
`The wavelet basis down to level L consists of the
`elements spanning the sub bands W ( 1, 1), W ( 2, 1),
`W(3, 1), W(1, 2), W(2, 2), W(3, 2), ... , W(l,
`L-1), W(2, L-1), W(3, L-1), W(1, L), W(2, L),
`W ( 3, L) and the largest-scale average W ( 0, L). The
`pixel values may be transformed into this basis via the
`two-dimensional version of the "pyramid scheme"
`described in [ 22]. This is depicted graphically in Fig.
`12. The pyramid scheme is also called a multiresolu(cid:173)
`tion analysis, and has been extensively studied. It
`provides an algorithm of complexity O(N) for the
`transformation of anN-pixel picture.
`
`2 3 6 7
`8 9 12 13
`10 11 14 15
`
`DGE 0 1 4 5
`
`Javel 0
`
`Iaveii
`
`level2
`
`FIG. 9. Two levels of subband decomposition.
`
`Page 8 of 23
`
`

`

`" · -
`
`I
`I
`
`-···
`
`i
`
`I
`
`I
`
`Basis ot subbands
`at one level
`
`Mulliresolution or
`wavelet basis
`
`Adapted subband or
`wavelet packet basis
`
`FIG. 10. Various decompositions into subbands.
`
`3.5.3. Custom subbands. There may be features
`which are most efficiently described

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket