`
`.
`
`'9
`
`than 8 bits per character). the entropy of the alphabet consisting of 2-letter pairs is lower than the entropf Of
`the original single letters. This idea may be extended to Jr-letter pairs, with the caveat that building the code
`grows in complexity exponentially with the number of bits. Hence there are some compromise algorithms
`which search a small range of alphabet lengths. Some of these, like U:tix.'"" ‘4:ompressfl)' which is based on
`the Lempel-Ziv algorithm, are remarkably effective as well as quite fast .[Ziv,Lempel77i, [Ziv,Le:npel7ll[.
`Shannon's theorem must he applied carefully, since in practice there are no infinite signals. The infinite-
`lengtb signal assumption allows us to ignore the the hired overhead of transmitting the new alphabet to
`the receiver.
`In practice, there must always be agreement between transmitter and receiver on both the
`original and the minimal alphabets, which must be established before any communication occurs. L-osslesa
`redundancy removers must always account this overhead. ‘Indeed, if the alphabet were taken to be huge
`(i.e., Large enough so that a particular finite signal consisted of only one ‘‘letter'‘'), then the entropy of the
`message would be 0 and no hits would be required to transmit it. In that case, the receiver must know the
`origins] alphabet and also the minimal alphabet, which is equivalent to assuming that the receiver already
`knows the message. having received it as part of the overhead.
`
`8.1: JPE-G.
`
`6: [mos cowncsslon .u.ooJurHMs
`
`The figure below deocribea, in block diagram form, the Joint Photographic Experts Group (JPE-G] picture
`compression algorithm.
`
`
`
`Figure 6.1-18.
`
`JPEG visibility-weighted transform coding.
`
`It specializes the generic transform coder by that cutting up the picture into independent 8 x 8 blocks,
`and also by multiplying the transformed pixel value: by “visibility weights“ determined from psychovisusl
`measurements. Small sub-bloclcs are needed to limit the complexity of the algorithm. A striking visual
`justifimtion for the Weighting can be found in [DeVore..l'awer1h.L1.lcier], p.73-1.
`There are two main dificultiee with JPEG. First,
`the algorithm cannot take advantage of large-scale
`correlations among pixels. such as those caused by regular textures covering large areas. This limitation is
`imposed by the choice of an 3 X 3 block size, and in some cases is overcome by using a. larger bloclr site.
`It is known, for example. that 16 x 13 or even 32 x 32 bloc]: sizes give better compreuion ratios for the
`same distortion when the images are high resolution fingerprint images. Second. the artifacts introduced by
`JPEG losses severely afiect those image processing algorithms which detect and treat sharp edges. Narnelyz
`small pixels errors introduced along the sharp boundaries between blocks appear to be false edges because
`they line up. This artifact is called “htoctriness” when it becomes noticeable to the eye at high compression
`rates, and is mitigated by over-quantizing the (0,0)-frequency or “DC" component of the transformed pixel
`values. The JPEG standard allows for proprietary algorithms to be used for visibility weighting, and various
`implementations use this flexibility to gain considerable reduction in visible distortion at fixed compression
`ratios.
`
`'
`
`_
`
`Page 226 of 437
`
`Page 226 of 437
`
`
`
`'30
`
`\\
`
`MLADEN VICTOR WICKEEEAUSER
`
`The definitive description of the JPEG standard may he found in [JPEG1] and [.lPEG2l. A more expusn
`itory description which includes more details than this nrticle may he iound in [WollaA:ei.
`
`‘
`6.2: LOT or LOT.
`The specialization of this method from the generic trsnsiorm coder is depicted in the figure below:
`
`
`
`Figure 8.2-19.
`
`Block LCT transform coding with frequency welghting.
`
`The main difference (the use of smooth blocks). is noted by the shaded block. As with JPEG, we can
`take advantage of the varying frequency response of the eye and use frequency weighting of the transformed
`amplitudes to reduce perceived distortion at a given compression rate.
`
`6.3: Orthogonal wavelets.
`The special features which set the wavelet transform apart from the generic: transform coder are depicted
`in the figure below:
`
`‘
`
`
`
`Figure 8.3-20.
`
`Wavelet transform signal compression.
`
`One of the imsge compression algorithms available on o VLSI microprocessor (for example from Aware.
`Inc. in Cambridge, Mass.) Inflows the above scheme, using short QMFS. a small number of decomposition
`levels, fixed-point arithmetic, varinble.b'1t-allocation, and arithmetic coding. The operations are optimized
`for speed. to make the implementation useful for motion-picture image compression.
`Another nlgorithm is described in [A.nlon1'oni,Bn.rlaud,Ma:hieu,Dsubet:h.ies]. it uses short QMF wavelet
`transformation, followed by vector quontizsiion of the triples of corresponding-position values from subhsnds
`W(l,k}, W[2,-7:), and W{3,k) for each levei k.
`
`Page 227 of 437
`
`Page 227 of 437
`
`
`
`. HIGH-RESOLUTION STILL PICTURE COMPRESSION
`
`21
`
`3.4: Custom wavelet packets.
`
`We can chooee some other basis subset of subbenda, rather than the one giving the wavelet basis. That will
`require refiltering the '“deta.i1" values, so the pyramid scheme is not enough. Schematically, this compression
`algorithm uses a. aubliand selection criterion:
`
`
`
`Figure 5.4-21.
`Signal compression with custom wavelet. packets.
`
`' The proposed FBI-Yale fingerprint image compression algorithm [Hopper] uses just such a custom selection I
`of aubbanda, discovered by experiments with the beat-baaia method, and trials involving amplifying or
`attenuating the various aubbanda. Values {tom the custom aubbeuds are weighted by visibility, then scaled
`and truncated to integers. with the integers pasaed to Lernpe!—Ziv or Huifman coding Compression. Another
`feature of the FBI-Yale algorithm in its use of biorthogonal QMFe (see [Cohen]]. These can be linear in
`phase so their underlying wavnlet Eunctiona are refiection-symmetric. unlike the orthogonal wavelets which
`will always make an artificial diltiuction between left and right. Biorthogoual wavelets do not decor-relate the
`signal aa well an orthogonal wavelets, but the difference appears to be of minor importance in practice. As
`can be seen from Figures 3.5-6 and 3.5-? above, the orthogonal “C 30" wavelets are quite close to rellection
`symmetric; conversely, some of the symmetric biorthogonal vvzweleta are quite close to orthogonal.
`
`6.5: Best-bula of wavelet packets.
`
`We can also atore all the Amplitudes in the complete quadtree subhand expansion. and then search for the
`most efficient representation by the "beat-basis” algorithm. The stream of amplitudes thus produced can be
`quantized and coded as before. Scltetnatically. this compression algorithm is depicted below:
`
`Page 228 of 437
`
`Page 228 of 437
`
`
`
`22
`
`,‘
`
`MLADEN VICTOR. WIICKEHJIAUSETI
`
`
`
`Vlauolupoehlllramtorm
`
`Figure 3.5-22.
`
`Wavelet packet (best-basis} signal compression.
`
`The transformation from a. picture to its best-basis representation is nonlinear. Since the “best” basis
`choice depends upon the picture, it must also be encoded together with the transformed pixel values. Unlike
`those vslua, the basis description must be coded without any losses. After the lossless coding of the nonlinear
`information. the new pixel values are an orthogonal linear trasnformation of the original pixels. The whole
`process therefore has condition number 1.
`
`The greater the number oi bones in the Lihrargr, the more side information must be transmitted to deséribe
`which one was selected. In [lInm('ha-ndi“&n,Vetterl.i1]. for example, only subbsnds close to the wavelet sub-
`bands are included in the beat-bards search.
`in other schemes, the number of decomposition levels is kept
`small but all bases within that collection of levels are considered.
`
`There are at least two ways to include the basis information. Besrrbaais amplitudes may be individually
`tagged with their coordinates in the best-basis tree. Unless the number of transformed pixei values is
`very small, or the precision very high, tagging large amplitudes will not be the most eficient compression
`method. Alternatively, we may agree upon an ordering of the amplitude and a. regular scheme for describing
`the basis. We will include some side information which describes the chosen basis. and we shall then write
`all the (quantized) amplitude: from that basis out into a. stream for entropy coding. We obtain compression
`because the quantized stream of transfonned pixels has a lower entropy that the original stream of pixels.
`There are various orderings one could choose; which one is most eificient depends upon the constraints in
`the basis search. This method is essential for a. competitive picture compression algorithm.
`
`Imagine L +1 arrays oi’ N x N numbers. The fir.-at array reprments the original signal, which we may call
`2. The second is s concatenation ol the -1 subspaces obtained via separable filter convolution-decimation,
`i.e., the spaces Fo[X)Fg(Y)Z, F;(X)Fo{Y)Z, Fg{X).F'1(Y)Z, and F;_(X)F1(Y]Z. Array or represents the
`concatenation: of the 4"‘ subsploes that make up level In of the wavelet packet decomposition. Of course,
`we must have I} 5 m 5 L5 log3(N). To understand the qliadtree coding schemes dmcrihed lnelow, it is
`helpful to visualize these arrays stacked one atop the other. as in the figure below:
`
`Page 229 of 437
`
`Page 229 of 437
`
`
`
`‘HIGH-RESOLUTION S'I‘ll.L PICTURE COMPRESSION
`
`_
`
`2.‘!
`
`
`£'&".%"%%.%'.r——'%
`
`
`
`Flgure s.s.-ss.
`'I‘wo-dimensional wavelet packet qundtree.
`
`A basis subset ol' the suhbands has the the property that if one element of a suhbsnds is in the basis, then
`that whole subband is in the basis. Also. if a subspace is in the basis, then none of its descendant or ancestor
`subbands are in the basis. Such a. subset can be identified with a cover by dyadic subarrays. Looking down
`through the stool: of arrays in Figure 6.5-23, this cover gives a tiling of the original N X N array by square
`subarrsys of size 2""'N K 2"""N, where m is the level from which that particular subspace was chosen.
`
`5.5.1: Tn; the lorpe uohm. Suppose that B is a beat-basis subset of W, chosen by counting amplitudes
`above a predetermined threshold. We may then extract just thae non-negligible amplitudes and transmit
`them, together with their locations in the tree. This number of amplitudes is no greater than thevnumher
`of pixels, since it is chosen after comparison with the original basis. among others. Define the compression
`ratio in this case to be the ratio of the number of retained amplitudes to the number of original pistols. With
`thresholding to e and counting. the compression ratio measures how well a library represents a picture 5 at
`a fixed precision e.
`'
`In practice it is sometimes necessary to fix the ooropression ratio, for example due to bandwidth limitations.
`In that case we may use th entropy coil: function to obtain the most concentrated representation, and then
`take only as many of the largest amplitudes as we can afford. This may be accomplished by first sorting
`into decreasing order by absolute Value, than reading all the desired number of amplitudes. Alternatively,
`since we know in advance how mI.l.'|j' amplitudes we can use, it may he more eflicient to bubble up the top
`- few amplitudes and discard the rest oi’ the array. The second method is better if the number of retained
`amplitudes is less than log,{N}, where N is the number of pixels.
`Together with the transformed pixel value we must include the extra information describing which basis
`was used. and we must somehow indicate which basis vector ea.|:l1 quantized value represents. Suppose the
`quadtree begins with nail! x N signal and decomposes it down to level L, where L 5 log,[N). Then there
`are LN’ wavelet packet amplitudes, and it takes log,{LN3] hits to encode each individual one. This method
`is used and documented in the wavelet paclnst software programs available by anonymous ftp from the Yale
`Mathematics Department [pascal].
`‘
`The overhead of this method is a constant number of bits per retained value. This technique is used in
`
`Page 230 of 437
`
`Page 230 of 437
`
`
`
`‘
`
`MLABEN VICTOR WICKBRHAUSER
`
`2-l
`
`\
`numerical analysis. where the “picture” might be a matrix and the pixels are double-precision floating point
`values. Then we will retain the nonnegligible values to full precision and ignore the rest. We win because
`we have reduced the number of parameters in the problem. and thus the complexity of the calculation.
`
`6.5.2: Coding the complete basis. In this scheme, we use two arrays to describe the transformation, a.
`levels list and an amplitudes list. The subspaces in the but basis are encountered in depth-lirst order as
`they are selected, and this order can be used to code the quadtree. The side infonnation consists of an
`array of integers which describe at which level the next subbsnd in the best basis resides. The subbands
`themselves are traversed in depth-llrst order, and the level of the next subband is determined by a very short
`integer of at most 1og,(L) hits {for on L-level decomposition). Some extra economy is possible, since the
`presence in the best heels of a node at level :11 sometimes implies that the following nodes can only be in
`levels m,... .L. The extreme case, For example, is that each first occurence in the levels list of a subspace
`at the deepest level L implies that all of its siblings are also in the best basis. Hence whenever the deepest
`level appears, it is not necessary to follow it with 3 “L” symbols. Further, the only levels list which contains
`a marker for level 0 is the list {D}, which implies that the root of the tree. or the original signal. is the best
`representation. That unique situation can be encoded with a single hit, and we can use the value 0 instead
`of L to represent the deepest level. thus using only the range 0. 1.. .
`. .L - 1 in the coding of the tree. It
`then becomes convenient to reverse the meaning of “level” so that it describes distance flow the bottom
`level 0. The original signal will now be labeled with L. the firs: decomposition level L — 1, end so on. This
`side information costs at most vl;L“‘ log,L bite, or 4L'1 log,L}N’ bits per pixel. The worst cases occur
`when every subspace at the deepest level is chosen and the levels list is {[1, 0,. .. ,0} (4’5" 0’s'}, or when the
`mutt-to-deepest level is dtosen and the levels list is {1,. .
`. , 1} (4"“ 1's). The side information will also be
`compressed loselesely so put. of the compression algorithm, and because of the redundancy in both of these
`cases the loesless compression will be extremely efiicient.
`Let us consider an ettaxnple. Suppose that the picture consists of a single wavelet packet froth subbend
`W[5,3) (i.e., frequency 5 and level 3). The best basis algorithm would have a choice of decompositions since
`many comparisons of children and parents would show equal inioi-motion costs.
`If we use the convention
`that levels closer to the root (i.e., parents) are preferable, then the canonical best-basis decomposition of
`this picture includes the following collection of subbnnde:
`
`
`
`Figure 5.5.2-24.
`Various descriptions of the eubbands in the best basis.
`
`If the maxi-
`Then the depth-first-search encounter order levels list would be {2.3.3,3,3.2,2, 1,1, 1}.
`mum level were declared to be 3, we could economize in the ‘conventional manner above and write this as
`{l_.,tl,1.1.2.2.2}.
`
`Page 231 of 437
`
`Page 231 of 437
`
`
`
`2.’:
`\l-IIGH-RESOLUTTON STILL PICTURE COMPRESSION
`Suppose also that the dimensions of the picture are 16 x 16. At level 3 each subbond has 2 x 2 amplitudes.
`in positions [k.,l:,,) E {(D.0),(1,0),{0,l),(il,1)}; let. us suppose that the wavelet packet in our picture has
`amplitude l at position (0.0). We can write the amplitudes list in any predefined manner, but it is convenient
`and conventional to dump the amplitudes from each subspace as it is encountered in depth-first order. This
`approach. in the exariaple case. produces a. stream of small square arrays, each transmitted as lists of rows:
`
`(3.1)
`(2.1)
`(1.1)
`(3.2)
`(2.2)
`(L53 (5.3! (0.3) (7.3)
`(0.?)
`0000 00
`10 00
`00 000000000000000000000000 00000000
`0000 00
`00
`00
`00 00000000 0000000000000000 00000000
`0000
`0000 0000 00000000 00000000 00000000
`0000
`000000000000000000000000 00000000
`000O000000000000 00000000
`00fl0000000000000 00000000
`000000000000000000000000
`00000000 00000000 00000000
`
`_ The various subbands present in the best-basis of wavelet packets provide a segmentation of the picture
`in the time-frequency domain. Some selection criterion can be applied to preselect (in a. crude manner) a.
`desired feature of the signal, such as a given texture or a feature at A. selected scale‘. Then the amplitudes
`within the subbsnd can be used as a. signature of the selected feature. while the positions of large amplitudes
`an be used to more precisely locate the feature. In the example above, we could track down. the position
`of 3. (5.3) wavelet packet in a noisy signal by finding the largest amplitude in the (5, 3) suhband. Features
`characterized by linear Cflmpinll-I-i0:’I8 of wavelet plrlrets can likewise be detected. Since wavelet packets form
`a basis this idea prividea a general feature-detection algorithm. Of course. its effectivenims depends upon
`the simplicity of the wnvelet packet description of the desired feature.
`
`6.3: Adapted LET and DOT.
`It is possible to apply the best-basis algorithm to the library of bases obtained from black DCT or LCT
`transforms with varying block sizes. The algorithms are depicted schematically below:
`
`
`
`-Illlbd bhdl aberewloeel
`DOIHII |f'I|‘lH'lN'|'lI
`
`Fllgura 0.6-25.
`
`Adapted block local cosine transform signal compression.
`
`The structure oi the subspaces in and: decomposition is also a quadtree, only the descendents are produced
`by recursive subdivision into subbloclrs rather than recursive filtering and decimation. Each suhblo_ck contains
`all the frequencies title to the maximum possible with its number of samples, and each block is the direct sum
`
`Page 232 of 437
`
`Page 232 of 437
`
`
`
`M
`
`~.
`
`-,
`
`MLADBN VICTOR. WICKERHAUSER
`
`of its four children. Because the subspaces (now suhhlocks) have the same quadtree structure as subbands
`of a wavelet packet basis, th some basis-coding convention may be applied.
`In the adapted IJCT case (ADCT), the blocks account {or independent regions of the original picture, with
`the usual sharp boundaries and attendant blocking artifacts. In the adapted LCT case LALCT}. adjacent
`blocks are snreared'tI_:gether. In both cases, the best-basis algorithm chooses a particular segmentation of
`the picture into blocks. This choice conveys‘ information about the contents of the picture; the spectrum of
`a chosen block can then he used to identify what is present in that part of the picture.
`_
`A decomposition down to a deep level, where the subblodts are Very small, results in rather sharp bells
`in the ALCT algorithm. This can be avoided by using “multiple folding,” an alternative to ALCT which
`preserves the product of window width and window steepness. The cost is poorer spatial localization for a _
`given block size. This algorithm is described in detail in _[I'lang.Seré].
`6.7: Other methods.
`
`One may also consider the correlation of a picture with the complete set of tensor product: oi wavelet
`packets. These form a. larger uouhomogeueous tree of subspaces which must be labelled with an s-scale and
`y-scale, rather than with a single scale as above. There is a more general notion of admissible subset, and
`s. best-basis search algorithm to find extrema. This basis will produce higher compression ratios at a given
`threshold, at a cost of greater computational complexity and increased overhead describing the basis. The
`practical disadvantages rule out this generalization for image compression: further details may be found in
`[INRIA].
`
`7: COMPARISONS
`
`their eflectiveness as measured by
`The compression algorithms described above may be competed for:
`rate-distorion curves; the usefulness of the transformed pixel values as input to various transformations; the
`types of artifiacts which appear when they are pushed to the point of visible distortion; and the complexity of
`. computing the transforms. In addition, we would like to have an inherent measure of the “compressibility”
`of an image for use in practice: can we apply such a measure algorithmically or will it always be necessary
`to involve humans in judging acceptable distortion?
`'
`
`1.1: 'I‘ransforming oorrtprossod pictures.
`Wavelet pockets contain a mixture of spatial and spectral information, as do wavelets, DCT and LCT
`values. A list of the -moat energetic subspaces used in a compressed picture conveys alsignatute for the
`picture. Certain operators are very eflicieutly represented by their action on the wavelet packet amplitudes.
`Some examples include spatial filtering and local image enhancement [INRJA], edge and texture detection
`[Hwnng,Ma.llat|, principal orthogonal factor classification [W2], and local rescaling.
`For purposes of explanation no will assume that the picture is a function of 2 real variables supported
`in the region [0,1] x {(1. I], with a resolution of 2"". Let (n.-m.Ir) be the index of an amplitude in the
`complete wavelet packet expansion of a picture .5‘. Here 111 = 0, 1,. .
`. , L. We may divide U 5 n < 4"‘ into
`:1, and r1, by talring the odd and even bits in its binary expansion, respectively. These are then arranged
`' in “sequency” order [by Gray-encoding; see [INRIAI] and then will approximately correspond to 2 and y
`frequencies. Likewise,
`it may be divided into its .1: and to components It. and Icy. Then the transforms
`described above may be defined by their action on the amplitudes c = c(rt,rn,Ir}. Since the functions
`underlying the wavelet pacltet transiorm are smooth, small errors in the transformed amplitudes c appear as.
`smoothly Varying deviations, i.e., low-contrast distortion. This is a. desirable property for transformations
`that might be coupled, for example, with edge detection algorithms or other procedures which are sensitive
`to abrupt changes or large derivatives.
`- For some examples oi one-dimensional signal processing algorithms which are defined by theiractions on
`wavelet packet and local cosine coefficieuts, see [INRIA] or [CMW].
`
`.
`
`Page 233 of 437
`
`Page 233 of 437
`
`
`
`lH.IGH—-RE‘-5O|LUTIOf1‘ STILL PICTURE COM?RE5Sl0N
`
`'2?
`
`7.1.1: Spatial filtering. To remove [or attenuate} high frequency components in a. particular direction
`a = tan :1, simply discard any amplitude for which [:1 — tan -3-}! < r with n, > C‘. n, > C. where the cutoff
`frequency 0 and the directionality e are parameters of the filter.
`
`‘$11.2: Local image erilioncernent. To remove high frequency noise from a particular region of the picture.
`employ spatial filtering as described above. ‘but only on those amplitudes for which 2"“ is less then the
`diameter of the region, and for which 2"'(It..}ir,.] is a point. in the region.
`'
`11.3: Edge detection. Suppose we wish to find an edge of scale mg in a. picture of resolution L, for
`example it white region which darkens to block in A distance 2'“""'. Such no edge will contribute large
`amplitudes to scales 1,2,. .. ,m.; at high frequencies. We may graph it by selecting only thine amplitudes
`c(n,m_l:) above a (large) threshold, with m < mo and n greater than an appropriate monotone function of
`rat, and then plot points at 2'''[i:,,k,).
`'
`7.14: Ttzlure detection. Textures may be characterized by lines: dependencies among wavelet packet sm-
`plitudes at nearby translations. Suppose for example that we wish to detect a texture in which c(nu,mo,l:+
`I] = --c(ng.mu,k +1} for all It in same region. An operator which added Ln amplitude at it to its neighbor
`at it + 1 would have We in its range at it, indicating where that texture was located. ‘
`
`7.1.5: Local rescaling. We simply replace c(n,m,Ir) with c{n',m‘,it’] for a restricted range of k’s. The
`map it i-t rt’, etc., is determined by the rescaling. For exnrnple, if rt’ = 2n and m’ = m + l, then we will
`increase the magnification of the picture locally with little change in the frequency content.
`7.2: Theoretical dimension and expected compression.
`Roughly speaking, entropy measures the logarithm of the number of meaningful amplitudes in the signal.
`The lower is this quantity, the better the compression for as given level of distortion. Some experimental
`results [Devore,Jawerth,Luoier] suggest tha.t.l'or pictures. there is 1. strong correlation between the rate'of
`decrease of transformed pixel values s.nd the entropy of the PDF of the values. This relationship depends
`upon the pictures having it definite degree of smoothness (technically. sampled functions from it smooth
`Bmov splice).
`Define the theoretical dimension d(s:) of a. sequence 2 = {z.};’;. to be the exponential of its entropy.-
`
`[ 7,2-35)
`
`d(z] = exp
`
`log.
`
`,
`
`l:,'l'. This quantity always lies between l and it if there are no more than it nonzero
`Where ]|:l|: =
`values 2.-. It can serve as a. measure of the compressibility of picture sfter transform coding. The higher the
`theoretical dimension 41(3) of the transformed pixel Values 2:, the more quantization birts should be used and
`the lower the expected compression for at given distortion leveL
`7.3: Operation counts.
`
`':'..5'.J‘: Tinmfannalion ooornplczily. The complexity of'.lPEG is the cornplerrity of the 2-dimensional
`factored DCT restricted to 8 X 8 blocks, multiplied by the number of such blocks in the picture. Restriction
`makes it on 0(N) algorithm, since the Huffman coding is 0(N]. The constant is rather small, around 10 or
`‘ BI}.
`
`LCT adds one step which costs N operations in each dimension. so the complexity is the some 0(N) as
`for JPEG with a constant of 12 or so.
`Adopted-block DCT and LCT require computing up to !og‘(N} difierent block DCTILCTS, with block
`sizes from 8 x 3 to
`This unites the complexity 0[N[log" (’N)}') with constants around 3. This can he
`controlled by limiting the range of biotic siz.
`
`Page 234 of 437
`
`Page 234 of 437
`
`
`
`28
`
`__
`
`MLADEN VICTOR WICKERIIAUSER
`
`Orthogonal wavelet transformation has complexity 0(N), with the constant controlled by the length of
`the Ql.-!Fs. Really long QMI"s can be factored and then the constant is controlled by the logarithm of QMF
`length, but such fllters are more useful in acoustic signal compression than image compression. Typical
`cnnntants are around 10.
`
`Adapted wavelet padet algorithms are more expensive. Suppose that S is an N-element picture. The
`convolution-decimatinns to generate the tree of amplitudes require 0(N log‘ (N}} nperations. The informa-
`tion cost calculationhas complexity O[N1og,{N}}. Labeling “ltept" subspaces is equivalent to a breadth-lirst
`search through the tree, which has complexity O[N). Locating topmost “kept” subspaces is equivalent to a.
`depth-first search, with complexity O(N]. and filling an output register with amplitudes from the best-basis
`takes an additional O{N) operations.
`
`Reconstruction from the retained amplitudes has no more than the same complexity as generating all the
`amplitudes. since the transformations are orthogonal and have the same complexity as their adjoints
`
`in the adapted transforms we must in general reproduce the entire tree to reconstruct the signal. so the
`reconstruction cornplezclty is of the order as the compression compleieity. The constant in smaller, though.
`because on average the chosen basis resides in a strictly smaller aubtree, and because no additional steps
`such as searching for a best basin are needed during reconstruction.
`
`_
`
`.
`
`7.3.3; Quantization campleriiy. We can perform a first sort to determine the largest amplitudes in the
`output register: this has complexity O(Nlog._.(N)). The alternative, extracting the top tarnplitndes, requires
`O(tN) operations. We choose the more eflcierit method: in either case the total cornpletdty of this step is
`0(N1osz(Nl}.
`
`0a the other hand, if we choose to qnantire all amplitudes to some fixed precision, it is clear that we need
`0(N} operations to replace each amplitude with is bin number.
`
`7.3.3: Entropy coding complexity. If we choose to transrnit all the quantized amplitudes, then We need
`to employ a standard entropy coding algorithm to gain compression. Those algorithms we have mentioned
`(arithmetic, Hufirnan, and Lempel-Ziv) all have complexity O[N], which is dominated by other steps in the
`algorithm.
`
`8: ExPan1MEN':‘nL COMPM-usorls
`
`To put the algorithms into perspective, we l'I.l.ve prepared compressions of three images: an artificial still
`life generated by the ray tracing program "rayshade“ written by Craig Kalb; a fingerprint digitized to 500
`dpi at 8 bits grayacale, and a. fingerprint digitized to 900 dpi at 8 bits grayscale provided by 'Ibrn Hopper.
`The fingerprints can be called high resolution images; the smaller one is sampled to the draft FBI standard
`[Hopper], while the larger is sampled to a higher standard used for early evaluation. The first irnage is
`sampled much more coarsely bltt still causing a. great deal of information at all scales.
`
`The reproductions below are provided to indicate what the images loo]: like, not for any kind of distortion
`analysis. The raw data files are available from the author by request.
`.
`
`L
`
`Page 235 of 437
`
`Page 235 of 437
`
`
`
`_ HIGH-RESOLUTION STILL PICTURE COMPRESSION
`
`2'9
`
`
`
`A -In-|‘0"|"'ElI“lII.ll.
`ZSGJZSJIHN
`
`B Fhgu-prH|rlI|u,gear5aJ¢L
`H2131‘! 13:‘!-‘F
`
`C Flllurpriucluupnlflfiohi.
`Idzlxlflurflhpp
`
`Figure 8-26.
`
`Three gray-scale test Images.
`
`We will compare the rate-distortion curves for 4 transform described above: JPEG, LCT, orthogonal
`wavelets, and best-basis of wavelet pockets.
`
`8.1: Implementation parameters. The JPEG implementation uses the following luminance visibility
`matrix, which was provided by Dan Fuhrmann and Arun Kumar:
`16
`11
`10
`16
`24
`40
`12
`12
`14
`I9
`26
`55
`14
`13
`16
`24
`40
`5'?
`14
`1?
`22
`29
`51
`8?
`18
`22
`37
`56
`I58
`1B9
`24
`35
`55 64
`81
`104
`49
`54
`T3
`87
`193
`121
`T2
`92
`95 98
`I12
`100
`
`( 8.1-3?)
`
`5 I.
`60
`69
`80
`1B3
`113
`120
`103
`
`51
`55
`56
`62
`?T
`92
`101
`99
`
`The z-frequencies increase from left to right. and the y-frequencies increase from top to bottom, just as
`inclines in 3 matrix. What is displayed are the reciprocals of the weights, that is, the algorithm specifies that
`corresponding amplitudes in each 8 X 8 block be divided by the numbers above before uniform quantization.
`ln [act the weights are scaled an that weighted qriantizntion can be implemented by division and truncation
`to the nearest integer.
`The LCT implementation ulea 8 X 8 blocks as well, and the bell function in the one given by Eq.(3.3—ii)
`above. It also use the brequency weighting matrix. in Eq.(B.1-37).
`_
`The orthogonal wavelet and wavelet packet transforms were computed using the periodized “C 6" QMFs,
`' whose nonzero coeflcienta are given below:
`.-.0 = 95 = o.o3a5so7m4raam'49,
`n, = -94 = -0.126969l25396205200.
`--0.07T161555-195773493.
`—o.eo7491s41aasss412o,
`l}.T45(iET553934-134280,
`D.22658426519?()6856D.
`
`er:1»3-:-uau»IIIilll Il‘:9‘.9‘S‘E IIIIllII
`
`(8138)
`
`Page 236 of 437
`
`Page 236 of 437
`
`
`
`30
`
`\__
`
`MLADEN VICTOR ‘NICKERHAUSER
`
`All pictures were decomposed only to level 8. Amplitudes were quantized uniformly. In the best-basis case,
`the information cost filnction Wu H(:) = — E51‘? lug(r.j§). The 46 subspaces chosen for image “A” by the
`best-basin algorithm are as follows: {:1}, 0, 0. 0. 1, 0. 0, G, U. 0. 0, 0. 0. 2. 2. 0. O. 0. D, 0. 0, D, I}, t}. CI. 0, 0. I3. 0.
`CI, 0. 3. 3. 3, 4. 4, 4. 5,45. 5, 0, B. 6, 7, '1', 7}. There were 22,513 subbends in the best-basis chosen for image
`"'13," contributing TDQ6 bytes or 0.026 bits per pixel (packed. but not entropy coded) of overhead on top of
`the storage requirements for the quantized omplitudes. There were 9349 suhbends in the best-basis chosen
`for image “C.” contributing 2995 bytes or 0.0028 bits per pixel (again. packed but not entropy coded) of
`overhead. Since the wavelet basis for all the images would contain only 25 suhbends, it is clear that the beat
`bases for images “B” and “C” are quite diflerent h-om the wavelet bmle. Image "A." on the other heed. is
`best represented by something quite close to the wavelet basis.
`I
`
`‘
`‘
`
`5
`
`8.2: Rate-distortion nomperleona.
`
`Mean square error is the measure of distortion in this comparison. Some other error norms should be
`mentioned, although they are not used here. A method described in [Fh.n§.Se:é] aliows arbitrary distortion
`measures in the adapted local cosine best-basis method. In another approach, (see [R.amr:handran,Vetterli2]),
`the rate-distortion curve is used as a distortion measure for the transform. Also there are projeetsfior
`example [E‘uhrmann]) to find a perceived distortion measure for the JP