`
`the nlgorirlrrrr is fririy general and performs rernerlrabiy
`well with most types of longer.
`_.
`
`ACRNOWLEDOMEJW
`The author would like ro thank Joel Zdepuki who rug-.
`grated incorporating III: sign of III: significant values into
`me significance map to_|.id embedding. Rajesh Hingonni
`who Wrote much ofche origins! C code for the QMF-pyr-
`amida. Allen Gcrsho who provided Lhe original “Bar-
`bara" image. and Gregory Womcll whose frrritfirl discus-
`sions convinced me to develop I more nmhemarical
`arulyris of zerorrees in terms of bounding an optimal es-
`timator. I would also like no thank the editor ml the man-
`ymous reviewers whose comments led no a greatly im-
`proved manuscript.
`
`Rzrarancns
`[I] E. H. Adelrnrr. E. Slrnorroelli. and R. Hiuonnl. "onirogonl pyr~
`urrid rruurorurs for image coding." Pm.-. SHE. val. I45. Caur
`Urine. MA. Oct. I911‘. pp. 50-3.
`[2] R. Amri. H. Dlflifii. and D. J. IJGIII. "HDTV coding using 1
`aonrwuuular rrrhburd
`In Prue. SHE‘ Cnnfi "Finn!
`Corrrrrrrul. Image Prrretrfiru. Cu:rh‘rid|I. MA. Nov. l9lI. pp. III.-
`324.
`‘
`[31 '1'. 1:. 5:11. 1. 0. Clary. we r. H. mm. Tm cu...-....r.....
`5..-
`glnvrrnd Cllfl. NI: Prentice-llnll. I9”.
`'
`[4] P. J. Burr Ind. E. H. Adlbop. “Thu Laplachn pyrrnirl II uorlrpecr
`Imp code." IEEE fun. C'nrrurnrn..rol.1l, pp. 531-540. IN].
`[5] ILR. Coiflmn and M. V. Wlrkuhrrur. "Errnrrpyhrrrd alpr-lrbms
`for beer hnlsBlII:IinII." K55 hut. Fqfirrlrrnt. THIIII7. vol. 35. pp.
`113-118. Mu. [991
`[ii I. Drrrhclriel. "Dnlrorrermni hue: oi‘ cornpoclly npporrod run-
`ieu." Carrrrrrrrrr. Prrrnlppt. .l£nrir.. vol. 4|. pp. 909-996. mil.
`in —. "'l'heuvele1 rmreforrrr. rim-frrrpeaey Irrealiuraa use srpu
`pr.
`I
`.
`sI:I1yrigt').r£££ nun. rrwmu. Ihwy. ....1.3e.pp. our-roar.
`fl} ILA. nrrVor:. ll. llultllll. and I. I. Lrider. “Imp corngrruriou
`Lluuugh -unis: madam aadlll." LEE 11-mu. frpirrnnr. Tlrlury,
`vol. :3. pp. m-‘rue. nu-. rm.
`[9] W. Eqrrir: Ind. 1'. coat. "mania rellumelr at
`IEEE Thu. fnfamwr. Theory. vol. 3?. pp. 269-175. Mir. W91.
`[Iii] Y. Huang. H. M. Dr-ieurr. and N. P.Ouuu1ruI. "frioritixed arr
`for Colnpnnlnrr Ind Plbliluul-rs Trlamilelou of Imus." IEEE
`mu. frrrrm rrvcrurqp, ml. 1. pp. 47?-411'. oer. I992.
`[1 l] N. S. Iuylluad P.-Noll. Dl‘;l'ra.i' Codr'rrgr:fIi'a-vqfurw. Eufleurrrod
`Clilfs. NJ: Praminu-HIJI. I984.
`[l2f Y. Ii. Kim urrll. W. Hnrhrrinn. "Mflflfll nnilnwmfld hlbblld
`ending of imuu." I351!‘ Tami. lilac! Hwmhl. vol. I. pp. 3!-
`II. In. I992.
`[I3] J. KMIEIVK urd M. veurll. "NIIII:flI1lII|e‘|II|I|l.|dil'IIIliflIl.l DB1‘-
`tee1reeen:uIrerionliu:bnhreadwniurhnnror¢."rE££fiunL
`Irrfirrrrrar. mug, vol. 18. pp. 333455. Ill. 1992.
`[141 1‘. [.nrrrr.Irrr|e1ro1rdIrrr1P‘EO Group‘: free 19310 mftrmt. I99].
`‘[15] A. S. Lewis Ind G. Knovtlu. "A 64 13!! video Code: min; lire
`z-o wrvelu rrrrrrlnrrrr." In Prrrc. Dar Currrpmrien C‘orr.f.. Snow-
`bird. Uuh. IEEE Ccurrprruc Ion.-Jury rresr. 199:.
`11:55
`115: —. "Image eomprurlan mu... on 2-D Iuvelot
`mt.-rs. {wrap Fmrrurgg. vol. !. pp. 2-I-I-250. Apr. 1992.
`(IT! 5. Mlllu. "A theory forrnrrhlreurhrrion Iiflllll decunpouiuon: 'l'Ira
`wnvcler repruerrlllinll." IEEE Trans Prrrm-rr Anal. Jldraclr.
`."Irr.r£l..
`«.1. ll. pp. 15?-i-693. July 1919.
`irion: ofiruger Ind wave-
`[Isl —. “MIIiliftuq'lien:y crnnneldnrourpnr
`ill rufldeli." IEEE Trrrm. rlmlrrl. Spa-Hr and Slpni hzrr.-:.r:irrg._
`vol. 37. pp. 209l«-Elli). Dec. I990.
`
`-
`
`Jinn: M. Shpir-v lS'l3-W90) he bum April
`29. l9&1ln New York City. III rescind the B.S.,
`31.8.. and rr..n. deg-run in alumina! eagirrarrhrs
`from In Llnueiranns iarumr at Technology.
`'€.‘urrirl¢u. MA.
`in 1985. I985. and I990. ru-
`5-puriv-cry.
`1'rurrI9l1 to 1934. he In: an sauna. Curr-
`eunl. HA. u put 1:! lb: VI-A Cmjperlrin Pm-
`prnrr. whuulredidlrlr Hlltlflllfllilnlphllb
`locked loop frequency rynuiula. From-1.9!: Io
`I9I?.hnIru|lnarrur:h.NulluuinIlrnVidw
`Inue Proflniljfirunofflrlflrtllfltuthlshuuuyof Eleerroaiee.
`Frrrrrr I9!-M91990. while prrruirrp irI.I rloclonl rrudinl. in war I Rnurch
`Animnr In |ireSe1uorProoeuor Teclrmlou Group of MIT Lincoln Lab-
`onrrrnr. Lumm. MA. In 1990. he joined us: olglui HDTV rum.-er.
`Grnrrp of the Dnvid Sonia! Inanldl Count. I Suiriidilry of SRI Inuit-
`nrrinrrrl. Prhrczruu. NI. Hlr rnearuh i.ut.:1'e|r.I run in rheucuefvidw and
`lung: dill cmrprullurr. difilil ilgrrnl ‘prIrnrrn'rrr;. Idlpliws filturing uni
`rymalic army algnrillrrtll.
`
`
`
`Page 304 of 437
`
`
`
`,,
`
`.
`
`N
`.
`.
`.
`.
`in.” 91.. II
`1“
`"*1 rW --= 9-
`-
`unl..IBli1camwW~'a°-vhtrlru-r.199I1.
`rzoro.ruo..1 an M. vemr-I1. "WueleI.r an anal pmmr...
`Sigrwihwmrrr; .l(a;.. vol. 3, pp. I4-38. Our. 1991.
`[2|| A. srirludw. A.1’urlmn."lnrn|oConrpruuinIuIi::1iu .
`OrieIIar.‘nI‘l‘nfl." lrrPmr.'. IEEEIH. Synrp. CJrI.1r1‘IuIIId5'_wr..
`-'
`-
`up, 11.. my 1993. p.219-1I1.
`__
`[22] J‘. M. Shqlilo. "AI Embedded Wlvelfll Hlarimhiul limp Cum-_ ‘
`Prat. HIE r‘.II'. Carri Aural. . Spud. Signal Frortuiu. Slnflrn. ""
`can... c1_r.r.r... 199:.
`[23] —....
`'*.qdrpri-re rrrrrlridinrrrrsierul perfect reconstruction urrr hh_.=
`ruin; llrl.:Cl.r.I|.rrr u-nuforrrrnflrrrr." Prue. IEEEJM. Syrup». am...
`-.
`53.5.. San Diego. CA. May I991.
`,4.
`[24] —. "Arr uubrdrhd lrierrrrrhiul
`inn:-e under ruin: armruri pt‘
`rnwler :oeflicie1ru."i| Prue. Deal vouprualerr corrf.. swung;
`Uuh. lE.ESCorq:ue1-Soeiny Pruu. I993.
`;_
`[351 —. "lnrlialinuufnreunbeddedwnvalauhlemqrkul irrrnnodu
`no my to H: rue inuu ending." .Proc..J££E.|'u. Corn‘. .1rurru.,_
`Speech. Slnulfrvau-int. Hlnnupulli. MN. Apr. 1993.
`7:
`[la] Spocill inre_ol'l'£.!.'..5‘fl'mrs. frrpr-rat. Tharp, bin-. 1991.
`.
`[27] 0. String. "Unwind and dlhdorr upniom: A brhflnlrotircdanfl
`Sflflkm. vol. I, pp. ii-I-621. Dec. I919.
`with vuliillhloci
`[28] I. Vniwy Ind A. firth. "1lIIl|l
`sin: ugrnenmiurl.“ {EEK 1'1-....:. Signal Pmnmirrg.. vol. ‘D. pp.
`1M0-EM). Arr‘. I992.
`[29] M. Vermrrii. I. liar-15:1-i£.:nd n. I. lsfluil. "Purim! mcoluurmiuu
`llilrrhlnh forI'l'I}TV
`_
`‘arr nndnrrding," Imp: Crrrrrrrrum,
`VoI.2,gp.J49-364.013-1990.
`_
`[SB] 0. K. Walllcl. "I'll JP Stlill Plcllrru Cnmprulinn Sl.In¢Ini,"
`a..........; dart. vol. :4. pp. )D—H.A1:'I'. 1991.
`13:11. rr.w1.uI. R. Ned. nu. o. Clary. "r\ririrner:ic¢odiI¢ twea-
`co1'IIpll9I'IIII," Conn. ACE. vol. 30. PP. $10-540.-June I931‘.
`I32] I. W. woods. Ed" Suiluudiaugc Coding. noun. MA: KIIWII.
`l99l.
`[33] G. W. Warrrzll. "a\ Knrirrmm-Loewe upnuinnlor I/fprweuu vi:
`I
`.
`\|;;;iIII." .l'£E£1'h:rr.r. lrrforrnur. 1'lmry.vol.36. pp. 859-l61.JnIy
`[341 Z. )i'.iuu;.N. Onlutnrrnr. ud. H. Orchard. "MIr|llr.l urlyrirprioru
`irinlimflorinuaeoorupvissiwhuurtunnhirurclaizl Iunvllaldun
`r:rrnqrnIiI:hI." In Prue. IEEZIII. Cog’. J|¢ar.u., finch. .§'i'grralPm-
`mdru. Lillnupolie. i-IN. Apr. I993.
`[33] W. Z.enlH.J. liI1'l'IIII. Ind D. C. P. l..|rrdu. ".lppl.lcuLlurrr plann-
`pucrly nrpporud minim ea lrrrnp errrrrpruriou." SHE Imp Pm»
`rcuirrg arpurhm. Suunclarn. CA 1990.
`
`,
`
`.
`
`'
`
`Page 304 of 437
`
`
`
`Compressing Still and Moving Images with Wavelets *
`
`Michael L. Hilton
`
`Bjorn D. Jawerth
`
`Ayan Sengupta
`
`April 18, 1994’
`
`7 Abstract
`
`The wavelet transform has become a cutting-edge technology in image compression
`research. This article explains what wavelets are and provid a practical, nuta—and-
`bolls tutorial on wavelet-based compression that will help readers to understand and
`experiment with this important new technology.
`
`Keywords: image coding, signal compression, wavelet transform. image transforms
`
`1
`
`Introduction
`
`The advent of multimedia. computing has lead to an increased demand for digital images.
`The storage and manipulation of these images in their raw form is very expensive; for
`example, a standard 35mm photograph digitized at 12 pm per pixel requires about 18
`Mflytes ofstorage and one second of NTSC-quality color video requires almost 23 Mflytes of
`storage. To make widespread use of digital imagery practical, some form of data compression
`must be used.
`
`'
`
`
`
`three typeoofl
`
`-
`
`=.—
`
`-
`
`strongly correlated.
`
`almost all natural images, the values of nei
`
`$11-l3C|1"iI!g pixels are
`
`—S images composed of more than one spectral band, the spectral
`values for the same pixel location are often correlated.
`
`-
`
`change.
`
`Adjacent frames in a video.sequence often show very little
`i
`
`The removal ofepatial and spectral redundancies is often accomplished by transform coding,
`which uses some reversible linear transform to the decorrelats the image data (Rabbani
`and Jones 1991).
`'I‘empora.l redundancy is exploited by techniques that only encode the
`differences between adjacent from in the image sequence, such as motion prediction and
`compensation (Jain and Jain 1981; Lin and Zaccerin 1993).
`In the last few years, the wavelet transform has become a. cutting edge technology in
`image compression research. Although the literature on wavelets is vast, most ofthe papers
`‘The work in this pnpar wu supported by Summus, Ltd.
`l‘I.‘o appear In Multimedia Systems, Vol. 2. No. 3
`
`Page 305 of 437
`
`Page 305 of 437
`
`
`
`Page 306 of 437
`
`dealing with wavelet-based image compression are written by specialists for the specialist-
`The purpose of this article is to provide a practical, nuts-and-bolts tutorial on wavelet-
`based compression that will (hopefully) help you to understand and experiment with this
`important new technology.
`'
`This paper is organized into three main sections. Section 2 discusses the theory behind
`wavelets and why they are useful fonitusge compression. Section 3 describes how the wavelet
`transform is implemented and used in still image compression systems, and-presents some
`results comparing several dlfliarent wavelet coding schemes with the JPEG (Wallace 1991}
`still image compression standard. In Section 4 We describe some initial results with a. novel
`softwure-only video decompression scheme for the PC environment. We conclude the paper
`with some remarks about current andfuture trends in wa.velet~ba.sed compression.
`
`1.1 A Note on Performance Measures
`
`Throughout this paper. numbers are given for two measures of compression performance
`—- compression ratio and peak signal-to-noise ratio (PSNR). The results of both of these
`performance measures can be used to mislead the unwary reader, so it is important to
`explain exactly how these figures were computed. We define compression ratio as
`
`the number of bits in the original image
`the number of hits in the compressed image'
`
`In this paper we confine our measurements to 8 bits per pixel (imp) greyscale images. so
`the peak signal-to—uctlse ratio in decibels (dB) is computed as
`255
`RMSE
`
`PSNB. = Zlllogm
`
`where RMSE is the root mean-squared error defined as
`
`
`
`2 t
`
`RMSE =
`
`_ _ .
`_
`N M
`I
`fi;Z):lf{1.J)-f(:.:)l
`iI=1jzI
`
`are the width and height, respectively, of the images inlpixels. I is the origiuil
`and N and
`image, and f is the reconstructed image. Note that the original and the reconstructed line;
`must be the same size.
`
`2 Wavelets
`
`The purpose of this section is to provide an intuitive understanding of what wavelets are and
`why they are useful for signal compression. For a more rigorous introduction to wavelets,
`see (Daubechies 1992), (Chui 1992}, or (Jawerth and Sweldens 1992).
`One of the most commonly used approaches for analyzing a signal f(z) is to represent
`it as a. weighted sum of simple building blocks, called basis functions:
`
`rs) = ):e.-we
`
`where the iii,-(2) are basis functions and the c.- are coefficients, or weights. Slnoe the basis
`functions, W; are fixed, it is the coeflioients which contain the information about the signal.
`
`Page 306 of 437
`
`
`
`The simplest such representation uses translates of the impulse function as its only bases,
`yielding a. representation that reveals information only about the time domain behavior of
`the signal. Choosing the sinusoicls as the basis functions yields a. Fourier representation
`that reveals information only about the signal‘s frequency domain behavior.
`For the purposes of signal comprsion, neither of the above representations is ideal.
`What we would. liim-ta=bs'nsIrili*'s=re1ireseiitiiti’on which‘ contains‘ informatirm abtrosvbotb
`the time and frequency behaving:-_9f-.ths»sigpal,. More specifically, we want to know the
`frequency content of the signal at a. particular instant in time. However, resolution in time
`{As} and resolution in frequency (Aw) cannot both be made arbitrarily small at the same
`time because their product is lower bounded by the Heisenberg inequality
`
`'
`
`1 2
`
`' nears‘;
`
`This inequality means that we must trade off time resolution for frequency iesolution. or
`vice versa. Thus, it is possible to get very good resolution in time if you are willing to settle
`for low resolution in frequency, and you can get very good resolution in frequency if youusre
`willing to settle for low resolution in time.
`'
`_
`The situation is really not all that had from a compression standpoint. By thr very
`nature, low frequency events are spread out {or non-local) in time and high frequency events
`are concentrated (or localized) in time. Thus,‘ one way that we can live within the confines
`of the Heisenberg -inequality and yet still get useful tirnefrequency information about a
`signal is if we design our basis functions to act like cascaded octave bandpass filters, which
`repeatedly split the signul’s bandwidth in half.
`'Ib gain insight into designing a set of basis functions that will satisfy both our desire
`for information and the Heisenberg inequality; let us compare the impulse function and the
`sinusoids. The impulse function cannot provide information about the frequency behavior
`of a signal because its support — the interval over which it is non-zero —- is infiuitesimally
`small. At the opposite extreme are the sinusoids, which cannot provide information about
`the time bah
`twist of a signal because they have infinite support. ‘him
`ace .
`'
`-
`I
`._
`
`
`su1:tpu|t..oLp_.._ ’
`'
`The different support widths allow us to trade 0)? time and
`frequency resolution in
`erent ways; for example, a. wide basis function can examine a
`large region of the signal and resolve low frequency details accurately, while a short basis
`function can examine a. small region of the signal to resolve time details accurately.
`To simplify things, let us constrain all of the basis functions in {VIM} to be scaled and
`translated versions of the same prototype function III, known as the mother wavelet. The
`scaling is-accomplished by multiplying .1 by some scale factor; if We choose the scale factor
`to he a power of 2, yielding ill‘(2"z} where tr is some integer, we get the cascaded octave
`bandpass filter structure we desire. Because 1]? has finite support, it will need to be trans-
`lated along the time axis in order to cover an entire signal. This translation is accomplished
`by considering all the integral shifts of III,
`
`lD(2"z—lr},
`
`tez.
`
`Note that this really means that we are translating ‘II in steps of site 2"’k.1 Putting this
`all together gives us 5 traveler decomposition of the signal.
`
`f(=): Z X cuk_'1'v.i'.(z)
`vfinitekflnite ‘
`
`"rant 2. because W(2':: — t) = s(:-'{; — 2""l:]).
`
`Page 307 of 437
`
`Page 307 of 437
`
`
`
`where
`
`‘I-",,;¢(x) = 2'”\I‘ [Eve — 1:]
`
`(the muitiiiiicatifln by 2"” is needed to make the bases orthonormal). 90 farwe l'r'.5.i're'-nit!‘
`not-hing aholblhe-aaeileienls» 9,5. They are computed'"By the wavelet tmnsfiir-rn-,‘-'whieh'i!'
`jut thu.inner‘proITh\:t"l!f‘tiiie'sign‘al"-f(x) with the-ha.siu..functionu.wi-,;,-fix)-.»g.
`The comparisons between waveiets and octave bandpass filters was not made just for
`pedagogicai reasons. Wavelets can, in fact, be thought of and implemented as octave band
`pass filters, and we shall treat them as such for the remainder of this paper.
`
`3 Still Image Compression
`
`A wide variety of wavelet-based image compression schemes have been reported in the lit-
`erature, ranging from simple entropy coding to more complex techniques such as vector
`quantization [Antonini et al. 1992; Hopper and Preston 1992), adaptive transforms {De-
`sarte et a1. 1992; Wickerhouser I992), tree encodings (Shapiro 1993; Lewis and Knowles
`1.992), and edge-based coding [Fhment and Mallet 1992). All of these schemes can be de-
`scribed in terms of the general framework depicted in Fig. 1.
`By appmre "m5né1s€€’m' fi$fi§ ng
`ouafonn ‘eaIfii_¢i0!¢Oi**‘nnd'*'0n1iiug~‘the"qua:rrtfle::Y" vaIiié§T" "Ifi'E‘g’é“"reE'a'm‘i‘ffiEt1h‘ii‘"f€"§!(§!n1-
`itI‘?e1'f!flg" tflfhofliifefikffifdpefifflflifih. We now dcribe each of the boxes in
`Fig. 1 in more detail.
`
`
`
`Figure 1: Block diagram of wavelet-based image coders.
`
`Page 308 of 437
`
`Page 308 of 437
`
`
`
`3.1
`
`Implementing the Wavelet Transform
`
`The forward and inverse wavelet transforms can each be ethciently implemented in 0(a)
`time by a pair of appropriately designed Quadrature Mirror Filters (QNiFs) (Creisier st 'a.i.
`1976). Therefore. wavelet-based image compression can be viewed as a. form of subband
`coding (Woods end O'Neil 1986), Each QMF pair consists of s. lowpass filter {.H'l_and s
`highpass filter {G} which split a signers bandwidth in half. The impulse responses of H
`and G are mirror images, and are related by
`‘{1}
`9.. = (-1)?-"h.-..
`The impulse responses of the forward and inverse transform QMFa — denoted (ii, (9') and
`(ILG) respectively —-« are related by
`
`_
`
`.9‘-I =
`
`Hal
`-n
`an
`
`[2]
`
`7
`-13'
`hi! =
`To illustrate how the wavelet transform is implented, we shall use Dauhechie's We
`wavelet [Dsirhechies 1958). We chose this wavelet because it is well known and has some
`nice properties. One such property is that it has two vanishing moments, which means the
`transform coeflicients will be zero (close to zero) for any signal that can be dcrib.-_a-I by
`-[approximated by] 8, polynomial of degree 2 or less. The mother wavelet basis for We is
`shown in Fig. 2. The filter coefficients for H of W; are
`
`5.".°"::?'.."?‘.?.§"
`
`ItIIItIInu
`
`o.;;:I2a7oss295n
`n.sn5s915o9311
`0459377502113
`—o.135o11o2oo1o
`
`—o.us5441273sa2
`MJ35226291882
`
`__
`
`_=
`
`'
`
`, and (3 can be derived using Equations 1, 2, and 3.
`from which the coeflicients for G,
`The impulse responses of H and G are shown in Fig. 3.
`A one-dimensional signal 5 can be filtered by convolving the filter coeflicients cl. with
`the signal values.-
`M
`
`-
`
`ii = E ck-’1'—J:
`it:
`
`where M is the number of coeflicients, or taps, in the filter. The one-dimensional forward
`wavelet transform of 9. signal 5 is performed by convolving s with "both Ii and G‘ and
`downsampling by 2. As dictated by Equation 1. the relationship of the Ii" and G‘ filter
`coeflicients with the beginning of signal 5 is
`
`E5
`-in
`5'3 9: 9!
`
`:33:
`
`90
`
`9s
`
`94
`
`in E,
`83
`54
`
`3.?’
`
`3?‘
`
`38
`
`Note that the G’ filter extends before the signal in time; if .9 is finite, the ii’ filter will extend
`beyond the end of the signal. A similar situation is encountered with the inverse wavelet
`transform filters H and G. In an implementation. one must make some choice about what
`
`5
`
`Page 309 of 437
`
`Page 309 of 437
`
`
`
`Figure 2: The mother wavelet basin function for W5.
`
`Fnmmqrflcnomeumwulflnmeflwl
`
`-
`
`‘I.4
`
`L2
`
`5..
`
`0.4
`
`02
`
`0
`
`D1
`
`0 I.
`DA
`Nonnallznu Frnqunrtey
`
`0.8
`
`‘I
`
`Figure 3: Frequency response of the We QMF5.
`
`Page 310 of 437
`
`Page 310 of 437
`
`
`
`values to pad the extensions with. A choice which works well in practice is to wrap the
`signal about its endpoints. i.e.,
`'-Sn-1 591
`
`-In-2 3:-n—1 5n 5n 51
`
`-lo 51 5:
`
`I
`thereby creating a periodic extension of s.
`Fig. 4 illustrates a single 2-D forward wavelet transform of an image, which is accom~
`pllshed by two separate 1-D transforms. The image f[:c,y) is first filtered along the :s
`dimension, resulting in a. lowpass image f;,[:',y} and a highpass image fy(2,y). Since the
`bandwidth of fr, and jg along the a dimension is now half that of f, we can safely down-'
`sample each of the filtered images in the : dimension by 2 without loss of information. The
`downaampling is accomplished by dropping every other filtered value. Both fr, and fg are
`then filtered along the 3; dimension, resulting in four subimages: fun fur, fun and fgy.
`Once again. we can downsample the subimages by 2, this time along the y dimension. As
`illustrated in Fig. 4, the 2-D filtering decomposes an image into an ouemge signal (ILL)
`and three detail signals which are directionally sensitive:
`f,r_,g emphasizes the horizontal
`image features, fm, the vertical features, and fgg the diagonal features. The directional
`sensitivity of the detail signals is an artifact of the frequency ranges they contain.
`
`fitter)
`
`
`
`Figure 4'. Block diagram of the 2-D forward wavelet transform.
`
`It is customary in wavelet compression to recursively transform the average signal.’ The
`number of transformations performed depends on several factors, including the amount of
`compression desired, the size of the original image, and the length of the QMF filters.
`In
`general, the higher the desired compression ratio, the more times the transform is performed.
`After the forward wavelet transform is completed, we are left with a matrix of coefficients
`that comprise the average signal and the detail signals of each scale. No compression of the
`original image has been accomplished yet; in fact, each application of the forward wavelet
`‘A more sophisticated dcconspocition strategy is to use the moueiet packets of Coifmsn and Meyer'(Wicit-
`erhouaer 1992; Coitmanaud Wiclserhouser 1992].
`
`Page 311 of 437
`
`Page 311 of 437
`
`
`
`\
`
`transform causes the magrlitude of the coefiiciente to grow, so there has actually been an
`increase in the amount of storage required for the image‘. Compression is achieved" by
`quantizing and encoding the wavelet coefficients.
`The 2-D inverse wavelet transform is illustrated in Fig. 5. The average and detail signals
`are first upsampled by 2 along the y dimension. Upsampling is accomplished by insertirig
`a zero between each pair of values in the go dimension. The upsampling is necessary to
`recover the proper bandwidth required to add the signals back together. After upsampiing,
`the signals are filtered along the y dimension and added together appropriately. The proces
`is then repeated in the 2 dimension, yielding the final reconstructed image.
`-
`
`
`
`"\
`
`Figure 5: Block diagram of the 2-D inverse wavelet transform.
`
`3.2 Quantization
`
`The forward wavelet transform decorrelates the pixel values of the original image and con-
`centrates the image information into a relatively small number of coeflicients. Fig. 6 (Left)
`is a. histogram of the pixel values for the 3~bits per pixel (bpp) 512 x 512 Lena image, and
`Fig. 6 (Right) is A histogram of the wavelet coefiicienta of the same image after the forward
`wavelet transform is applied. The “information packing" effect of the wavelet transform is
`readily apparent from the scarcity of coefficients with large magnitudes.
`The sharply peaked coeflicient distribution of the wavelet transformed image has a lower
`zero-th order entropy (4.24 bpp) than the original image (1.46 bpp), thereby increasing the
`. amount of lossless eompression possible.
`We can also take advantage of the energy invariance property of the wavelet transform
`to achieve ltigh-quality lossy compression. The energy invariance property says that total
`amount of energy in an image‘ does not change when the wavelet transform is applied. This
`property can also be viewed in a slightly different way: any changes made to the values
`of the wavelet coefficients will result in proportional changes in the pixel values of the
`reconstructed image. In other words, we can elimirfate (set to zero} those coeflicients with
`small magnitudes without creating significant distortion in the reconstructed image.
`In
`
`Page 312 of 437
`
`Page 312 of 437
`
`
`
`Transformed Lena Image
`
`_Original Lena Image
`
`0.015
`
`§‘ 0.01
`0}
`3-
`$0.005
`
`10
`
`.2
`
`gm
`in
`§
`.._1o“‘
`
`..
`
`0
`0
`
`200
`100
`Pixel Value
`
`300
`
`1o‘
`
`0
`
`10000
`5000
`Coefflolent Value
`
`Figure 6: LEFT) Normalized histogram of the pixel values in the original Lena image.
`RIGHT) Normalized histogram of the wavelet transform coeflicients of the same image.
`
`practice, it is possible to efimlnate all but a. few percent of the wavelet coefllcients and still
`get a reconstructed image of reasonable quality. The elimination of small valued coeflicients
`can be accomplished by applying a iltresholding function
`
`T(t,..-2] = {
`
`0' if in < 2
`:1: otherwise
`
`to the coefficient matrix. The amount of compression’ obtained can now be controlled by
`varying the threshold psrsmeter t.
`Higher oornpression ratios can be obtained by quantizing the nonzero wavelet coefli-
`cients before they are encoded. A quantizer is 3 many-to-one function 0(1) that maps many
`input values into as (usually much) smaller set of output values. Quantizers are staircase
`functions characterized by a set of numbers {d.-,1’ = 0,. ..,N} called decision points and
`a set of numbers {r;,:' = 0, .. ., N — 1} called reconstruction levels. An input value 2 is
`mapped to s reconstruction level n if .1 lies in the interval (d.',d.-+1].
`To achieve the best results, a separate quantizer should be designed for each scale,
`taking into account both the properties of the Human Visual System {Man 1982) and the
`statistical properties of the scale's coeffieits. The characteristics of the Human Visual
`System guide the allocation of bits among the different scales, and the coeflicient statistics
`guide the quautizer design for each scale. Descriptions of various bit allocation strategies
`can be found in (Mstic and Mosley 1993) and (Clarke 1985).
`The distribution of coefiicient values in the various detail signals can be modeled rea-
`sonably well by the Generalized Gaussian Distribution (GGD). The probability density
`function of the coefiicient distribution at each scale :2, then, can be given by (Abrarnowits
`and Stegun 1965}:
`
`.Pv("-"J =
`
`2I'{1]nr,,)
`
`exp f-[v(a»»0-r) |=|1“')
`
`Page 313 of 437
`
`Page 313 of 437
`
`
`
`Scale u Codeword Size Decision Points and Reconstruction Levels
`
`{in bits}
`
`l‘|l|l‘EllN‘l!NIEHIIilIHIIEHIIEIEEHIIIIIIIIIIIIIIIIIII
`flfllIEIIEIHilfiilllllllllllllllllll
`
`IZIIHEHIIEIEfillfllfiilfillfllilfililllll
`re IWIIEIEEIdfiliilfiilfifillfifilllllllll
`I2IIUIIEIEEIillflllflfillflilfliillfliill
`IIIEIEHIHElfifllifilllilflillflilllllll
`W 20, 33, 51, T3, 99, 128, 161, 201, 245
`291,339,3ss,43s,5oo,591,73s,2n4s
`2s,41,s1,s5,113,143,179,223.2s7
`
`
`314,3s2,41n,4ar,s39,e44,334
`‘Table 1: Lloyd-Max quantizers generated using magnitude data from the W5 transformed
`Lena image.
`
`
`
`
`
`
`
`where
`
`We-r:::5.;1‘“
`
`Q
`
`I
`
`and
`
`W
`nn=£ rw*n
`is a shape
`cr, is the standard deviation of the coefiicient distribution at scale it and cr,
`parameter describing the exponential rate of decay of the distribution et scale v. For
`example, when or, = 1 the GGD becomes the Laplaciao pdf, while or. = 2 leads to the
`Gaussian pdf. The an, appropriate for a particular class of images can be computed using -
`the 1;’ test or by simple observation. Eat the W5 wavelet and the Lena image, several of
`' the appropriate Values of try end 0... are:
`
`as = [L53
`01- =
`:2; = 0.43
`CI; = 0.39
`
`5.118
`
`03 =
`0'7 =
`as = 14.33
`ct; = 20.17
`
`The design of scalar quautisers will also depend on the type of encoder to be used. If
`the encoder uses fixed-length codewords, the Lloyd-Mex algorithm (Max 1960) can be used
`to design a quantize: that minimizes the mean-squared quantization error.
`If a variable-
`length entropy coder is used, uniform quantization is optimal (in the mean-squared error
`sense) when the coeificient distribution is Laplacian; for other distributions, the algorithm
`in [Wood 1969} can be used to design an optimal quantizer. Vector quantization [Gersho
`and Grey 1992) has also been used in wavelet comprsion systems, for example (Antouini
`et a1. 1992) and (Bradley and Brislewn 1993}.
`'
`Table 1 lists the decision points and reconstruction levels for as set of Lloyd-Max quan-
`tizers generated using magnitude data from the W3 transformed Lena image. One extra
`hit per codeword is needed to represent the sign of the quantized coefficient. The codeword
`sizes were chosen by experimentation.
`
`Page314of437
`
`Page 314 of 437
`
`
`
`3.3 Coding the Coefllcients
`
`The encoder/decoder pair, or codec, has the trial: of lossleasly compressing and decompress-
`ing the sparse matrix of quantized coeflicients. Codec design has received a tremendous
`amount of attention, and a wide variety of schemes exist [Lelewer and Hirshberg 1987).
`The design of a code: is usually a compromise between (often sonllicting) requirements for
`memory use, execution speed. available bandwidth, and reconstructed irna.ge_qual.ity.
`For applications requiring fast execution. simple run-length coding (Pratt 1973) of the
`zero-valued coefficients has proven very effective. (The distribution of non-zero coefficients
`is such that rarely is it profitable to run-length encode them.) The zero rumlengths can"
`be encoded using either fixed-length codewords or variable-length entropy coding; entropy
`coding is more expensive to implement, but can improve the peak signal to noise ratio
`{PSNRJ of reconstructed images by -a.s much as 3 dB, depending upon the energy-packing
`ability of the wavelet in use.
`For applications requiring the best possible image quality at a. particular compression
`ratio. a technique such as Shapiro's Zero Tree encoding (Shapiro 1993) is a. better choice.‘
`The execution speed tradeofi” between these two coders is quite dramatic: our run-length
`entropy coder takes less than one second to compress a 512 x 512 3 bpp image on a 66-MHz
`80486 computer, and Zero Tree-like coders can take up to 45 seconds to compress the same
`image on the same machine. However. the quality of the Zero Tree image is much better
`— 36.28 dB PSNR (Shapiro 1993) vs. 33.2 dB PSNR at a. compression ratio of 16:1.
`
`3. 4 Compression Results
`
`The peel: signal to noise ratios of several different wavelet compression techniques applied’
`to the 512 x 512 8-bpp Lena image are compared in Fig. i‘. The graphs show that both the
`encoding technique and the particular wavelet used can make -a significant clilference in the
`performance of a. compression system: the Zerotree coder performs the best; biorthogonal
`wavelets (Antonini at al. 1992: Cohen 1992; Averbuch et a1. 1993) perform better than W4;
`and variable length coders perform better than fixed length coders.
`The performance of a baseline .1’PEG (Wallace 1991) image cornpressora is also indicated
`in Fig. 7. At compression ratios less than 25:1 or so, .1PEG performs better numerically than
`the simple wavelet coders. At compression ratios above 30:1, J PEG performance rapidly
`deteriorates. while wavelet coders degrade gracefully well beyond ratios of 100:1. Figure 8
`compares the visual quality of several image coders.
`-
`
`4 Video Compression
`
`The wavelet transform can also be used in the compression of image sequences, or video.
`Video compression techniques are able to achieve high quality image reconstruction at low
`bit rates by exploiting the temporal redundancies present in an image sequence (Le Call
`1991: Liu 1991). Wavelet—bssed implementations of at least two standard video compression
`techniques, hierarchical motion compensation (Us et sl. 1991) and 3-D subband coding
`(Karlson and Vetterli l.939).hzwa been reported {Zhang and Zafar 1992; Lewis and Knowles
`1990). However, the computational expense of the wavelet transform has so far prevented
`its use in resltime, software-only video codecs for PC—class computers. In this section, we
`
`‘The JPEG coder that is included in Version 3.21 of John H-‘:ndley's xvin‘ program was used to generate
`the JPEG performance data shown in Fig. 1'.
`
`11
`
`Page 315 of 437
`
`Page 315 of 437
`
`
`
`zflffllfflfl -0--
`JPEG -+---
`rthog=onal¢VLO -a---
`ah-FLG -It-
`WB+VLC -I»-
`
`P3NH(£)
`
`:2
`' 16
`Compression Ratio
`
`Figure 7: A comparison of the image reconstruction quality of several different wavelet
`coders and JPEG. The tests were performed on the 512 x 512, 8-‘opp Lena image. “VLC"
`means Variable Length Coder. and “FLC” means Fixed Length Coder.
`
`describe a new technique for rapidly evaluation the inverse wavelet transform and illustrate '
`its use in the context of a software-only video decoder based on frame differencing.
`
`4.1 The Basic Idea
`
`The playback speed ofa wavelet-based video coder depends in large part upon how long it
`takes to perform the inverse wavelet transform. A 66-Mhz 80486 computer takes about 0.25
`seconds to compute a complete inverse wavelet transform for a 258 x 256, 8-bpp grayscale
`image. Unless one finds a way to avoid periormirrg a complete inverse transform each time an
`image frame is reconstructed, wavelets are not viable for software-only video of reasonably
`sized images.
`Fortunately, it is not necessary to perform the complete inverse transform for each
`frame in a. slowly varying image sequence. The value of an arbitrary pixel p in an image
`is determined by a weighted sum of all the basis vectors in the wavelet decomposition that
`include pin their region of support. If the weights 0.2., the wavelet coéfficients) of these
`basis vectors do not change between frames in an imag