`
`IEEE TRANSACTIONS ON FMAGF, PROCESSING, VOL. 3. N0. 5. SEPTEMBER I994
`
`Rate-Distortion Optimal Fast Thresholding with
`Complete JPEGIMPEG Decoder Compatibility
`
`Kannan Ramchandran and Martin Vetlerli
`
`I
`
`cum
`
`THRESHOLD“
`
`I
`
`Abstmct— We show a rate-distortion optimal way to threshold or
`drop the DCT coefficients of the JPEG {1] and MPEG ll] compression
`standards. Out- optirnei algorithm uses a last dynamic programming
`recursive structure. The primary advantage of our approach lies in its
`complete compatibility with standard JPEG and MPEG decoders.
`
`I. Block diagram of problem statement; X is the input image. X is
`Fig.
`the quantized prethreshoidcd version [after JPEG or MPEG]. and X is the
`output of the thresholder. We want to minimize the distortion. for a fixed
`quantizer. between the original and the threshoided images subject to a bit
`budget constraint.
`
`l.
`
`lNTRODUCTiON
`
`JPEG [I] and MPEG [2] are popular DCT-based compression
`standards for still images and Video sequences. respectively. Standard-
`ization of these compression formats has spurred the wide usage of
`JPEG {and recently MPEG) decoders. The key to good compression
`when using these standards while still being compatible with the
`standard decoder lies in determining an optimal thresholding strategy
`for the DCT coefficients. We are thus interested in retaining that
`subset of the DCT coefficients for the image or video frame that is
`the “best“ in a rate—distortion (RAD) sense. Thresholding or dropping
`the less significant DCT coefficients may be desirable in the R-D
`sense. as it may lead. at a marginal sacrifice of coded. quality.
`to
`a significant reduction in coding bit rate. due to fewer coefficients
`needing to be transmitted. This is especially so when deciding the
`last nonzero coefficient, which is typically followed by an inexpensive
`"end-of-block“ code [1]. [2]. Thus. it may especially pay to get rid of
`compression—hindering sparsely interspersed insignificant coefficients
`that represent the last nonzero values before the end—of—block. In the
`case of JPEG. where the standard does not permit variable scaling
`of the different image blocks. intelligent adaptive thresholding of the
`coefficients would essentially amount to changing the quantization
`scales at the block level. without breaking the rules of the game!
`in this paper. we formulate an R—D optimal strategy to threshold
`the quantized DCT coefficients by using a fast recursive dynamic
`programming (DP)
`technique. Starting from the “highest quality
`point" after quantization at a fixed scale (for JPEG [l] or QP-level (for
`MPEG [2]). one can sweep the entire thresholding R-D curve over
`a continuous range of target bit rates (or equivalently target-coding
`qualities) by dropping insignificant coefficients in tire image or video
`frame. Thus, our algorithm could find all points that reside on the
`convex hull of the thresholding R-D curve. The appeal of the strategy
`lies in its combination of RB optimality. speed of operation. and
`its complete compatibility with standard JPEG and MPEG decoders,
`which remain blissfully oblivious to the thresholding gymnastics
`performed by the encoder. A point to note is that our algorithm
`exploits the monotonic nature of bit rate versus the zero run length
`count preceding a nonzero coefficient inherent in the Huffman tables
`of JPEG and MPEG.
`to specify a fast “pruning“ operation in the
`
`l993. The work
`Manuscript received October .h’l. 1992; revised August 4.
`of K. Ramchandran was supported by the New York State Science and
`Technology Foundation‘s CAT. The work of M. Vettcrli was supported in
`part by the National Science Foundation under grants ECD-SS-ll I ii and
`HIP-904M139.
`K. Ramchandran is with the Department of Electrical and Computer
`Engineering. University of illinois at Urbanofhampaign. Urbana. lL filSOl.
`M. Vetterli is with the Dopartment of Electrical Engineering and Center for
`Telecommunications Research. Columbia University, New York. NY lllOZ'l-
`6699.
`JEEP: Log Number 940224].
`
`DP recursion. Thus. the computational complexity of the algorithm
`becomes low enough for it to be implementable.
`The problem of thresholding using an “efficiency" measure was
`tackled in [3] and was the inspiration for the work described here.
`The approach of [3] does not attain R-D optimality. as is our goal
`here. Moreover. in this work. we use a faster dynamic programming
`formulation than that of [3] and additionally exploit the monotonicin
`of the bit rate versus zero run-length counts of the 1 PEG and MPEG
`codebooits to appreciably reduce the computational complexity of the
`algorithm. The subject of finding a locally optimal quantization matrix
`that matched the image was tackled in [7]. using a computationally
`intensive descent algorithm which is not block-adaptive.
`This paper is organized as follotvs. Section II defines the problem
`quantitatively, Section III describes the optimal solution. which
`is presented in algorithmic detail
`in Section IV. Finally. coding
`applications using JPEG and MPEG are described in Section V.
`
`ll. PROBLEM STATEMENT
`
`We wish to find that optimal set of quantized DCT coefficients to
`be retained for every 8 x 8 block of an image or video frame such
`that the mean—squared-error {MSBJ distortion (any additive distortion
`metric is feasible in general. e.g.. activity—weighted MSE) between
`the original image and the thresholded version is minimized subject
`to a maximum target coding-blt-mte constraint. or equivalently.
`the coding bit rate is minimized subject
`to a maximum allowable
`distortion constraint.
`
`1111.15. if X is the input signal. X the quantized output correspond—
`ing to a fixed scale or “anchor" level representing the maximum
`quality operating point, and Ji'
`the thresholded version of X. we
`seek to minimize the MSE distortion between X and X given the
`quantized image X. subject to a tour] coding-bit budget of Roms...
`for X. That is. our goal
`is to find
`
`utin[DlX. ;‘ifljfi'j subject to’RL-‘h 3 Hbudget
`
`(l)
`
`where X’ is a thresholded version of .i' (see Fig. I).
`
`Ill. OFl'lMAi. Sowrron
`
`A. R—D Optimality: The Constant Slope Condition
`The “hard“ constrained thresholding problem of (1) can be solved
`by being convened to an "easy“ equivalent unconstrained problem by
`“merging" rate and distortion through the Lagrange multiplier A [4}.
`The unconstrained thresholding problem becomes the detennination
`(for a fixed A) of that set of coefficients. which results in the minimum
`total Lagrangian cost defined as
`
`Jun = Drill") + mill}.
`
`(2)
`
`lflST-"ll49l945041lll (E) I994 lEEE
`
`Petitioner Microsoft Corporation - Ex. 1048, p. 700
`Petitioner Microsoft Corporation - EX. 1048, p. 700
`
`
`
`IEEE TRANSACTIONS ON IMAGE PROCESSING. VOL. 3, NO. 5. SEPTEMBER I994
`
`'l'fll
`
`The optima] coefficient search for the image can be done indepen—
`dently for every 8 x 8 image block for the fixed quality “slope" A.
`which trades off distortion for rate. This is because it can be shown
`
`{4]. [5] that, at R-D optimality. all blocks must operate at a constant
`slope point /\ on their R-D curves.
`This result is fairly intuitive as seen from the argument that. at
`optimality. each allocated bit must do an equal amount of "good" {in
`the distortion-removal sense). for otherwise one could redistribute
`bits from 'fitnprofitable“ to "profitable" blocks. It is obviously cost‘
`effective to keep doing this until an optimum is reached. where no
`block can spend the next available hit any more profitably than any
`other. i.e.. until all signal blocks have the same slope (A) on their
`rate—distortion function. The desired optimal constant slope value A'
`is not known a pt't‘rtrr' and depends on the particular target budget or
`quality constraint. Fortunately. however. A' can be obtained relatively
`painlessly via a fast convex recursion in A using the bisection
`algorithm [5}. The main advantage of the Lagrangian approach is its
`independent optimization property for each signal clement. enabling
`the independent analysis of the 8 x 8 image blocks in our case!
`For a more mathematical formulation. if T is {{l. l. 2. .
`- - . I33}. the
`set of all 8 x 8 coefficients in each block ordered in the 1—D zigzag
`scan order. 3 j T any feasible ordered subset of T. and 13(5). Rt 5 i
`the distortion and bit rate. respectively, associated with retaining the
`coefficients in 5‘. our problem of finding:
`
`Dmin = 32!} DIS} filllljt‘fl ['0 His) 3 Hinltluyl
`
`(3)
`
`is solved by introducing JtA) = [Dt31+ thSfl representing the
`Lagrangian cost of 5 associated with the quality factor A and solving
`the following equivalent unconstrained problem
`
`Jm'th = min JIM : min[D(S}I + ARISI].
`
`[4)
`
`is not known a priori
`The desired optimal constant slope value A‘
`and depends on the particular target budget or quality constraint but is
`obtained using a fast convex search using the bisection algorithm [5]
`
`ItIrninldl-l = lilgéil-IminlAl _ ARluulevtl
`
`(S)
`
`3. Fast Dynamic Programming Algorithm
`Since the optimal convex-hull solution can be found. as described
`above. by independently finding the minimum—Lagrangian-cost op-
`erating point (i.e.. one that minimizes .t = Dltlut—L + A' Raina.) for
`each block of the sequence.
`it suffices to consider a single block
`for analysis. The problem is sotved using a dynamic programming
`approach.
`is part of the standards ill. [2] is used to
`The zigzag scan that
`order the 2-D coefficients. As an initialization. one has to gather the
`Anus—‘5 associated with the incremental Lagrangian cost of going
`from coefficient j to coefficient 1- (Le. dropping all the coefficients
`between them) for all nonzero valued (j. 1r) coefficient pairs with
`j < tr. AJJIt : -Ek + ARM.— represents the “net gain" of including
`5- conditioned on the previous nonthresholded coefficient being j.
`E1. represents the “goodness“ measure as calculated by the decrease
`in squared error caused by no} thresholding It. and is given by
`Cf. - (Ci,— —Cfi ). where Cr. and Cr arc the unquantized and quantized
`coefficient values. respectively, while R). is the conditional bit rate
`in encoding coefficient it. given that the previous nonzero coefficient
`is j. i.e..
`it is the conditional cost of not thresholding It. See Fig. 2.
`the values 12,}.— can be “read off“ from the standard Huffman coding
`tables for both JPEG and MPEG and can be prestored. Note that to
`find the optimal algorithm. only the run lengths need to be stored.
`and not the actual Huffman—coded bit stream. and this represents a
`trivial memory requirement.
`
`
`
`1
`let-ci-tck-Cp l i. “Mum-mama wilhmeffldentlk
`“,1: a in maniacal bit rat: out «normals; eoefidentlk given that
`“negation: non-moneffieienthfi.
`
`DCI‘ eaefiicientrt
`
`Fig. 2, DCT coefficients of a typical 8 X 8 image block of lPEGt’MPEG
`ordered in l—D according to the zigzag scan order. Coefficient #0 is the
`dc coefficient. E". and R”. are shown as the thresholding distortion and
`nonthresholding cost awtciatcd with coefficient Li conditioned on the previous
`nonzero eoetlicient being 3'.
`
`The optimal operating point. for a fixed value of A. can be done
`(see Fig. 3)
`in a recursive fashion by finding (i)
`the minimum
`Lagrangian cost. J " [ fr}. and (ii) the optimal predecessor coefficient.
`predecessor Hr), associated with choosing coefficient
`li' as the last
`nonzero coefficient for all k : l. 2. -
`-
`- . 63. Then. starting from that
`coefficient fr'. which is the cheapest to retain as the last nonzero
`coefficient. i.e.. minimum J‘i - l. the optimal set can be "backtracked"
`from the optimal predecessor chain calculated for all predecessors of
`l." .
`
`A more elaborate step—by—step description of the algorithm follows.
`The recursion begins with coefficient 0. The cost of dropping all ac
`coefficients is stored in J'tfl‘t. Then. one proceeds with the minimum
`cost “path" that ends in coefficient
`|. There is not much choice here.
`as there is only one path that ends in coefficient l. namely. dropping
`all coefficients from 2 to 63. This cost is saved in 4"“). and the
`optimal predecessor to l
`is obviously 0. Proceeding to coefficient 2.
`the most efficient recursive way of determining the best path that
`ends in 2 is to find the optimal predecessor to 2. Le. either 0 or 1.
`Since the optimal costs associated with ending at 0 and l are known
`from .t‘lOJ and .l‘tl ]. respectively. the job of finding the cheapest
`cost path ending in 2 is simply the minimum of [J°(U} + 3.102]
`{where AJDQ is the incremental cost of going from 0 to 2) and
`[J' t l l +'_\J'12]. The smaller ofthese two costs is saved in J" (2). and
`the optimal predecessor of 2 {i.c.. the one among 0 or ] responsible
`for the smaller total cost leading to 2) is saved in predecessor (2).
`Proceeding similarly to coefficient 3. the best path ending in 3 has to
`have a direct predecessor that is either i} or i or 2. As the best costs
`associated with ending at all predecessors are known from previous
`iterations and are stored in J ° {predecessor}. and. the incremental cost
`of going from each predecessor to 3 is known from the precomputcd
`'_\Jp....irm....,3 for all predecessors 0. l. and 2. the best path ending
`in 3 is computed as the cheapest of the total costs [J' (predecessor)
`+ Amemt-ms for all predecessors 0. l. and 2. This cheapest cost
`is saved in J't3). the optimal predecessor is saved in predecessor
`l3). and the recursion continues to coefficient 4 and so on until the
`last coefficient 63 is done. At this point.
`the optimal
`last nonzero
`coefficient A"
`is obviously the one with the smallest
`.l‘tlr) for
`1- = 0. l.---..63 See Fig. 3. By backtracking from k', one can
`
`Petitioner Microsoft Corporation - Ex. 1048, p. 701
`Petitioner Microsoft Corporation - EX. 1048, p. 701
`
`
`
`1'0?
`
`[BEE TRANSACTIONS ON EMAGE PROCESSING. VOL. 3. NO. 5. SEPTEMBER 199$
`
`
`
`Fig. 3. Block diagram of the dynamic programming recursion of the optimal
`algorithm for each 8 x 3 image block. E is the as energy, AJ..,- is the
`incremental Lagrangian cost of going from coefficient i to coefficient 3' while
`dropping all coefficients inbetween. and I; is the minimum Lagrangian cost
`associated with ending at coefficient it. Note that k‘ = maggflm J; is
`the optima] "last" coefficient.
`' _
`
`find the optimal predecessor chain sequence starting from predecessor
`(k’) and going back to 0. at which point the entire optima] set of
`coefficients to be retained for each block is known for the given A.
`in finding the optimal predecessor at a particular iteration l.- as
`described above. one generally has to consider as candidates all
`coefficients 1' < k. However, for the particular case of monotonicity
`of Bi]: in the zero run length count (it -— j — 1) (see Fig. 4}. which
`is true for the default coding tables of JPEG and MPEG.‘ 3 fast
`pruning algorithm can be used to speed up the search. This results
`in a substantial decrease in computational complexity and leads to
`a fast optimal algorithm. The above optimal dynamic programming
`algorithm is performed independently on all blocks. The composite
`R-D point for the picked A is obtained simply as the sum of the
`optimally obtained R—D points for each block for that A. Finally. the
`optimal slope )t'
`that solves the desired budget or quality constraint
`is found using a fast convex search.
`
`IV. ALGORITHM
`
`We now explain somewhat rigorously the algorithm employed to
`find the optimal solution to our problem. The optimal algorithm
`flowchart for a fixed operating slope A will be described for a single
`typical 8X8 image block in Phase 1. Note that the algorithm is applied
`independently and in parallel to each signal block to detem'tine the
`optimal coefficient sequence to be retained for that block. Then in
`Phase II. the optimal operating slope A' for the composite problem
`will be obtained.
`
`'1‘his is a highly reasonable condition even if custom codebooks are used.
`
`O-ZO-IOZD:
`CIZHm>maunmU20':
`
`HemiCANNDTbu-apdnaldhumuh
`
`is made
`Fig. 4. Fast pruning of nonoptimal predecessors to coefficient
`possible thanks to the monotonicity of the JPEGMPEG codeboolts in the
`l'unlcngths preceding a nonzero coefficient.
`
`A. Data Gathering
`
`Prior to running the optimal algorithm. a one-time fixed cost of
`gathering the statistics needed for running the optimal algorithm
`must be endured. See Fig. 2. This involves gathering. for each
`DCT coefficient is. its thresholding distortion E. and its conditiomd
`nonthresholding coding cost R3]; conditioned on every preceding
`nonzero coefficient 3' < k.
`
`8. Phase l': Finding the Optinwl Set of Coeficienrsfar :1 Given )1
`Note that.
`in the algorithm flowchart to be described. E refers
`to the total unquantieed ac energy in the signal block;
`i.e., E =
`2:1 CE. Err refers to the thresholding distortion associated with
`coefficient Jr. RJ 1. refers to the incremental bit-rate cost of coding is
`after j. AJJ-i. refers to the incremental Lagrangian cost of including
`k after j. J; is the minimum Lagrangian cost associated with having
`k as the last nonzero coefficient. and Si;
`is the set of all candidate
`optimal predecessor coefficients to It. See Fig. 3.
`I) Finding the Optimal MST Coe‘fiict'ent:
`
`J5 <—
`
`3)
`
`4)
`
`1) For the A of the current iteration. compute AL,- = —E_.-+,\R.-,-
`for all nonzero coefficient pairs i.j with 3' > i.
`2} (Initialization) k'
`.— 0;
`i.-
`«— 0:
`SJ i— {0}:
`E;
`predessorlll) *— oil.
`I.-
`-— k +1;
`if k = 64. go to Step 7'. Else, continue to the
`next step.
`If Er = 0. set Sr ~— Sr_. and go to Step 3. Else. continue
`to the next step.
`5) J; ~— tnin,€3*_l[J.‘ + AJ.;.]. If .1; g fink: «— k.
`6) 5i ._ {k} U {i|(i E Sir—i and J.’ < m}.
`predecessoflk) — min—'3‘.”I [J.-' + A-Lk]. Go to Step 3 .
`The best "path“ ends in k'. i.e.. the optimal set of coefficients to be
`retained for the given A for the current block has coefficient k' as
`its LAST nonzero coefficient.
`
`Petitioner Microsoft Corporation - Ex. 1048, p. 702
`Petitioner Microsoft Corporation - EX. 1048, p. 702
`
`
`
`[BEE TRANSACTIONS ON IMAGE PROCESSING. VOL. 3. NO. 5. SEPTEMBER I994
`
`1'03
`
`WWW“ 1—D“ fu'llnue': lulu-0.1.1.0
`
`2) Backtrackt'ng to Find Entire Optinwl Set of Coefficients:
`l) Initialize the set of optimal coefficients as optset ._ {k' }.
`2)
`If predecessor (it) 2 nil. go to Step I0. Else continue to the
`next step.
`3) Get the optimal predecessor to it and include its membership in
`the set {optset}. {opted} ‘— {optscr} Ll {predecessor(i;)}.
`Go to Step 8.
`4) Done! Optimal solution of coefficients to be retained for given
`A is the set {optset}.
`Note that a key operation that ensures a fast algorithm is the
`pruning action described in Step 6. This step (see Fig. 4) eliminates
`from contention for predecessor to the next nonrero coefficient. all
`prior coefficients whose lowest cost of retaining as the last nonzero
`coefficient exceeds that of the current iteration‘s coefficient. Thus.
`if the current coefficient produces the lowest cost so far.
`it is the
`only candidate for predecessor to the next nonzero coefficient! This
`is due to the monotonic nature of the bit rate versus zero run length
`Huffman tables for JPEG and MPEG. where the cost of coding a
`nonzero coefficient is monotonically nondecreasing in the length of
`the zero-run preceding that coefficient.
`
`V. Armcnnou
`
`The optimal algorithm outlined in Section [II can be used to
`quantify the benefits of adaptive thresholding applied to the IPEG and
`MPEG coding environments. R-D curves are obtained by sweeping
`the Lagrange multiplier A through all positive values for typical
`quantization scales of interest. Fig. 5 shows the R-D curves for a
`typical image (“House") using JPEG for prethresholding quantization
`scales of 1.0 and 0.1 Note the significance of Fig. 5. Point X on curve
`“a" is the unthreshnlded ‘Teference" obtained for a scale of 1.0. Let
`us fix the bit rate for the problem at the reference X‘s bit rate of
`0.615 blpixel. Now. to see the advantage of optimal thresholding.
`observe curve “b" corresponding to a finer quantization scale of
`s = 0.? instead of 1.0. For this finer scale.
`the nonthresholded
`bit rate. corresponding to point Z. is obviously greater than that of
`X. However.
`if we now start thresholding optimally until the bit
`budget constraint imposed by curve "a“ is satisfied. we will get an
`adaptive tluesholding gain in terms of increased SNR for the same
`bit rate. Thus. point Y enjoys a 0.7 dB gain at the same bit rate over
`X. Altcmatively. if we fix the PSNR according to that of X {35.7
`dB). point W enjoys a compression advantage at the same PSNR of
`roughly 15%. Fig. 6 shows a similar curve using optimal thresholdirtg
`using an MPEG intraframe codehook. applied to an intraframe coded
`frame of the “mit” sequence.
`In our experiments. we found that “backing off" to a finer quan—
`tization scale and thresholding optimally until we achieved the same
`reference bit rate {or PSNR) as an untlu-esholded coarser quantized
`version resulted in a decent coding gain. However.
`there was an
`optimal back-off point. beyond which the performance started to
`degrade. See Fig. 6. Intuitively. overdoing the act of thresholding
`after starting with a finer quantization scale is inadvisable beyond
`a point. as the gain of representing the nonthresholded coefficients
`with less distortion is no longer outweighed by the drastic step of
`dropping entire coefficients. since for fine quantization scales. there
`is not much distortion to begin with.
`While the “optimal backed" point" depends on he particular input
`image as well as the target bit budget (or quality) constraints.
`it
`was found in all cur simulations {corresponding to typical images
`and video sequences used in the image—processing community) that
`the thresholding gain is a convex function of the amount of “back-
`off“ from the reference point. Thus. not “backing off" enough is
`suboptimal. as is “backing off" too much. with the best performance
`
`
`
`03
`
`0.6 w 0.1
`0.16
`Rat: MI: III linh
`
`0.15
`
`08
`
`Fig. 5. Optimal [Molding R-D curves for "House" image using JPFJG.
`Curve “it" corresponds to a JPE'JG quantization scale of 1.0. while curve "b"
`corresponds to a finer scale of 0.7. Note that if we fix the reference at point
`X on curve "a" corresponding to a scale of 1.0. we can achieve point Y
`by “backing off” to a finer scale of 0.? (point Z on curve “13") and then
`thresholding optimally to point Y at the same bit rate as X. Note that
`the
`thresholding gain for this example (reference X) is approximately 0.7 dB at
`a bit rate of 0.615 bpp (point Y). or alternatively about 15% reduction in hit
`rate {point W of curve “b") at the same PSNR of 35.1? dB.
`
`wmma-ommwrw
`
`moans)
`
`mill-IFM
`
`Fig. 5. Optimal tlnesholding R-D curves for an intraframe coded frame of
`the "mil" sequence using MPEG. Curve "a" corresponds to a Q? level of
`4%. curve “it“ corresponds to a finer quantizer 0? level of 40. and curve "c"
`corresponds to a still finer Q? = 32. Note that if we fix the reference at point X
`on curve “a" corresponding to a QP of 48. we can achieve point Y by “backing
`ofi" to the finer QF = 40 and thresholding optimally to point Y at the same
`bit rate as X. Note that the threeholding gain for this example {reference X)
`is approximately 0.52 dB at a bit rate of 0.3701 hpp (point Y). or alternatively
`about 12% reduction in hit rate (point W of curve "b") at the same MSE of
`”2.5. Note that if we back off too much to curve "c" with QP = 32, the
`achieved coding gain diminishes to point Z (about 0.26 dB) over point X.
`
`achieved somewhere (uniquely) in between. Due to the convex nature
`of this relationship. binary—search methods can be used to find this
`optimal back—off point.
`thresholding
`Coding results obtained from performing optimal
`on typical
`images and video-sequence frames used in the image—
`processing community revealed a coding gain of about (1&1 dB
`or altematively about [245% bit—rate compression improvement
`
`Petitioner Microsoft Corporation - Ex. 1048, p. 703
`Petitioner Microsoft Corporation - EX. 1048, p. 703
`
`
`
`T94
`
`IEEE, TRANSACTIONS ON IMAGE PROCESSING. VOL.
`
`.1. N0. 5. SEPTEMBER 1994
`
`mun-um(use;
`
`lhrcsholding (curve "b“J
`Fig. 8. Comparison in performance of oplimai
`versus "dumb" heuristic (curve “3"] of retaining the largest
`is coefficients
`of each block for k = ]. 2.1 .
`-
`. G4. The highest quality "pivot" point for both
`schemes corresponds to an unthresholdcd JPEG scale of 0.6. Note how the
`optimal scheme beats the heuristic by over 5 dB for bit rates below 0.6 bits
`per pixei {at 0.52 bits per pixel. the gain is 5.6 :13].
`
`[i] S_-W. Wu and A. Gersho. "Rate-constrained picture-adaptive quanti-
`zation for JPEG baseline coders." in Pm: fCASSP—E‘jl (Minneapolis.
`MN). Apr.
`r993.
`
`
`
`{at
`
`(b)
`
`Fig. T. Subjective results of optimal thresholding in a JPEG framework for
`the "House" image: [a] Linthresholded image {JPEG scale = 3.0. PSNR : 31.5
`dB, bit rate -—- 0.28 bits per pixel]; (h) optimally thresholded image [JPEG scale
`= 2.0. PSNR = 32.3 dB. hit rate = {1.23 bits per pixel).
`
`while retaining complete decoder compatibility. Optimal thresholding
`seems to be most subjectively beneficial
`in the case of low to
`medium bit-rate coding. as evidenced by Fig. T. Fig. Ila} shows
`as a nonthresholded reference the “House“ image coded with 1 PEG
`using a quantization scale of 3.0. The thresholded versions using
`a scale of 2.0 are shown in Fig. ?(b). The coding gain is 0.8 dB.
`and as can be seen. the subjective quality is also better. Intuitively.
`this is because for low—bit—rate applications. it is better to represent
`the low—frequency coefficients with maximum fidelity while dropping
`the expensive high-frequency coefficients. This gives a smoother but
`less noisy image. which is the best one can do at low bit rates. Thus.
`adaptive Lhresholdirtg can take the place of noise shaping or low—pass
`filtering without any external processing and without
`the decoder
`"skipping a beat.“
`
`Using the optimal algorithm as a benchmark. we compared the
`performance of a fast heuristic algorithm that retains the It'
`largest
`(in absolute magnitude) DCT coefficients for each 8 x 8 block
`for K < 64 as in [6]. When the heuristic of [6]
`is extended to
`a JPEG quantization framework.
`it can be seen from Fig. 8 that
`considerabie gain can be obtained (over 5 dB) by resorting to our
`optimal algorithm. with the gain getting larger as the number of
`retained coefficients K per block decreases. Our optimal thresholding
`algorithm took a CPU time of about 1.3 s on a SPARC-II workstation
`to run a typical iteration of the thresholding algorithm on a 256 x 256
`image and about 6.1 s on a .512 x 512 image.
`Thus we have established, both subjectively and objectively. the
`benefits of optimally thresholdirig the DCT coefficients in a JPEG or
`MPEG coding environment. The primary advantage of our approach
`is that it is completely compatible with existing JPEG and MPEG
`decoders.
`
`REFERENCES
`
`[1] "IPEG technical specification: Revision (DRAFT), joint photographic
`experts group. ISOJ'IFJC I'I‘CHSCZMGS. CCITT SGVlll." Aug. I990.
`[2] "MPEG video simulation model
`three. ISO. coded representation of
`picture and audio information." 1990.
`thresholding for MPEG-basod
`[3] Y. Yu and D. Anastassiou. "Optimal
`video coding.“ Tech. Rep. Elec. Eng. Depl.. Center for Telecommun..
`Columbia University. New York, NY. 1991.
`{4] Y. Shoham and A. Gersho, "Efficient bit allocation for an arbitrary set
`of quantiaers." {EEE Trans. Awash. Speech. Signal Processing. vol. 36.
`pp. 1445—1453. Sept. 1988.
`[5] K. Ramchandran and M. Vetterli. "Best wavelet packet bases in a
`rate-distonion sense." to be published.
`[6] C.-T. Chen and D. LeGall. "A K-th order adaptive transform coding
`algorithm for image data compression." in PFOC. SHE: Applications of
`Digimi Image Processing XII. vol. 1153. 1939. pp. 71—18.
`
`Petitioner Microsoft Corporation - Ex. 1048, p. 704
`Petitioner Microsoft Corporation - EX. 1048, p. 704
`
`