throbber
Probability
`estimation
`for the Q-Coder
`
`by W. B. Pennebaker
`J. L. Mitchell
`
`The Q-Coder is an important new development
`in binary arithmetic coding. It combines a simple
`but efficient arithmetic approximation for the
`multiply operation, a new formalism which yields
`optimally efficient hardware and software
`implementations, and a new technique for
`estimating symbol probabilities which matches
`the performance of any method known. This
`paper describes the probability-estimation
`technique. The probability changes are
`estimated solely from renormalizations in the
`coding process and require no additional
`counters. The estimation process can be
`implemented as a finite-state machine, and is
`simple enough to allow precise theoretical
`modeling of single-context coding. Approximate
`models have been developed for a more
`complex multi-rate version of the estimator and
`for mixed-context coding. Experimental studies
`verifying the modeling and showing the
`performance achieved for a variety of image-
`coding models are presented.
`
`1. Introduction
`Arithmetic coding, introduced several years ago by Rissanen
`[ 11 and Pasco [2] and generalized by Langdon and Rissanen
`[3] (see Langdon [4] for a comprehensive review article), is a
`
`Vopyright 1988 by International Business Machines Corporation.
`Copying in printed form for private use is permitted without
`payment of royalty provided that ( I ) each reproduction is done
`without alteration and (2) the Journal reference and IBM copyright
`notice are included on the first page. The title and abstract, but no
`other portions, of this paper may be copied or distributed royalty
`free without further permission by computer-based and other
`information-service systems. Permission to republish any other
`portion of this paper must be obtained from the Editor.
`
`powerful technique for coding of strings of data symbols. It
`derives its power from an ability to approach the entropy
`limit in coding efficiency and to dynamically alter the
`estimate of the probability of the symbol being encoded.
`A new binary arithmetic coding system, the Q-Coder, has
`been developed as a joint effort between the authors of this
`paper and colleagues at the IBM Almaden Research Center.
`The new probability-estimation technique used in the Q-
`Coder is presented in this paper; companion papers describe
`the basic principles of the Q-Coder [ 5 ] , software
`implementations of the Q-Coder [6], and the arithmetic
`coding procedures which allow compatible yet optimal
`hardware and software structures [7, 81. The Q-Coder is part
`of a proposal submitted to the CCITT and IS0 Joint
`Photographic Experts Group (JPEG) for color photographic
`image compression [9].
`A description of the general structure of the Q-Coder
`arithmetic coding section is given in [5, 61. Briefly, the
`arithmetic coder contains two key registers, the interval
`register A and the code register C. The interval register
`contains the measure of the current probability interval, and
`the code register contains a pointer to the interval. In order
`to use fixed-precision integer arithmetic for the coding
`process, the interval and code registers must be periodically
`renormalized.
`When a given symbol is coded, the interval measure in A
`is reduced to the subinterval for that symbol, and the code
`string is repositioned to point within the subinterval. Ideally,
`the scaling of the interval is done by multiplying the current-
`interval measure A by the probability estimate of the symbol
`which occurred. If the less probable symbol (LPS or L)
`
`probability q is estimated as a and the more probable
`symbol (MPS or M) probability p is estimated as 1 - a, the
`subintervals, A X a and A - (A X Qe). The multiplication
`
`binary coding process divides the interval into two
`
`IBM J. RES. DEVELOP, VOL. 32 NO. 6 NOVEMBER 1988
`
`W. B. PENNEBAKER AND I. L. MITCHELL
`
`Unified Patents, Ex. 1013
`
`000001
`
`

`

`for mixed contexts and a random-interval model. Theory
`and experiments are compared for a single context in
`Section 5. Section 6 extends the probability estimation to a
`multi-rate system. Mixed-context coding is analyzed in
`Section 7.
`
`2. Estimation process
`The basic concept is as follows: The estimated LPS
`probability, Q,, is taken from a fixed table of allowed values.
`Renormalization occurs either when an L event is
`encountered or when the interval falls below 0.75 following
`an M event. When renormalization after an LPS is
`encountered, the index to the current Q, is shifted to a larger
`Q,. Conversely, whenever the MPS renormalization is
`encountered, the index is shifted to a smaller Q,. (The terms
`MPS renormalization and LPS renormalization are usually
`abbreviated as “MPS renorm” and “LPS renorm” in the text
`following.)
`The following approximate calculation suggests that the
`estimate of the probability obtained from the table of
`allowed Q, values will adapt to and closely approach the true
`LPS probability q of a binary symbol sequence. Given a
`starting value A for the interval register immediately
`following the last renormalization, N successive MPS events
`must occur to reach the MPS renorm point:
`N = 1 + [AA/Q,l,
`where Q, is the current estimated value of q, AA is the
`change in the interval (0.75 > AA 2 0), and the brackets
`denote the greatest integer function (rounding down to the
`nearest integer). The probability of getting N MPS events in
`a row (and an MPS renorm) is
`p,,, = (1 - 4 ) *
`(2)
`For simplicity, consider the case where q is small. Taking the
`natural logarithm of Equation (2) and approximating
`In (1 - 4 ) by -q,
`e - W d / e . )
`Pmmr
`The magnitude of AA is dependent on the type of
`renormalization. If the MPS renorm occurred last, AA is
`close to 0.75; if the LPS renorm occurred last, AA is
`typically somewhat smaller than 0.75. If the effective value
`of AA is assumed to be an appropriate average and the
`change in Q, is the same for both types of renorms, the
`renorm probabilities are balanced at the point where
`
`(1)
`
`(3)
`
`N
`
`can be avoided by introducing a tight constraint on the
`renormalization [3-5, 101. If the probability-interval measure
`A falls within the bounds 0.75 5 A < 1.5, A can be
`approximated by 1 when multiplying by $. The
`subintervals are then approximated by Q, and A - Q,, and
`the multiplication is avoided.
`Although the renormalization is required for the
`arithmetic approximation, it can serve another important
`purpose-it can also be used to estimate the probability of
`the symbol being coded.
`A number of different but somewhat related techniques
`have been used to estimate symbol probabilities. Langdon
`and Rissanen [ 1 11 and Pennebaker and Mitchell [ 121 have
`both used confidence-interval techniques to determine
`whether the current estimate Q, of the LPS probability
`should be changed. In [ 121, the degree to which the
`confidence limit is exceeded is used to determine the degree
`to which Q, should be changed; this gives a multi-rate
`estimation process. Goertzel and Mitchell [ 131 used a
`counting technique with periodic renormalization; although
`in principle a divide operation is required, the precision is
`small enough that a lookup table inversion and multiply can
`be used. Mohiuddin, Rissanen, and Wax [ 141 devised an
`intriguing multi-rate adapter in which the choice of
`estimation rate is based on a local minimization of the code
`string being generated. Although relatively complex to
`implement, this technique is quite powerful; we regard it as a
`standard against which other estimation techniques can be
`measured. Finally, Helman et al. [ 151 used a Monte Carlo
`technique involving the LPS renormalization and symbol
`counts for updating the estimate of the LPS probability in
`the Skew Coder [3].
`The rest of this paper is devoted to an analysis of a
`probability-estimation technique in which the probability is
`estimated solely from renormalization.’ The renewed
`attempt to use renormalization as the basis for probability
`estimation was inspired by earlier work on the Log Coder
`[ 121. In that work the computations for probability
`estimation were minimized by estimating probability each
`time one byte of compressed data was generated. While this
`system proved to be simple to implement and provided
`accurate estimates, it failed to adapt quickly enough in
`coding of facsimile data sets. On the other hand, calculating
`a new probability after the coding of each symbol, as was
`done by Rissanen and Mohiuddin [lo], provided good
`estimates and fast adaptation but involved far too much
`computation. Estimation after each renormalization
`appeared to be an attractive compromise between these two
`schemes.
`In Section 2 the estimation process is described. Section 3
`develops the exact theoretical modeling of that process for a
`single context. Section 4 continues the theoretical modeling
`’ In unpublished work G. Goertzel and J. L. Mitchell explored and abandoned this
`possibility because they were unable
`to obtain good coding efficiencies.
`
`738
`
`The equilibrium is stable at this balance point. If Qe is too
`
`W. B. PENNEBAKER AND 1. L. MITCHELL
`
`IBM J. RES. DEVELOP. VOL. 32 NO. 6 NOVEMBER 1988
`
`Unified Patents, Ex. 1013
`
`000002
`
`

`

`is also large and the system tends to move to
`large, P,,,
`smaller Q,. Conversely, if Q, is too small, P,,,
`is small and
`the system tends to move to larger Q,. Therefore, the system
`adapts to and balances approximately at the point q = Q, .
`Although these calculations are approximate, exact
`calculations which follow the same general approach and
`prove the point more rigorously are described below.
`As will be seen, the coding efficiencies achieved by the
`Q-Coder with this probability estimator for pseudorandom
`data sets are usually not as good as those obtainable with
`simple estimates of probability from counts [ 131. Coding
`inefficiency is due partly to the lower coding efficiency
`inherent in the arithmetic approximation to the multiply,
`partly to small but systematic errors in Q,, partly to the
`granularity of the set of allowed values of Q,, and partly to
`the intrinsic distribution in Q, resulting from the stochastic
`estimation process. However, the estimation process tracks
`variations in symbol probability very well. Consequently, the
`coding efficiency achieved with the less stable symbol
`probabilities encountered in many real coding environments
`is extremely good, competitive with the best that can be
`done by any technique currently known.
`This estimation process works well for both single-context
`and mixed-context coding. For a single-context system, the
`renormalization process is used to estimate only one
`probability. For mixed-context coding, the coding decisions
`are conditioned by past history, and a different probability
`must be estimated for each conditioning state or context. It
`is perhaps somewhat unexpected that renormalization of a
`single A register can be used to estimate the many different
`probabilities required in the mixed-context case.
`
`3. Modeling of the estimation process for a
`single context
`The estimator can be defined as a finite-state machine, that
`is, a table of Q, values and associated next states for each
`type of renorm (i.e., new table positions). The rate of change
`of Q, is determined by the granularity of the table of Q,
`values and by the new state associated with each Q, value for
`the two types of renorms. Figure 1 diagrams sections of the
`actual finite-state machine used to estimate the probabilities.
`The leftmost section illustrates the exchange of MPS and
`LPS definitions at Qe z 0.5 (kex is defined to be the particular
`state index, k, where this exchange occurs). The center
`section shows a region where the finite-state machine
`changes from a single-state jump on LPS to a double-state
`jump. Some parts of the finite-state machine require a jump
`of more than one state in order to correctly estimate the
`probability. The rightmost section shows the diagram for the
`smallest values of Q,. This last section shows how the
`transition at the MPS renorm for the smallest Q, value is
`returned to that state.
`Conditional changes in estimated probability, such as
`changing Q, only after the occurrence of two MPS renorms
`in a row, can readily be incorporated by allowing multiple
`
`entries of a given Qe value. A related form of this can be
`seen in the diagram for the lowest Q, state, where entry to
`that state can only occur after two MPS renorms in
`sequence. Handling conditional effects in this manner greatly
`simplifies the theoretical treatment, the only penalty being
`the need to solve a relatively large number of simultaneous
`equations when complex conditional structures are being
`considered.
`Figure 2 illustrates the sequencing of the probability
`estimator for an LPS followed by a sequence of MPSs. In
`Figure 2 the ordinate is the interval (A-register) value, and
`the abscissa is the discrete allowed values of Q, . The LPS
`renormalization causes a transition to a known A-register
`value and a known state in the finite-state machine (in this
`case from a
`of 0.42206 to the appropriate starting A-
`register value at Q, = 0.46893. (The particular Q, values in
`the figure are taken from the actual optimized 5-bit Q,
`values in Table 1, shown later. As MPSs are coded, the
`interval decays until it drops below 0.75. At that point a
`transition is made to a smaller Q,, and the interval is
`renormalized by doubling until it is greater than 0.75. In
`most cases only one doubling is needed. Thus, the pair of
`doublings shown at Q, = 0.32831 is the exception rather
`than the rule. Whether one or two doublings occur is of no
`consequence for the probability estimation. However, since
`each doubling produces one bit in the code string (ignoring
`bit stuffing for a carry), the extra doubling is important in
`the calculation of the coding efficiency.
`Figure 2 illustrates the sequencing behavior which
`underlies the calculation of the estimation process. The first
`half of the problem is determining the probability that the
`estimate will be at each of the allowed Q, values. If we define
`n, as the occupation probability for the state corresponding
`to Q,[k] (the kth allowed value of e,), balance of transition
`probabilities into and out of the kth state gives
`
`where X , = q for k 2 k,, and X, = 1 - q for k < kex. The
`symbols rkj and tkj are defined below. The table of allowed
`Q, is defined to have mirror symmetry at the boundary
`between keX and k,, - 1. Thus, k,, is the index in the table of
`Q, where the definitions of least probable symbol and most
`probable symbol are exchanged. (For k < kex the table
`provides an estimate of 1 - q rather than q.)
`The first term in the summation represents the transition
`probability into the state at Q,[k] from all states j which can
`reach the state k by an LPS followed by a sequence of MPSs.
`The exponent tkj is the number of MPSs needed to just enter
`the kth state when starting from an LPS at statej. Thus, for
`the example sketched in Figure 2, state j is marked with an
`asterisk, and state k could be any one of the states which is
`reached by the MPS sequence following the LPS.
`The second term in the summation represents the
`transition probability out of the state k, given that the MPS
`
`IBM I. RES. DEVELOP. VOL. 32 NO. 6 NOVEMBER 1988
`
`W. B. PENNEBAKER AND I. L. MITCHELL
`
`739
`
`Unified Patents, Ex. 1013
`
`000003
`
`

`

`e
`
`kex - 1
`
`MPS
`
`LPS
`
`
`
`
`
`LPS
`
`MPS
`
`*MPS exchange
`
`MPS
`
`MPS
`
`LPS*
`
`MPS
`
`MPS
`
`MPS
`
`MPS
`
`Sections of the state diagram for the probability estimator.
`
`e
`
`MPS
`
`MPS
`
`MPS
`
`MPS
`
`MPS
`
`MPS
`
`sequence continues until the interval decays below 0.75 for
`the kth state. The exponent rkj is the number of MPSs
`needed to just leave the kth state, counting from the symbol
`after the LPS at state j.
`The summation therefore represents the net gain in nk due
`to transitions from all statesj which can reach state k
`through an LPS followed by a sequence of MPSs. It is
`equated to the probability of transition out of state k due to
`an LPS event. (All probabilities are per-symbol encoded.)
`The probability of transition out of state k via the LPS path
`is the probability of the LPS multiplied by the occupation
`probability nk.
`Normalization requires
`1 nj= 1.
`
`(7 )
`
`j
`The numerical solution of these equations can present
`problems, in that the nk can be vanishingly small when the
`index k is far from
`the value for the most probablGvalue of
`
`Q,. Therefore, the equations are reduced to a subset
`involving only the nk near the most probable value of a.
`
`Contributions from members outside this range are assumed
`to be zero. The set of equations must be large enough that
`the error in truncating the set is small, yet small enough to
`avoid arithmetic precision problems in the calculation of
`determinants by the method of Gaussian elimination. Except
`table, the center value of k for the
`near the end of the
`subset is defined as the index for which Q,(k] is closest to q.
`Because the table of allowed values of Q, is finite in
`extent, the equations must be reformulated to take end
`conditions into account. This is done by assuming that
`either tkj = 0 or rkj = 00 in Equation (6). The latter condition
`exactly describes the closure of the state diagram for k,,
`in
`Figure 1, in that the system cannot exit from the k,, state
`after the MPS renorm. It also approximates the closure if the
`equation set is truncated before k reaches kma. Truncation
`near the other endpoint, k = kmin, is described by one of the
`two approximations, the choice depending on whether a
`
`740
`
`W. E. PENNEBAKER AND J. L. MITCHELL
`
`IBM J. RES. DEVELOP. VOL. 32 NO. 6 NOVEMBER 1988
`
`Unified Patents, Ex. 1013
`
`000004
`
`

`

`particular k is less than the exchange index, k,,, or not. The
`two assumptions are equivalent to assuming that either LPS
`or MPS renormalization is highly unlikely. Note that the
`approximate closure condition at the smallest value of k is
`not needed, as it is replaced by Equation (7).
`Given a table of Q, values and associated index changes to
`new Q, values for each renormalization path, these equations
`provide an exact solution of the probability of the system
`being at each Q,[k]. However, they hold only for single-
`context coding.
`The second half of the problem is the calculation of
`coding rate. Refer again to Figure 2. The current occupation
`of each state in the system is determined by the balance of
`LPS and MPS transitions into and out of that state. For each
`0.30487
`0.32831
`
`Q, the probability of the LPS renormalization is known by
`definition (q), and the probability of each succeeding MPS
`renormalization is readily calculated. The bits generated by
`each renormalization are also readily calculated. The net bit
`rate R, for the kth state is thus
`
`Rk = nkXk BLPS,k + x
`
`BMp&j
`
`I
`
`>
`
`(' -
`
`(8)
`
`[
`j
`
`where j ranges over all MPS renormalizations which can
`occur following the LPS renorm. BLPS,, is the number of bits
`generated in renormalizing Q,[k] to the allowed interval
`range. BMps,j is the number of bits generated by the jth MPS
`renorm. As defined earlier, X , = q for k 2 k,,, X, = 1 - q for
`k < k,,, and the exponent rkj is the number of MPSs needed
`to reach the jth MPS renorm after the LPS event from state
`k.
`The total coding rate in bits per symbol is therefore
`R = C R , .
`k
`
`4. Mixed contexts: The random-interval model
`The calculations in Section 3 are not applicable to coding of
`mixed-context symbols. If the context varies from one
`symbol to the next, Q, also varies. The calculation of
`probabilities of MPS renormalization and the associated bit
`rate is therefore far more complex. Let us consider the
`following hypothesis: The probability of the various interval
`values is sufficiently randomized by the effects of multiple
`contexts that the interval-register values are uniformly
`distributed in the interval from 0.75 to 1.5.
`Assuming that the above hypothesis is valid, the following
`equations give the LPS renormalization probability P,,k and
`the MPS renormalization probability Pm,k :
`
`The equations describing the balance in transition
`probabilities are similar to those developed in Section 3,
`except that the probability of the MPS renorm is calculated
`
`t
`
`I
`
`t I
`
`-r
`I
`
`O.,,t
`
`
`
`
`
` 0.42206*
`
`0.46893
`
`Allowed values of Q,
`
`I LPS.
`
`b Example of the sequencing of the probability estimator following an
`
`from Equation (1 1) rather than from the equations for the
`probability of a sequence of MPSs. Therefore, the balance
`for state k is given by
`ntPm,t - nk(Pm,k + p l , k ) = O 9
`nspI,s +
`
`S
`
`I
`where s is summed over all states which can make a
`transition to k via LPS, and t is summed over all states
`which can make a transition to k via a single MPS renorm.
`The normalization condition, Equation (7), completes the
`set of equations to be solved. Numerical precision again
`requires that the set of equations be truncated. Therefore,
`endpoint conditions are handled in the same manner as
`discussed in Section 3.
`The calculation of coding efficiency is done differently for
`the random-interval model. For a given interval A and a
`
`given estimated LPS probability a, the relative coding
`
`efficiency is
`
`where H i s the entropy and R, is the bit rate per symbol for
`state k. Defining p = 1 - q, the entropy is given by
`= -4 log 2(4) - P 1% 2 ( P h
`(14)
`and, for a uniform distribution of A values in the interval
`0.75 to 1.5, R, is given by
`
`IBM J. RES, DEVELOP. VOL. 32 NO, 6 NOVEMBER 1988
`
`W. B. PENNEBAKER AND J. L. MITCHELL
`
`Unified Patents, Ex. 1013
`
`000005
`
`

`

`0.10
`
`0.08
`
`0.06
`
`6
`.- 8
`5
`0 .-
`b
`3 0.04
`
`0.02
`
`0.00
`
`I single context using Table 1. The measured points were obtained
`from pseudorandom data sets of fixed q.
`4
`E
`
`Table 1 Probability estimates for 5-bit &.
`
`Q c
`index
`
`Q,
`
`value value
`
`
`
`0
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`
`X'OACl'
`X'OA8 1 '
`X'OAO 1 '
`X'0901'
`X'0701'
`X'068 1 '
`X'0601'
`X'0501'
`X'048 1 '
`X'0441'
`X'0381'
`X'0301'
`X'02C1'
`X'028 1 '
`X'024 1 '
`X'0181'
`X'0121'
`X'OOE 1 '
`X'OOA 1 '
`X'007 1 '
`X'0059'
`X'0053'
`X'0027'
`X'0017'
`X'0013'
`X'OOOB'
`X'0007'
`X'0005'
`X'0003'
`X'OOO 1 '
`
`742
`
`Decimal Q Decr Incr
`
`
`(LPS) (MPS)
`0
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`2
`3
`2
`3
`2
`3
`2
`
`1
`
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`1
`0
`
`0.50409
`0.49237
`0.46893
`0.42206
`0.32831
`1
`
`0.30487
`0.28 143
`1
`
`0.23456
`0.21 112
`0.19940
`0.16425
`0.14081
`0.12909
`0.11737
`0.10565
`0.07050
`0.05295
`0.04 120
`0.02948
`0.02069
`0.0 1630
`0.0 1520
`0.007 14
`0.0042 1
`0.00348
`0.0020 1
`0.00128
`0.00092
`0.00055
`0.000 1 8
`
`MPS
`exch
`
`1
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`0
`
`Rk,LPS =
`
`X , In ( X , ) - X , In (X,) - X , + X o
`>
`
`(17)
`
`where X, = 0.75/Qe[kl and X , = 1.5/Qe[k1.
`
`5. Comparison between theory and experiments
`for single-context coding
`In Figure 3, the results of calculations (solid curve) based on
`the equations derived in Sections 3 and 4 are compared to
`experimental results (circles). For single-context coding, the
`agreement between experiment and the exact theory for that
`case is excellent (as expected). The corresponding table of
`estimated Q, and the associated schedule of changes in index
`following renormalization are given in Table 1. This table
`was selected after much experimentation [ 51; it represents
`the best compromise among simplicity, minimum storage
`requirements for each context (6 bits),' reasonable coding
`efficiency for fixed statistics, and good performance on
`mixed-context data obtained from both facsimile-
`compression models and continuous-tone image-
`compression models.
`The Q, values in Table 1 are expressed as hexadecimal
`
`to the decimal fractional representation. The "Decr" column
`shows the decrement in the Qe index when moving to larger
`
`integers. Divide these a values by X'1000' X 4/3 to convert
`a following an LPS renormalization. The "Incr" column
`shows the increment in the Q index when moving to
`smaller Q, following an MPS renormalization. Where the
`"MPS exch" column entry is 1, an LPS will cause an
`exchange in the MPS definition.
`Figure 4 shows several theoretical (dashed curves) and
`experimental (solid curves) distributions of Qe for the 5-bit
`case. Again, agreement between calculation and
`measurement is excellent.
`Figures 5 and 6 show similar experimental and theoretical
`
`calculations for the 6-bit a table (Table 2). As might be
`
`expected, the finer granularity of this table significantly
`decreases the coding inefficiency for stationary statistics.
`(Compare Figure 3 with Figure 5.)
`Figure 6 shows several calculated and measured
`distributions of Q, for the 6-bit
`case. Comparing these
`distributions to those in Figure 4, the increase in coding
`
`The initial impetus to use very small amounts of storage per context came from
`G. G. Langdon. Somewhat to our surprise, he was able to demonstrate that our
`allowed a having less than 32 entries. We subsequently worked jointly to optimize
`estimation technique could achieve relatively good performance with a table of
`
`this singlerate system.
`
`W. B. PENNEBAKER AND 1. L. MITCHELL
`
`IBM J. RES. DEVELOP. VOL. 32 NO. 6 NOVEMBER 1988
`
`Unified Patents, Ex. 1013
`
`000006
`
`

`

`a" :,l
`#A,
`
`0.7
`
`0.6
`0.5
`
`0.4
`
`0.3
`
`5'
`
`0.2
`
`0.1
`
`0.0
`
`8 12
`
` 16
`
` 20
`
`
`
` 24 44 48 28
`
`
`
` 32
`
` 36
`
` 40
`
`0.10
`
`0.08
`
`-
`5- 0.06
`
`2 0.04
`
`0.02
`
`0.00
`
`-20-16-12-8
`
`-4 0 4
`
`
`
`12
`
`8
`
`
`
`16
`
`5 0.10 -
`
`$0.06 -
`-
`0.04
`-
`-
`
`0.02
`20
`0.00
`
`
`
`Q, index
`
`Q, index
`
`Q, index
`
`-8 -4 0 4 8
`
`12 16
`
`20 24 28 32
`
`efficiency with a 6-bit Q, is due to the reduced spread in the
`distribution. In both cases, the peak in the distribution is
`quite close to the desired Q,. The coarser granularity of the
`5-bit Q, table is not the only factor. The changes in table
`position for the 5-bit Q, are often larger as well. This
`increases the spread in the Q, distribution.
`The effect of the estimation process can be eliminated by
`choosing the Q, so as to minimize the coding inefficiency.
`Such a procedure might be appropriate when the probability
`is known a priori. The coding inefficiency for this special
`case is shown in Figure 7 for two cases. The first case (solid
`line) is for the 12-bit integer representation of with no
`additional granularity introduced. This curve gives the lower
`bound for the coding inefficiency that can be achieved with
`this integer representation. Quantization of Q, to the 12-bit
`integer representation causes the sequence of distinct
`minima which is noticeable for q < 2-l'. The sequence of
`distinct minima for q larger than about 2-3 reflects the fact
`that the bit rate stays constant over short intervals in Q,.
`These intervals of constant bit rate result from the arithmetic
`approximation and renormalization used in the Q-Coder.
`The dashed line in Figure 7 represents the 12-bit integer
`representation and the granularity of the 5-bit Q, table.
`One feature of Tables 1 and 2 is the avoidance of Q,
`values which renormalize to X' 1000'. For the integer
`representation chosen for the tables this is the minimum A
`value allowed before MPS renormalization must occur.
`Consequently, when the Q, value renormalizes to an A value
`too close to X' lOOO', the probability of the MPS renorm
`becomes very large. If A is exactly X' lOOO', any MPS will
`
`0.08 -
`
`A
`8
`.p 0.06
`$
`
`-
`2 .* # 0.04-
`
`V
`
`0.02 -
`
`2
`
`4
`
`6
`
`
`
` 8 12
`
`
`
`10
`
`immediately trigger an MPS renorm. In single-context
`coding, this creates a trap for the estimator. Note, however,
`that the smallest Q, entry in the table must renormalize to
`
`IBM J. RES. DEVELOP. VOL. 32 NO. 6 NOVEMBER 1988
`
`W. E. PENNEBAKER A N D 1. L. MITCHELL
`
`Unified Patents, Ex. 1013
`
`000007
`
`

`

`L m1:
`
`L
`
`0.8
`0.7
`0.6
`
`(C)
`
`0.16
`
`0.04
`
`0.00
`
`-10 0 10 20 30 40
`
` 50 60 70
`
`0.2
`0.1
`0.0
`
`:
`
`
`,
`,
`,
`,
`I
`20 30 40 50 60 70 80 90 1 0 0
`
`Q, index
`
`Q, index
`
`Q, index
`
`I
`t - 12-bit quantization only
`
`O.1°
`
`0.08
`
`I
`
`I
`2
`
`I
`4
`
`I
` 6
`
`I
`8
`12
`
`
`
`I
`10
`
`I
`
`I
`
`-log* (4)
`
`8 Coding inefficiency as a function of q. The solid curve is for a Q,
`i table which allows all values between X'ACI' and 1. The Q, value
`used for any given q is the smallest value in the table that minimizes
`f
`the coding inefficiency. The dashed curve is for the 5-bit Q, table
`1
`(Table I), again with the smallest value of Q, chosen that minimizes
`4
`the coding inefficiency.
`
`744
`
`X' 1000'. When the system is at that entry, the LPS renorm
`moves the Q, pointer such that at least two MPS renorms in
`succession are needed to return to the smallest entry. This
`avoids the trap.3
`The calculations and measurements described in this
`section show that coding inefficiency is strongly influenced
`by the Q, table granularity and the amount of change in
`index following renormalization. Table granularity and the
`amount of change directly influence the rate at which the
`system adapts to a change in q. However, the faster the
`adaptation rate, the more the distribution in Q, is spread,
`and the larger the coding inefficiency is for stationary
`statistics.
`
`6. Multi-rate probability estimation
`In real data sets the symbol probability can vary quite
`widely. Consequently, the adaptive nature of the probability
`estimator is extremely important in achieving good coding
`efficiencies. The two different Q, table granularities discussed
`in the preceding section exhibit quite different coding
`efficiencies for fixed-probability coding. However, the coarser
`granularity of the 5-bit table allows
`the estimator to amve
`
`G. G. Langdon suggested the particular integer representation (X' 1ooO'
`corresponding to 0.75) used in developing these tables. His suggestion was motivated
`by simplicity in hardware implementation, in that this representation allowed a single-
`bit test to determine when renormahtion of the interval was needed. However, this
`representation also guaranteed that the estimator would be trappA at the smallest
`integer value, 0, = I , We had explored conditional renormalization as a means of
`centering the estimator at the correct 0,. Langdon suggested using it to avoid the trap.
`
`W. B. PENNEBAKER AND 1. L. MITCHELL
`
`IBM J. RES. DEVELOP. VOL. 32 NO. 6 NOVEMBER 1988
`
`Unified Patents, Ex. 1013
`
`000008
`
`

`

`at the appropriate Qe value at a cost of about half as many
`bits in the compressed data string. This illustrates the trade-
`off that must be made between good coding efficiency for
`stationary probabilities and rapid estimation of changing
`probabilities.
`A second-order process has been developed, however,
`which allows use of the 6-bit Q, table (retaining most of the
`coding efficiency for stable statistics) and yet provides very
`fast estimation of changing probabilities. This second-order
`process is based on measurement of the correlation of
`renormalizations. If the probability of a given
`renormalization is about 0.5, the probability of two
`renormalizations in sequence being the same is also about
`0.5. Only if the current estimate of Q, is poor will the
`renormalization correlation probability significantly exceed
`0.5.
`Correlation of renormalizations is detected by comparing
`the previous renormalization (for a given context) to the
`current one. A 4-bit counter (R,), kept individually for each
`context, is used to determine the degree to which this
`correlation is occurring. The counter is incremented by one
`if the renormalization is the same as the last and
`decremented by two if the renormalization is different. The
`one exception to this rule is at the minimum value of Q,.
`A spurious indication of correlation would result if R,, were
`incremented in that state. If R,, is zero, the schedule for
`change in the
`index shown

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket