`Practical Implementations of
`Arithmetic Coding
`Arithmetic Coding
`
`Paul G. Howard and Jeffrey Scott Vitter
`Paul G. Howard and Je rey Scott Vitter
`
`Brown University
`Brown University
`Department of Computer Science
`Department of Computer Science
`Technical Report No. 92-18
`Technical Report No.
`Revised version, April 1992
`Revised version, April
`(Formerly Technical Report No. CS-91-45)
`Formerly Technical Report No. CS
`
`Appears in Image and Text Compression,
`Appears in Image and Text Compression,
`James A. Storer, ed., Kluwer Academic Publishers, Norwell, MA, 1992, pages 85-112.
`James A. Storer, ed., Kluwer Academic Publishers, Norwell, MA, , pages .
`
`A shortened version appears in the proceedings of the
`A shortened version appears in the proceedings of the
`International Conference on Advances in Communication and Control (COMCON 3),
`International Conference on Advances in Communication and Control COMCON ,
`Victoria, British Columbia, Canada, October 16-18, 1991.
`Victoria, British Columbia, Canada, October , .
`
`Unified Patents, Ex. 1007
`
`000001
`
`
`
`Unified Patents, Ex. 1007
`
`000002
`
`
`
`PRACTICAL IMPLEMENTATIONS OF
`Practical Implementations of
`ARITHMETIC CODING'
`Arithmetic Coding
`
`Paul G. Howard2
`Jeffrey Scott Vitter3
`Je rey Scott Vitter
`Paul G. Howard
`Department of Computer Science
`Department of Computer Science
`Brown University
`Brown University
`Providence, R.I. 02912-1910
`Providence, R.I.
`
`Abstract
`Abstract
`We provide a tutorial on arithmetic coding, showing how it provides nearly
`We provide a tutorial on arithmetic coding, showing how it provides nearly
`optimal data compression and how it can be matched with almost any prob-
`optimal data compression and how it can be matched with almost any prob-
`abilistic model. We indicate the main disadvantage of arithmetic coding, its
`abilistic model. We indicate the main disadvantage of arithmetic coding, its
`slowness, and give the basis of a fast, space-efficient, approximate arithmetic
`slowness, and give the basis of a fast, space-e cient, approximate arithmetic
`coder with only minimal loss of compression efficiency. Our coder is based on
`coder with only minimal loss of compression e ciency. Our coder is based on
`the replacement of arithmetic by table lookups coupled with a new deterministic
`the replacement of arithmetic by table lookups coupled with a new deterministic
`probability estimation scheme.
`probability estimation scheme.
`
`Index terms: Data compression, arithmetic coding, adaptive modeling, analysis
`Index terms: Data compression, arithmetic coding, adaptive modeling, analysis
`of algorithms, data structures, low precision arithmetic.
`of algorithms, data structures, low precision arithmetic.
`
`'A similar version of this paper appears in Image and Text Compression, James A. Storer, ed.,
`Kluwer Academic Publishers, Norwell, MA, 1992, 85-112. A shortened version of this paper appears
`in the proceedings of the International Conference on Advances in Communication and Control
`(COMCON 3), Victoria, British Columbia, Canada, October 16-18, 1991.
`2 Support was provided in part by NASA Graduate Student Researchers Program grant NGT-
`50420 and by a National Science Foundation Presidential Young Investigators Award grant with
`matching funds from IBM. Additional support was provided by a Universities Space Research As-
`sociation/CESDIS associate membership.
`3Support was provided in part by National Science Foundation Presidential Young Investigator
`Award CCR-9047466 with matching funds from IBM, by NSF research grant CCR-9007851, by
`Army Research Office grant DAAL03-91—G-0035, and by the Office of Naval Research and the De-
`fense Advanced Research Projects Agency under contract N00014-91—J-4052 ARPA Order No. 8225.
`Additional support was provided by a Universities Space Research Association/CESDIS associate
`membership.
`
`
`
`Unified Patents, Ex. 1007
`
`000004
`
`
`
`1
`
`
`1 Data Compression and Arithmetic Coding
` Data Compression and Arithmetic Coding
`
`Data can be compressed whenever some data symbols are more likely than others.
`Data can be compressed whenever some data symbols are more likely than others.
`Shannon [54] showed that for the best possible compression code (in the sense of
`Shannon showed that for the best possible compression code in the sense of
`minimum average code length), the output length contains a contribution of — lg p
`minimum average code length, the output length contains a contribution of (cid:0) lg p
`bits from the encoding of each symbol whose probability of occurrence is p. If we can
`bits from the encoding of each symbol whose probability of occurrence is p. If we can
`provide an accurate model for the probability of occurrence of each possible symbol
`provide an accurate model for the probability of occurrence of each possible symbol
`at every point in a file, we can use arithmetic coding to encode the symbols that
`at every point in a le, we can use arithmetic coding to encode the symbols that
`actually occur; the number of bits used by arithmetic coding to encode a symbol with
`actually occur; the number of bits used by arithmetic coding to encode a symbol with
`probability p is very nearly — lg p, so the encoding is very nearly optimal for the given
`probability p is very nearly (cid:0) lg p, so the encoding is very nearly optimal for the given
`probability estimates.
`probability estimates.
`In this paper we show by theorems and examples how arithmetic coding achieves
`In this paper we show by theorems and examples how arithmetic coding achieves
`its performance. We also point out some of the drawbacks of arithmetic coding
`its performance. We also point out some of the drawbacks of arithmetic coding
`in practice, and propose a unified compression system for overcoming them. We
`in practice, and propose a uni ed compression system for overcoming them. We
`begin by attempting to clear up some of the false impressions commonly held about
`begin by attempting to clear up some of the false impressions commonly held about
`arithmetic coding; it offers some genuine benefits, but it is not the solution to all data
`arithmetic coding; it o ers some genuine bene ts, but it is not the solution to all data
`compression problems.
`compression problems.
`The most important advantage of arithmetic coding is its flexibility: it can be
`The most important advantage of arithmetic coding is its exibility: it can be
`used in conjunction with any model that can provide a sequence of event probabilities.
`used in conjunction with any model that can provide a sequence of event probabilities.
`This advantage is significant because large compression gains can be obtained only
`This advantage is signi cant because large compression gains can be obtained only
`through the use of sophisticated models of the input data. Models used for arithmetic
`through the use of sophisticated models of the input data. Models used for arithmetic
`coding may be adaptive, and in fact a number of independent models may be used
`coding may be adaptive, and in fact a number of independent models may be used
`in succession in coding a single file. This great flexibility results from the sharp
`in succession in coding a single le. This great exibility results from the sharp
`separation of the coder from the modeling process [47]. There is a cost associated with
`separation of the coder from the modeling process . There is a cost associated with
`this flexibility: the interface between the model and the coder, while simple, places
`this exibility: the interface between the model and the coder, while simple, places
`considerable time and space demands on the model's data structures, especially in
`considerable time and space demands on the model’s data structures, especially in
`the case of a multi-symbol input alphabet.
`the case of a multi-symbol input alphabet.
`The other important advantage of arithmetic coding is its optimality. Arithmetic
`The other important advantage of arithmetic coding is its optimality. Arithmetic
`coding is optimal in theory and very nearly optimal in practice, in the sense of encod-
`coding is optimal in theory and very nearly optimal in practice, in the sense of encod-
`ing using minimal average code length. This optimality is often less important than it
`ing using minimal average code length. This optimality is often less important than it
`might seem, since Huffman coding [25] is also very nearly optimal in most cases [8,9,
`might seem, since Hu man coding is also very nearly optimal in most cases , ,
`18,39]. When the probability of some single symbol is close to 1, however, arithmetic
` , . When the probability of some single symbol is close to , however, arithmetic
`coding does give considerably better compression than other methods. The case of
`coding does give considerably better compression than other methods. The case of
`highly unbalanced probabilities occurs naturally in bilevel (black and white) image
`highly unbalanced probabilities occurs naturally in bilevel black and white image
`coding, and it can also arise in the decomposition of a multi-symbol alphabet into a
`coding, and it can also arise in the decomposition of a multi-symbol alphabet into a
`sequence of binary choices.
`sequence of binary choices.
`The main disadvantage of arithmetic coding is that it tends to be slow. We shall
`The main disadvantage of arithmetic coding is that it tends to be slow. We shall
`see that the full precision form of arithmetic coding requires at least one multiplication
`see that the full precision form of arithmetic coding requires at least one multiplication
`per event and in some implementations up to two multiplications and two divisions
`per event and in some implementations up to two multiplications and two divisions
`per event. In addition, the model lookup and update operations are slow because
`per event. In addition, the model lookup and update operations are slow because
`of the input requirements of the coder. Both Huffman coding and Ziv-Lempel [59,
`of the input requirements of the coder. Both Hu man coding and Ziv-Lempel ,
`60] coding are faster because the model is represented directly in the data structures
` coding are faster because the model is represented directly in the data structures
`
`Unified Patents, Ex. 1007
`
`000005
`
`
`
`2
`
`
`2 TUTORIAL ON ARITHMETIC CODING
` TUTORIAL ON ARITHMETIC CODING
`
`used for coding. (This reduces the coding efficiency of those methods by narrowing
`used for coding. This reduces the coding e ciency of those methods by narrowing
`the range of possible models.) Much of the current research in arithmetic coding
`the range of possible models. Much of the current research in arithmetic coding
`concerns finding approximations that increase coding speed without compromising
`concerns nding approximations that increase coding speed without compromising
`compression efficiency. The most common method is to use an approximation to
`compression e ciency. The most common method is to use an approximation to
`the multiplication operation [10,27,29,43]; in this paper we present an alternative
`the multiplication operation ,, , ; in this paper we present an alternative
`approach using table lookups and approximate probability estimation.
`approach using table lookups and approximate probability estimation.
`Another disadvantage of arithmetic coding is that it does not in general produce a
`Another disadvantage of arithmetic coding is that it does not in general produce a
`prefix code. This precludes parallel coding with multiple processors. In addition, the
`pre x code. This precludes parallel coding with multiple processors. In addition, the
`potentially unbounded output delay makes real-time coding problematical in critical
`potentially unbounded output delay makes real-time coding problematical in critical
`applications, but in practice the delay seldom exceeds a few symbols, so this is not a
`applications, but in practice the delay seldom exceeds a few symbols, so this is not a
`major problem. A minor disadvantage is the need to indicate the end of the file.
`major problem. A minor disadvantage is the need to indicate the end of the le.
`One final minor problem is that arithmetic codes have poor error resistance, espe-
`One nal minor problem is that arithmetic codes have poor error resistance, espe-
`cially when used with adaptive models [5]. A single bit error in the encoded file causes
`cially when used with adaptive models . A single bit error in the encoded le causes
`the decoder's internal state to be in error, making the remainder of the decoded file
`the decoder’s internal state to be in error, making the remainder of the decoded le
`wrong. In fact this is a drawback of all adaptive codes, including Ziv-Lempel codes
`wrong. In fact this is a drawback of all adaptive codes, including Ziv-Lempel codes
`and adaptive Huffman codes [12,15,18,26,55,56]. In practice, the poor error resistance
`and adaptive Hu man codes , , ,,,. In practice, the poor error resistance
`of adaptive coding is unimportant, since we can simply apply appropriate error cor-
`of adaptive coding is unimportant, since we can simply apply appropriate error cor-
`rection coding to the encoded file. More complicated solutions appear in [5,20], in
`rection coding to the encoded le. More complicated solutions appear in , , in
`which errors are made easy to detect, and upon detection of an error, bits are changed
`which errors are made easy to detect, and upon detection of an error, bits are changed
`until no errors are detected.
`until no errors are detected.
`
`In Section 2 we give a tutorial on arithmetic coding.
`Overview of this paper.
`In Section we give a tutorial on arithmetic coding.
`Overview of this paper.
`We include an introduction to modeling for text compression. We also restate several
`We include an introduction to modeling for text compression. We also restate several
`important theorems from [22] relating to the optimality of arithmetic coding in theory
`important theorems from relating to the optimality of arithmetic coding in theory
`and in practice.
`and in practice.
`In Section 3 we present some of our current research into practical ways of improv-
`In Section we present some of our current research into practical ways of improv-
`ing the speed of arithmetic coding without sacrificing much compression efficiency.
`ing the speed of arithmetic coding without sacri cing much compression e ciency.
`The center of this research is a reduced-precision arithmetic coder, supported by
`The center of this research is a reduced-precision arithmetic coder, supported by
`efficient data structures for text modeling.
`e cient data structures for text modeling.
`
`2 Tutorial on Arithmetic Coding
` Tutorial on Arithmetic Coding
`
`In this section we explain how arithmetic coding works and give implementation
`In this section we explain how arithmetic coding works and give implementation
`details; our treatment is based on that of Witten, Neal, and Cleary [58]. We point out
`details; our treatment is based on that of Witten, Neal, and Cleary . We point out
`the usefulness of binary arithmetic coding (that is, coding with a 2-symbol alphabet),
`the usefulness of binary arithmetic coding that is, coding with a -symbol alphabet,
`and discuss the modeling issue, particularly high-order Markov modeling for text
`and discuss the modeling issue, particularly high-order Markov modeling for text
`compression. Our focus is on encoding, but the decoding process is similar.
`compression. Our focus is on encoding, but the decoding process is similar.
`
`2.1 Arithmetic coding and its implementation
`. Arithmetic coding and its implementation
`
`Basic algorithm. The algorithm for encoding a file using arithmetic coding works
`Basic algorithm. The algorithm for encoding a le using arithmetic coding works
`conceptually as follows:
`conceptually as follows:
`
`Unified Patents, Ex. 1007
`
`000006
`
`
`
`2.1 Arithmetic coding and its implementation
`. Arithmetic coding and its implementation
`
`Old interval
`Old interval
`
`Decomposition
`Decomposition
`
`New interval
`New interval
`
`0
`L
`0 L
`
`0
`0
`
`3
`
`
`H
`H
`
`1
`I
`
`1
`
`probability of a
`probability of ai
`i
`
`0
`0
`
`L'
`L
`
`H
`H'
`
`1
`1
`
`Figure 1: Subdivision of the current interval based on the probability of the input
`Figure : Subdivision of the current interval based on the probability of the input
`symbol a, that occurs next.
`symbol ai that occurs next.
`
`1. We begin with a "current interval" [L, H) initialized to [0, 1).
` . We begin with a current interval" L; H initialized to ; .
`
`2. For each symbol of the file, we perform two steps (see Figure 1):
`. For each symbol of the le, we perform two steps see Figure :
`
`(a) We subdivide the current interval into subintervals, one for each possible
`a We subdivide the current interval into subintervals, one for each possible
`alphabet symbol. The size of a symbol's subinterval is proportional to the
`alphabet symbol. The size of a symbol’s subinterval is proportional to the
`estimated probability that the symbol will be the next symbol in the file,
`estimated probability that the symbol will be the next symbol in the le,
`according to the model of the input.
`according to the model of the input.
`(b) We select the subinterval corresponding to the symbol that actually occurs
`b We select the subinterval corresponding to the symbol that actually occurs
`next in the file, and make it the new current interval.
`next in the le, and make it the new current interval.
`
`3. We output enough bits to distinguish the final current interval from all other
` . We output enough bits to distinguish the nal current interval from all other
`possible final intervals.
`possible nal intervals.
`
`The length of the final subinterval is clearly equal to the product of the probabilities
`The length of the nal subinterval is clearly equal to the product of the probabilities
`of the individual symbols, which is the probability p of the particular sequence of
`of the individual symbols, which is the probability p of the particular sequence of
`symbols in the file. The final step uses almost exactly — lg p bits to distinguish the
`symbols in the le. The nal step uses almost exactly (cid:0) lg p bits to distinguish the
`file from all other possible files. We need some mechanism to indicate the end of the
` le from all other possible les. We need some mechanism to indicate the end of the
`file, either a special end-of-file symbol coded just once, or some external indication of
` le, either a special end-of- le symbol coded just once, or some external indication of
`the file's length.
`the le’s length.
`In step 2, we need to compute only the subinterval corresponding to the symbol a,
`In step , we need to compute only the subinterval corresponding to the symbol ai
`that actually occurs. To do this we need two cumulative probabilities, PC = Pi(cid:0)
`that actually occurs. To do this we need two cumulative probabilities, Pc = a -=1 pk
`k= pk
`and PN = Pi
`and PN =
`=1 Pk . The new subinterval is [L Pc(H — L), L PN (H — L)). The
`k= pk. The new subinterval is L + PC H (cid:0) L; L + PN H (cid:0) L. The
`a
`need to maintain and supply cumulative probabilities requires the model to have a
`need to maintain and supply cumulative probabilities requires the model to have a
`complicated data structure; Moffat [35] investigates this problem, and concludes for
`complicated data structure; Mo at investigates this problem, and concludes for
`a multi-symbol alphabet that binary search trees are about twice as fast as move-to-
`a multi-symbol alphabet that binary search trees are about twice as fast as move-to-
`front lists.
`front lists.
`Example 1: We illustrate a non-adaptive code, encoding the file containing the
`Example : We illustrate a non-adaptive code, encoding the le containing the
`symbols bbb using arbitrary fixed probability estimates pa = 0.4, pb = 0.5, and
`symbols bbb using arbitrary xed probability estimates pa = :, pb = :, and
`PEOF - 0.1. Encoding proceeds as follows:
`pEOF = : . Encoding proceeds as follows:
`
`Unified Patents, Ex. 1007
`
`000007
`
`
`
`1
`
`
`COMM;
`2 T l TORIAL ON AR1111111EI IC
` TUTORIAL ON ARITHMETIC CODING
`
`Current
`Current
`Interval
`
`Action
`
`[0.000,1.000) Subdivide
`[0.400, 0.900) Subdivide
`[0.600, 0.850) Subdivide
`[0.700, 0.825) Subdivide
`[0.812, 0.825)
`
`Subintervals
`
`EOF
`
`Input
`
`[0.000,0.00)
`[0.400, 0.600)
`[0.6000.700)
`[0.700
`750)
`
`9 (0)
`[0.40
`850)
`[0.60
`[0.700, 0.825)
`[0.750,0.812)
`
`h
`[0.900, LON)
`h
`[0.850, 0.900)
`h
`[0.825, 0.850)
`[0.812,0.825) EOF
`
`The final interval (without rounding) is [0.8125.0.825). which in binary is approx-
`imately [0.11010 00000. 0.11010 01100). We can uniquely identify this interval by
`outputting 1101000. According to the fixed model. the probability p of this partic-
`ular file is (0.5)3(0.1) = 0.0125 (exactly the size of the final interval) and the code
`length (in bits) should be — lgp = 6.322. In practice we have to output I bits.
`q
`
`The idea of arithmetic coding originated with Shannon in his seminal 1918 paper
`on information theory [5i]. h was rediscovered by Elias about 15 years later, as
`briefly mentioned in [1].
`
`Implementation details. The basic implementation of arithmetic coding de-
`scribed above has two major difficulties: the shrinking current interval requires the
`use of high precision arithmetic, and no output is produced until the entire file has
`been read. The most straightforward solution to both of these problems is to output
`each leading bit as soon as it is known, and then to double the length of the cur-
`rent interval so that it reflects only the unknown part of the final interval. Witten,
`Neal, and Cleary [58] add a clever mechanism for preventing the current interval from
`shrinking too much when the endpoints are close to 1/2 but straddle 1/2. In that
`case we do not yet know the next output bit, but we do know that whatever it is, the
`following bit will have the opposite value; we merely keep track of that fact, and ex-
`pand the current interval symmetrically about 1/2. This follow-on procedure may be
`repeated any number of times, so the current interval size is always longer than 1/1.
`Mechanisms for incremental transmission and fixed precision arithmetic have been
`developed through the years by Pasco [i0], Rissanen [i8], Rubin [52], Rissanen and
`Langdon [i9], (Mazzo [19], and Witten, Neal, and Cleary [58]. The bit-stuffing idea
`of Langdon and others at IBM that limits the propagation of carries in the additions
`is roughly equivalent to the follow-on procedure described above.
`We now describe in detail how the coding and interval expansion work. This
`process takes place immediately after the selection of the subinterval corresponding
`to an input symbol.
`We repeat the following steps (illustrated schematically in Figure 2) as many times
`as possible:
`
`a. E the new subinterval is not entirely within one of the intervals [0, I/2), E/4, 3/4),
`or E/2, 1), we stop iterating and return.
`
`
`
`2.1 Arithmetic coding and its implementation
`. Arithmetic coding and its implementation
`
`5
`
`
`(a)
`(a)
`
`(b)
`(b)
`
`(c)
`(c)
`
`(d)
`(d)
`
`0
`0
`
`1/
`1/
`
`2
`2
`
`1
`1
`
`Figure 2: Interval expansion process. (a) No expansion. (b) Interval in [0, 1/2). (c)
`Figure : Interval expansion process. a No expansion. b Interval in ; =. c
`Interval in [1/2, 1). (d) Interval in [1/4, 3/4) (follow-on case).
`Interval in =; . d Interval in =; = follow-on case.
`
`b. If the new subinterval lies entirely within [0, 1/2), we output 0 and any Is left
`b. If the new subinterval lies entirely within ; =, we output and any s left
`over from previous symbols; then we double the size of the interval [0, 1/2),
`over from previous symbols; then we double the size of the interval ; =,
`expanding toward the right.
`expanding toward the right.
`
`c. If the new subinterval lies entirely within [1 2,1), we output 1 and any Os left
`c. If the new subinterval lies entirely within =; , we output and any s left
`over from previous symbols; then we double the size of the interval [1/2, 1),
`over from previous symbols; then we double the size of the interval =; ,
`expanding toward the left.
`expanding toward the left.
`
`d. If the new subinterval lies entirely within [14 3/4), we keep track of this fact
`d. If the new subinterval lies entirely within =; =, we keep track of this fact
`for future output; then we double the size of the interval [1/4, 3/4), expanding in
`for future output; then we double the size of the interval =; =, expanding in
`both directions away from the midpoint.
`both directions away from the midpoint.
`
`Example 2: We show the details of encoding the same file as in Example 1.
`Example : We show the details of encoding the same le as in Example .
`
`Unified Patents, Ex. 1007
`
`000009
`
`
`
`6
`
`
`COMM;
`2 111O1ilaLU.1 ARI111111E1 IC
` TUTORIAL ON ARITHMETIC CODING
`
`Subintervals
`
`EOF
`
`Input
`
`0.40)
`[0.0
`
`
`[0.4 0.60)0,
`
`[0.40, 0.90)
`[0.60, 0.85)
`
`[0.90,1.00)
`[0.85, 0.90)
`
`h
`h
`
`[0.2
`
`40)
`
`[0.4(
`
`65)
`
`[0.65. 0,70)
`
`h
`
`[0.30, 0.50)
`
`[0.50
`
`75)
`
`[0.75, 0.80) EOF
`
`Current
`Current
`Interval
`
`Action
`
`[0.00,1.00) Subdivide
`[0.40, 0.90) Subdivide
`[0.60, 0.85) Output 1
`Expand [1/2, 1)
`[0.20, 0.70) Subdivide
`follow
`[0.40, 0.65)
`Expand [1/4, 3/4)
`[0.30,0.80) Subdivide
`[0.75, 0.80) Output 10
`Expand [1/2, 1)
`[0.50, 0.60) Output 1
`Expand [1/2, I)
`[0.00, 0.20) Output 0
`Expand [0, 1(2)
`[0.00,0.40) Output 0
`Expand [0, 1/2)
`[0.00, 0.80) Output 0
`
`The "lotion," ou put in the sixth line indicates the follow-on procedure: we keep
`track of our knowledge that the next output bit will be followed by its opposite; this
`"opposite" bit is the 0 output in the ninth line. The encoded file is 1101000, as
`q
`before.
`
`Clearly the current interval contains some information about the preceding inputs;
`this information has not yet been output, so we can think of it as the coder's state. E
`a is the length of the current interval, the state holds — lg a bits not yet output. In the
`basic method (illustrated by Example 1) the state contains all the information about
`the output, since nothing is output until the end. In the implementation illustrated
`by Example 2, the state always contains fewer than two bits of output information,
`since the length of the current interval is always more than 1/1. The final state in
`Example 2 is [0,0.8), which contains — lg 0.8
`0.322 bits of information.
`
`Use of integer arithmetic. In practice, the arithmetic can be done by storing
`the current interval in sufficiently long integers rather than in floating point or exact
`rational numbers. (We can think of Example 2 as using the integer interval [0, 100)
`by omitting all the decimal points.) We also use integers for the frequency counts
`used to estimate symbol probabilities. The subdivision process involves selecting non-
`overlapping intervals (of length at least 1) with lengths approximately proportional
`to the counts. To encode symbol ai we need two cumulative counts, C =
`and
`ci„ and the sum T of all counts. T =
`cy,. (Here and elsewhere we
`iV =
`denote the alphabet size by o.) The new subinterval is [1.
`+
`[c(IIL)
`\
`r -L)
`
`L
`
`
`
`2.1 Arithmetic coding and its implementation
`. Arithmetic coding and its implementation
`
`
`
`(in this discussion we continue to use half-open intervals as in the real arithmetic case.
`In this discussion we continue to use half-open intervals as in the real arithmetic case.
`In implementations [58] it is more convenient to subtract 1 from the right endpoints
`In implementations it is more convenient to subtract from the right endpoints
`and use closed intervals. Moffat [36] considers the calculation of cumulative frequency
`and use closed intervals. Mo at considers the calculation of cumulative frequency
`counts for large alphabets.)
`counts for large alphabets.
`
`Example 3: Suppose that at a certain point in the encoding we have symbol counts
`Example : Suppose that at a certain point in the encoding we have symbol counts
`= 1, eb = 5, and CEOF = 1 and current interval [25, 89) from the full interval [0, 128).
`ca = , cb = , and cEOF = and current interval ; from the full interval ; .
`Let the next input symbol be b. The cumulative counts for b are C = 1 and iV = 9,
`Let the next input symbol be b. The cumulative counts for b are C = and N = ,
`+
`I 9[89-25) I
`and T = , so the new interval is +b (cid:0)
` we then
`and T = 10, so the new interval is [2,5 +
`I 4[89-25) I
`[50, 82);
`10
`
`10
`increment the follow-on count and expand the interval once about the midpoint 61,
`giving [(36, 100). It is possible to maintain higher precision, truncating (and adjusting
`to avoid overlapping subintervals) only when the expansion process is complete; this
`makes it possible to prove a tight analytical bound on the lost compression caused by
`the use of integer arithmetic, as we do in [22], restated as 'Theorem 1 below. in practice
`q
`this refinement makes the coding more difficult without improving compression.
`
`Analysis. in [22] we prove a number of theorems about the code lengths of files
`coded with arithmetic coding. Most of the results involve the use of arithmetic coding
`in conjunction with various models of the input; these will be discussed in Section 2.3.
`Here we note two results that apply to implementations of the arithmetic coder. The
`first shows that using integer arithmetic has negligible effect on code length.
`
`Theorem 1 If we use integers from the range [0, N) and use the high precision al-
`gorithm for scaling up the subrange, the code length is provably bounded by 1/(N 1n'2)
`bits per input symbol more than the ideal code length for the file.
`
`For a typical value N = 65,536, the excess code length is less than 10-4 bit per
`input symbol.
`The second result shows that if we indicate end-of-file by encoding a special symbol
`just once for the entire file. the additional code length is negligible.
`
`Theorem 2 The use of a special end-of-file symbol when coding a file of length t
`using integers from the range [0. N) results in additional code length of less than
`8t/(N ln 2) + lg N + I bits.
`
`Again the extra code length is negligible. less than 0.01 bit per input symbol for
`a typical 100,000 byte file.
`Since we seldom know the exact probabilities of the process that generated an
`input file. we would like to know how errors in the estimated probabilities affect the
`code length. We can estimate the extra code length by a straightforward asymptotic
`analysis. The average code length L for symbols produced by a given model in a
`given state is given by
`
`L = —
`
`lg
`
`
`
`8
`
`
`2 T 1 7TORIAL ON ARI I 11111EI IC COMM;
` TUTORIAL ON ARITHMETIC CODING
`
`where pi is the actual probability of the ith alphabet symbol and qi is its estimated
`where pi is the actual probability of the ith alphabet symbol and qi is its estimated
`probability. The optimal average code length for symbols in the state is the entropy
`probability. The optimal average code length for symbols in the state is the entropy
`of the state, given by
`of the state, given by
`
`tp:
`pi lg pi:
`
`nXi
`
`11 = -
`H = (cid:0)
`=
`The excess code length is E = L — 11; if we let di = — pi and expand asymptotically
`The excess code length is E = L(cid:0) H; if we let di = qi (cid:0) pi and expand asymptotically
`in d. we obtain
`in d, we obtain
`P
`1 d?
`=
`_
`_
` + 0 ('
`E=L
`E =
`21n 2 pi
`(
`Pi
`(This corrects a similar derivation in [5], in which the factor of 1/1112 is omitted.)
`The vanishing of the linear terms means that small errors in the probabilities used
`by the coder lead to very small increases in code length. Because of this property,
`any coding method that uses approximately correct probabilities will achieve a code
`length close to the entropy of the underlying source. We use this fact in Section 3.1
`to design a class of fast approximate arithmetic coders with small compression loss.
`
`
`
`0-)
`
`nXi
`
`
`
`2.2 Binary arithmetic coding
`
`The preceding discussion and analysis has focused on coding with a multi-symbol
`alphabet, although in principle it applies to a binary alphabet as well. h is useful
`to distinguish the two cases since both the coder and the interface to the model are
`simpler for a binary alphabet. The coding of bilevel images, an important problem
`with a natural two-symbol alphabet, often produces probabilities close to 1, indicating
`the use of arithmetic coding to obtain good compres