`
`A Universal Algorithm for Sequential Data Compression
`
`JACOB ZIV, FELLOW, IEEE, AND ABRAHAM LEMPEL, MEMBER. IEEE
`
`Abstraetwn universal algorithm for sequential data compres-
`sion is presented. Its performance is investigated with respect to
`a nonprobabilistic model of constrained sources. The compression
`ratio achieved by the proposed universal code uniformly an»
`preaches the lower bounds on the compression ratios attainable by
`block-to-variable codes and variable-to-block codes designed to
`match a completely specified source.
`
`1.
`
`IN'l‘RODUC'l‘ION
`
`com-
`arising in digital
`N MANY situations
`munications and data processing, the encountered
`strings of data display various structural regularities or are
`otherwise subject to certain constraints, thereby allowing
`for storage and timesaving techniques of data compres-
`sion. Given a discrete data source, the problem of data
`compression is first to identify the limitations of the source,
`and second to devise a coding scheme which, subject to
`certain performance criteria, wi ll best compress the given
`source.
`
`Once the relevant source parameters have been identi-
`fied, the problem reduces to one of minimum-redundancy
`coding. This phase of the problem has received extensive
`treatment in the literature [1]—l7].
`When no a priori knowledge of the source characteristics
`is available, and if statistical tests are either impossible or
`unreliable, the problem of data compression becomes
`considerably more complicated. In order to overcome these
`difficulties one must resort to universal coding schemes
`whereby the coding process is interlaced with a learning
`process for the varying source characteristics [8], [9]. Such
`coding schemes inevitably require a larger working mem-
`ory space and generally employ performance criteria that
`are appropriate for a wide variety of sources.
`In this paper, we describe a universal coding scheme
`which can be applied to any discrete source and whose
`performance is comparable to certain optimal fixed code
`book schemes designed for completely specified sources.
`For lack of adequate criteria, we do not attempt to rank the
`proposed scheme with respect to other possible universal
`coding schemes. Instead, for the broad class of sources
`defined in Section III, we derive upper bounds on the
`compression efficiency attainable with full a priori
`knowledge of the source by fixed code book schemes, and
`
`Manuscript received June 23. 1975; revised July 6, 19'?6. Paper pre-
`viously presented at the lEEEIntemationalSyrnposiun:1 on Information
`Theory, Ronnehy, Sweden. June 21-24, 1973.
`J. Ziv was with the Department of Electrical Engineering, Technion.
`Israel Institute of Technology. Haifa, Israel. He is now with the Bell
`Telephone Laboratories, Murray Hill, NJ 07911.
`A. Lempel was with the Department of Electrical Engineering, Tech-
`nion—lsrael Institute of Technology, Haifa, Israel. He is now with the
`Sperry Research Center, Sudbury, MA 01776.
`
`then show that the efficiency of our universal code with no
`a priori knowledge of the source approaches those
`bounds.
`
`The proposed compression algorithm is an adaptation
`of a simple copying procedure discussed recently [10] in
`a study on the complexity of finite sequences. Basically,
`we employ the concept of encoding future segments of the
`source—output via maximum—length copying from a buffer
`containing the recent past output. The transmitted
`codeword consists of the buffer address and the length of
`the copied segment. With a predetermined initial load of
`the buffer and the information contained in the codewords,
`the source data can readily be reconstructed at the de-
`coding end of the process.
`The main drawback of the proposed algorithm is its
`susceptibility to error propagation in the event of a channel
`error.
`
`II. THE COMPRESSION A1.ooR1'rHM
`
`The proposed compression algorithm consists of a rule
`for parsing strings of symbols from a finite alphabet A into
`substrings, or words, whose lengths do not exceed a pre—_
`scribed integer L_,, and a coding scheme which maps these
`substrings sequentially into uniquely decipherable code-
`words of fixed length L, over the same alphabet A.
`The word—length bounds L, and L,. allow for bounded-
`delay encoding and decoding, and they are related by
`
`L, = 1+ [log (rt -- L.,)1+[1ogL,1,
`
`(I)
`
`where [x'| is the least integer not smaller than :5, the log-
`arithm base is the cardinality or of the alphabet A, and n
`is the length of a buffer, employed at the encoding end of
`the process, which stores the latest n. symbols emitted by
`the source. The exact relationship between :1. and L, is
`discussed in Section III. Typically, n m L_,cx“'-*=, where 0
`< h < 1. For on-line decoding, a buffer of similar length has
`to be employed at the decoding end also.
`To describe the exact mechanics of the parsing and
`coding procedures, we need some preparation by way of
`notation and definitions.
`
`Consider a finite alphabet A of or symbols, say A =
`i0,1, - - - ,a — 1}. Astring, or word, S of length 8(3) = it over
`A is an ordered k—tuple S = s1s2 - - - s;, of symbols from A.
`To indicate a substring of S which starts at position i and
`ends at position j, we write S(i,j}. When i E j, S(i,j) =
`s,-3,41 - - - s,-, but when E > j, we take S(z',j) = II, the null
`string of length zero.
`The concatenation of strings Q and R forms a new string
`8' = QR; if£’{Q) = i2 and NR) = m, then 5(3) = 12 + m, Q
`= S(1,k}, and R = Site + 1, it + m). For eachj, U Ej 5
`
`SAP AMERICA, INC. ET AL.
`EXHIBIT 1012
`
`Page 1 of 7
`
`
`
`338
`
`IEEE TRANSACTIONS DN INFORMATION THEORY. MAY 1977
`
`3(8), S(1,j) is called a prefix ofS; S(1,,r') is a proper prefix
`ofSifj < NS).
`Given a proper prefix S(1,,i) ofa string S and a positive
`integer i such that i S 3?, let L (i) denote the largest non-
`negative integer E S 1? {S ) — j such that
`
`S(i,i+ .5’ — 1] = S(_;'+ 1,j+ 2},
`
`and let p be a position of S (1,}'} for which
`
`Llp) = max lL(i)}.
`‘l.£r'Sj
`
`The substring S{_,r' + 1,3’ + L(p)} ofS is called the repro-
`ducible extension of S{1,j) into S, and the integer p is
`called the pointer of the reproduction. For example, if S
`= 00101011 and} = 3, thenL(1} = 1 since S0" + 1,} + 1) =
`S{1,1) but SU + 1,,i + 2) ;-5 S{1,2). Similarly,L_(2) = 4and
`M3} = 0. Hence, S(3 + 1,3 + 4) '—' 0101 is the reproducible
`extension of S(1,3) = 001 into S with pointer p = 2.
`Now, to describe the encoding process, let S = 8189 - - -
`denote the string of symbols emitted by the source. The
`sequential encoding of S entails parsing S into successive
`source words, S = SIS; -
`-
`-
`, and assigning a codeword C,-
`for each S,-. For bounded-delay encoding, the length t’,- of
`each 8; is at most equal to a predetermined parameter L,-.
`while each C,- is of fixed length L._. as given by (
`).
`To initiate the encoding process, we assume that the
`output 8 of the source was preceded by a string Z of n —
`L, zeros, and we store the string B1 = ZS(1,L,.) in the
`buffer. If S(1,j) is the reproducible extension of Z into
`ZS(1,L,. — 1], then S] = S(1,j + I) and E1=j+ 1. To de-
`termine the next source word, we shift out the first 3;
`symbols from the buffer and feed into it the next 4?] sym-
`bols ofS to obtain the string Bg = B](£1 + 1,n)S(L,, + 1,
`L, + 81). Now we look for the reproducible extension E of
`Bg(1,n - L3) into Bg(1,n -1), and setSg = Es, where sis
`the symbol next to E in B2. In general, if B; denotes the
`string of :1 source symbols stored in the buffer when we are
`ready to determine the ith source word S,-, the successive
`encoding steps can be formally described as follows.
`1) Initially, set 31 = 0"""-'-'S(1,L,,), i.e., the all—zero
`string of length n - L5 followed by the first L3 symbols of
`S
`.
`
`2} Having determined B,-,:' 2 1, set
`
`S; = B,-(n - L, +1,n - Ls + 8.-),
`
`where the prefix of length 2, — 1 of S; is the reproducible
`extension ofl$,-(1,n — L,} into B,-{1,n -1).
`3) If p,- is the reproduction pointer used to determine
`S; , then the codeword C, for S, is given by
`'
`
`C: = CuC;‘2Ci3.
`
`where Cu is the radix-oz representation of p.‘ - 1 with
`£(C,-1} = [log [n — L_.,.)], C52 is the radix-or representation
`of 3; - 1 with £’(C,-2) = [log L3]. and C53 is the last symbol
`of S,-, i.e., the symbol occupying position rt — L, +' 3; of B,-.
`The total length of C; is given by
`
`= [log (73 —
`
`+ [log Ls-l +1
`
`4) To update the contents of the buffer, shift out the
`symbols occupying the first 8; positions of the buffer while
`feeding in the next 8; symbols from the source to obtain
`
`B;+1= Brit’; + 1,7!)-S'(l1i+ 1.hi+ 9:).
`
`where h,- is the position of S occupied by the last symbol
`of .35.
`This completes the description of the encoding process.
`It is easy to verify that the parsing rule defined by (2)
`guarantees a bounded, positive source word length in each
`iteration; in fact, 1 5 8; S. L, for each i thus allowing for
`a radix—a representation of E; — 1 with [log L_.,'| symbois
`from A. Also, since 1 S p,- 5 n - L, for eachi, it is possible
`to represent pg — 1 with [log (n — Ls)-l symbols from A.
`Decoding can be performed simply by reversing the
`encoding process. Here we employ a buffer oflength n —
`L, to store the latest decoded source symbols. Initially, the
`buffer is loaded with n - L3 zeros. If'D,- = d]Glg - - - dn_L,
`denotes the contents of the buffer after C,-_. has been de-
`coded into S,-_,, then
`
`Si—‘l = Diln "Ls _ fa"-1+]-an _ L5):
`
`where E,--1 = €(S.'—i), and where D,-+1 can he obtained
`from D; and C; as follows.
`Determine pg - l and E; — 1 from the first [log [n - L,.}'|
`and the next [log L_,'| symbols of G. Then, apply 8; - I
`shifts while feeding the contents of stage p,- into stage n —
`L_,.. The first of these shifts will change the buffer contents
`from D; to
`
`D5" = d2d.’l - - - d.._:.,.dp.- = d‘.“d-‘_a" - --
`
`5.‘.'.i...
`
`Similarly, ifj S 6’; - 1, the jth shift will transform DP”)
`= dgi-”dy—” .
`.
`. d£{:,{;
`into pi“ = dy"”dy'-U . ..
`d‘,;‘:},f,d_‘,;‘,."’ = dllldlg” - - - d‘,:”_;,,. After these 8; — 1
`shifts are completed, shift once more, while feeding the last
`symbol of C; into stage n — L, of the buffer. It is easy to
`verify that the resulting load of the buffer contains 8; in
`its last 9; = £(S;) positions.
`The following example will serve to illustrate the me-
`chanics of the algorithm. Consider the ternary (ox = 3}
`input string
`
`3 = 0010102I02102l2021021200 - - - ,
`
`and an encoder with parameters L3 = 9 and n = 18. [These
`parameters were chosen to simplify the illustration; they
`do not reflect the design considerations to be discussed in
`Section III.) According to (1), the corresponding codeword
`length is given by
`
`C = 1 +log3{18—9)+log39=5.
`
`Initially, the buffer is loaded with n — L3 = 9 zeros, fol-
`lowed by the first L, = 9digits of S,namely,
`
`B] =00_0000000 001010210.
`\-.._..uII--\/--II-.._./
`\.._.uII--\/--II-.._/
`n.—L,,.=9
`L,.=9
`
`To determine the first source word S 1, we have to find the
`longest prefix B1{1Q,9 + £1 - 1} of
`
`in accordance with {1).
`
`B](10,l'?l =00101021
`
`Page 2 of 7
`
`
`
`ZN AND LEMPEL: SEQUENTIAL DATA COMPRESSION
`
`339
`
`which matches a substring of B1 that starts in position p 1
`S 9 and then set S1 = B1(10.9 + 8;). It is easily seen that
`the longest match in this case is B1(10,11) = 00, and hence
`S1= 001 and !1= 3. The pointer P} for this step can be any
`integer between one and nine; we choose to set p 1 = 9. The
`two-digit radix—3 representation of p 1 — 1 is C” = 22, and
`that of 8; -— 1 is C12 = 02. Since C,-3 is always equal to the
`last symbol of S,-, the codeword for S1 is given by C1 =
`22021.
`
`To obtain the buffer load B2 for the second step, we shift
`out the first 3 1 = 3 digits of B1 and feed in the next 3 digits
`S (10,12) = 210 of the input string S. The details of steps
`2, 3, and 4 are tabulated below, where pointer positions are
`indicated by arrows and where the source words S, are
`indicated by the italic substring of the corresponding
`buffer load B,-
`
`l
`B2=000O0fl00101021021Q
`
`1
`B3=0o0o101021021o212a
`
`l
`B,=21o210212o2r0212oa
`
`C2 = 21102
`
`C3 = 20212
`
`C4 = 02220.
`
`111. COMPRESSION or CONSTRAINED SOURCES
`
`In this section, we investigate the performance of the
`proposed compression algorithm with respect to a non-
`probabilistic model of constrained information sources.
`After defining the source model, we derive lower bounds
`on the compression ratios attainable by block-to-variable
`and variable—to-block codes under full knowledge of the
`source, and then show that the compression ratio achieved
`by our universal code approaches these bounds.
`
`A. Definition of the Source Model
`
`Let A = {{),1, - - - ,a - 1] be the given a—s-ymbol alphabet,
`and let A * denote the set of all finite strings over A. Given
`a string S E A* and a positive integer m 5 HS), let Sim}
`denote the set of all substrings of length m contained in S,
`and let S (m) denote the cardinality of S {m }. That is,
`!iSJ—rn
`
`Slm}= B‘ S(£+ 1,f+m)
`
`son; = IS£mIl-
`
`Givenasubset arofA*, let
`
`crlml = {S E rr|£(S) = m},
`
`and let cr{m) denote the cardinality of aim}.
`A subset er of A * is called a source if the following three
`properties hold:
`
`1) A C er (i.e., 0' contains all the unit length strings),
`2) S E 0 implies SS E a,
`3) S E crimplies Slml C clml.
`
`Typically. such a source rr is defined by specifying a fi-
`nite set of strings over A which are forbidden to appear as.
`substrings of elements belonging to O‘, and therefore cr{m)
`< am for all m exceeding some mg.
`With every source or, we associate a sequence M1),
`M2), - - - of parameters. called the h-parameters of 0',
`where‘-
`
`Mm) = ilog aim),
`m
`
`m=1e,.-
`
`(3
`
`It is clear that 0 5 Min) 5 1 for all m and, by 2) it is also
`clear that mh(m} is a nondecreasing function of m. The
`sequence of h-parameters, however,
`is usually nonin—,
`creasing in m. To avoid any possible confusion in the se-'
`quel, we postulate this property as an additional defining
`property of a source. Namely, we require
`
`4} h{m} = 1/m log cr(m} is a nonincreasing function of
`m.
`
`B. Some Lower Bounds on the Compression Ratio
`
`Consider a compression coding scheme for a source tr
`which employs a block-to-variable {Bl/) code book of M
`pairs (X,-,Y,-) of words over A, with .£’(X,-} = L for i =
`1,2, - - - ,M. The encoding of an infinitely long string 8 E‘-
`or by such a code is carried out by first parsing S into blocks
`of length L, and then replacing each block X; by the cor-
`responding codeword Y.-. It is assumed, of course, that the
`code book is exhaustive with respect to o and uniquely
`decipherable [2]. Hence, we must have
`
`lXeli-"11 = GIT-l
`
`M = <r(L) = cx“"""’.
`
`max Ie‘(Y,-)lz1ogM = Lh(L).
`15:‘ SM
`
`(3)
`
`(4)
`
`The compression ratio ,0,‘ associated with the ith word-
`pair of the code is given by
`
`_ _ 6’ (Yr)
`.0;
`L -
`
`The BV compression ratio, p31,r(o',.M), of the source a_
`is defined as the minimax value of p,-, where the maximi-
`zation is over all word—pairs of a given code, and the min-
`imization is over the set C3y{a,M) of all B V code hooks
`consisting of M word—pairs. Thus,
`
`i0av(0,M) =
`
`min max
`C;;v(o.M>1si'4_=M
`
`.
`
`3(Y;)l
`
`L
`
`s‘°gM=%=:-.(L1..
`L
`L
`
`1 Throughout this paper, log I means the base-nr logarithm of x.
`
`Page 3 of 7
`
`
`
`340
`
`For later reference, we record this result in the following
`lemma.
`Lemma 1:
`
`where
`
`o.sv(cr.M) 2 h( L
`
`Lh(L) = log M.
`
`Now, consider a compression scheme which employs a
`variable-to-block (VB) code book of M word—pairs (X,-,Y.-},
`with £’(Y,-) = L for all i = 1,2, - - - ,M. In this case, the
`compression ratio associated with the ith word-pair is
`given by
`
`L
`
`l_
`P:
`
`and similarly, the VB compression ratio py,:;{o,M) of c is
`defined as the rninirnax value of _£I,' over all word—pairs and
`over the set Cw;(a,M) of VB code books with M word-
`pairs.
`Lemma 2:
`
`pvs(rf.Ml Z MLM)
`
`where
`
`LM = max {£|M 2 6(5)}.
`
`Proof: We may assume, without loss of generality, that
`in every code under consideration
`
`£’(X1}St’(X2}S---Efi-KM).
`
`and hence for each C E CVg(0,M),
`
`_ MC)
`I
`max p,{C) — —“r1}.
`Since C is exhaustive with respect to or, we have
`
`M 2 alt’ 1}.
`
`{3}
`
`where 81 = €(X1); and since C is uniquely decipherable,
`we have
`
`MC] .2 log M.
`
`(7)
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, MAY 1977
`
`Remarks
`
`1} Since the value of L in the context of Lemma 1
`satisfies the definition of LM as given in Lemma 2, it fol-
`lows that the bounds of both lemmas are essentially the
`same, despite the basic difference between the respective
`coding schemes.
`2) By 2), the second defining property of a source, the
`per—word bounds derived above apply to indefinitely long
`messages as well, since the whole message may consist of
`repeated appearances of the same worst case word.
`3) By 4}, the nonincreasing property of the h—pa-
`rameters. the form of the derived bounds confirms the
`intuitive expectation that an increase in the size M of the
`employed code book causes a decrease in the lower bound
`on the attainable compression ratio.
`
`C. Performance of the Proposed Algorithm
`
`We proceed now to derive an upper bound on the com-
`pression ratio attainable by the algorithm of Section II. To
`this end, we consider the worst case source message of
`length n — L_,, where n is the prospective buffer length and
`L, is the maximal word-length. The bound obtained for
`this case will obviously apply to all messages of length n
`— L, or greater.
`First, we assume that only the h-parameters of the
`source under consideration are known to the designer of
`the encoder. Later, when we discuss the universal perfor-
`mance of the proposed algorithm, we will show that even
`this restricted a prion" knowledge of the source is actually
`unessential.
`We begin by choosing the buffer length n to be an inte-
`ger of the form
`A
`
`8
`m=.x+I mcrifl 4'
`n = Z mam + Z
`rn-——l
`
`(I? 4' 1}(Ne+:+1l.
`
`(9)
`
`where
`
`N!+1 =
`
`8
`i (£’—m)zv"'+ Z
`m=l
`m=2\+l
`
`U""m)0[-9),
`
`(10)
`
`From the definition of LM, inequality (6), and the nonde-
`creasing property of <r(8), we obtain
`
`A = [£'h(€)].
`
`the integer part oflog ¢r(€) = £h(£’), and
`
`3 = L, — 1.
`
`(Ill
`
`€;SLM.
`
`_ From (5), (7), and (8), we have
`
`log M_
`maxp.'(C) 2 LM ,
`
`and since
`
`M 3 CALM) = a'f‘MhhLJH}’
`
`it follows that for each C‘ E CV5-{ar,M),
`
`max m(C) Z h(LM}.
`
`(8)
`
`The specific value of the parameter L, is left for later
`determination {see the first remark following the proof
`of Theorem 1). The reasoning that motivates the given
`form of n will become clear from subsequent deriva-
`tions.
`
`Consider a string S E rrin — L5}, and let N(S) denote the
`number of words into which 3 is parsed by the algorithm
`during the encoding process. Recalling that each of these
`words is mapped into a codeword of fixed length L, {see
`(1)), it follows that the compression ratio p(S) associated
`with the string 8' is given by
`
`as) =
`
`L"n_
`
`L8 N(S}.
`
`Page 4 of 7
`
`
`
`ZIV AND I.-EMPEL: SEQUENTIAL DATA COMPRESSION
`
`Hence, the compression ratio ,0 attainable by the algorithm
`for a given source 0' is
`
`(12)
`
`N= max N(S).
`SE aln‘-L,.l
`
`Let Q E crin — L3} be such that N(Q} = N, and suppose
`that the algorithm parses Q into Q = Q1622 - - - QN. From
`step 2) of the encoding cycle it follows that if HQ.-) = 9 )5)
`< L_.,, for somei <j < N, then Q; 75 65-. (Note that when
`Q_,- is being determined at the j th cycle of the encoding
`process, all of Q, is still stored in the buffer, and since 5 (Q1)
`< L,, the longest substring in the buffer that precedes Q;
`and is a prefix of Q; must be oflength 2{Q_.-) — 1.}
`Denoting by Km the number of Q,-, 1 S i _<. N - 1, of
`length m, 1 S m 5 L5, we have
`L.-.
`N=l+ 2 Km.
`m=l
`
`N£N’=
`
`n - L3 _ n — L5
`5
`L, — 1'
`
`Hence, from (12) and (16), we have
`
`DEE:
`8‘
`
`LC
`L3 — 1'
`
`(17)
`
`Note that despite the rater rudimentary overestimation
`of N by N’, the upper bound of (17) is quite tight, since the
`fact that no source word is longer than L, immediately
`implies p 2 L,/Ls.
`We can state now the following result.
`Theorem I: If the buffer length n for a source with
`known h-parameters is chosen according to (9), then
`
`P 5-
`
`"'
`
`+ €(Le;l;
`
`where
`
`(3 + 3 log (L, — 1} + log
`c(L,-)
`:12, -1
`Proof.‘ From {1} we have
`
`By the above argument, and by property 3} of the source,
`we have
`
`L, = 1+ [log L_,.] + [log in - L_,)'|
`
`From {9} and (10) we obtain
`
`E 3 + log [L,.. — 1) + log {n — L,..).
`
`m=|
`
`n-L_.,=f[ i {t’—m]a"‘+ i
`m=.\+‘|. (3 -m)rr{Zl
`A
`t’
`m= A+l am],
`+ z or"'+ 2
`m=I
`and since C!” 5 0(3), for 1 S m S A, we have
`
`n—L,,sea{e) f (3-m-I-1}=$€2(£’+l)ar(
`Im:
`
`),
`
`+1
`log (rt — L5) 5 2log£ + log‘? 2 + fhlf).
`
`Since 8 = L, — 1, we obtain
`
`L, s3+31og(L_, —1)+1og%+(L_, —1}h(L_, -1),
`OI‘
`
`LC 5 (Ls _ ]-)lh(Ls _1)+ 5(Ls)l-
`
`Substituting this result into (17), we obtain the bound of
`Theorem 1.
`
`Q.E.D.
`
`Remarks
`1) The Value of e(L,) decreases with L, and, conse-
`quently, the compression ratio p approaches the value of
`ML, - 1), the h-parameter associated with the second
`largest word-length processed by the encoder. Given any
`6 > 0, one can always find the least integer 3,. such that
`p - h(£,, - 1) 5 6. The magnitude of the acceptable de—
`
`Km
`
`rI(m).
`
`for1Sm5£'=L_,—1.
`
`Since
`
`mKrnn
`+
`37- “Ls '=
`and n and L, are both fixed, it is clear that by overesti-
`mating the values of K," for 1 5 m 5 E at the expense of
`K(+1, we can only overestimate the value of N. Therefore,
`since «(ml 5. afm + 1) and cr{m) = a”“"l’“’ 5 am, we ob-
`tain
`
`J
`8
`r
`N£K,p,1+ Z K,,,=N’,
`m=l
`
`(13)
`
`K:,.=l
`
`am
`’0(3),
`
`for] 5 m 5 A = Lfh{£)j
`for)\<m 5 8
`
`(14)
`
`From (14), (15), and (9), we obtain Kin.” = N3“, and
`A
`
`r I
`
`lV'=lV(+1+ E: fl"'+
`m=1
`
`I".I'I=.\+l
`
`«(fl
`
`which, together with (9) and (10). yields
`A
`9
`Z"_mtr’”+ X mcr(£’)
`m=l
`m=.\+l
`
`n—L_,-EN’:
`
`A
`if
`+iNg+[—'g[Z;0tm+ Z
`m=A+I 6(9)] = D,
`m=l
`
`Page 5 of 7
`
`
`
`342
`
`IFIEE TRANSACTIONS ON INFORMATION THEORY. MAY 1977
`
`viation 5 determines the operational range of Ls, namely,
`L, 2 £3.
`2) Since our code maps source words of variable
`length at most L, into codewords of fixed length Lc, we
`adopt as a reference for comparison the best VB code CV5
`discussed in Subsection III-B. The counterpart of our
`block-length L, is L(C) of (5), and by (7) we can write
`
`Le "3 103 M ‘“ LMMLM),
`
`(13)
`
`where M is the code book size and h{L,a,4) is the lower
`bound {see Lemma 2) on the compression ratio pvg at-
`tainable by CV3. Since for sufficiently large L,, we have
`also
`
`Le =*‘= (Ls - Up 9“ (Ls -1}h(L.= *1),
`
`(I9)
`
`it follows that p H pm.
`We turn now to investigate the universal performance
`of the proposed algorithm, i.e., the performance when no
`a priori knowledge of the source to be compressed is
`available.
`Given 5; and in such that 0 < 61< h, < 1, let £1 be the
`least positive integer satisfying
`
`alzew,+1)=§—Gr+3mgz,+kg£‘+1)
`1
`and let X = o’“''' and A] -~ |_t‘1h1_|. Consider the encoder
`E1 which employs a buffer of length in = n{h1,t'1), ob-
`tained from (9) and (10) by setting 8 = 31,} = A], and a(!.’)
`'-= K , and whose maximal word—leng'th L, is equal to El +
`1.
`
`It follows from Theorem 1 that if this encoder is applied
`to a source an such that h,,,(81_} = h], then the resulting
`compression ratio piicn) satisfies
`
`In analogy with (20), it is also clear that if E0 is applied
`to a source 0'0 such that h,,,,(!g) = ho, then the resulting
`compression ratio p()(t'f[)} satisfies
`
`IJo(0ol 5 hcIo(e[)] 4' 50 = ho 4‘ 5o.
`
`(24)
`
`where 60 = e{€g + I).
`From (23) and (24), we have pg{o'0} 5 pg — c(€n + 1) + :50
`= 190. and
`
`1
`5o5po'hu<e(fo) S s(£i +1) 551 (Eng.
`
`Hence, hg can be made arbitrarily close to pa, and conse-
`quently, the encoder Eg matches as closely as desired the
`lower end pg of the given compression range.
`Theorem 2: Let 0' be a source for which a matched en-
`
`coder E achieves a compression ratio ,o(o) within the range
`{p|},,o]). Then the compression ratio P0(0'), achieved for o'
`by E0, satisfies
`
`where
`
`1
`1
`J
`hg1—hn'
`(Typically. (hifhol > (1./1 - ho) andd = (h1z’hn}.}
`
`Proof: To prove the theorem we shall consider the
`obviously worst case of applying E0 to the source in whose
`matched encoder E1 realizes p1.
`Let po(I‘71l denote the compression ratio achievable by
`Eu when applied to 01. According to (12), we have
`
`floifirl =
`
`Lei}
`If_ (£0 + 1) Nam),
`
`D1(01l5he.(31l+f5i=h1+51-
`
`(20)
`
`where
`
`Suppose now that h; and :51 were chosen to fit the pre-
`scribed upper value p1 = In + 61 of a prospective com-
`pression ratio range (90.91) with
`
`0<51R‘(p0<p]‘Cl,
`
`H>I,
`
`(21)
`
`where R is an adjustment parameter to be determined
`later. As shown above, the encoder E1 is then matched
`exactiy to the upper value ,0 1 of the prospective compres-
`sion range. In order to accommodate the whole given range,
`we propose to employ a slightly larger encoder E3 whose
`buffer is of length ng = R1 — £1 + 5'0, where in and t’; are
`the parameters of E1, and where 80 is an integer, greater
`than £1, for which the solution ha of the equation
`
`R1-31+€’o= n(ho.3c)
`
`(22)
`
`satisfies
`
`flo‘€(3ol <ho5IJo‘€(3o+ 1l-
`
`(23)
`
`Noting that no - 89 = n; - £1 is fixed, it is clear from (9)
`and (10) that as 99 increases ha decreases; also, (21) and the
`fact that 59 > 3; imply that pa — etfn + 1} > pg — (-(£1+ 1}
`2 pg — 51 > 0. Hence, there exist ha > 0 and 80 > 2’. that
`satisfy both (22) and (23).
`
`L._.; = 1 +|'log(t’,-+1}'I+|'log(n;— E.--1)],
`
`5E mall.
`
`and N,-(cr,-}, i,_,i E §0,1l. is the maximum number of words
`into which a string 8 E a,-In, — E; — 1i is parsed by E;.
`Since no - Eu -'= a1 - £1 and 85 > £1, it is easy to verify
`that? N0(a1) 5 Nlfol). Also by (18),
`
`N1{0il5 Niifiil “
`Hence
`
`}‘I1‘(f1+1}=T1o—(£g+1)
`£1
`£1
`
`£o+
`51
`
`Let) "' Lcl
`
`3'1
`
`+ Let)‘ Lcl
`31
`
`and since
`
`LE9 - L9; = [log (89 +1)]—|'log(£1+1)'lSflogk'|,
`
`9 The assertion here is analogous to that of [10, theorem 1].
`
`Page 6 of 7
`
`
`
`IEEE 'l‘R»\NSAC'l‘IONS ON INFORMATION THEORY, VOL.
`
`[T-23, NO. 3. MAY 1977
`
`343
`
`where k = ifgffil. we obtain
`
`1
`1
`Doif-'1) 5 :01 ‘l’ E‘ H03 kl-
`
`Moreover, for given complexity i.e., a given codeword
`length, the compression efficiency is comparable to that
`of an optimal variable-to-block code book designed to
`match a given source.
`
`(25)
`
`30
`
`To obtain an upper bound on k, we first observe that
`flu
`2‘:
`m =.\|_p+1
`
`(50 — m 'l' 1)a"°'l”’ S no — do
`F
`= n, — 2, 5 2', i (£1-m +1)w’r’H,
`m=l
`
`which reduces to
`
`ken _ h0}2 5 a:’1h:{1—&>()mfh[J).
`
`(26)
`
`Now, either it (1 — ho) < 1, or else the exponent on the
`right side of (26) must be nonnegative and thus is S 021/
`ho). In either case,
`
`k S
`
`max
`
`in
`—,
`
`1
`1 _ ho]=d.
`
`Q.E.D.
`
`Theorems 1 and 2 demonstrate the efficiency and uni-
`versality of the proposed algorithm. They show that an
`encoder designed to operate over a prescribed compression
`range performs, practically, as well as one designed to
`match a specific compression ratio within the given range.
`
`REFERENCES
`
`{l]
`
`1'}. A. Huffman, "A method for the construction of minimum-re-
`dundancy codes," Pro-:. IRE, vol. 40, pp. 10934101, 1952.
`[2] R. M. Karp, “Minimum—redundancy coding for the discrete noiseless
`channel," IRE Trans. inform. Theory, vol. IT—l'?, pp. 27-38, -Jan.
`1951.
`I3] B. F‘. Vern, "Optimal variable length codes." Inform. Conlr., vol.
`19, pp. 289-301, I971.
`[4] Y. Perl, M. R. Gary, and S. Even, "Efficient generation ofoptimal
`prefix code: Equiproboble words using unequal cost letters." J.
`ACM, vol. 22. pp. 2U2«214, April 1975.
`|5] A. l.empel,S. Even,and M.Cohn, “An algorithm foroptimalprclix
`parsing of a noiseless and memoryless channel,“ IEEE Trans. in-
`form_ Theory, vol. I'l"- I9, pp. 208-214, March 1973.
`[S] F. -Jelinek and K. S. Schneider. "On variable length to block coding,"
`IEEE Trans. Inform. Theory, vol. l'l‘.18, pp. 'l65—'i"F4. Nov. 1972‘.
`[T] R. G. Gallager, Information 'T'heor_v and Reliubie C'ommu:1i('r1lir:n.
`New York: Wiley, 1968.
`[8] J. Ziv, “Coding of sources with unknown statistics—l’art 1: Probe-
`bility of encoding error," IEEE Trans. inform. Theory. vol. l'i‘—l8.
`pp. 384-394. May 1972.
`[S] L. D. Davisson, “Universal noiseless coding,“ IEEE Trans. Inform.
`Theory, vol. lT—l9, pp. 783-795, Nov. 1973.
`[10] A. Lempel and J. Ziv, "On the complexity of finite sequences,” IEEE
`Trans. Inform. 'I‘}ieor_v, vol. IT-22. pp. 'F5 -81,Jan. 19'i6.
`[11] B. M. Fitingof, “Optimal coding in the case of unknown and
`changing message statistics." Prob. inform. Tron.-2m_, vol. 2, pp.
`3-1], 1966.
`
`On Binary Sliding Block Codes
`
`TOBY BERGER, SENIOR MEMBI-‘Jvt. lF1EE,.AND JOSEPH KA-YIN LAU
`
`Abstract—- Sliding block codes are an intriguing alternative to
`the block codes used in the development of classical information
`theory. The fundamental analytical problem associated with the
`use of a sliding block code (SBC} for source encoding with respect
`to a fidelity criterion is that ofdeterrnining‘ the entropy of the coder
`output. Several methods of calculating and of bounding the output
`entropy of an SBC are presented. The local and global behaviors
`of a well-designed 3138 also are discussed. The so-called “I01-
`coder,” which eliminates all the isolated zeros from a binary input,
`plays a central role. It not only provides 3. specific example for
`application of the techniques developed for calculating and
`bounding the output entropy, but also serves as a medium for oh-
`taining indirect insight into the problem of characterizing a good
`
`Manuscript received April 16, 19175. This work was supported in part
`by the National Science Foundation under Grant GK-41229 and by a
`John Simon Guggenheim Memorial Foundation Fellowship. Portions
`of this paper were first presented at the IEEE Information Theory
`Workshop, Lcnox, Massachusetts, June l9'?5.
`The authors are with the School of Electrical Engineering, Cornell
`University, lthaca. NY 14853.
`
`SBC. An easily implementable SBC subclass is introduced in which
`the outputs can be calculated by simple logic circuitry. The study
`of this subclass is shown to be closely linked with the theory of al-
`gebraic group codes.
`
`I.
`
`INTR()DUC'T‘ION
`
`theory of
`the
`HANNON’S development of
`source coding subject to a fidelity criterion [1], [2]
`dealt almost exclusively with block coding, i.e., the map-
`ping of consecutive, nonoverlapping, fixed—]ength blocks
`of source data into a so—called “code book” containing a
`constrained number of entries. The fundamental theorems
`of rate—distortion theory, which relate optimal source code
`performance to an information-theoretic minimization,
`involve complicated random coding "arguments [3], [4] in
`general cases. Also,
`in many situations, block coding
`structures are exceedingly difficult to implement.
`
`Page 7 of 7