`
`IT—23, NO. 3, MAY 1977
`
`337
`
`A Universal Algorithm for Sequential Data Compression
`
`JACOB ZIV, FELLOW, IEEE, AND ABRAHAM LEMPEL, MEMBER, IEEE
`
`Abstract—A universal algorithm for sequential data compres-
`sion is presented. Its performance is investigated with respect to
`a nonprobabilistic model of constrained sources. The compression
`ratio achieved by the proposed universal code uniformly ap-
`proaches the lower bounds on the compression ratios attainable by
`block-to-variable codes and variable-to-block codes designed to
`match a completely specified source.
`
`I. INTRODUCTION
`
`com-
`arising in digital
`N MANY situations
`munications and data processing, the encountered
`strings of data display various structural regularities or are
`otherwise subject to certain constraints, thereby allowing
`for storage and time—saving techniques of data compres-
`sion. Given a discrete data source, the problem of data
`compression is first to identify the limitations of the source,
`and second to devise a coding scheme which, subject to
`certain performance criteria, will best compress the given
`source.
`
`Once the relevant source parameters have been identi-
`fied, the problem reduces to one of minimum—redundancy
`coding. This phase of the problem has received extensive
`treatment in the literature [1]—[7].
`When no a priori knowledge of the source characteristics
`is available, and if statistical tests are either impossible or
`unreliable, the problem of data compression becomes
`considerably more complicated. In order to overcome these
`difficulties one must resort to universal coding schemes
`whereby the coding process is interlaced with a learning
`process for the varying source characteristics [8],
`Such
`coding schemes inevitably require a larger working mem-
`ory space and generally employ performance criteria that
`are appropriate for a wide variety of sources.
`In this paper, we describe a universal coding scheme
`which can be applied to any discrete source and whose
`performance is comparable to certain optimal fixed code
`book schemes designed for completely specified sources.
`For lack of adequate criteria, we do not attempt to rank the
`proposed scheme with respect to other possible universal
`coding schemes. Instead, for the broad class of sources
`defined in Section III, we derive upper bounds on the
`compression efficiency attainable with full a priori
`knowledge of the source by fixed code book schemes, and
`
`Manuscript received June 23, 1975; revised July 6, 1976. Paper pre-
`viously presented at the IEEE International Symposium on Information
`Theory, Ronneby, Sweden, June 21-24, 1976.
`J. Ziv was with the Department of Electrical Engineering, Technion—
`Israel Institute of Technology, Haifa, Israel. He is now with the Bell
`Telephone Laboratories, Murray Hill, NJ 07974.
`A. Lempel was with the Department of Electrical Engineering, Tech-
`nion—Israel Institute of Technology, Haifa, Israel. He is now with the
`Sperry Research Center, Sudbury, MA 01776.
`
`then show that the efficiency of our universal code with no
`a priori knowledge of the source approaches those
`bounds.
`
`The proposed compression algorithm is an adaptation
`of a simple copying procedure discussed recently [10] in
`a study on the complexity of finite sequences. Basically,
`we employ the concept of encoding future segments of the
`source—output via maximum—length copying from a buffer
`containing the recent past output. The transmitted
`codeword consists of the buffer address and the length of
`the copied segment. With a predetermined initial load of
`the buffer and the information contained in the codewords,
`the source data can readily be reconstructed at the de-
`coding end of the process.
`.
`The main drawback of the proposed algorithm is its
`susceptibility to error propagation in the event of a channel
`error.
`
`11. THE COMPRESSION ALGORITHM
`
`The proposed compression algorithm consists of a rule
`for parsing strings of symbols from a finite alphabet A into
`substrings, or words, whose lengths do not exceed a pre—_
`scribed integer Ls, and a coding scheme which maps these
`substrings sequentially into uniquely, decipherable code-
`words of fixed length LC over the same alphabet A.
`The word—length bounds L, and L, allow for bounded-
`delay encoding and decoding, and they are related by
`
`Le = 1+ l10g (11 - Ls)l 4” I108 L5]:
`
`(1)
`
`where [x] is the least integer notgsmaller than x, the log-
`arithm base is the cardinality oz of the alphabet A, _and n
`is the length of a buffer, employed at the encoding end of
`the process, which stores the latest n symbols emitted by
`the source. The exact relationship between n and L, is
`discussed in Section III. Typically, n 2 Lso/‘Ls, where 0
`< h < 1. For on—line decoding, a buffer of similar length has
`to be employed at the decoding end also.
`To describe the exact mechanics of the parsing and
`coding procedures, we need some preparation by way of
`notation and definitions.
`
`Consider a finite alphabet A of a symbols, say A =
`{0,1, - - - ,a — 1}. A string, or word, S of length [(8) = k over
`A is an ordered k—tuple S = .9132 - - - 3;, of symbols from A.
`To indicate a substring of S which starts at position i and
`ends at position j, we write S(i,j). When i S j, S(i,j) =
`s,~s,-+1 - - - S,-, but when i > j, we take S(i,j) = A, the null
`string of length zero.
`The concatenation of strings Q and R forms a new string
`S = QR; if£’(Q) = k and EU?) = m, then €(S) = k + m, Q
`= S(1,k), and R = S(k + 1, k + m). For eachj, 0 Sj _<_
`
`1
`
`Oracle 1015
`
`Oracle 1015
`
`
`
`2
`
`
`
`ZIV AND LEMPEL: SEQUENTIAL DATA COMPRESSION
`
`339
`
`which matches a substring of B1 that starts in position pl
`5 9 and then set S1 = B1(10,9 + 61). It is easily seen that
`the longest match in this case is B1(10,11) = 00, and hence
`S1 = 001 and £1 = 3. The pointer p1 for this step can be any
`integer between one and nine; we choose to set p1 = 9. The
`two—digit radix—3 representation of p1 — 1 is C11 = 22, and
`that of £1 - 1 is C12 = 02. Since C53 is always equal to the
`last symbol of S,~, the codeword for S1 is given by C1 =
`22021.
`
`To obtain the buffer load B2 for the second step, we shift
`out the first £1 = 3 digits of B 1 and feed in the next 3 digits
`S (10,12) = 210 of the input string S. The details of steps
`2, 3, and 4 are tabulated below, where pointer positions are
`indicated by arrows and where the source words S, are
`indicated by the italic substring of the corresponding
`buffer load B,-
`
`l
`
`B2=00OO0O00101021O210,
`
`C2=211O2
`
`l
`
`B3=000010102102102I20,
`
`C3=2O212
`
`l
`
`B4=210210212021021200,
`
`C4=0222O.
`
`III. COMPRESSION or CONSTRAINED SOURCES
`
`In this section, we investigate the performance of the
`proposed compression algorithm with respect to a non-
`probabilistic model of constrained information sources.
`After defining the source model, we derive lower bounds
`on the compression ratios attainable by block—to-variable
`and variable—to—block codes under full knowledge of the
`source, and then show that the compression ratio achieved
`by our universal code approaches these bounds.
`
`A. Definition of the Source Model
`
`Let A = {0,1, - - - ,oz — 1} be the given a—symbol alphabet,
`and let A * denote the set of all finite strings over A. Given
`a string S E A* and a positive integer m S 3 (S), let Slm}
`denote the set of all substrings of length m contained in S,
`and let S(m) denote the cardinality of S {In}. That is,
`3(3)-In
`
`S{m}=
`
`S(i+ 1,i+m)
`
`and
`
`S(m) = |S{m}|-
`
`Given a subset 0 ofA*, let
`
`aim} = {S E oI£’(S) = m},
`
`and let a(m) denote the cardinality of o{m}.
`A subset a of A * is called a source if the following three
`properties hold:
`
`1) A C a (i.e., 0 contains all the unit length strings),
`2) S E 0’ implies SS E a,
`3) S E 0 implies S{m} C a{m}.
`
`3
`3
`
`Typically, such a source a is defined by specifying a fi-
`nite set of strings over A which are forbidden to appear as;
`substrings of elements belonging to a, and therefore a(m)
`< am for all m exceeding some mo.
`With every source 0, we associate a sequence h(1),
`h(2), - - - of parameters, called the h-parameters of a,
`' wherel
`
`h(m) = i log a(m),
`m
`
`m = 1,2, - - -.
`
`(2)
`
`It is clear that 0 S h(m) S 1 for all m and, by 2) it is also
`clear that mh(m) is a nondecreasing function of m. The _
`sequence of h—parameters, however,
`is usually nonin~
`creasing in m. To avoid any possible confusion in the se—£
`quel, we postulate this property as an additional defining
`property of a source. Namely, we require
`
`4) h(m) = 1/m log o(m) is a nonincreasing function of
`m.
`
`B. Some Lower Bounds on the Compression Ratio
`
`Consider a compression coding scheme for a source a
`which employs a block—to-variable (B V) code book of M
`pairs (X,-,Y,-) of words over A, with £’(X,) = L for i =
`1,2, -
`- - ,M. The encoding of an infinitely long string 8 E‘
`0 by such a code is carried out by first parsing S into blocks
`of length L, and then replacing each block Xi by the cor-
`responding codeword Y,-. It is assumed, of course, that the
`code book is exhaustive with respect to 0 and uniquely
`decipherable [2}. Hence, we must have
`
`{Xi iii = UlL}
`
`M =
`
`= aLh(L),
`
`01'
`
`and
`
`max l€(Y,-)} Z l0gM = Lh(L).
`1SiSM
`
`(4)
`
`The compression ratio pg associated with the ith word»
`pair of the code is given by
`
`Pi
`
`£’(Yi)
`L .
`
`The BV compression ratio, pm/(o,M), of the source a,
`is defined as the minimax value of p,-, where the maximi—
`zation is over all word—pairs of a given code, and the min-
`imization is over the set C3‘/(a,M) of all BV code books"
`consisting of M word—pairs. Thus,
`
`p3V(a,M)= min max {gum}
`
`Cm/(a,M)1sisM L
`
`logM Lh(L)
`= —-—' =
`L
`L
`
`_
`>
`
`h
`
`,
`
`L .
`()
`
`1 Throughout this paper, log x means the base—a logarithm of x.
`
`
`
`340
`
`For later reference, we record this result in the following
`lemma.
`Lemma 1:
`
`where
`
`pBv(U,M) Z h(L),
`
`Lh(L) = log M.
`
`Now, consider a compression scheme which employs a
`variable—to—block (VB) code book of M word—pairs (X,-,Y,-),
`with £’(Y,) = L for all i = 1,2, - -- ,M. In this case, the
`compression ratio associated with the ith word-pair is
`given by
`
`_
`
`Pi
`
`L
`5’ (Xi),
`
`and similarly, the VB compression ratio pV3(a,M) of a is
`defined as the minimax value of p,- over all word—pairs and
`over the set CVB(a,M) of VB code books with M word-
`pairs.
`Lemma 2:
`
`where
`
`/>vB(0,M) Z h(LM)
`
`LM = max {KIM Z a(€)}.
`
`Proof: We may assume, without loss of generality, that
`in every code under consideration
`
`€(X1)S 6'(X2) 5
`
`S €(XM),
`
`and hence for each C E C VB(a,M),
`
`,- C =
`maxp(
`)
`
`L(C)
`“Xfl
`
`.
`
`Since C is exhaustive with respect to 0, we have
`
`M Z rr(£’i),
`
`5
`()
`
`(6)
`
`where 61 = L’ (X 1); and since C is uniquely decipherable,
`we have
`
`L(C) ZlogM.
`
`(7)
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, MAY 1977
`
`Remarks
`
`1) Since the value of L in the context of Lemma 1
`satisfies the definition of LM as given in Lemma 2, it fol-
`lows that the bounds of both lemmas are essentially the
`same, despite the basic difference between the respective
`coding schemes.
`2) By 2), the second defining property of a source, the
`per—word bounds derived above apply to indefinitely long
`messages as well, since the whole message may consist of
`repeated appearances of the same worst case word.
`3) By 4), the nonincreasing property of the h-pa-
`rameters, the form of the derived bounds confirms the
`intuitive expectation that an increase in the size M of the
`employed code book causes a decrease in the lower bound
`on the attainable compression ratio.
`
`C. Performance of the Proposed Algorithm
`
`We proceed now to derive an upper bound on the com-
`pression ratio attainable by the algorithm of Section II. To
`this end, we consider the worst case source message of
`length n — Ls, where n is the prospective buffer length and
`L, is the maximal word—length. The bound obtained for
`this case will obviously apply to all messages of length n
`— L, or greater.
`First, we assume that only the h—parameters of the
`source under consideration are known to the designer of
`the encoder. Later, when we discuss the universal perfor-
`mance of the proposed algorithm, we will show that even
`this restricted a priori knowledge of the source is actually
`unessential.
`
`We begin by choosing the buffer length n to be an inte-
`ger of the form
`x
`
`
`m=1n= 2 ma”"+ Zm=}\+1 m<T(3) + (5 + 1)(Ne+1 +1),
`
`Z
`
`
`where
`
`A
`
`e
`m=>\+1 (3 — m)<7(€),
`Ne+1 = Z (5 -m)a’" + Z
`m=1
`
`(9)
`
`(10)
`
`From the definition of LM, inequality (6), and the nonde-
`creasing property of 6(5), we obtain
`
`A = [£’h(Z)j, the integer part of log 0(3) = Zh(€), and
`
`6 = L, —— 1.
`
`(11)
`
`£1 ELM.
`
`_ From (5), (7), and (8), we have
`
`max p,-(C) Z
`
`logM_
`LM ’
`
`and since
`
`M Z
`
`= aLMh(l-M),
`
`it follows that for each C E C VB(o,M),
`
`max p,-(C) Z h(LM).
`
`(8)
`
`The specific value of the parameter L, is left for later
`determination (see the first remark following the proof
`of Theorem 1). The reasoning that motivates the given
`form of n will become clear from subsequent deriva-
`tions.
`
`Consider a string S E crjn — Ls}, and let N(S ) denote the
`number of words into which S is parsed by the algorithm
`during the encoding process. Recalling that each of these
`words is mapped into a codeword of fixed length LC (see
`(1)), it follows that the compression ratio p(S) associated
`with the string S is given by
`
`Q.E.D.
`
`4
`
`‘p(S') =
`
`Lc
`
`n — Ls
`
`N(S)-
`
`
`
`ZIV AND LEMPEL: SEQUENTIAL DATA COMPRESSION
`
`Hence, the compression ratio p attainable by the algorithm
`for a given source 0 is
`
`N,
`
`(12)
`
`where
`
`N= max N(S).
`se.;:n—L,}
`
`Let Q E afn — L3} be such that N(Q) = N, and suppose
`that the algorithm parses Q into Q = Q1Q2 - - - QN. From
`step 2) of the encoding cycle it follows that if €(Q,-) = Z (Qj)
`< Ls, for some i <j < N, then Q, 75 Q]; (Note that when
`Qj is being determined at the jth cycle of the encoding
`process, all of Q, is still stored in the buffer, and since €(Q,-)
`< L, , the longest substring in the buffer that precedes Q,-
`and is a prefix of Qj must be of length €(Q,-) — 1.)
`Denoting by Km the number of Q,-, 1 S i S N — 1, of
`length m, 1 S m S Ls, we have
`Ls
`N = 1 + 2 K,,,.
`m= 1
`
`or
`
`n — L
`- L
`3.
`L, — 1
`Z
`Hence, from (12) and (16), we have
`
`N 5 N’ = ”
`
`S =
`
`Le
`£2 _
`pS€—LS_1.
`
`341
`
`(16)
`
`(17)
`
`Note that despite the rater rudimentary overestimation
`of N by N’, the upper bound of (17) is quite tight, since the
`fact that no source word is longer than L, immediately
`implies p Z LC/Ls.
`We can state now the following result.
`Theorem 1: If the buffer length n for a source with
`known h—parameters is chosen according to (9), then
`
`p S h(Ls - 1) + e(Ls),
`
`where
`
`e(L,.) =
`
`L, — 1
`
`(3 + 3 log (L, — 1) + log
`
`Proof: From (1) we have
`
`By the above argument, and by property 3) of the source,
`we have
`
`LC = 1 + [log LS] + [log (n — L,)]
`
`K,,,Sa(m),
`
`for1SmS€=L,—-1.
`
`Since
`
`[+1
`
`n _Ls = €(Q1v)+ Z mKm,
`m=1
`
`and n and L, are both fixed, it is clear that by overesti-
`mating the values of Km for 1 S m _<_ E at the expense of
`K5+1, we can only overestimate the value of N. Therefore,
`since a(m) S a(m + 1) and a(m) = amhiml S am, we ob-
`tain
`
`S 3 + log (L, — 1) + log (n — L_,).
`
`From (9) and (10) we obtain
`A
`
`7n=1
`
`4.’
`
`m=>\+1
`
`>\
`
`6’
`
`+ ; oz’”+ 2 0(3)],
`
`m=)\+1
`
`1
`
`and since 01'” S a(Z), for 1 S m S A, we have
`
`n—L,SZa(€) f (Z-m+1)=%£’2(€+1)a(€),
`
`m=1
`
`log (n — L3) 5 2 log 3 + log
`
`+ L’h(€).
`
`2
`
`£’+1
`
`/
`6
`/
`N_<_K[r+1+ Z Km=N/,
`m=1
`
`01'
`
`where
`
`and
`
`K;n={a’”,
`
`0(5),
`
`for1SmS}\=L€h(€)j
`
`for>\<m SE
`
`(14)
`
`Since 3 = L, — 1, we obtain
`
`,
`
`Kg+1=
`
`1
`
`5’
`
`,
`
`;1mK,,,)-I.
`
`Ls
`LC 5 3 + 3log (L, — 1) +log?‘+ (L, — 1)h(L, — 1),
`01'
`
`From (14), (15), and (9), we obtain K)“ 5 Ng+1, and
`
`LC 5 (Ls — 1)[h(Ls — 1) + 6(Ls)l-
`
`x
`
`(7
`
`N/=Ne+1+ Z;
`
`1
`
`01'” + Z
`m=>\+1
`
`0(3)
`
`which, together with (9) and (10), yields
`e
`n—L,—€N’= i mcr’"+ Z ma(£’)
`m=1
`m=>x+1
`
`A
`
`e
`
`+Nz+1-€[ Z a’”+ Z a(€)]=O,
`
`m=>\+1
`
`m=1
`
`Substituting this result into (17), we obtain the bound of
`Theorem 1.
`
`Q.E.D.
`
`Remarks
`
`1) The value of e(L,) decreases with L, and, conse-
`quently, the compression ratio p approaches the value of
`h(L, — 1), the h—parameter associated with the second
`largest word—length processed by the encoder. Given any
`6 > 0, one can always find the least integer Z, such that
`p - h(€, — 1) S 6. The magnitude of the acceptable de-
`
`5
`
`
`
`342
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, MAY 1977
`
`In analogy with (20), it is also clear that if E0 is applied
`to a source :70 such that h,,(,(Z0) = ho, then the resulting
`compression ratio p0(a0) satisfies
`
`Po(00) S ha0(€o) + 50 = ho + 50.
`
`(24)
`
`where 60 = e(€0 + 1).
`From (23) and (24), we have p0(a0) 5 p0 — e(£0 + 1) + 60
`= p(), and
`
`1
`()0Spo—hQ<e(€0)S6([1+1)S51<Ep0.
`
`Hence, h0 can be made arbitrarily close to p0, and conse—‘
`quently, the encoder E0 matches as closely as desired the
`lower end p0 of the given compression range.
`Theorem 2: Let a be a source for which a matched en-
`
`coder E achieves a compression ratio p(a) within the range
`(p0,p1). Then the compression ratio po(a), achieved for a
`by E0, satisfies
`
`00(0) S 0(0) + A,
`
`where
`
`flog dl
`
`1
`hl
`AS
`(1
`max {ho 1 " ho
`(Typically, (h1/h0) > (1/1 - I10) and d = (h1/h0).)
`
`d =
`
`—,
`
`viation 6 determines the operational range of LS, namely,
`L, 2 Z3.
`2) Since our code maps source words of variable
`length at most Ls into codewords of fixed length LC, we
`adopt as a reference for comparison the best VB code C VB
`discussed in Subsection III—B. The counterpart of our
`block-length LC is L(C) of (5), and by (7) we can write
`
`LC % lOg'M %
`
`(18)
`
`~
`
`where M is the code book size and h(LM) is the lower
`bound (see Lemma 2) on the compression ratio pVB at-
`tainable by C VB. Since for sufficiently large Ls, we have
`also
`
`LC ~ (Ls _ 1)P Q (Ls _
`
`" 1):
`
`(19)
`
`it follows that p % pvg.
`We turn now to investigate the universal performance
`of the proposed algorithm, i.e., the performance when no
`a priori knowledge of the source to be compressed is
`available.
`
`Given 61 and h1 such that0 < 61< h1 < 1, let [1 be the
`least positive integer satisfying
`1
`c)1Ze(€1+l)=2-<3+3log€1+log
`1
`
`and let K = a"1h1 and )\1 = |_Z1h1j. Consider the encoder
`E1 which employs a buffer of length n1 = n(h1,€ 1), ob-
`tained from (9) and (10) by setting I = Z1, A = A1, and 0(3)
`= K, and whose maximal word—length L, is equal to 6’ 1 +
`1.
`
`It follows from Theorem 1 that if this encoder is applied
`to a source 01 such that h,,(€
`1) = h1, then the resulting
`compression ratio p1(a1) satisfies
`
`Proof: To prove the theorem we shall consider the
`obviously worst case of applying E0 to the source 01 whose
`matched encoder E1 realizes p1.
`Let p0(o'1) denote the compression ratio achievable by
`E0 when applied to 01. According to (12), we have
`
`L00
`
`Po(01) =
`
`no _
`
`N0(0I))
`
`p1(01) 5 h.}1(€1) + 51 = hi 4” 51-
`
`(20)
`
`where
`
`Suppose now that h1 and 61 were chosen to fit the pre-
`scribed upper value p1 = h1 + 61 of a prospective com-
`pression ratio range (p0,p1) with
`
`0<51R<p0<p1<l,
`
`R>1,
`
`where R is an adjustment parameter to be determined
`later. As shown above, the encoder E1 is then matched
`exactly to the upper value p1 of the prospective compres-
`sion range. In order to accommodate the whole given range,
`we propose to employ a slightly larger encoder E0 whose
`buffer is of length n0 = n1 - 61 + £0, where n1 and [1 are
`the parameters of E1, and where 30 is an integer, greater
`than t’ 1, for which the solution h0 of the equation
`
`i
`
`711 — 31 + 30 = n(h0,€’o)
`
`(22)
`
`satisfies
`
`P0 ‘ €(3o) < ho 5 Po " 6(50 + 1)-
`
`(23)
`
`1 is fixed, it is clear from (9)
`Noting that n0 — £0 = n1 - €
`and (10) that as 20 increases h0 decreases; also, (21) and the
`fact that (0 > (1 imply that p0 - e(€0 + 1) > p0 — e(€1 + 1)
`2 p0 — 61 > 0. Hence, there exist h0 > 0 and 30 > £1 that
`satisfy both (22) and (23).
`
`L31‘: 1 + [log ([1 + 1)]-1-|'log(n,-— 31-‘ 1)],
`
`iE i011},
`
`and N,~(aj), i,j E {0,1}, is the maximum number of words
`into which a string 8 E ag,-{n,~ — 6’, — 1} is parsed by E1.
`Since n0 — £0 = n1 — Z1 and 30 > 31, it is easy to verify
`that? N0(U1) S N1(U1). AISO by
`
`N1(U1) 5 Ni(01)=
`
`n1‘(€1+1)___n0-(50+1)
`£1
`(1
`
`Hence
`
`L00
`_
`P0(IT1) < [1
`
`Lcl
`= — +
`[1
`
`Lc0 — Lcl
`£1
`
`S
`
`01
`
`+
`
`Lc0 _ Lcl
`£1
`
`,
`
`and since
`
`L00 ‘ LL-1 = l10g(3o + 1)-l ‘ l10g (31 + 1)-l5l—10g kl,
`
`2 The assertion here is analogous to that of [10, theorem 1].
`
`6
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-23, NO. 3, MAY 1977
`
`343
`
`where k = (30/31), we obtain
`
`110071) 5 ,01+%1[10g ki.
`
`Moreover, for given complexity i.e., a given codeword
`length, the compression efficiency is comparable to that
`of an optimal Variable—to—block code book designed to
`match a given source.
`
`(25)
`
`To obtain an upper bound on k, we first observe that
`
`REFERENCES
`
`«I0
`302
`m=}\o+1
`
`(50 — m + 1)a'!°h0 S no — do
`
`/1
`= YL1 - 51 S (1 Z (€1—m + 1)0/lhl,
`m=1
`
`which reduces to
`
`k2(1 _ ho)? 5 0/Ih1(1-/€(ho/h1))_
`
`(26)
`
`Now, either k(1 —— ho) < 1, or else the exponent on the
`right side of (26) must be nonnegative and thus k S (h1/
`ho). In either case,
`
`h
`
`1
`k 5 max ]—1
`h01—h0]
`
`,
`
`=d.
`
`Q.E.D.
`
`Theorems 1 and 2 demonstrate the efficiency and uni-
`versality of the proposed algorithm. They show that an
`encoder designed to operate over a prescribed compression
`range performs, practically, as well as one designed to
`match a specific compression ratio within the given range.
`
`[1] D. A. Huffman, “A method for the construction of minimum—re-
`dundancy codes,” Proc. IRE, vol. 40, pp. 1098-1101, 1952.
`[2] R. M. Karp, “Minimum-redundancy coding for the discrete noiseless
`channel,” IRE Trans. Inform. Theory, vol. IT—17, pp. 27-38, Jan.
`1961.
`[3] B. F. Varn, “Optimal variable length codes,” Inform. Contr., vol.
`19, pp. 289-301, 1971.
`[4] Y. Perl, M. R. Gary, and S. Even, “Efficient generation of optimal
`prefix code: Equiprobable words using unequal cost letters,” J.
`ACM, vol. 22, pp. 202-214, April 1975.
`[5] A. Lempel, S. Even, and M. Cohn, “An algorithm for optimal prefix
`parsing of a noiseless and memoryless channel,” IEEE Trans. In-
`form. Theory, vol. IT-19, pp. 208-214, March 1973.
`[6] F. Jelinek and K. S. Schneider, “On variable length to block coding,”
`IEEE Trans. Inform. Theory, vol. IT—18, pp. 765-774, Nov. 1972.
`[7] R. G. Gallager, Information Theory and Reliable Communication.
`New York: Wiley, 1968.
`[8] J. Ziv, “Coding of sources with unknown statistics—Part I; Proba-
`bility of encoding error,” IEEE Trans. Inform. Theory, vol. IT-18,
`pp. 384-394, May 1972.
`[9] L. D. Davisson, “Universal noiseless coding,” IEEE Trans. Inform.
`Theory, vol. IT-19, pp. 783-795, Nov. 1973.
`[10] A. Lempel and J. Ziv, “On the complexity of finite sequences,” IEEE
`Trans. Inform. Theory, vol. IT-22, pp. 75-81, Jan. 1976.
`[11] B. M. Fitingof, “Optimal coding in the case of unknown and
`changing message statistics,” Prob. Inform. Transm., vol. 2, pp.
`3-11, 1966.
`
`On Binary Sliding Block Codes
`
`TOBY BERGER, SENIOR MEMBER, IEEE, AND JOSEPH KA-YIN LAU
`
`Abstract— Sliding block codes are an intriguing alternative to
`the block codes used in the development of classical information
`theory. The fundamental analytical problem associated with the
`use of a sliding block code (SBC) for source encoding with respect
`to a fidelity criterion is that of determining the entropy of the coder
`output. Several methods of calculating and of bounding the output
`entropy of an SBC are presented. Thelocal and global behaviors
`of a well-designed SBC also are discussed. The so-called “101-
`coder,” which eliminates all the isolated zeros from a binary input,
`plays a central role. It not only provides a specific example for
`application of the techniques developed for calculating and
`bounding the output entropy, but also serves as a medium for ob-
`taining indirect insight into the problem of characterizing a good
`
`Manuscript received April 16, 1976. This work was supported in part
`by the National Science Foundation under Grant GK—41229 and by a
`John Simon Guggenheim Memorial Foundation Fellowship. Portions
`of this paper were first presented at the IEEE Information Theory
`Workshop, Lenox, Massachusetts, June 1975.
`The authors are with the School of Electrical Engineering, Cornell
`University, Ithaca, NY 14853.
`‘
`
`SBC. An easily implementable SBC subclass is introduced in which
`the outputs can be calculated by simple logic circuitry. The study
`of this subclass is shown to be closely linked with the theory of al-
`gebraic group codes.
`
`I. INTRODUCTION
`
`theory of
`the
`of
`HANNON’S development
`source coding subject to a fidelity criterion [1], [2]
`dealt almost exclusively with block coding, i.e., the map-
`ping of consecutive, nonoverlapping, fixed—length blocks
`of source data into a so—called “code book” containing a
`constrained number of entries. The fundamental theorems
`
`of rate—distortion theory, which relate optimal source code
`performance to an information—theoretic minimization,
`involve complicated random coding arguments [3], [4] in
`general cases. Also,
`in many situations, block coding
`structures are exceedingly difficult to implement.
`
`7