`Shu Lin Alain Poli (Eds.)
`
`Applied Algebra,
`Algebraic Algorithms and
`Error-Correcting Codes
`
`13th International Symposium, AAECC-13
`Honolulu, Hawaii, USA, November 15-19, 1999
`Proceedings
`
`1 3
`
`Apple 1024
`
`1
`
`
`
`Series Editors
`Gerhard Goos, Karlsruhe University, Germany
`Juris Hartmanis, Cornell University, NY, USA
`Jan van Leeuwen, Utrecht University, The Netherlands
`
`Volume Editors
`
`Marc Fossorier
`Shu Lin
`University of Hawaii, Department of Electrical Engineering
`2540 Dole St., # 483, Honolulu, HI 96822, USA
`E-mail: marc@aravis.eng.hawaii.edu; slin@spectra.eng.hawaii.edu
`
`Hideki Imai
`University of Tokyo, Institute of Industrial Science
`7-22-1, Roppongi, Minato-ku, Tokyo 106, Japan
`E-mail: imai@iis.u-tokyo.ac.jp
`
`Alain Poli
`Universit´e Paul Sabatier, AAECC/IRIT
`118, Route de Narbonne, F31067 Toulouse Cedex, France
`E-mail: poli@cict.fr
`
`Cataloging-in-Publication data applied for
`
`Die Deutsche Bibliothek - CIP-Einheitsaufnahme
`Applied algebra, algebraic algorithms and error-correcting codes
`: 13th international symposium ; proceedings / AAECC 13, Honolulu,
`Hawaii, USA, November 15 - 19, 1999. Marc Fossorier . . . (ed.). -
`Berlin ; Heidelberg ; New York ; Barcelona ; Hong Kong ; London ;
`Milan ; Paris ; Singapore ; Tokyo : Springer, 1999
`(Lecture notes in computer science ; Vol. 1719)
`ISBN 3-540-66723-7
`
`CR Subject Classi cation (1998): E.4, I.1, E.3, G.2, F.2
`
`ISSN 0302-9743
`ISBN 3-540-66723-7 Springer-Verlag Berlin Heidelberg New York
`
`This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
`concerned, speci cally the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
`reproduction on micro lms or in any other way, and storage in data banks. Duplication of this publication
`or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
`in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are
`liable for prosecution under the German Copyright Law.
`c⃝ Springer-Verlag Berlin Heidelberg 1999
`Printed in Germany
`Typesetting: Camera-ready by author
`SPIN 10704606
`06/3142 – 5 4 3 2 1 0
`
`Printed on acid-free paper
`
`2
`
`
`
`New Sequences of Linear Time Erasure Codes
`Approaching the Channel Capacity
`
`M. Amin Shokrollahi
`
`Bell Labs, Room 2C-353, 700 Mountain Ave, Murray Hill, NJ 07974, USA
`amin@research.bell-labs.com
`
`Abstract. We will introduce a new class of erasure codes built from
`irregular bipartite graphs that have linear time encoding and decoding
`algorithms and can transmit over an erasure channel at rates arbitrarily
`close to the channel capacity. We also show that these codes are close
`to optimal with respect to the trade-off between the proximity to the
`channel capacity and the running time of the recovery algorithm.
`
`1
`
`Introduction
`
`A linear error-correcting code of block length n and dimension k over a finite field
`IFq—an [n, k]q-code for short—is a k-dimensional linear subspace of the standard
`vector space IFn
`q . The elements of the code are called codewords. To the code C
`there corresponds an encoding map Enc which is an isomorphism of the vector
`spaces IFk
`q and C. A sender, who wishes to transmit a vector of k elements in
`IFq to a receiver uses the mapping Enc to encode that vector into a codeword.
`The rate k/n of the code is a measure for the amount of real information in each
`codeword. The minimum distance of the code is the minimum Hamming distance
`between two distinct codewords. A linear code of block length n, dimension k,
`and minimum distance d over IFq is called an [n, k, d]q-code.
`Linear codes can be used to reliably transmit information over a noisy chan-
`nel. Depending on the nature of the errors imposed on the codeword during
`the transmission, the receiver then applies appropriate algorithms to decode the
`received word. In this paper, we assume that the receiver knows the position of
`each received symbol within the stream of all encoding symbols. We adopt as
`our model of losses the erasure channel, introduced by Elias [3], in which each
`encoding symbol is lost with a fixed constant probability p in transit indepen-
`dent of all the other symbols. As was shown by Elias [3], the capacity of this
`channel equals 1 − p.
`It is easy to see that a code of minimum distance d is capable of recovering
`d−1 or less erasures. In the best case, it can recover from any set of k coordinates
`of the encoding which means that d − 1 = n − k. Such codes are called MDS-
`codes. A standard class of MDS-codes is given by Reed-Solomon codes [10]. The
`connection of these codes with polynomial arithmetic allows for encoding and
`decoding in time O(n log2 n log log n). (See, [2, Chapter 11.7] and [10, p. 369]).
`However, these codes do not reach the capacity of the erasure channel, since
`there is no infinite sequence of such codes over a fixed field.
`
`Marc Fossorier et al. (Eds.): AAECC-13, LNCS 1719, pp. 65–76, 1999.
`c⃝ Springer-Verlag Berlin Heidelberg 1999
`
`3
`
`
`
`66
`
`M. Amin Shokrollahi
`
`Elias [3] showed that a random linear code can be used to transmit over the
`erasure channel at any rate R < 1 − p, and that encoding and decoding can be
`accomplished with O(n2) and O(n3) arithmetic operations, respectively. Hence,
`we have on the one hand codes that can be encoded and decoded faster than
`general linear codes, but do not reach the capacity of the erasure channel; and
`on the other hand we have random codes which reach the capacity but have
`encoding and decoding algorithms of higher complexity.
`The paper [1] was the first to design codes that could come arbitrarily close
`to the channel capacity while having linear time encoding and decoding algo-
`rithms. Improving these results, the authors of [8] took a different approach and
`designed fast linear-time algorithms for transmitting just below channel capac-
`ity. For all ϵ> 0 they were able to produce rate R = 1 − p(1 + ϵ) codes along
`with decoding algorithms that could recover from the random loss of a p fraction
`of the transmitted symbols in time proportional to n ln(1/ϵ) with high proba-
`bility, where n is the block length. These codes could also be encoded in time
`proportional to n ln(1/ϵ). They belong to the class of low-density parity check
`codes of Gallager [4]. In contrast to Gallager codes, however, the graphs used to
`construct the asymptotically good codes obtained in [8] are highly irregular.
`The purpose of the present paper is twofold. First, we prove a general trade-off
`theorem between the proximity of a given Gallager code to the channel capacity
`in terms of the loss fraction and the running time of the recovery algorithm of [8].
`We show that in this respect, the codes constructed in that paper are close to
`optimal. Next, we exhibit a different sequence of asymptotically close to optimal
`codes which have better parameters than the codes in [8]. An interesting feature
`of these codes is that the underlying bipartite graphs are right regular, i.e., all
`nodes on the right hand side of the graph have the same degree. Since they are
`theoretically better than their peers, we expect them to also perform better in
`practice.
`The organization of the paper is as follows. In the next section we will re-
`view the construction of Gallager codes. Next, we prove upper bounds on the
`maximum tolerable loss fraction in terms of the running time of the decoding al-
`gorithm. The last two sections are concerned with the derivation of the sequence
`of right regular erasure codes.
`
`2 Codes from Bipartite Graphs
`
`In this section, we will briefly review the class of codes we are interested in, and
`the erasure recovery algorithm associated to them.
`Our codes are similar to the Gallager codes [4] in that they are built from
`sparse bipartite graphs. In contrast to Gallager codes, however, our codes will
`be constructed from graphs that have a highly irregular degree pattern on the
`left.
`Let G be a bipartite graph with n nodes on the left and n − k nodes on the
`right. G gives rise to a binary code of block-length n and dimension ≥ k in the
`
`4
`
`
`
`New Sequences of Erasure Codes
`
`67
`
`following way: let the adjacency matrix of the graph G be given as
`H⊤
`
`A =! 0
`0 " ,
`H
`where H is some (n− k)× n matrix describing the connections in the graph. The
`code defined by the graph is the code with parity check matrix H. A different
`way of describing the code is as follows: we index the coordinate positions of
`the code with the n nodes on the left hand side of the graph. The code consists
`of all binary vectors (c1, . . . , cn) such that for each right node in the graph the
`sum of the coordinate places adjacent to it equals zero. The block-length of this
`code equals n, and its dimension is at least k since we are imposing n − k linear
`conditions on the coordinates of a codeword. Expressed in terms of the graph,
`the fraction of redundant symbols in a codeword is at most aL/aR where aL and
`aR are the average node degrees on the left and the right hand side of the graph,
`respectively. In other words, the rate of the code is at least 1 − aL/aR. This
`description of the rate will be useful in later analysis. In the following, we will
`assume that the rate is in fact equal to this value. This is because the statements
`we will prove below will become even stronger if the rate is larger.
`The above construction needs asymptotically O(n2) arithmetic operations to
`find the encoding of a message of length k, if the graph is sparse. One can apply
`a trick to reduce the running time to O(n) by a modification of the construction.
`Details can be found in [8].
`Suppose now that a codeword (c1, . . . , cn) is sent and that certain erasures
`have occurred. The erasure recovery algorithm works as follows. We first initialize
`the contents of the right hand nodes of G with zero. Then we collect the non-
`erased coordinate positions, add their value to the current value of their right
`neighbors, and delete the left node and all edges emanating from it from the
`graph. After this stage, the graph consists of the erased nodes on the left and
`the edges emanating from these nodes. In the next step we look for a right node in
`the graph of degree one, i.e., a node that has only one edge coming out of it. We
`transport the value of this node to its unique left neighbor ℓ, thereby recovering
`the value of cℓ. We add cℓ to the current value of all the right neighbors of ℓ,
`delete the edges emanating from ℓ, and repeat the process until we cannot find a
`node of degree one on the right, or until all nodes on the left have been recovered.
`It is obvious that, on a RAM with unit cost measure, the amount of arithmetic
`operations to finish the algorithm is at most proportional to the number of edges
`in the graph, i.e., to naL, where aL is the average node degree on the left. The
`aim is thus to find graphs with constant aL for which the recovery algorithm
`finishes successfully.
`The main contribution of [8] was to give an analytic condition on the maxi-
`mum fraction of tolerable losses in terms of the degree distribution of the graph.
`More precisely, define the left and the right degree of an edge in the graph as
`the degree of the left, resp. right node it is emanating from. Further, denote
`by λi and ρi the fraction of edges of left, resp. right degree i, and consider the
`
`generating functions λ(x) :=#i λixi−1 and ρ(x) :=#i ρixi−1. In the following
`
`we will call the pair (λ, ρ) a degree distribution.
`
`5
`
`
`
`68
`
`M. Amin Shokrollahi
`
`Table 1. Rate 1/2 codes built from heavy tail/Poisson distribution.
`
`N
`8
`16
`27
`47
`79
`132
`221
`
`aR
`5.9266
`7.0788
`8.0054
`9.0256
`10.007
`10.996
`12.000
`
`θ
`5.9105
`7.0729
`8.0027
`9.0243
`10.007
`10.996
`12.000
`
`δ/(1 − R)
`0.91968
`0.95592
`0.97256
`0.98354
`0.98990
`0.99378
`0.99626
`
`δ
`0.45984
`0.47796
`0.48628
`0.49177
`0.49495
`0.49689
`0.49813
`
`ˆδ
`0.49085
`0.49609
`0.49799
`0.49902
`0.49951
`0.49975
`0.49988
`
`δ/ˆδ
`0.93682
`0.96345
`0.97648
`0.98547
`0.99087
`0.99427
`0.99650
`
`The main theorem of [8] states that if the graph is chosen at random with
`degree distribution (λ, ρ) and if the erasures occur at random positions, then
`the above erasure recovery algorithm can correct a δ-fraction of losses if ρ(1 −
`δλ(x)) ≥ 1 − x for x ∈ [0, 1]. In a later paper [6], this condition was slightly
`relaxed to
`(1)
`δλ (1 − ρ(1 − x)) < x
`for x ∈ (0,δ ].
`The paper [8] further exhibited for any ϵ> 0 and any rate R an infinite
`sequence of degree distributions (λ, ρ) giving rise to codes of rate at least R such
`that the above inequality is valid for δ = (1−R)(1−ϵ), and such that the average
`left degree of the graph is O(log(1/ϵ)). In other words, δ can get arbitrarily close
`to its optimal value (1 − R) with a “logarithmic” sacrifice in the running time
`of the algorithm. Explicitly, for any given 0 <ϵ <1, one chooses an integer N
`close to 1/ϵ and considers the pair (λN ,ρ N) with
`
`λN (x) =
`
`N−1
`
`xk
`k
`
`$k=1
`equal to 1 − R, and H(N − 1) is the harmonic sum #N−1
`
`,ρ N(x) = eθN·(x−1),
`
`(2)
`
`1
`H(N − 1)
`where θN is chosen to make the fraction of the nodes on the left and the right
`k=1 1/k. This sequence
`is referred to as the heavy tail/Poisson sequence in the following. Table 1 gives
`an example of the performance of this sequence for some codes of rate 1/2. In
`that table, aR denotes the average right degree, δ is the maximum tolerable
`loss fraction, and ˆδ is a theoretical upper bound on δ, see Remark 1 following
`Corollary 1.
`In the following we will show that this sort of relationship between the average
`degree and the maximum tolerable loss fraction is the best possible. Furthermore,
`we will exhibit a new sequence of degree distributions for which the same type of
`relationship holds between the average degree and the maximum tolerable loss
`fraction.
`
`6
`
`
`
`New Sequences of Erasure Codes
`
`69
`
`3 Upper Bounds for the Maximum Tolerable Loss
`Fraction
`
`In this section we will prove some upper bounds on the maximum tolerable loss
`fraction δ for which the algorithm given in the previous section is successful when
`applied to a random graph with a given degree distribution. Our main tool will
`be the inequality (1). We will first need a preliminary result.
`Lemma 1. Let G be a bipartite graph with edge degree distribution (λ, ρ). Then,
`the average left, resp. right, node degree of G equal
`1
`1
`0 λ(x)dx
`0 ρ(x)dx
`respectively. Let the polynomials Λ and R be defined by
`0 λ(t)dt
`0 ρ(t)dt
`0 λ(t)dt
`0 ρ(t)dt
`Then the coefficient of xi in Λ(x), resp.R (x) equals the fraction of nodes of
`degree i on the LHS, resp. RHS of G.
`
`,
`
`,
`
`% 1
`Λ(x) := % x
`% 1
`
`,
`
`% 1
`R(x) := % x
`% 1
`
`.
`
`Proof. Let L denote the number of nodes on the LHS of G. Then the number of
`edges in G equals aLL, and the number of edges of degree i equals λiaLL. Hence,
`the number of nodes of degree i on the LHS of the graph equals λiaLL/i, since
`each such node contributes to exactly i edges of degree i. Hence, the fraction of
`nodes of degree i equals λiaL/i. Since the sum of all these fractions equals 1, we
`obtain a−1
`0 λ(x)dx and this yields both assertions for the LHS
`of the graph. The assertions on the RHS follow similarly.
`⊓&
`
`L = #i λi/i = % 1
`
`The following theorem shows a rather strong upper bound on δ.
`Theorem 1. Let λ and ρ denote the right and left edge degree distributions of
`a bipartite graph. Let δ be a positive real number such that
`δλ (1 − ρ(1 − x)) ≤ x
`
`for 0 < x ≤ δ. Then we have
`aL
`(1 − (1 − δ)aR) ,
`δ ≤
`aR
`where aL and aR are the average node degrees of the graph on the left, resp. right,
`hand side.
`
`Proof. As a real valued function the polynomial λ(x) is strictly increasing for
`positive values of x. Hence it has a unique inverse λ−1 which is also strictly
`increasing. The first of the above inequalities is thus equivalent to
`1 − ρ(1 − x) ≤ λ−1 (x/δ)
`
`7
`
`
`
`70
`
`M. Amin Shokrollahi
`
`0
`
`0
`
`0
`
`0 λ(x)dx.) Invoking the previous lemma, we thus
`
`for 0 < x ≤ δ. Integrating both sides from 0 to δ and simplifying the expressions,
`we obtain
`ρ(x)dx −& 1−δ
`& 1
`ρ(x)dx ≥ δ& 1
`λ(x)dx.
`(Note that% 10 λ−1(x)dx = 1−% 1
`
`aR ’1 − % 1−δ
`0 ρ(x)dx ( = aL
`aR − aL& 1−δ
`ρ(x)dx = aL
`δ ≤
`% 1
`where the polynomial R is defined as in the statement of Lemma 1. To finish
`the proof we only need to show that R(1− δ) ≥ (1 − δ)aR. To see this, first note
`that if a1, . . . , aM are nonnegative real numbers adding up to 1 and µ is any
`nonnegative real number, then
`
`have
`
`aL
`
`0
`
`ρ(x)dx
`
`0
`
`(1 − R(1 − δ)) ,
`
`aR
`
`iai.
`
`(3)
`
`aiµi ≥ µ#i
`$i
`This follows from taking logarithms of both sides and noting that the log-function
`is concave. Denoting by ai the coefficients of R(x) and setting µ := 1 − δ, this
`gives our desired inequality: the ai are by the previous lemma the fractions of
`nodes of degree i, and hence #i iai is the average right degree aR.
`⊓&
`
`Corollary 1. With the same assumptions as in the previous theorem, we have
`
`aL
`δ ≤
`aR
`Proof. Follows from δ ≤ aL/aR.
`
`(1 − (1 − aL/aR)aR) .
`
`⊓&
`
`Remark 1. A more refined upper bound for δ is given by the following. Let r
`denote the quantity aL/aR, i.e., one minus the rate of the code. Suppose that
`aR > 1/r—an assumption that is automatically satisfied if the average left degree
`is larger than one—and consider the function f(x) = x−r(1−(1−x)aR). We have
`f(0) = 0, f(1) = 1 − r > 0, f′(0) < 0, and f′(x) has exactly one root. Hence, f
`has exactly one nonzero root, denoted by ˆδaR,r and this root is between 0 and 1.
`The previous theorem asserts that the maximum δ satisfying the inequality (1)
`is smaller than ˆδaR,aL/aR.
`The following definition is inspired by Corollary 1.
`Definition 1. A sequence (λm,ρ m) of degree distributions giving rise to codes
`of rate R ≥ R0 for a fixed R0 > 0 is called asymptotically quasi-optimal if there
`exists a constant µ (possibly depending on R) and for each m there exists a
`positive δm with δmλm(1 − ρm(1 − x)) ≤ x for x ∈ (0,δ m] such that the average
`right degree aR satisfies aR ≤ µ log(1 − δ/(1 − R)).
`
`8
`
`
`
`New Sequences of Erasure Codes
`
`71
`
`The significance of quasi-optimal is the following: we want to construct codes
`that can recover as many erasures as possible in as little as possible time. The
`running time of the erasure recovery algorithm is proportional to the block-
`length of the code times aR, the average right degree. Hence, we are looking for
`a trade-off between aR and δ. Corollary 1 shows that we cannot make aR too
`small. In fact, it implies that aR ≥ log(1− δ/(1− R))/ log R. However, we might
`hope that we can make aR smaller than log(1 − δ/(1 − R)) times a (negative)
`constant µ. In this way, we have maintained a qualitative optimality of the code
`sequence in the sense that the running time increases only logarithmically with
`the relative increase of the erasure correction capability.
`The heavy tail/Poisson sequence (2) is asymptotically quasi-optimal as is
`shown in [8]. Starting in the next section, we will introduce another quasi-optimal
`sequence which has certain advantages over the heavy tail/Poisson sequence.
`We close this section by stating another useful upper bound on the maximum
`possible value of δ.
`Lemma 2. Suppose that λ and ρ are polynomials and δ is such that δλ(1−ρ(1−
`x)) < x for x ∈ (0,δ ]. Then δ ≤ ρ′(1)/λ′(0).
`Proof. Let f(x) = δλ(1 − ρ(1 − x)) − x. By assumption, f(x) < 0 for x ∈ (0,δ ].
`This implies that f′(0) ≤ 0, where f′(x) is the derivative of f with respect to x.
`But f′(0) = δλ′(0)ρ′(1) − 1, which gives the assertion.
`⊓&
`4 Fractional Binomial Coefficients
`
`As can be seen from their description (2), the heavy tail/Poisson sequence is
`closely related to the Taylor series of − ln(1− x). The new sequence that we will
`describe in the next section will be related to the Taylor expansion of (1 − x)α
`where 1/α is an integer. The coefficients of this expansion are fractional binomial
`coefficients. For this reason, we will recall some well-known facts about these
`numbers.
`For real α and a positive integer N we define
`
`
`! αN" := α(α − 1)··· (α − N + 1)
`N !1 −
`N − 1"···)1 −
`= (−1)N−1 α
`* (1 − α) .
`For convenience, we also define +α
`0, := 1. We have the Taylor expansion
`
`=1!αk"(−1)k+1xk.
`1 − (1 − x)α =
`Note that for 0 <α < 1 the coefficients of the right hand power series above are
`all positive. Furthermore, we have
`
`α2
`
`N!
`
`α
`
`(4)
`
`∞$k
`
`(5)
`
`
`
`! αN"(−1)N +1,
`
`N α
`
`N−1
`
`
`
`$k=1!αk"(−1)k+1 = 1 −
`
`9
`
`
`
`72
`
`M. Amin Shokrollahi
`
`and
`
`N−1
`
`k + 1
`
`N,(−1)N +1
`= α −+ α
`
`α + 1
`
`,
`
`(6)
`
`k"(−1)k+1
`$k=1!α
`
`cα
`
`as can easily be proved by induction.
`In the following, we will derive some estimates on the size of the binomial
`coefficients.
`Proposition 1. Let α be a positive real number less than 1/2 and let N ≥ 2 be
`an integer. There is a constant c independent of α and N such that
`
`N α+1 ≤! αN"(−1)N +1 ≤
`α
`N α+1 .
`N,(−1)N +1 is positive. Taking its logarithm and expand-
`Proof. First note that+ α
`ing the series − ln(1 − x) =#i xi/i, we obtain
`
`ln!! αN"(−1)N +1" = ln(α/N) − α
`2 −··· .
`−
`
`(7)
`
`1 k
`N$k
`
`=1
`
`α2
`2
`
`1 k
`
`N$k
`
`=1
`
`(Note that the series involved are absolutely convergent, so that we can rearrange
`the orders of the summations.) For an upper bound on this sum we replace
`#N
`k=1 1/ks by 1 for s ≥ 2 and use the left hand side of the inequality
`1
`2N
`
`1 k
`
`N$k
`
`=1
`
`ln(N) + γ<
`
`< ln(N) + γ +
`
`,
`
`This gives, after exponentiation,
`
`α
`
`where γ is Euler’s constant [5, pp. 480–481]. For the lower bound we use the
`right hand side of the above inequality and replace #N
`k=1 1/ks for s ≥ 2 by 2.
`
`N α+1 eα(2−γ−1/2N )(1 − α)2 ≤! αN"(−1)N +1 ≤
`α
`N α+1 eα(1−γ)(1 − α).
`Noting that 0 <α ≤ 1/2, we obtain our assertion.
`
`⊓&
`
`5 Right Regular Sequences
`
`A closer look at the proof of Theorem 1 reveals the following: one of the reasons
`we might obtain a lower than optimal upper bound for the largest δ satisfying (1)
`is that the inequality (3) is not sharp. In fact, that inequality is sharp iff all
`the ai except for one are zero, i.e., iff ρ(x) has only one nonzero coefficient,
`i.e., iff the graph has only one degree on the right hand side. We call such
`graphs right regular in the following. We will study in this section right regular
`degree distributions and design the left hand side in such a way as to obtain
`asymptotically quasi-optimal sequences. We remark that right regular sequences
`
`10
`
`
`
`New Sequences of Erasure Codes
`
`73
`
`,
`
`(8)
`
`(x).
`
`(9)
`
`were previously studied in an unpublished manuscript [9]. Our approach differs
`however from the one given in that paper.
`From a theoretical point of view, codes obtained from these sequences should
`perform better than the heavy tail/Poisson distribution, since they allow for a
`larger loss-fraction for a given rate and a given average left degree. Furthermore,
`the analysis given in [6] suggests that the actual performance of the code is
`related to how accurately the neighborhood of a message node is described by a
`tree given by the degree distributions. For instance, the performance of regular
`graphs is much more sharply concentrated around the value predicted by the
`theoretical analysis, than the performance of irregular graphs. Hence, if the graph
`is right regular, one should expect a smaller variance in the actual performance
`than for corresponding irregular codes.
`For integers a ≥ 2 and N ≥ 2 we define
`α,N (x) := α#N−1
`k=1 +αk,(−1)k+1xk
`
`ρa(x) := xa−1,λ
`N,(−1)N +1
`α − N+ α
`where here and in the following we set α := 1/(a− 1). Note that ρa(1) = 1, that
`λa,N(1) = 1 by (5), and that λa,N has positive coefficients. Hence, (ρa,λ a,N )
`indeed defines a degree distribution.
`Let ν< 1 be a positive constant. The sequence we are interested in is given
`by the following degree distributions:
`ρa(x),λ
`a,⌊ν−1/α⌋
`First we consider the rate of the codes defined by these distributions.
`Proposition 2. The rate R = Ra,µ of the code defined by the distributions
`in (9) satisfies
`1 − cν
`1 − ν
`1 − cνν 1/α ≤ 1 − R ≤
`1 − νν1/α ,
`where c is the constant from Proposition 2.
`Proof. In the following, we denote by N the quantity ⌊ν−1/α⌋ and by λ(x) the
`function λa,N(x). We need to compute the integral of λ(x). We have
`N,(−1)N +1
`α −+ α
`& 1
`N,(−1)N +1 .
`α − N+ α
`N,(−1)N +1
`α − N+ α
`1 −
`:= 1 − rα,N .
`N,(−1)N +1
`α −+ α
`Next, we estimate rα,N using Proposition 2 again:
`1 − ν
`1 − cν
`1 − cνν 1/α ≤ rα,N ≤
`1 − νν1/α .
`This gives the desired assertion on the rate.
`
`0 ρa(x)dx = 1/a = α/(α + 1). Hence, the rate of the code is
`
`λ(x)dx = α
`α + 1
`
`0
`
`Further,% 1
`
`(10)
`
`⊓&
`
`11
`
`
`
`74
`
`M. Amin Shokrollahi
`
`δα
`
`.
`
`Theorem 2. The sequence of degree distributions given in (9) is asymptotically
`quasi-optimal.
`Proof. Let us compute the maximum value of δ such that δλ(1 − ρ(1 − x)) < x
`for x ∈ (0,δ ), where (λ, ρ) is the pair given in (9). We have
`δα
`δλ(1 − ρ(1 − x)) <
`(1 − ρ(1 − x)α)
`N,(−1)N +1
`α − N+ α
`=
`N,(−1)N +1 x.
`α − N+ α
`So, for δλ(1 − ρ(1 − x)) < x we need
`N,(−1)N +1
`α − N+ α
`δ ≤
`α
`This is, by the way, the same upper bound as the one obtained from Lemma 2. In
`the following we assume that δ is equal to the above upper bound, and compute
`δ/(1 − R), R being the rate of code estimated in Proposition 2:
`N,(−1)N +1
`= α −+ α
`1
`δ
`≥ 1 −
`N α+1 .
`1 − R
`α
`Ignoring diophantine constraints, we assume in the following that N = ν−1/α.
`This gives
`
`(11)
`
`δ
`
`1 − R ≥ 1 − ν(α+1)/α = 1 − νaR,
`where aR = a = (α + 1)/α is the average right degree of the code. This proves
`the assertion.
`⊓&
`In practical situations, one is interested in designing a code of a given fixed rate.
`Since the parameter α in the definition of the sequence in (9) is the inverse
`of an integer, the range of this parameter is discrete. Hence, we do not have
`a continuous range of rates for these codes. However, we can come arbitrarily
`close to our desired rate by making α smaller, i.e., by allowing high degrees
`on the RHS of the graph. Some examples for different target rates are given in
`Table 2. The value of N has been hand-optimized in these examples to come as
`close to the desired rate as possible. The last column in that table corresponds
`to the value ˆδ defined in Remark 1 following Corollary 1. It gives a theoretical
`upper bound on the best value of δ. As can be observed from the table, the
`maximum value of δ converges very quickly to the maximum possible value.
`Also, a comparison between these codes and the heavy tail/Poisson distribution
`in Table 1 reveals that the new codes are better in terms of the trade-off between
`δ and aR. However, the new codes need larger degrees on the left than the heavy
`tail/Poisson distribution.
`To obtain codes that have a fixed rate, one can modify α slightly. Such ex-
`amples are given in Table 3. Both parameters (α and N) of these codes are
`hand-optimized to give the best results.
`
`12
`
`
`
`Table 2. Right regular sequences for rates R close to 2/3, 1/2, and 1/3.
`
`New Sequences of Erasure Codes
`
`75
`
`N
`2
`3
`6
`11
`17
`27
`42
`64
`13
`29
`60
`12.
`257
`523
`1058
`111
`349
`1077
`3298
`
`aR
`6
`7
`8
`9
`10
`11
`12
`13
`6
`7
`8
`9
`10
`11
`12
`6
`7
`8
`9
`
`1 − Rδ
`0.33333
`0.31677
`0.32886
`0.33645
`0.33357
`0.33392
`0.33381
`0.33312
`0.50090
`0.50164
`0.49965
`0.49985
`0.50000
`0.50002
`0.49999
`0.66677
`0.66667
`0.66663
`0.66669
`
`/(1 − R)
`0.60000
`0.74537
`0.88166
`0.93777
`0.96001
`0.97502
`0.98401
`0.98953
`0.96007
`0.98251
`0.99159
`0.99598
`0.99805
`0.99904
`0.99953
`0.99698
`0.99904
`0.99969
`0.99990
`
`δ
`0.20000
`0.23611
`0.28994
`0.31551
`0.32024
`0.32558
`0.32847
`0.32963
`0.48090
`0.49287
`0.49545
`0.49784
`0.49903
`0.49954
`0.49975
`0.66475
`0.66603
`0.66642
`0.66662
`
`ˆδ
`0.29099
`0.28714
`0.31243
`0.32690
`0.32724
`0.32984
`0.33113
`0.33134
`0.49232
`0.49759
`0.49762
`0.49885
`0.49951
`0.49977
`0.49986
`0.66584
`0.66636
`0.66653
`0.66665
`
`δ/ˆδ
`0.68731
`0.82230
`0.92801
`0.96514
`0.97860
`0.98711
`0.99197
`0.99484
`0.97679
`0.99052
`0.99563
`0.99797
`0.99904
`0.99953
`0.99977
`0.99837
`0.99950
`0.99984
`0.99995
`
`6 Conclusions and Open Questions
`
`In this paper we have analyzed the theoretical performance of a simple erasure
`recovery algorithm applied to Gallager codes by deriving upper bounds on the
`maximum fraction of tolerable losses given a parameter that describes the run-
`ning time for encoding and decoding of the codes. We have shown that there is
`a trade-off between proximity to the optimal value of tolerable losses, i.e., one
`minus the rate of the code, and the average degree of nodes in the graph, in
`the sense that multiplying the average degree by a constant factor implies an
`exponential relative increase of the maximum tolerable loss fraction. Further,
`we have introduced a new sequence of graphs which are asymptotically close to
`optimal with respect to this criterion. These graphs are right regular and their
`node degree distribution on the left hand side is closely related to the power
`series expansion of (1 − x)α, where α is the inverse of an integer. Previously,
`the only known such sequence was the heavy tail/Poisson sequence introduced
`in [8]. We have included examples which show that the new codes tolerate a
`higher fraction of losses if the average degree of the graph is fixed.
`It would be very interesting to extend the analysis given in this paper to
`other decoding algorithms, e.g., to the simple decoding algorithm of Gallager
`for error correcting codes [4,7]. Such an analysis would probably give clues for
`constructing infinite sequences of graphs that perform asymptotically optimally
`with respect to the decoding algorithm in question.
`
`13
`
`
`
`76
`
`M. Amin Shokrollahi
`
`Table 3. Rate 1/2 codes, ρ(x) = xaR−1, λα,N (x) as defined in (8) for arbitrary α.
`
`Nα
`11
`27
`59
`124
`256
`521
`1057
`
`0.17662
`0.16412
`0.14225
`0.12480
`0.11112
`0.09993
`0.09090
`
`aR
`6
`7
`8
`9
`10
`11
`12
`
`δ/(1 − R)
`0.95982
`0.98190
`0.99142
`0.99596
`0.99800
`0.99896
`0.99950
`
`δ
`0.47991
`0.49095
`0.49571
`0.49798
`0.49900
`0.49948
`0.49975
`
`ˆδ
`0.49134
`0.49586
`0.49798
`0.49901
`0.49951
`0.49975
`0.49988
`
`δ/ˆδ
`0.97673
`0.99009
`0.99543
`0.99794
`0.99898
`0.99945
`0.99974
`
`References
`
`1. N. Alon and M. Luby. A linear time erasure-resilient code with nearly optimal
`recovery. IEEE Trans. Inform. Theory, 42:1732–1736, 1996.
`2. R.E. Blahut. Theory and Practice of Error Control Codes. Addison Wesley, Read-
`ing, MA, 1983.
`3. P. Elias. Coding for two noisy channels. In Information Theory, Third London
`Symposium, pages 61–76, 1955.
`4. R. G. Gallager. Low Density Parity-Check Codes. MIT Press, Cambridge, MA,
`1963.
`5. R.L. Graham, D.E. Knuth, and P. Patashnik. Concrete Mathematics. Addison-
`Wesley, 1994.
`6. M. Luby, M. Mitzenmacher, and M.A. Shokrollahi. Analysis of random processes
`via and-or tree evaluation. In Proceedings of the 9th Annual ACM-SIAM Sympo-
`sium on Discrete Algorithms, pages 364–373, 1998.
`7. M. Luby, M. Mitzenmacher, M.A. Shokrollahi, and D. Spielman. Analysis of low
`density codes and improved designs using irregular graphs. In Proceedings of the
`30th Annual ACM Symposium on Theory of Computing, pages 249–258, 1998.
`8. M. Luby, M. Mitzenmacher, M.A. Shokrollahi, D. Spielman, and V. Stemann.
`Practical loss-resilient codes. In Proceedings of the 29th annual ACM Symposium
`on Theory of Computing, pages 150–159, 1997.
`9. M. Luby and M.A. Shokrollahi. Right regular erasure codes. Unpublished, 1998.
`10. F.J. MacWilliams and N.J.A. Sloane. The Theory of Error-Correcting Codes.
`North-Holland, 1988.
`
`14
`
`