`
`2547
`
`TABLE I
`WEIGHT DISTRIBUTION OF
`
`CODE
`
`is found through a computer to be
`The minimum distance of
`, which is two more than the best code given in [2] as of May 3,
`is shown in Table I.
`1999. The weight distribution of code
`
`REFERENCES
`[1] W. W. Peterson and E. J. Weldon, Jr., Error Correcting Codes, 2nd ed.
`Cambridge, MA: MIT Press, 1972.
`[2] A. E. Brouwer, “Bounds on the minimum distance of linear codes,”
`database at URL
`[Online]. Available WWW: http://www.win.tue.nl/
`˜aeb/voorlincod.html.
`
`Linear-Time Binary Codes Correcting Localized Erasures
`Alexander Barg and Shiyu Zhou
`
`Abstract—We consider a communication model over a binary channel
`in which the transmitter knows which bits of the
`-bit transmission are
`prone to loss in the channel. We call this model channel with localized
`erasures in analogy with localized errors studied earlier in the literature.
`check
`We present two constructions of binary codes with
`bits, where
`is the maximal possible number of erasures. One
`and
`construction is on-line and has encoding complexity of order
`. The other construction is recursive.
`decoding complexity of order
`The encoding/decoding algorithms assume a delay of
`bits, i.e., rely on
`the entire codeword. The encoding/decoding complexity behaves roughly
`and
`, respectively.
`as
`Index Terms—Defects, linear-time codes, localized erasures.
`
`INTRODUCTION
`I.
`We consider a communication model in which binary information
`is transmitted from one party to the other over a channel which
`can occasionally erase some of the transmitted bits. Information is
`-bit blocks. In contrast to the usual channel with
`transmitted by
`erasures we assume that the sender knows the part of the block where
`erasures can (but not necessarily will) occur. Therefore, we call such
`-bit blocks. Both
`erasures localized. Information is transmitted by
`out of
`bits can be
`the sender and the receiver know that at most
`possibly erased in the channel. The bit positions of possible erasures
`become known to the sender before transmitting a block and can vary
`from transmission to transmission. This is the only difference of our
`model from the standard erasure channel. The receiver, as usual, can
`-bit block are lost upon receipt,
`only detect that certain bits of the
`i.e., erasures are visible at the receiving end.
`This problem has a game-like flavor. Namely, supposing that the
`bits would make the problem
`channel always destroys exactly
`message bits outside these
`positions and
`trivial: just send
`write anything in bits that will be erased. However, if some of these
`bits arrive unscathed, the receiver would read them off and mix
`with the actual information since it cannot tell the bits that are prone
`to loss from the reliable bits of the block.
`Our correspondence shows that the transmission can be organized
`bits (
`is any positive number),
`with a data overhead of
`which is essentially the optimum, and that the delays introduced by
`.
`the sender/receiver are linear in
`Typically, codes correcting erasures are discussed in connection
`with packet routing in networks employing the asynchronous transfer
`mode and in the Internet. Generally, the most common causes for
`the loss of packets are traffic congestion and buffer overflows,
`which result in excessive delays. If a part of the message is not
`received within a certain time interval, the receiving node issues a
`retransmission request. However, in on-line applications such as, for
`instance, real-time video transmission, retransmission is impossible.
`Manuscript received March 8, 1998; revised March 8, 1999. The material
`in this correspondence was presented in part at the 35th Annual Allerton
`Conference, Monticello, IL, September 1997.
`A. Barg is with Bell Laboratories, Lucent Technologies, 2C-375, Murray
`Hill, NJ 07974 USA (e-mail: abarg@research.bell-labs.com).
`S. Zhou was with Bell Laboratories, Lucent Technologies. He is now with
`the Department of Computer and Information Science, University of Pennsyl-
`vania, Philadelphia, PA 19104-6389 USA (e-mail: shiyu@cis.upenn.edu).
`Communicated by F. R. Kschischang, Associate Editor for Coding Theory.
`Publisher Item Identifier S 0018-9448(99)07007-8.
`
`0018–9448/99$10.00 1999 IEEE
`
`Apple 1223
`
`1
`
`
`
`2548
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 7, NOVEMBER 1999
`
`In this context we face the problem of retrieving the lost data at
`the receiving end at the expense of having a certain data overhead
`in the transmission. Consequently, coding theory solutions for load
`balancing and decreasing delays in global networks have been a
`recurrent topic in the literature [3], [8], [12], [13], and [15]. A general
`idea of these works is the same. Namely, it is suggested to encode
`the message with maximum distance separable (MDS) codes. If the
`code has length
`and is used to encode message symbols, then
`the message can be recovered from any
`-part of the codeword,
`and the data overhead is
`symbols. There are three general
`methods of constructing MDS codes known: redundant-residue codes,
`Cauchy matrices, and Reed–Solomon (RS) codes (see [11, Sec. 10.9,
`Exercise 11.7, and Ch. 10], respectively). These methods are utilized
`in [3] (redundant-residue codes), [13] (Cauchy matrices), and [8], [15]
`(Reed–Solomon codes). The complexity of these constructions is of
`order
`-ary operations (
`is the alphabet size) for encoding and
`decoding. The encoding is performed either by matrix multiplication
`or by polynomial evaluations; the decoding is basically Lagrange
`interpolation for RS codes or Chinese remaindering for redundant-
`residue codes. The alphabet size
`is related to the transmission
`protocol. In [8],
`equals the length of the packet; in [15] it is the
`number of packets in the image.
`In our situation, the sender knows the set
`of positions of possible
`erasures. This information is used in the encoding. More precisely,
`the encoding mapping, which assigns a codeword to a given message
`, depends both on the set
`of possible messages and the set
`of
`positions of possible erasures. This implies that to every message
`we associate a subset of codewords, and the code is formed by
`the union of these subsets. A binary code
`of length
`corrects
`localized erasures if for different messages these subsets are
`disjoint (then the encoding mapping is invertible). Precise definitions
`are given in the next section.
`Two transmission problems studied previously are close to our
`problem. The first is called codes for channels with (memory) defects
`[9]. Under this model, codewords are stored in memory with some
`cells stuck at either
`or
`. Therefore, no matter what is written in
`these bits by the “sender,” the “receiver” always reads a fixed value.
`The second model deals with codes correcting localized errors [5].
`The definition is the same as of codes correcting localized erasures
`except that now the transmitted bits inside a given set
`are either
`received correctly or flipped in the channel.
`There is a substantial difference between the minimal possible
`number of check bits in binary codes correcting localized erasures
`and binary codes correcting usual erasures. Namely, to correct
`nonlocalized erasures one needs binary codes with minimum distance
`at least
`. By the Varshamov–Gilbert bound, the redundancy of
`the best known binary codes with distance
`is
`. On
`the contrary, the minimum redundancy of a code of length
`that
`corrects
`localized erasures is
`(see Proposition 2.1).
`Our results show that this bound is essentially tight. The same
`lower bound is valid for codes correcting defects [9]. For localized
`errors the upper and lower bounds on the number of checks have order
`[5]. A more detailed survey of the bounds and complexity
`results for is given in [4].
`As remarked above, for larger alphabets there exist MDS codes that
`correct usual erasures with check symbols. The encoding/decoding
`time is of order
`for RS codes and
`for codes of
`Alon and Luby [2], which use
`check symbols. However,
`an easy consequence of the Singleton bound [11] shows that the
`size
`of the alphabet of an MDS code must be large, namely,
`, where
`is the number of erasures.
`Note also that the assumption of deterministic encoding/decoding
`is possible to construct probabilistic encod-
`is essential since it
`
`ing/decoding methods that ensure recovery of binary sequences with
`erasures using
`check bits; see Luby et al. [10].
`at most
`A number of recursive constructions of low-complexity codes for
`different transmission models is known in the literature; see Ahlswede
`et al. [1], Alon and Luby [2], Dumer [7], and Spielman [14]. The idea
`of constructing low-complexity codes for all models is essentially the
`is partitioned into a number
`same. Namely, first the code length
`of shorter segments. A recursion is applied on each or some of these
`segments. The recursion stops after reaching constant length or some
`length depending in the complexity of codes used on these
`small segments, so that the encoding and decoding of an optimal code
`can be implemented by exhaustive search. This general construction
`varies for different models since optimal codes are constructed in
`different ways.
`This idea was used by Spielman [14] to construct codes of length
`correcting a small number
`of Hamming
`and size
`errors. The encoding/decoding complexity is of order
`and
`. These codes were used as a basis of recursion by
`Alon and Luby [2], who suggested codes over a very large alphabet
`usual erasures. The number of check symbols of this
`correcting
`. The complexity of encoding/decoding is
`construction is
`. Here must be very small since the construction relies on
`error-correcting codes from [14]. In principle, a similar technique can
`be used in our problem, but the parameters of the construction are
`.
`sharply restricted by the condition on
`Dumer [7] introduced polynomial-time codes that use
`checks
`to correct
`defects. The complexity of encoding is
`and of decoding
`. In our work we
`rely on these constructions to construct codes correcting localized
`erasures. This enables us to achieve linear encoding/decoding
`complexity for localized erasures. The redundancy of codes is
`, which is not possible for usual erasures with deterministic
`encoding/decoding algorithms.
`
`II. PRELIMINARIES
`be the set of all binary
`Let us define formally our problem. Let
`is any subset of
`.
`of length
`words of length . A binary code
`is called the number of message bits (all
`The quantity
`’s in the correspondence are base
`). Let
`.
`is a codeword submitted to the
`Suppose
`transmission channel. We say that erasures are localized at
`if the output
`of the channel is given by
`
`(1)
`
`or
`where
`is the erasure symbol. In other words, symbols outside
`are never erased in the channel and symbols in may or may not be
`can be any -subset of
`, which becomes known
`erased. Suppose
`to the encoder (but not to the decoder) before transmission and can
`vary from one block to the other. Then we speak of transmitting over
`localized erasures.
`the channel with
`be the set of messages. By encoding we mean
`Let
`over
`assigning a codeword used to transmit a given message
`. The encoding mapping is
`the channel with erasures localized at
`defined by
`
`where
`
`is the set of all
`
`-subsets of
`
`. Then
`
`2
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 7, NOVEMBER 1999
`
`2549
`
`be a decoding mapping which sends any codeword with at most
`Let
`bits missing to
`. By definition, the code
`corrects
`localized
`for any string with
`erasures if
`
`or
`
`local-
`
`.
`for any
`We consider the problem of constructing codes correcting
`and
`ized erasures. Below we assume that
`.
`Proposition 2.1: Let
`be a binary code of length
`used to
`messages from a set
`over the channel with localized
`transmit
`.
`erasures. Then
`be the erasure operator which maps a codeword
`Proof: Let
`to a sequence
`that satisfies (1). A necessary and sufficient
`to correct
`localized erasures is
`condition for
`
`(2)
`for any
`be a fixed
`. Let
`and any
`. The outcome of the transmission can be any set of
`-subset of
`erases all
`bits in
`. Assume that
`.
`erasures on
`bits should satisfy (2); thus
`The binary vector in the remaining
`.
`Below we adapt to our problem certain codes correcting defects.
`and
`The channel with defects can be defined as follows. Let
`is a codeword submitted to a channel with defects. Suppose
`suppose
`. Then the output of the channel is a word with
`
`.
`Again we assume that
`becomes known to the encoder before the
`transmission and can vary from block to block.
`be a set of messages. The code is defined as
`Let
`
`(3)
`
`is an encoding mapping. Codes for which the encoding
`where
`mapping is invertible are said to correct
`defects.
`
`III. THE CONSTRUCTIONS
`
`A. Overview
`We shall present two constructions of binary codes correcting
`localized erasures. As we have pointed out in the Section I, the
`problem of correcting localized erasures is much related to the
`problem of correcting defects [6], [7]. An advantage of locality is that
`in our coding scheme the encoder knows exactly what the decoder
`will receive. To achieve this, the encoder forces all the bits that can
`possibly be erased to be ’s in its encoding, i.e., for any given
`the encoder forces all bits of the codeword
`and
`contained in
`to be
`. The decoder replaces all erased bits in the
`received sequence by zeros. It turns out that in this way the encoder
`can convey certain useful information (such as the number of possible
`erasures in a segment) to the decoder in a fairly straightforward
`manner, which is something difficult to achieve when dealing with
`usual erasures.
`In order to achieve linear time complexity, we propose to break
`a block of linear size into segments of constant size and apply the
`optimal code on each segment. In this way even though the time
`
`complexity of the optimal code is super-linear, it only causes a
`constant factor increase in the overall linear complexity. The optimal
`code we shall use is due to Dumer [6]. The original purpose of the
`code was to deal with asymmetric defects, i.e., defects with
`in (3). It turns out that this construction can be easily reformulated to
`correct localized erasures. We give the following theorem with proof
`since it is important for understanding our constructions.
`,
`there exists an easily
`Theorem 3.1 [6]: For all
`of length
`that corrects
`constructible binary nonlinear code
`localized erasures. The redundancy of the code is
`. The encoding
`and the decoding time is
`.
`time of the code is
`Moreover, in any codeword, all the bits corresponding to the possible
`erasures are equal to .
`be the subset of positions of possible erasures.
`Proof: Let
`be the binary message of
`bits that we want to
`Let
`(the
`is divisible by
`send. Suppose that
`, say
`of
`into segments
`opposite case will be treated later). Partition
`The encoding is performed by viewing the vectors
`length
`and multiplying each of them by one and
`as elements of GF
`GF
`Each product
`is expanded
`the same element
`; the codeword of length
`is
`into a binary vector
`formed by concatenating all these vectors and the binary expansion
`as follows:
`of
`
`is chosen so that all the bits of this word with numbers
`Element
`be
`in
`. Each product
`is a linear form of the variables
`In total,
`binary coordinates in these products
`. Each of these conditions defines a linear equation with
`must be
`unknowns
`. A nontrivial system of
`homogeneous
`unknowns always has a nontrivial
`equations with respect to
`(not all-zero) solution.
`can be encoded as follows. It
`is parti-
`Then the message
`,
`tioned into segments as described. Then the encoder computes
`, and their binary expansions
`finds the products
`. This defines the codeword.
`itself is a part of the sequence transmitted and is also
`Note that
`of possible erasures includes some
`subject to erasures. If the set
`of the last
`coordinates of the transmitted sequence, then the
`are simply set to .
`corresponding coordinates of
`The decoder sets the erased coordinates to . Now the received
`word exactly equals the codeword. Therefore, the decoder splits it
`and finds
`. The message is
`into segments of length
`, and taking the binary
`found by computing
`expansions of these products.
`,
`is not divisible by
`, say
`Finally, if
`segments
`then we partition the first part of the message into
`and take the last segment of length
`.
`of length
`All segments except the last are treated exactly as above. The last
`and multiplied
`segment is considered as an element of GF
`by
`GF
`. It is easy to see that the
`construction described above remains valid, and the number of check
`. This completes the proof.
`bits is still
`encoding/decoding when
`In what follows, we use the term
`we talk about the encoding/decoding of the code.
`decoding is defined on the whole
`Remark: Note that the
`except for the subspace
`spanned by the first
`space
`coordinates (i.e., subspace of vectors with last
`zeros). In
`does not contain the all-zero codeword.
`particular, the code
`
`3
`
`
`
`2550
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 7, NOVEMBER 1999
`
`In our constructions we need to extend the definition of
`decoding to the entire space
`. Assume by definition that
`decoding of any
`outputs a “message” of
`zeros.
`In the first construction that we shall present, the coding scheme is
`implemented on-line. That is, the codeword is broken into segments
`each of constant length and the encoding/decoding is applied inde-
`pendently on each segment. The second construction is an extension
`of the work in [7]. It is a more complicated recursive construction
`and cannot be implemented on-line, but it has asymptotically better
`performance in terms of the encoding/decoding complexity.
`
`B. On-Line Construction
`In this section, we present a code construction and prove the
`following.
`such that
`and
`Theorem 3.2: Given integer
`, there exists an easily constructible binary code of length
`localized erasures and has redundancy
`.
`that corrects
`The encoding complexity of the code is
`and the decoding complexity is
`. Moreover,
`the coding scheme can be implemented on-line.
`Proof: We will describe the construction of the code, describe
`encoding and decoding, compute its redundancy, and estimate its
`complexity.
`bits. By assumption, the
`We begin with a message of
`is known to the encoder. The set
`of code
`set
`coordinates is divided into subblocks of a smaller length
`called segments (we assume without loss of generality that
`is a
`). The encoding procedure is applied to each of these
`multiple of
`segments independently. The same holds for the decoding, so the
`construction introduces only a constant delay and can be implemented
`on-line.
`Thus we need to describe what happens within a given segment
`of length , where
`is taken equal to
`
`is not the minimal possible value of the constant.)
`(We remark that
`The construction relies on two more parameters
`
`(4)
`
`First, if the number of possible erasures in
`, the entire
`exceeds
`is filled with zeros and is not used to transmit information.
`segment
`is of constant length, we could apply the optimal
`Otherwise, since
`code of Theorem 3.1 on it. This would solve our problem, were
`known to the decoder. To
`the number of possible erasures in
`communicate this number, we use a part (subsegment) of segment
`with relatively few possible erasures. Such a subsegment of length
`exists since
`itself has at most
`erasures. The final problem is
`to communicate to the decoder the location of this subsegment.
`Before describing the coding scheme in detail, we note one
`inequality, useful for later analysis, which can be easily verified by
`and
`and the assumption of the theorem
`using our definitions of
`
`(5)
`
`. As said above,
`there is an index, say , such that
`be
`does not carry any part of the message. Let
`the number of possible erasures in the remaining part of the segment.
`to encode
`message bits
`We use the code
`.
`and transmit them on
`to accomplish the goals
`It remains to specify the use of
`written on this subsegment,
`described above. A codeword of
`written on
`, the
`is calculated from the codeword of
`, and the starting bit position of
`in
`. Let
`number
`be a binary vector, the first
`bits of which carry this position
`bits represent
`.
`and the last
`is calculated as follows. First, we
`The codeword written on
`of
`. This
`decoding to each subsegment
`apply
`is possible by the remark after Theorem 3.1. Denote the result of the
`th decoding by
`. It is clear that each such decoding gives a bit
`. Next, compute
`sequence of length
`
`(6)
`
`.
`
`and write the result on
`Finally, we encode with
`Decoding: The decoding begins upon receipt of the first
`-bit
`. If after replacing the erased bits by zeros, the decoder
`segment
`observes an all-zero word, the segment is assumed not to carry any
`information and discarded. By the remark after Theorem 3.1, the only
`, i.e., this decision
`case when this can occur is when
`coincides with the intentions of the encoder.
`into
`subsegments
`Otherwise, we decompose
`each of length . Then we apply the
`decoding on each
`. Let
`be the output of the decoding on
`. Next
`. By (6), the first
`bits of this
`compute
`vector represent the (binary) number of the starting position of
`and the last
`bits contain the number
`of erasures in
`.
`and apply the
`So what remains is to isolate the segment
`decoding on it. This gives the transmitted message.
`be
`Let us estimate the overall redundancy of the code. Let
`with the number of possible erasures
`the collection of segments
`. The following parts of the
`-word do not
`, the subsegments
`of
`carry information: all the segments in
`, and the redundant bits of codewords of
`the segments not in
`on these segments.
`. Clearly,
`. The total number of
`Let
`and the number of erasures in all the segments
`segments is
`is at most
`. Therefore, for the overall redundancy
`we get
`
`(7)
`
`where the last inequality follows from (5). We need to show that
`for
`. Since (7) is linear in
`, it
`suffices to show that the last expression in (7) is at most
`and
`for both
`.
`Case 1:
`
`Encoding: More to the point, we shall partition
`into
`consecutive subsegments
`each of length . Note
`by (5), so this partition is well-defined. Since
`,
`that
`
`(7)
`
`4
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 7, NOVEMBER 1999
`
`2551
`
`Case 2:
`(7)
`
`where the last step makes use of (4).
`To complete the proof it remains to estimate the complexity of
`encoding and decoding. By Theorem 3.1, it is easy to see that the
`is
`and for decoding
`time needed for encoding each segment
`. Therefore, the overall encoding time (on
`segments) is
`is
`and the overall decoding time is
`.
`
`,
`that
`.
`
`C. Recursive Construction
`As we mentioned in Section I, Dumer [7] gave a nearly optimal
`construction of binary codes of length
`defects with
`correcting
`redundancy,
`encoding time, and
`decoding time. This construction can be easily modified to give
`binary codes correcting localized erasures with the same amount
`of redundancy and the same encoding/decoding complexity. In this
`section, we outline an extension of that construction and derive a
`coding scheme that satisfies the following properties.
`such that
`Theorem 3.3: Given any
`, there is an
`such that for any integer
`there exists an easily constructible binary code of length
`localized erasures and has redundancy
`corrects
`The encoding complexity of the code is
`and the decoding complexity is
`.
`Here we point out that this construction is recursive and the coding
`scheme cannot be implemented on-line.
`Proof: We assume the existence of an easily constructible binary
`correcting any
`localized erasures
`code of any length
`that satisfies the properties required in the theorem. The construction
`proceeds recursively and relies on this assumption as an induction
`hypothesis.
`As in Theorem 3.2, we present a constructive definition of the
`code by specifying the encoding procedure. Then we describe the
`decoding, compute the code redundancy, and estimate the necessary
`complexities. The general encoding scheme is based on isolating a
`in which the fraction of possible
`large (linear-size) segment
`erasures is at most the same as in the entire block. This segment
`does not carry message bits; instead, we encode on it the number of
`. Encoding and decoding
`possible erasures in the remaining part of
`are performed with a code whose existence is guaranteed by the
`on
`induction hypothesis. The rest of the block length is used to transmit
`message bits with codes of Theorem 3.1 and the information about
`. This is done similarly to Theorem 3.2.
`the location of
`Let us introduce notation for the partition described, specify the
`parameters, and give a formal description of encoding and decoding.
`, where
`In the remaining part we isolate an
`Let
`of length
`, which
`auxiliary segment
`is used to transmit the location of
`. Finally, the set
`is partitioned into subsets
`each of length , called
`blocks. Here
`
`(8)
`All the parameters are known both to the encoder and decoder and
`form a part of the code description (transmission protocol).
`
`and a
`Encoding: We are given a set of possible erasures
`bits. Let
`be a segment
`message of
`. Then we partition the subset
`into
`such that
`each and fix one such segment
`consecutive segments of length
`. (This can be done even if
`.) Finally,
`with
`into
`we partition the remaining part of
`consecutive blocks
`each of length .
`. Block
`is used to transmit
`Let
`We
`message bits, protected against erasures with the code
`numbers
`and the starting
`also need to transmit to the decoder the
`of the auxiliary segment
`. In total this is
`position
`bits. These bits are encoded with the
`code, whose existence
`constitutes the induction hypothesis. The encoding result is written
`. Note that
`in
`
`follows by (8) and the assumption for the theorem and
`where
`holds for
`sufficiently large. Thus this step of the encoding
`procedure is well-defined.
`The only thing the encoder still needs to transmit to the decoder
`. Let us assume without loss
`is the starting position of the segment
`and
`are multiples of
`. We partition
`of generality that both
`each of length . Then by the
`into segments
`for some
`. For
`,
`construction, we have that
`let
`decoding on
`be the result of the
`. Since
`is one of the
`consecutive segments of
`segment
`,
`to locate it we need
`bits. Let
`be the binary
`. Let
`representation of the segment number of
`and let
`carry the codeword obtained by
`encoding of
`.
`Decoding: The decoding begins upon receipt of
`bits of the
`. Then we
`codeword. First we replace all
`the erased bits by
`into
`segments
`decompose
`each of length
`and apply the
`decoding on each segment
`.
`Each decoding yields a
`-bit vector
`. Let
`By the above, this is the binary number of the segment
`in the partition
`into
`consecutive segments of length
`.
`, we apply the decoding recursively on
`Upon having located
`to obtain the number
`of possible erasures in
`and the starting
`of segment
`. Now we can recover all the segments
`bit position
`which carry the message. Decoding each of
`, we get the transmitted message bits.
`them with the code
`This completes the decoding.
`Let us show that the amount of redundancy satisfies the requirement
`of the theorem. The redundancy of the code consists of segments
`and
`and the redundancy needed for each
`encoding on
`.
`Therefore, the total is at most
`
`Here in
`that
`
`which implies
`we used the obvious inequality
`. Further,
`holds for
`sufficiently large.
`
`5
`
`
`
`2552
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 7, NOVEMBER 1999
`
`It remains to analyze the complexity of the coding scheme. By
`is
`Theorem 3.1, the time needed for the encoding of each
`and for the decoding is
`. Thus the overall encoding time on
`segments is
`and the overall
`. Similarly, we can show that
`decoding time is
`and for the
`is
`the time needed for the encoding of segment
`. Thus the overall encoding time at this recursive
`decoding is
`)
`level (excluding the time needed for the recursion on segment
`and the overall decoding time is
`is
`.
`(resp.,
`) to be the overall time
`Now let us denote
`needed for the encoding (resp., decoding) of a segment of length .
`Then we have the following recurrent equations:
`
`Solving the recurrences, we have that
`and
`. Indeed, for instance, for the first equation, we
`have
`
`since by (8)
`
`This completes the proof.
`
`[7]
`
`[6] I. I. Dumer, “Asymptotically optimal codes correcting memory defects
`of fixed multiplicity,” Probl. Pered. Inform., vol. 25, no. 4, pp. 3–10,
`in Russian, and Probl. Inform. Transm., pp. 259–264, 1989, English
`translation.
`, “Asymptotically optimal linear codes correcting defects of lin-
`early increasing multiplicity,” Probl. Pered. Inform. vol. 26, no. 2, pp.
`3–17, in Russian, and Probl. Inform. Transm., pp. 93–104, 1990, English
`translation.
`[8] G. A. Kabatianskii and E. A. Krouk, “Coding decreases delay of
`messages in networks,” in Proc. IEEE Int. Symp. Inform. Theory (San
`Antonio, TX, 1993), p. 255.
`[9] A. V. Kuznetsov and B. S. Tsybakov, “Coding for memory with
`defective cells,” Probl. Pered. Inform., vol. 10, no. 2, pp. 52–60,
`in Russian, and Probl. Inform. Transm., pp. 132–138, 1974, English
`translation.
`[10] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, D. A. Spielman, and
`V. Stemann, “Practical loss-resilient codes,” in Proc. 29th ACM Annual
`Symp. Theory of Computing (STOC’97), ACM, 1997, pp. 150–159.
`[11] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting
`Codes. Amsterdam, The Netherlands: North-Holland, 1977.
`[12] A. J. McAuley, “Reliable broadband communication using a burst
`erasure correcting code,” in Proc. SIGCOMM’90. Philadelphia, PA:
`ACM, 1990, pp. 287–306.
`[13] M. O. Rabin, “Efficient dispersal of information for security,
`load
`balancing, and fault tolerance,” J. Assoc. Comput. Mach., vol. 36, no.
`2, 1989, pp. 335–348.
`[14] D. Spielman, “Linear-time encodable and decodable error-correcting
`codes,” IEEE Trans. Inform. Theory, vol. 42, pp. 1723–1731, Nov. 1996.
`[15] R. Urbanke and A. Wyner, “Packetizing for the erasure broadcast
`channel with an Internet application,” 1997, preprint.
`
`IV. CONCLUDING REMARKS
`This correspondence introduces binary codes correcting localized
`and is thus very close
`erasures. The redundancy of codes is
`to the minimal possible. The complexity of the simplest construction
`and inversely proportional to
`. The number
`can
`is linear in
`, contrary to some other recursive
`take almost any value in
`constructions in which it has to be small, which results in high
`encoding/decoding complexity. We note that it is possible to construct
`almost optimal low-complexity binary codes that correct localized
`errors and erasures at the same time. This could be the subject of a
`future work. Another interesting problem would be to construct low-
`localized erasures with redundancy
`complexity codes correcting
`as for codes correcting defects [7].
`REFERENCES
`[1] R. Ahlswede, L. A. Bassalygo, and M. S. Pinsker, “Asymptotically
`optimal binary codes of polynomial complexity correcting localized
`errors,” Probl. Pered. Inform., vol. 31, no. 2, pp. 76–83, in Russian,
`and Probl. Inform. Transm., pp. 162–168, 1995, English translation.
`[2] N. Alon and M. Luby, “A linear time erasure resilient code with nearly
`optimal recovery,” IEEE Trans. Inform. Theory, vol. 42, pp. 1732–1736,
`Nov. 1996.
`[3] C. A. Asmuth and G. R. Blakley, “Pooling, splitting and restituting
`information to overcome total failure of some channels of communi-
`cation,” in Proc. 1992 Symp. Security and Privacy (IEEE, New York,
`1982), pp. 156–169.
`[4] A. Barg, “Complexity issues in coding theory,” in Handbook of Coding
`Theory, V. Pless and W. C. Huffman, Eds. Amsterdam, The Nether-
`lands: Elsevier, 1998, pp. 649–754.
`[5] L. A. Bassalygo, S. I. Gelfand, and M. S. Pinsker, “Codes for channels
`with localized errors,” in Proc. 4th Swedish–Soviet Workshop Informa-
`tion Theory, 1989, pp. 95–99.
`
`Asymptotically Good Codes Correcting
`Insertions, Deletions, and Transpositions
`Leonard J. Schulman and David Zuckerman
`
`Abstract—We present simple, polynomial-time encodable and decodable
`codes which are asymptotically good for channels allowing insertions,
`deletions, and transpositions. As a corollary, they achieve exponential
`error probability in a stochastic model of insertion–deletion.
`Index Terms— Asymptotically good, asynchronous communication,
`deletion, edit distance, error-correcting codes, insertion, transposition.
`
`INTRODUCTION
`I.
`In an asynchronous noisy channel, characters of the received mes-
`sage are not definitively identified with antecedents in the transmitted
`message. We describe a code which allows for correction of data
`modified in the following ways:
`Insertion and deletion of characters. (Note that this implies also
`alteration of characters.)
`Manuscript received April 13, 1997; revised October 16, 1998. This work
`was supported in part by NSF NYI under Grant CCR-9457799, a David and
`Lucile Packard Fellowship for Science and Engineering, and an Alfred P.
`Sloan Research Fellowship.
`L. J. Schulman is with the College of Computing, Georgia Institute of Tech-
`nology, Atlanta, GA 30332-0280 USA (e-mail: schulman@cc.gatech.edu).
`D. Zuckerman is with the Computer Science Division, University of Cali-
`fornia at Berkeley, Berkeley, CA 9472