throbber
The Complexity of Error-Correcting Codes
`
`Daniel A. Spielman1
`
`Massachusetts Institute of Technology
`
`Abstract. By concatenating linear—time codes with small, good codes,
`it is possible to construct in polynomial time a family of asymptotically
`good codes that approach the Shannon bound that can be encoded and
`decoded in linear time. Moreover, their probability of decoder error is
`exponentially small in the block length of the codes. In this survey, we
`will explain exactly what this statement means, how it is derived, and
`what problems in the complexity of error—correcting codes remain open.
`Along the way, we will survey some key developments in the complexity
`of error—correcting codes.
`
`1
`
`Introduction
`
`Error—correcting codes are the means by which we compensate for the cor—
`ruption that occurs in communication over imperfect channels. In a coding
`system, a message first passes through an encoder, which transforms it
`into a codeword; this codeword is then transmitted over the channel. The
`channel modifies the codeword by adding noise, so that the received word
`received by the receiver may differ from the codeword that was transmit-
`ted. The received word is processed by a decoder, which uses the received
`word to guess which codeword was transmitted, and outputs its guess.
`Much of the research on error-correcting codes is devoted to improving
`the trade-off between the probability that the decoder7s guess is correct
`and the complexity of the encoders and decoders.
`In this survey, we examine the software complexity1 of this problem
`from an asymptotic perspective. We will restrict our attention to commu-
`
`nication over the binary symmetric channel (BSC), as it seems natural
`to most computer scientists.2 A channel is binary if it allows the trans-
`mission of only two symbols, which we take to bc 0 and l. The binary
`
`symmetric channel with error-probability p (BSCp) is the channel that
`
`1 We should point out that, at the present time, almost every implementation of error-
`correcting codes uses special—purpose hardware. Moreover, coding schemes that are
`efficient in software can be inefficient in hardware, and vice verso.
`2 Results similar to those in this survey may be obtained for any reasonable memoryless
`channel.
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 1
`
`IPR2018-01474
`Apple Inc. EX1017 Page 1
`
`

`

`transmits one bit at a time and flips the value of that bit with proba—
`bility p. If it does not flip a bit, then it transmits the bit correctly. For
`each bit transmitted by the channel, the probability that it is flipped is
`independent of the others.
`In the rest of this introduction, we will review the definitions needed
`to describe the types of error-correcting codes we will use and then state
`the complexity and coding problems we will consider. We will conclude
`the introduction with an overview of the rest of the paper.
`The framework presented in this survey builds on those developed
`
`in [4] and [29,30]. Most of the material in Sections 2 and 3 can be found
`in those papers. The material in Section 2 is standard, and we refer the
`
`reader to a reference such as [21], [40], or [3] for a more thourough treat-
`ment. We also recommend the recent survey by Vardy [41]. For references
`on error-correcting codes that emphasize an engineer’s perspective, we
`recommend [26], [42] and [6].
`
`1.1 Coding Definitions
`
`A code is an injective mapping from strings to strings. We will consider
`families of block codes over the alphabet {0, 1}. A binary block code is
`one whose messages are strings over {0,1}m and whose codewords are
`strings over {0,1}", for some n 2 m; the length of the code is n, and
`its rate is n/m. The strings of {0, 1}m are the possible messages, and we
`assume that each is equally likely.3 Each image of a string from {0, 1}m is
`a codeword, and the word “code” is sometimes used to refer just to the set
`of codewords. An encoder maps a message to its codeword, and a decoder
`maps a word in {0, 1}" to either a codeword or a message indicating that
`it cannot decide on a codeword.
`
`A family of codes can be defined as an infinite sequence of codes, each
`of a difierent length, indexed by their length. A code constructor for a
`family of codes is a device that takes as input a block length, and outputs
`a description of an encoder and a decoder for the code in the family
`of that length. We do not insist that this decoder be the best possible
`decoder for the resulting code, as we will measure the quality of the coding
`system by the number of errors that the decoder can actually correct. By
`measuring the complexity of the code constructor, we allow ourselves
`
`3 The task of adjusting one’s message space so that each message is equally likely
`is that of compression. Compression schemes need to take advantage of the special
`structure of the data to which they are applied. We would like to avoid such concerns.
`In situations in which error—free communication is desired, it is reasonable to assume
`that data has been compressed before it is encoded with an error—correcting code.
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 2
`
`IPR2018-01474
`Apple Inc. EX1017 Page 2
`
`

`

`to consider the non—uniform complexity of the encoder and decoder. We
`separate the complexity of constructing the encoders and decoders from
`the complexities of these devices themselves because this is how they are
`used: the same encoders and decoders are used many times to transmit
`messages of the same block length In particular, a long message is usually
`broken up into a sequence of blocks which are then transmitted separately.
`It would be reasonable to change the definition of a family of codes so that
`it allows additional parameters, such as the rate of the code and perhaps
`a quality parameter; in this case, the complexity of the constructor should
`also be given in terms of these parameters.
`Systematic codes are a particularly convenient class of codes. In a
`systematic code, the message appears as a substring of the codeword.
`Thus, we can divide the bits of a codeword into message bits and check
`bits, where the message bits contain the message and the encoder need
`only compute the check bits from the message bits. All the codes that
`we consider will be linear codes, and we will see that all such codes can
`easily be made systematic. Linear codes are codes in which the alphabet
`
`is a field, such as the field GF(2), and the encoding function is a linear
`map. Thus, the codewords form a vector space.
`
`1.2 Coding Problems
`
`lntuitively, if an error-correcting code is going to provide strong protec-
`tion against errors, then most codewords should be far from every other
`codeword. Our notion of distance between two words is the Hamming
`distance,
`the number of bits in which the words differ. The decoding
`algorithm for transmission over the BSC that achieves the minimal prob-
`ability of error will map a received word to the codeword to which it is
`closest. This approach is known as maximum likelihood decoding, as it
`selects the codeword that maximizes the probability that the channel will
`output the received word. If there are codewords ml and 1172 that differ
`
`is transmitted and more than [9/2 of those
`in only k places, and if 1111
`bits are flipped, then the optimal decoder will return wg. On a BSC with
`error probability p, the probability of this happening is at least
`
`16
`
`(a +1>/2>p(k+1)/2(1 WW ”/27
`
`,
`
`for k: odd.
`This observation led to the definition of the minimum distance of a
`
`code: the minimum distance achieved by a pair of codewords in the code.
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 3
`
`IPR2018-01474
`Apple Inc. EX1017 Page 3
`
`

`

`If 101, 1,02, and 'v are codewords in a linear code, then
`
`d(w17 w?) : d((w1_ ’U), (1112 _ 11)) : d(07 (w2 _ 101)),
`
`and 1,02 — w] is a codeword; so, the minimum distance of the code is equal
`to the minimum weight of a codeword, where the weight of a codeword is
`equal to its distance from the all-0 word. It is possible to construct families
`of codes in which the rate remains constant while the minimum distance
`
`grows linearly with the block length. Such codes are called asymptotically
`good, and we measure their minimum relative distanceitheir minimum
`distance divided by their block length.
`Sufliciently long codes from an asymptotically good family can pro-
`vide excellent error-protection: the maximum likelihood decoder can cor-
`rect any number of errors up to half the minimum distance. Moreover, if
`one uses an asymptotically good code to transmit over a channel whose
`error probability is less than the relative minimum distance of the code,
`then the probability of decoding error decreases exponentially in the block
`length of the code. Since the rate of the code remains constant, the com—
`munication overhead of the code does not changeione just divides the
`communication into longer blocks. Of course7 the computation time of
`most decoders will increase with the block length. These observations
`lead to a natural complexity of coding problem:
`to design asymptoti—
`cally good codes that have fast encoding, decoding, and constructing al-
`gorithms, where the complexity is measured in terms of the number of
`message bits in the codes. In this case, we consider the number of er—
`rors that the proposed decoder can correct rather than the number of
`errors that the maximum likelihood decoder can correct, as this provides
`a more accurate measure of the quality of the resulting coding system. Of
`course, one should consider the tradeoff between the rate and minimum
`distance of the codes as well, although this affects the length of the trans-
`mission more than the complexity of the algorithms. Bassalygo, Zyablov,
`and Pinsker [4] approached versions of this problem using random linear
`codes, Reed-Solomon codes, and some graph-theoretic codes (from [12]).
`A more natural problem is to measure the average-case performance
`of a coding system. Rather than measure the maximum number of errors
`that the decoder can correct, we will measure the number of errors that
`the decoder can usually correct when the errors are chosen at random.
`
`In his paper, “A Mathematical Theory of Communication”, Shannon [32]
`proved that every channel has a fixed capacity, which is a rate beyond
`which it is not possible to communicate reliably. Moreover, he demon—
`strated that for every rate less than the capacity of the channel, it is
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 4
`
`IPR2018-01474
`Apple Inc. EX1017 Page 4
`
`

`

`possible to transmit information with arbitrarily small error if one uses
`codes of sufficiently long block length. Thus, we are presented with a
`natural complexity problem, first posed by Savage [31]: given a binary
`symmetric channel with error probability p, and an error—tolerance 6, find
`the coding system4 of least complexity that enables one to communicate
`over the channel so that the probability that the output of the decoder
`is incorrect is at most 6.
`
`Neither of the previous problem statements have compared the rate of
`the code produced with the optimal rate possible for the channel. The rate
`of the code is important as it dictates how much redundancy occurs in
`the transmission. As the time required to transmit the message across the
`channel may dominate the time of the computation, such factors cannot
`be ignored. Moreover, part of the channel may be a network, in which case
`an unnecessarily long transmission may slow the transmissions of others
`using the network. Thus, our analyses should include a comparison of
`the rate of our codes with the best rate possible. The existence of the
`Shannon bound makes this reasonable in the analysis of the average-case
`performance of a coding system. Shannon’s bound states that the capacity
`of the BSCp is 1—H(p), where H(:1:) = —x log2 06— (1—56) log2(1—:v) is the
`binary entropy function. It is trickier to examine the tradeoff between the
`rate and minimum relative distance of a code because the best tradeoff
`
`achieved by known codes, the Gilbert—Varshamov bound, does not meet
`
`the best known upper bound, the Linear Programming Bound [24]. Most
`linear codes meet the Gilbert-Varshamov bound, which means that their
`
`rate 7“ and relative minimum distance 6 satisfy 7“ Z 1 — H (6) As we are
`unaware of any recent advances in the construction of binary codes that
`approach the Gilbert-Varshamov bound, we will not have to worry about
`this problem in this survey.
`All the complexity statements in this paper are designed to hold for
`the unit-cost model of a RAM (see [1]) We have verified that the state-
`ments of Section 5 also hold for Pointer Machines (see [16]) and, with
`minor modifications of the coding system such as those outlined in [34],
`for the log-cost model of a RAM [1] as well.
`
`1.3 Overview
`
`In Section 2, we examine the random linear codes introduced by Elias [7]
`and point out that they meet both the Gilbert-Varshamov and Shannon
`bounds. We then consider the Wozencraft ensemble, a smaller family of
`
`4 Actually, Savage just considered the complexity of the decoder.
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 5
`
`IPR2018-01474
`Apple Inc. EX1017 Page 5
`
`

`

`linear codes with similar performance. and show that its complexity is
`polynomially related with its probability of decoder error. In Section 37
`
`we introduce Forney’s [9] method of concatenating codes to obtain low-
`complexity codes that approach the Shannon bound and have a prob—
`ability of decoder error that is almost exponential in their complexity.
`Section 4 provides a very brief overview of constructions of codes based
`on sparse bipartite graphs. These have been used to obtain coding systems
`in which the probability of decoder error is exponential in the complex—
`ity of encoding and decoding. In Section 5, we concatenate the codes of
`Section 2 with the codes of Section 4 to obtain codes that do meet the
`
`Shannon bound and have a probability of decoder error that is exponen—
`tial in the complexity of encoding and decoding. We conclude by pointing
`out that we have not analyzed the complexities of these codes in terms
`of how much computation is required to obtain a given probability of de-
`coder error at a given distance from the Shannon bound. We argue that
`this is an important measure of the performance of a coding system7 and
`that more work on this problem is needed.
`
`2 Random Linear Codes
`
`A linear code is a subspace of GF(2)". One can show that, with high
`probability7 a linear code chosen uniformly at random probably meets
`the Gilbert-Varshamov bound. Similarly, for any rate r < 1 — H (p)7 most
`linear codes of rate r enable the maximum likelihood decoder to achieve
`
`an error probability that decreases exponentially in the block length when
`communicating over the BSCp.
`
`A linear code is usually described by its generator matrix or its check
`matrix. The generator matrix of a linear code of length n and rate r is a
`matrix over GF(2) with rn rows and 71. columns such that the words of
`the code are precisely the linear combinations of the rows of the matrix.
`
`The check matrix of such a code is a matrix of rank (1 — r)n with (1 —r)n
`rows and n columns such that the codewords are exactly those words that
`have inner product zero with each row of this matrix. The check matrix
`is sometimes called the parity check matrir since each row indicates a
`subset of the message bits whose parity must be zero in order for a word
`to be a codeword. Note that elementary row operations do not change the
`code defined by a generator or check matrix. Since the choices for these
`matrices are not unique7 one should presume that some choices will lead
`to simpler representations of the code than others.
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 6
`
`IPR2018-01474
`Apple Inc. EX1017 Page 6
`
`

`

`Encoding: By performing row operations on the check matrix of a
`
`linear code, one can obtain a check matrix that has the (1 —r)n-by-(1—r)n
`identity matrix as a submatrix. Using this check matrix, one can see that a
`linear code can be made systematic: the bits corresponding to the columns
`in the identity matrix can be labeled the check bits, and the other 7'71 bits
`can be used as the message bits. For any setting of the Tn message bits,
`there is exactly one setting of the check bits that produces a word that
`has inner product 0 with each row of this check matrix; moreover, these
`check bits can be computed by multiplying the vector of 7‘” message bits
`by the (1 — r)n-by-rn submatrix of the check matrix corresponding to the
`columns of the message bits. Thus, the code constructor could construct5
`a matrix from which the encoder could encode a message in quadratic
`time.
`
`is
`it
`Code Construction: If we choose a linear code at random,
`very unlikely that its performance will be much worse than that of most
`linear codes. However, we know of no efficient algorithm for certifying
`that a particular linear code will perform well, even assuming that we
`are going to use a maximum likelihood decoder. In general, computing
`
`the minimum distance of a binary code is NP-hard (see [41]), and we
`see no reason that it should be easier for a randomly chosen code. In
`fact, we know of no algorithm for approximating the minimum distance
`of a linear code that is substantially faster than enumerating all low-
`weight words and then checking whether they are codewords. However,
`we can find explicit constructions of linear codes that meet the Gilbert—
`Varshamov bound in less time than that taken by the naive algorithm that
`enumerates all linear codes. An algorithm for doing this was presented
`in [4]. We will present a different approach in Section 2.1.
`Decoding: The best known decoding algorithms for arbitrary linear
`codes require time exponential in the lengths of the codewords. One algo—
`rithm is to enumerate all codewords, compute the distance of each from
`the received word, and output a word closest to the received word.
`
`If one is willing to make a decoder that uses a lot of space, then one can
`speed up this algorithm by constructing a giant look-up table indexed by
`every possible received word. The entry corresponding to a received word
`would be the codeword closest to it. This results in a table with 2" entries.
`
`To obtain a smaller but almost as useful table, one can use syndrome
`decoding. The syndrome of a received word 11;
`is the vector obtained by
`
`5 A naive algorithm will produce this matrix in time 0(n3). An asymptotic advantage
`can be obtained by observing that the time required is asymptotically the same as
`the time required for matrix multiplication (see [15] or [5, Section 164]).
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 7
`
`IPR2018-01474
`Apple Inc. EX1017 Page 7
`
`

`

`computing the inner product of w with each row of the check matrix.
`Note that an error pattern produces the same syndrome regardless of the
`codeword to which it is added. Thus7 if one associates to each syndrome
`a minimal weight error—pattern that produces this syndrome, then one
`can perform maximum-likelihood decoding of a linear code by mapping
`each received word to the codeword obtained by adding the error pattern
`associated with the received word’s syndrome. Such a list of error patterns
`associated with syndromes has 2(1’7’)" entries of length at most n, and
`can be produced in time at most 2” >x< n2 by enumerating all possible
`error-patterns. If one just desires an exponentially decreasing probability
`decoding error when using a BSC with error—probability p, then one can
`speed up this algorithm by only enumerating over error-patterns of weight
`at most 1) + e for some small 6 > 0.
`
`Using syndrome decoding, one can check whether a given code meets
`the Shannon bound in time O(2<1’r+6)n), for any 6 > 0. First, construct
`the syndrome table for all error-patterns of weight at most (37 + e')n in
`time 2(1’T+‘)”n2. Then, for each error pattern of weight at most
`(p +
`6’)n, compute its syndrome and check whether the syndrome decoding
`algorithm would decode it to the the 0 word. Using this information, one
`can estimate the probability of decoding error of the syndrome decoding
`algorithm.
`
`2.1 The Wozencraft Ensemble
`
`The Wozencraft ensemble6 is a set of codes much smaller than the set
`
`of linear codes, but from which a randomly chosen code has an error
`tolerance similar to that of a linear code chosen uniformly at random.
`The property of the Wozencroft ensemble that gives its codes their power
`is that a every word appears as a codeword in an equal number of codes
`in the ensemble. The proof that this condition suffices is almost identical
`to the proofs that a random linear code meets the Gilbert-Varsharnov
`and Shannon bounds.
`
`Theorem 1. Let S be a set of codes of length n and mte 7“ such that
`
`|{C E S :
`
`'w 6 CH does not depend on '11). Then,
`
`1. a code chosen at random from S probably meets the Gilbert-Varshamov
`bound, and
`
`6 The VVozencraft ensemble was presented by Massey [23] and attributed to
`VVozencraft.
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 8
`
`IPR2018-01474
`Apple Inc. EX1017 Page 8
`
`

`

`2. a code chosen at random from S probably approaches the Shannon
`bound.
`
`[Sketch] The probability that a randomly chosen code from S
`Proof.
`contains a word of weight at most k; is (2)2’u’r)". Thus, the probability
`that a randomly chosen code from S contains no non-zero codeword word
`of weight less than 671 is at least
`
`(in.
`
`27(177’)n Z (n),
`
`2:1
`
`2
`
`which becomes less than any constant as n grows large, provided that
`7" g 1 7 H(())
`To prove that a code chosen at random from S probably meets the
`Shannon bound, we assume, without loss of generality, that the 0 word is
`transmitted over the BSCp. The expected number of errors produced by
`the channel is pm, and Chebyshev’s inequality implies that the probability
`
`that it produces more than pn+m errors is less than l/c, for any c 2 1.
`Assuming that at most pn + W errors are produced by the channel,
`the probability that there is another codeword closer than the 0 word to
`the received word is at most
`
`pn+\/CITH
`
`2(1—r)77, Z (n) a
`
`.
`1:1
`
`2
`
`which becomes less than any constant as n grows large, provided that
`7" < l i H (p) Thus, the bound is obtained by letting c grow large and
`then letting n grow much more quickly.
`
`The 1/2-rate codes of length n from the Wozencraft ensemble can
`be indexed by elements of the field GF(2"/2). For each element
`oz 6
`GF(2”/2), the code W0, will consist of the binary representation of all
`pairs (:5, out), where w E CF (271/2), and by 04:]; we mean 04 multiplied by
`:5. Thus, :10 represents the message bits, and oar its check bits. It is clear
`that each word appears as a codeword exactly once; in fact, the word
`(any) appears in the code Wy/z. To produce codes of shorter length, we
`can puncture the code by ignoring some check bits. If we ignore 0 check
`bits, then each word will appear as a codeword 2C times. To produce
`longer codes,7 we can append more check bits by constructions such as
`
`7 These shorter and longer codes are different from those described by VVozencraft.
`We have presented them because they are easier to describe in this framework. If we
`use VVozencraffls original codes, we should be able to replace the r’ with r in what
`follows.
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 9
`
`IPR2018-01474
`Apple Inc. EX1017 Page 9
`
`

`

`setting WW3 to be the set of words of the form (momma), where a and
`B are chosen from GF(2"/2). A simple analysis shows that these codes
`also meet the requirements of Theorem 1.
`Encoding: The codes can be encoded using Discrete Fourier Trans—
`
`form based algorithms for multiplication over finite fields (see [27] and [1]).
`These algorithms take time O(n log n)
`Decoding: We would decode these codes with the syndrome decod—
`ing algorithm presented in Section 2. This algorithm takes time 0(712)
`and space OOH—Mo”)7 provided that the code constructor spends time
`0(2(1’T+‘)"n2), for e > 0, to build the syndrome table.
`Construction: One can find a code in these ensembles that meets the
`
`Gilbert-Varshamov bound in time 0(2(1_T/)"n2), where 7" is the greatest
`reciprocal of an integer that is less than 7“. To do this7 enumerate all
`words of weight at most 6717 where 11(6) 2 1 — 7’. For each of these words,
`determine the codes in which it appears. Then, choose a code in which
`none of these words appeared.
`Finding a code in the these ensembles that meets the Shannon bound
`does not seem to be as easy. One approach would be to enumerate over
`the codes, and then choose one that meets the Shannon bound. If we test
`the quality of a code using the syndrome decoding approach outlined at
`the end of Section 27 then this algorithm takes time 2<27TLTln7 where 7"
`is the greatest reciprocal of an integer that is less than 7“.
`
`3 Concatention of Codes
`
`Forney [9] introduced the method of concatenating codes to obtain low-
`complexity codes that approach the Shannon bound. Concatenation is
`a means of combining two codes. referred to as the inner code and an
`outer code. We’ll assume that the inner code is a binary code with m1
`message bits, rate 'r 1, and length m. The outer code will be a code over
`an alphabet with at most 2ml symbols. If the outer code has length 712
`and rate 7‘2, then the concatenation of the inner code with the outer code
`will be a binary code of rate mm with mngml message bits and length
`72,1712.
`
`To encode the concatenated code7 one collects the message bits into
`r2712 symbols in the alphabet of the outer code. These symbols are treated
`as message symbols and encoded using the encoding algorithm for the
`outer code, resulting in 712 symbols. Each of these 712 symbols of m1
`bits each is then treated as a set of message bits and encoded using the
`inner code. The code is then decoded in the reverse order. Thus7 the time
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 10
`
`IPR2018-01474
`Apple Inc. EX1017 Page 10
`
`

`

`required to encode (decode) the concatenated code is the sum of the time
`required to encode (decode) the outer code and 712 times the time required
`to encode (decode) the inner code. The idea behind the construction is
`that the length of the inner code should be small so that its complexity is
`small relative to 77.2. On the other hand, the outer code should be a code
`for which correction of errors in its symbols is very simple.
`
`As an outer code, Forney suggested the using a Reed-Solomon code.
`Reed—Solomon codes can be built with any field as their alphabet, and
`have length one less than the number of elements in the field. Using algo-
`rithms based on the Discrete Fourier Transform, Reed-Solomon codes can
`be encoded in time O(n log n log log n) and decoded in time O(n log2 n log log n).
`Using a new result of Pan [25], one can decode Reed—Solomon codes over
`prime fields in time O(n log n log log n). The decoding algorithm for a
`Reed-Solomon code of length n and rate 7" can correct up to n — 2m — 1
`errors, which is the best one could hope for.
`
`8
`
`We now sketch how one can use concatenated codes to approach the
`Shannon bound with low complexity. For any p, find a binary code C1 of
`
`rate 7‘1 < 1 — H(p) and length m such that the probability of decoding
`error using the maximum likelihood decoder is at most 6. Concatenate this
`binary code with a with a Reed Solrnon code 02 of rate 1 — 3e and length
`77.2 over the field GF(2“"1) . If fewer than 36/2 of the inner code decoding
`operations fail, then the decoder for the Reed-Solomon code will correctly
`decode the entire code. One can use a Chernoff bound to show that the
`
`probability that at least 36/ 2 of the inner code decoding operations fail
`is exponentially small in 17.2. As the rate of the concatenated code is
`(l — 36)r1, we can decrease e to obtain codes that approach the Shannon
`bound. A more careful analysis reveals that, for any rate 7" < 1 — H (p)
`and for sufficiently long block lengths,9 the probability that an inner
`code decoding operation fails will be 01”” for some (:1 > 1. Thus, the
`probability that the concatenated decoder will fail will be c; 711712 for some
`02 > 1.
`
`Encoding: If we use a code from the Wozencroft ensemble as the
`inner code, then the total time of the inner code encoding operations
`will be 0(n2n1 log n1). As mm : [logng], the encoding time will be
`dominated by the encoding time of the outer Reed-Solomon code.
`
`[38] and [5, Chapter 3, Sections 1 and 2].
`8 See [14], [28], [1],
`9 Note that we must have 1 — H(p) — 'r' < «pm before this analysis become applicable.
`Without this restriction, there is a constant probability that the number of errors
`will be more than the inner decoder can handle
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 11
`
`IPR2018-01474
`Apple Inc. EX1017 Page 11
`
`

`

`Decoding: Assuming that the code constructor has created look—up
`tables to aid the decoding of the inner code, the decoding time of the
`concatenated code will be dominated by the time required to encode and
`decode the outer Reed—Solomon code. For sufficiently large m, we can find
`primes close to and less than 27”, so we can actually use a Reed-Solomon
`
`code over a prime field as the outer code and make use of Pan’s [25]
`improved decoding algorithm.
`Construction: As the total block length is n : mug and mm :
`[log 712], the discussion in Section 2.1 yields an algorithm for choosing the
`inner code that takes time 001,244,), Where 7" is the greatest reciprocal
`of an integer such that r’ < r. This factor will dominate as it is larger
`than the time required to build the decoder and the time required to
`describe the Reed-Solomon codes.
`
`While concatenation allows us to construct codes with good average-
`case performance, it does not seem to help much with worst-case perfor-
`mance. It would be difficult to construct new codes that beat the Gilbert—
`
`Varshamov bound by concatenation: if the inner code has minimum rel-
`ative distance 61 and the outer code has minimum relative distance 62,
`then their concatenation could have minimum relative distance 6162.
`
`3.1
`
`Justesen Codes
`
`Justesen [13] was the first to find an explicit construction of an asymp-
`totically good family of error-correcting codes. Exactly what is meant
`by emplicit construction is not completely clear, but it certainly excludes
`any construction that involves a search over small codes. Ideally, it should
`mean that there is a “closed form” description of the code. One might like
`to formalize the notion of an explicit construction by defining it in terms
`of the complexity of the code constructor. However, no such definition
`will avoid the possibility of performing a constant-size search.
`J ustesen’s main idea was to concatenate an outer Reed-Solomon code
`
`with many different inner codes. This is, instead of using just one inner
`code, he would use a different inner code for each outer code symbol. In
`particular, he used a Reed-Solomon code constructed over a field of the
`
`form GF(2”) and the l/2-rate Wozencraft ensemble indexed by elements
`of GF(2”) as the inner codes. The previous analysis proving that concate-
`nated codes can be asymptotically good can easily be modified to show
`that the concatenated code will be asymptotically good provided that
`almost all of the inner codes are asymptotically good. While it was not
`Justesen’s main objective, one can also show that his codes approach the
`
`Shannon bound for rate 1/2 by using as an outer code a Reed-Solomon
`
`|PR2018—01474
`
`Apple Inc. EX1017 Page 12
`
`IPR2018-01474
`Apple Inc. EX1017 Page 12
`
`

`

`code of rate 1* 67 and letting 6 slowly go to zero as the block length grows
`large.
`
`4 Graph Theoretic Codes
`
`A major advance in the development of low-complexity codes was intro-
`
`duced in Gallager’s [11] thesis, “Low Density Parity-Check Codes”. This
`led to the development of codes that we will refer to as graph—theoretic
`codes. We do not have the time or space to provide a good explanation
`of the development of graph-theoretic coding; soj we will just mention a
`few of the points that are relevant to this work.
`
`Gallager [11] defined a family of linear codes by setting their check
`matrix to be the the bipartite adjacency matrix of a high-girth sparse
`bipartite graph. If this graph is chosen at random and the codes are de—
`coded by a maximum likelihood decoder, then they will approach the
`Shannon bound. Gallager proved that there was a linear-time algorithm
`for decoding these codes that would correct some constant fraction of ran-
`domly chosen errors with probability 2‘0Wfi). As we are now beginning
`to use codes of sufficiently long block lengths to make Gallagerls codes
`useful, they have become the subject of many recent papers (see, for ex-
`ample [10,8,20743,44]). Tanner [39] presented an important generalization
`of Gallagerls construction. along with generalizations of his decoding al—
`gorithms.
`
`Zyablov and Pinsker [45] showed that, with high probability over
`the choice of graph, Gallager’s decoder would alway correct some con—
`stant fraction of errors, regardless of where they appeared. Sipser and

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket