IPR2015-00059, No. 1053 Exhibit - MacKay, Good error correcting (P.T.A.B. Oct. 14, 2014)

Good Error(cid:2)Correcting Codes
`based on Very Sparse Matrices
`
`David J(cid:2)C(cid:2) MacKay
`Cavendish Laboratory(cid:3) Cambridge(cid:3) CB HE(cid:2)
`United Kindom(cid:2) mackay(cid:2)mrao(cid:3)cam(cid:3)ac(cid:3)uk
`
`June (cid:3) (cid:9) submitted to IEEE(cid:10)IT
`
`Abstract
`
`We present two families of error(cid:2)correcting codes de(cid:3)ned in terms of very sparse matrices(cid:4)
`(cid:5)MN codes(cid:6) are new(cid:7) and (cid:5)GL codes(cid:6) were (cid:3)rst investigated by Gallager in (cid:7) but appear to
`have been largely forgotten(cid:7) in spite of their excellent properties(cid:4) The decoding of both codes
`can be tackled with a practical belief propagation algorithm(cid:4)
`We prove that these codes are (cid:5)very good(cid:6)(cid:7) in that sequences of codes exist which(cid:7) when
`optimally decoded(cid:7) achieve information rates up to the Shannon limit(cid:4) This result holds not
`only for the binary symmetric channel but also for any channel with symmetric stationary
`ergodic noise(cid:4)
`We give experimental results for binary symmetric channels and Gaussian channels demon(cid:2)
`strating that practical performance substantially better than that of standard convolutional and
`concatenated codes can be achieved(cid:12) indeed the performance of GL codes is almost as close to
`the Shannon limit as that of Turbo codes(cid:4)
`
`Keywords
`
`Error(cid:2)correction codes(cid:3) iterative probabilistic decoding(cid:3) Shannon Limit(cid:4)
`
`
`
`Introduction
`
`For a glossary of symbols used in this paper(cid:3) please see appendix A(cid:4)
`
` (cid:3) Background
`
`In (cid:3) Shannon (cid:9) (cid:12) proved that for any channel there exist block codes that achieve arbitrarily
`small probability of error at any communication rate up to the capacity of the channel(cid:4) We will
`refer to such code families as (cid:13)very good(cid:14) codes(cid:4) By (cid:13)good(cid:14) codes we mean code families that achieve
`arbitrarily small probability of error at non(cid:2)zero communication rates up to some maximum rate
`that may be less than the capacity of the given channel(cid:4) By (cid:13)bad(cid:14) codes we mean code families that
`can only achieve arbitrarily small probability of error by decreasing the information rate to zero(cid:4)
`(cid:15)Bad codes are not necessarily useless for practical purposes(cid:4)(cid:16) By (cid:13)practical(cid:14) codes we mean code
`families which can be encoded and decoded in time and space polynomial in the block length(cid:4)
`Shannon(cid:14)s proof was non(cid:2)constructive and employed random codes for which there is no practi(cid:2)
`cal encoding or decoding algorithm(cid:4) Since (cid:3) it has been proved that there exist very good linear
`codes (cid:15)non(cid:2)constructively(cid:16)(cid:17) that there exist very good cyclic codes (cid:15)non(cid:2)constructively(cid:16) (cid:9)(cid:12)(cid:17) that
`there exist very good codes that have structure (cid:15)non(cid:2)constructively(cid:16) (cid:9) (cid:12)(cid:17) and that very good codes
`can be produced with a short description in terms of permutations (cid:9) (cid:12)(cid:4) But no practical decoding
`algorithm is known for any of these codes(cid:3) and it is known that the general linear decoding problem
`
`
`
`Hughes, Exh. 1053, p. 1
`
`

`
`(cid:15)(cid:20)nd the maximum likelihood s in the equation GTs (cid:21) n (cid:22) r(cid:16) is NP(cid:23)complete (cid:9) (cid:12)(cid:4) Convolutional
`codes (cid:15)which can be viewed as block codes with memory(cid:16) can approach the Shannon limit as their
`constraint length increases but the complexity of their decoding grows exponentially with the con(cid:2)
`straint length(cid:4) For a long time a generally held view was that for practical purposes a channel(cid:14)s
`e(cid:24)ective capacity was a rate (cid:13)R(cid:14) smaller than the Shannon capacity(cid:3) if convolutional codes were
`used(cid:17) and many believed this conjecture applied to all codes(cid:3) speculating that practical commu(cid:2)
`nication beyond R was impossible(cid:4) Forney proved that there do exist very good (cid:13)concatenated(cid:14)
`codes that are practical (cid:9) (cid:12)(cid:17) but the proof was also non(cid:2)constructive (cid:9)(cid:12)(cid:4)
`When it comes to practical(cid:3) constructive codes(cid:3) constructions have been demonstrated of codes
`based on concatenation that are good(cid:3) though not very good(cid:3) but most known practical codes are
`asymptotically bad (cid:9)(cid:12)(cid:4) Goppa(cid:14)s recent algebraic geometry codes (cid:15)reviewed in (cid:9)(cid:12)(cid:16) appear to be
`both practical and good (cid:15)with practical decoding proven possible up to the Gilbert bound(cid:16)(cid:3) but
`we believe that the literature has not established whether they are very good(cid:4) The most practi(cid:2)
`cal decoding algorithm (cid:9)(cid:12) appears to be prohibitively costly (cid:15)N (cid:16) to implement(cid:3) and algebraic
`geometry codes do not appear to be destined for practical use(cid:4)
`Thus the conventional view is that there are few known constructive codes that are good(cid:3) fewer
`still that are practical(cid:3) and none at all that are both practical and very good(cid:4) It seems to be widely
`believed that whereas almost any random linear code is good(cid:3) codes with structure that allows
`practical coding are likely to be bad (cid:9)(cid:12)(cid:3) (cid:9) (cid:12)(cid:4) Battail expresses an alternative view(cid:3) however(cid:3) that
`(cid:13)we can think of good codes(cid:3) and we can decode them(cid:14) (cid:9)(cid:12)(cid:4) This statement is supported by the
`results of the present paper(cid:4)
`In this paper we present two code families(cid:4) Gallager(cid:14)s low(cid:2)density parity(cid:2)check codes (cid:15)(cid:13)GL
`codes(cid:14)(cid:16) are de(cid:20)ned in terms of a very sparse random parity check matrix (cid:9) (cid:3) (cid:3) (cid:12)(cid:4) (cid:13)MN codes(cid:14)
`are also de(cid:20)ned in terms of very sparse random matrices(cid:3) and were (cid:20)rst presented in (cid:9) (cid:12)(cid:4) (cid:15)We
`generalized MN codes to GL codes(cid:3) then realised that we had rediscovered Gallager(cid:14)s work(cid:4)(cid:16) MN
`codes are unconventional in that redundancy can be incorporated in the transmitted codewords not
`only by using a K (cid:0) N generator matrix with transmitted block length N greater than the source
`block length K(cid:3) but also by using a source that is itself redundant(cid:4)
`These code families both have two important properties(cid:4) First(cid:3) because of the construction in
`terms of sparse matrices(cid:3) practical decoding seems to be possible at good communication rates(cid:4)
`Second(cid:3) we prove in section that in spite of their simple construction these codes are very good (cid:27)
`that is(cid:3) sequences of codes exist which when optimally decoded achieve information rates up to the
`Shannon limit of the binary symmetric channel(cid:4) We further prove that the same codes are in fact
`good for any ergodic symmetric channel model(cid:4) Our proof may be viewed as a semi(cid:2)constructive
`proof of Shannon(cid:14)s noisy channel coding theorem(cid:4) It is indeed easy to think of good codes(cid:4)
`In section we present a (cid:13)belief propagation(cid:14) algorithm for solving the decoding problem(cid:3) (cid:20)rst
`presented by Gallager (cid:9) (cid:12)(cid:4) We give an analysis of the decoding algorithm in section (cid:4) (cid:4) These
`results lead us to conjecture that there exist GL and MN codes which are not only good but which
`also achieve error rates approaching zero at a non(cid:2)zero information rate when decoded using a
`practical algorithm(cid:4)
`In sections and we describe empirical results of computer experiments
`using the belief propagation algorithm to decode GL and MN codes(cid:4) Our experiments show that
`practical performance signi(cid:20)cantly superior to that of textbook codes can be achieved by these
`codes(cid:4) Section contains discussion speci(cid:20)c to MN codes(cid:4)
`
` (cid:3) De(cid:5)nitions
`
`The input and output alphabets of the binary symmetric channel (cid:15)BSC(cid:16) will be denoted f(cid:2) g(cid:4)
`We will denote the error probability of the BSC by fn(cid:3) where fn (cid:3) (cid:4)(cid:4)
`De(cid:2)nition The binary entropy functions H(cid:15)f (cid:16) and H e
`(cid:15)f (cid:16) are
`H(cid:15)f (cid:16) (cid:22) f log(cid:15) (cid:5)f (cid:16) (cid:21) (cid:15) (cid:2) f (cid:16) log(cid:15) (cid:5)(cid:15) (cid:2) f (cid:16)(cid:16)
`
`(cid:15) (cid:16)
`
`
`
`Hughes, Exh. 1053, p. 2
`
`

`
`H e
`(cid:15)f (cid:16) (cid:22) f loge(cid:15) (cid:5)f (cid:16) (cid:21) (cid:15) (cid:2) f (cid:16) loge(cid:15) (cid:5)(cid:15) (cid:2) f (cid:16)(cid:16)(cid:4)
`
`(cid:15)(cid:16)
`
`De(cid:2)nition The weight of a binary vector or matrix is the number of s in it(cid:3) We denote the
`weight of a vector x by w(cid:15)x(cid:16)(cid:3) The density of a source of random bits is the expected fraction of
` bits(cid:3) A source is sparse if its density is less than (cid:3)(cid:3) A vector v is very sparse if its density
`vanishes as its length increases(cid:6) for example(cid:6) if a constant number t of its bits are s(cid:3) The overlap
`between two vectors is the number of s in common between them(cid:3)
`
`De(cid:2)nition The capacity C(cid:15)fn(cid:16) of a binary symmetric channel with noise density fn is(cid:6) in bits
`per cycle(cid:6)
`
`The rate R(cid:15)fn(cid:16) is
`
`C(cid:15)fn(cid:16) (cid:22) (cid:2) H(cid:15)fn(cid:16)(cid:4)
`R(cid:15)fn(cid:16) (cid:3) (cid:2) log(cid:0) (cid:21) qfn(cid:15) (cid:2) fn(cid:16)(cid:2) (cid:4)
`
`(cid:15) (cid:16)
`
`(cid:15)(cid:16)
`
`This is the computational cuto(cid:7) of sequential decoding for convolutional codes(cid:8)the rate beyond
`which the expected cost of achieving vanishing error probability using sequential decoding becomes
`in(cid:9)nite(cid:3)
`The Gilbert bound GV (cid:15)fn(cid:16) is
`
`GV (cid:15)fn(cid:16) (cid:22) (cid:3) (cid:2) H(cid:15)fn(cid:16) fn (cid:3) (cid:5)
`fn (cid:4) (cid:5)
`This is the maximum rate at which one can communicate with a code whose codewords satisfy the
`Gilbert(cid:10)Varshamov minimum distance bound(cid:6) assuming bounded distance decoding (cid:11) (cid:14)(cid:3)
`
`(cid:4)
`
`(cid:15)(cid:16)
`
`
`
`De(cid:2)nition A model that de(cid:9)nes a probability distribution over strings x of any length N (cid:6)
`P (cid:15)xjN (cid:16)(cid:6) has mean entropy Hx if for any (cid:6) (cid:7) and (cid:8) (cid:7) there exists an N (cid:0) such that for all
`N (cid:7) N (cid:0)(cid:6)
`
`(cid:7) (cid:8)(cid:6) (cid:3) (cid:6)(cid:4)
`
`log
`
`P (cid:15)xjN (cid:16) (cid:2) Hx(cid:5)(cid:5)(cid:5)(cid:5)
`
` N
`
`P (cid:4)(cid:5)(cid:5)(cid:5)(cid:5)
`
`(cid:15)(cid:16)
`
`For example(cid:3) a memoryless binary symmetric channel(cid:14)s noise has mean entropy Hn (cid:22) H(cid:15)fn(cid:16)(cid:3)
`where fn is the density of the noise(cid:17) the proof of this statement(cid:3) by the law of large numbers(cid:3) is
`well known (cid:9) (cid:12)(cid:4) We will prove that the codes presented in this paper are good codes not only for
`the binary symmetric channel but also a wide class of channels with memory(cid:4)
`
`De(cid:2)nition A binary channel with symmetric stationary ergodic noise is a channel whose output
`in response to a transmitted binary vector t is given by r (cid:22) t (cid:21) n mod (cid:6) where n(cid:6) the noise vector(cid:6)
`has a probability distribution that is (cid:15)a(cid:16) independent of t and (cid:15)b(cid:16) stationary and ergodic(cid:3)
`
`For example(cid:3) burst noise might be modelled by a stationary and ergodic Markov process(cid:4) Such
`a process has a mean entropy(cid:3) though the evaluation of this quantity may be challenging(cid:4) The
`standard Gaussian channel with binary inputs is also equivalent to a binary channel with stationary
`stationary ergodic noise(cid:4)
`We will concentrate on the case of a binary channel with symmetric noise (cid:15)see de(cid:20)nition (cid:16) in
`the body of this paper(cid:4) Channels with memory whose inputs are binary and whose outputs are in
`some more general alphabet are addressed in appendix H(cid:4)
`
`
`
`Hughes, Exh. 1053, p. 3
`
`

`
` (cid:8)(cid:8) Linear codes
`
`A linear error correcting code can be represented by a N by K binary matrix GT (cid:15)the generator
`matrix(cid:16)(cid:3) such that a K(cid:2)bit binary message s is encoded as the N (cid:2)bit vector t (cid:22) GTs mod (cid:4) (cid:15)Note
`that we have chosen to use column vectors so the generator matrices act to the right rather than
`the left(cid:4)(cid:16) The generator matrix is in (cid:13)systematic form(cid:14) if it can be written as
`
`P (cid:8) (cid:2)
`GT (cid:22) (cid:7) IK
`where IK is the K (cid:0) K identity matrix(cid:3) and P is a binary matrix(cid:4) The channel adds noise n to the
`vector t with the resulting received signal r being given by(cid:28)
`
`(cid:15)(cid:16)
`
`r (cid:22) (cid:15)GTs (cid:21) n(cid:16) mod (cid:4)
`
`(cid:15)(cid:16)
`
`The decoder(cid:14)s task is to infer s given the received message r(cid:3) and the assumed noise properties of
`the channel(cid:4) The optimal decoder returns the message s that maximizes the posterior probability
`
`P (cid:15)sjr(cid:2) G(cid:16) (cid:22)
`
`(cid:4)
`
`P (cid:15)rjs(cid:2) G(cid:16)P (cid:15)s(cid:16)
`P (cid:15)rjG(cid:16)
`It is often not practical to implement the optimal decoder(cid:4)
`If the prior probability of s is assumed uniform(cid:3) and the probability of n is assumed to be
`independent of s (cid:15)c(cid:4)f(cid:4) de(cid:20)nition (cid:16)(cid:3) then it is convenient to introduce the (cid:15)N (cid:2) K(cid:16) (cid:0) N parity
`check matrix(cid:3) H(cid:3) which in systematic form is (cid:9)P IN (cid:2)K(cid:12)(cid:4) The parity check matrix has the property
`HGT (cid:22) mod (cid:3) so that(cid:3) applying H to equation (cid:15)(cid:16)(cid:3)
`
`(cid:15) (cid:16)
`
`Hn (cid:22) Hr mod (cid:4)
`
`(cid:15) (cid:16)
`
`Any other (cid:15)N (cid:2) K(cid:16) (cid:0) N matrix A whose rows span the same space as H is a valid parity check
`matrix(cid:4)
`The decoding problem thus reduces(cid:3) given the above assumptions(cid:3) to the task of (cid:20)nding the
`most probable noise vector n such that
`
`Hn mod (cid:22) z(cid:2)
`
`(cid:15) (cid:16)
`
`where the syndrome vector z (cid:22) Hr mod (cid:4)
`
` (cid:3) Description of the two code families
`
`We de(cid:20)ne two code families(cid:4) We explain the more conventional GL codes (cid:20)rst(cid:4)
`
` (cid:8) (cid:8) The idea behind GL codes
`
`We construct a linear code by (cid:20)rst making a very sparse random parity check matrix A(cid:4) (cid:15)Very
`sparse(cid:3) but not systematic(cid:4)(cid:16) We then use linear algebra to obtain a corresponding generator matrix(cid:4)
`This can be done by (cid:20)rst putting A into systematic form H(cid:3) then deducing a systematic G(cid:4) This
`simple construction is complemented by a straightforward decoding algorithm which implements
`an approximation to the ideal Bayesian inference of equation (cid:15) (cid:16)(cid:4)
`
`
`
`Hughes, Exh. 1053, p. 4
`
`

`
` (cid:8) (cid:8) Construction of GL codes
`
`The parity check matrix A can be constructed as follows(cid:4) We will describe variations on this
`construction later(cid:4)
`A transmitted block lengthN and a source block lengthK are selected(cid:4) We de(cid:20)ne M (cid:22) N (cid:2) K
`to be the number of parity checks(cid:4) We select a column weight t(cid:3) which is an integer greater than
`or equal to (cid:4) We create a rectangular M (cid:0) N matrix (cid:9)M rows and N columns(cid:12) A at random with
`exactly weight t per column and a weight per row as uniform as possible(cid:4) If N(cid:5)M is chosen to be
`an appropriate ratio of integers then the number per row can be constrained to be exactly tN(cid:5)M (cid:4)
`We then use Gaussian elimination and reordering of columns to derive an equivalent parity check
`matrix in systematic form (cid:9)P IM (cid:12)(cid:4) There is a possibility that the rows of A are not independent
`(cid:15)though for odd t(cid:3) this has small probability(cid:16)(cid:17) in this case(cid:3) A is a parity check matrix for a code with
`the same N and with smaller M (cid:3) that is(cid:3) a code with greater rate than assumed in the following
`sections(cid:4) Rede(cid:20)ning A to be the original matrix with its columns reordered as in the Gaussian
`elimination(cid:3) we have the following situation(cid:4)
`The matrix A (cid:22) (cid:9)C C(cid:12) is composed of two very sparse matrices C and C as follows(cid:4)
`
`is a square M (cid:0) M matrix that is very sparse and invertible(cid:4) The inverse C(cid:2)
`The matrix C
`
`of this matrix has been computed during the Gaussian elimination which produced the matrix
`P (cid:22) C(cid:2)
` C (cid:4) The inversion takes order M N time and is performed once only(cid:4)
`
`The matrix C
`
`is a rectangular M (cid:0) K matrix that is very sparse(cid:4)
`
`Encoding(cid:8) We de(cid:20)ne the generator matrix of the GL code to be
`
`
`where IK is the K (cid:0) K identity matrix(cid:4)
`
`GT (cid:22) (cid:7) IKP (cid:8) (cid:22) (cid:7)
`
`IK
`C(cid:2)
`
` C (cid:8) mod (cid:2)
`
` (cid:8) (cid:8) Variations
`
`(cid:15) (cid:16)
`
` (cid:4) When generating the matrix A(cid:3) one can constrain all pairs of columns in the matrix to have
`an overlap (cid:5) (cid:4) This is expected to improve the properties of the ensemble of codes(cid:3) for
`reasons that will become apparent in section (cid:4)(cid:4)
`
`(cid:4) One can further constrain the matrix A so that the topology of the corresponding bipartite
`graph does not contain short cycles(cid:4) This is discussed further in section (cid:4) (cid:4)
`
` (cid:8) (cid:8) The decoding problem for GL codes
`
`A source vector s of length K is encoded into a transmitted vector t de(cid:20)ned by(cid:28)
`
`t (cid:22) GTs mod (cid:4)
`
`(cid:15) (cid:16)
`
`If the generator matrix has been computed explicitly (cid:15)which takes M N time(cid:16) then the transmitted
`vector can be computed by explicit multiplication in N K time(cid:4) However(cid:3) encoding might be possible
`in less time using sparse matrix methods(cid:4)
`The received vector is
`
`r (cid:22) t (cid:21) n mod (cid:2)
`
`(cid:15) (cid:16)
`
`where the noise is n(cid:4) In the case of a binary symmetric channel(cid:3) n is assumed to be a sparse random
`vector with independent(cid:3) identically(cid:2)distributed bits of density fn(cid:4) We will discuss more general
`channels below(cid:4)
`
`
`
`Hughes, Exh. 1053, p. 5
`
`

`
`By construction(cid:3) A is a parity check matrix for G (cid:27) that is(cid:3) AGT (cid:22) (cid:27) so the decoding
`problem is to recover t by (cid:20)nding the most probable n that satis(cid:20)es the equation(cid:28)
`
`where z is the syndrome vector(cid:28)
`
`An (cid:22) z mod (cid:2)
`
`z (cid:22) Ar mod (cid:2)
`
`(cid:15) (cid:16)
`
`(cid:15) (cid:16)
`
`computation of which takes time of order N t(cid:3) if the sparseness of A is exploited(cid:4)
`The optimal decoder(cid:3) in the case of a binary symmetric channel(cid:3) is an algorithm that (cid:20)nds the
`sparsest vector (cid:29)n that satis(cid:20)es A(cid:29)n (cid:22) z(cid:4) From (cid:29)n we obtain our guess for the transmitted signal
`(cid:29)t (cid:22) r (cid:21) (cid:29)n(cid:3) the (cid:20)rst K bits of which are the optimal guess (cid:29)s for the original message(cid:4)
`Both the matrix A and the unknown vector n are sparse(cid:4) One might therefore hope that it is
`practical to solve this decoding problem (cid:15)though perhaps not right up to the theoretical limits of
`the optimal decoder(cid:16)(cid:4) We will demonstrate a practical decoder later(cid:4)
`
` (cid:8) (cid:8) Construction of MN codes
`
`Having created the matrices C(cid:2)
` and C (cid:3) we can de(cid:20)ne the generator matrix G of an MN code by
`the non(cid:23)systematic matrix GT (cid:22) C(cid:2)
` C (cid:4) The novel idea behind MN codes is that we canc onstrain
`the source vectors to be sparse and exploit this unconventional form of redundancy in the decoder
`(cid:9) (cid:12)(cid:4) We will discuss properties and possible applications of MN codes in section (cid:4) The decoding
`of these codes involves a problem isomorphic to the GL codes(cid:14) decoding problem Ax (cid:22) z(cid:4)
`
` (cid:8) (cid:8) Overview
`
`The theoretical e(cid:24)ectiveness of GL and MN codes as error correcting codes depends on the properties
`of very sparse parity matrices A in relation to the solvability of the decoding problem (cid:15) (cid:16)(cid:4) We
`address the question(cid:3) how well would these codes work if we had the best possible algorithm for
`solving the decoding problem(cid:30)
`The practical e(cid:24)ectiveness of GL and MN codes depends on our (cid:20)nding a practical algorithm
`for solving equation (cid:15) (cid:16) that is close enough to the optimal decoder that the desirable theoretical
`properties are not lost(cid:4)
`We now show theoretically that there exist GL and MN codes for which the optimal decoder
`would achieve information rates arbitrarily close to the Shannon limit for a wide variety of channels(cid:4)
`We then describe empirical results with a decoding algorithm(cid:3) which(cid:3) although suboptimal(cid:3) gives
`excellent results(cid:4) The decoding takes of order N t computational operations(cid:4)
`
` Limits of Optimal Decoding
`
`We prove properties of GL and MN codes by studying properties of the decoding problem Ax (cid:22) z
`where the unknown vector x is sparse and A is very sparse(cid:4) We make use of two standard tools(cid:28)
`we prove properties of the optimal decoder by proving properties of a slightly sub(cid:2)optimal (cid:13)typical
`set decoder(cid:14) which is easier to analyse(cid:17) and we average the performance of this decoder over an
`ensemble of random very sparse matrices A(cid:4) A (cid:13)good(cid:14) average performance proves that there exist
`(cid:13)good(cid:14) matrices A(cid:3) and indeed that any random matrix A from the ensemble is likely to be (cid:13)good(cid:14)(cid:4)
`As in all proofs of goodness of coding systems(cid:3) we employ a transmission block length that can be
`increased to a su(cid:31)ciently large value that an error probability smaller than a desired (cid:6) is achieved(cid:4)
`To prove that GL and MN codes are very good we will also increase the weight per column(cid:3) t(cid:3) of
`the matrix A(cid:3) but only in such a way as to keep the matrix very sparse(cid:3) i(cid:3)e(cid:3)(cid:3) t(cid:5)M (cid:6) (cid:4)
`Previous work on low density parity check codes has already established some good properties
`of GL codes(cid:4) Gallager (cid:9) (cid:3) (cid:12) proved some good properties of GL codes in terms of their distance
`
`
`
`Hughes, Exh. 1053, p. 6
`
`

`
`properties(cid:4) Zyablov and Pinsker proved that GL codes are good and gave a practical decoder(cid:3) but
`only for communication rates substantially below the Gilbert bound(cid:4) Our approach in terms of an
`ideal decoder allows us to prove that the codes are good not only for the binary symmetric channel
`but also for arbitrary ergodic symmetric channel models(cid:17) we also prove that GL codes are very
`good(cid:3) a result not explicitly proven in (cid:9) (cid:3) (cid:3) (cid:12)(cid:4)
`The properties that we prove depend on the details of the ensemble of matrices A that is used(cid:4)
`We (cid:20)nd it easiest to prove the desired properties by weakening the ensemble of matrices from that
`described in section (cid:4) (cid:4) We introduce the following ensembles which we believe are ordered such
`that the later ensembles de(cid:20)ne GL and MN codes that have smaller average probability of error(cid:3)
`though we do not have a proof of this statement(cid:4)
`
` (cid:4) Matrix A generated by starting from an all(cid:2)zero matrix and randomly ipping t not necessarily
`distinct bits in each column(cid:4)
`
`(cid:4) Matrix A generated by randomly creating weight t columns(cid:4)
`
` (cid:4) Matrix A generated with weight t per column and (cid:15)as near as possible(cid:16) uniform weight per
`row(cid:4)
`
`(cid:4) Matrix A generated with weight t per column and uniform weight per row(cid:3) and no two
`columns having overlap greater than (cid:4)
`
`(cid:4) Matrix A further constrained so that its bipartite graph has large girth(cid:4)
`
`(cid:4) Matrix A (cid:22) (cid:9)C C(cid:12) further constrained or slightly modi(cid:20)ed so that C is an invertible matrix(cid:4)
`
`Our proofs are for the (cid:20)rst ensemble(cid:4) Our demonstrations use matrices from ensembles (cid:3) and (cid:4)
`The properties also depend on the assumed noise model(cid:4) We will give theoretical results for
`three cases(cid:4) First(cid:3) we give a general theorem for a broad class of symmetric noise models with
`memory (cid:15)de(cid:20)nition (cid:16)(cid:4) Second(cid:3) we discuss a popular special case(cid:3) the memoryless binary symmetric
`channel(cid:3) corresponding to a vector x of uniform density f (cid:4) Third(cid:3) the generalization to channels
`with continuous outputs will be discussed in appendix H(cid:4)
`
`(cid:3) Decodability for arbitrary P (cid:11)x(cid:12)
`
`To avoid confusion between GL and MN codes when discussing their common decoding problem
`Ax (cid:22) z(cid:3) we refer to the number of columns in A as L and the number of rows as M (cid:4) (cid:9)For a glossary
`of all symbols used in this paper(cid:3) see appendix A(cid:4)(cid:12) Let the ratio of the length of x to the length of
`z be (cid:9) (cid:3) L(cid:5)M (cid:4) The decoding problem is equivalent to playing a game in which a vector x is drawn
`from a probability distribution P (cid:15)x(cid:16)(cid:3) and the vector z (cid:22) Ax mod is revealed(cid:17) the player(cid:14)s task
`is then to identify the original vector x(cid:3) given z(cid:2) A(cid:3) and the known distribution P (cid:4) The optimal
`decoder is an algorithm that identi(cid:20)es the vector x that maximizes P (cid:15)x(cid:16) subject to the constraint
`z (cid:22) Ax mod (cid:4)
`There is a Shannon limit for this game beyond which we clearly cannot hope to recover x from z
`reliably(cid:4) The maximum information content ofz is clearly M bits(cid:4) We assume that the probability
`P is stationary and ergodic so that a mean entropy Hx can be de(cid:20)ned(cid:4) Then the Shannon limit
`says reliable recovery of x from z is only possible if LHx (cid:5) M (cid:3) i(cid:3)e(cid:3)(cid:28)
`(cid:9)Hx (cid:5) (cid:4)
`If there are matrices A for which we can play the decoding game well at a value of (cid:9)Hx close to
`this limit(cid:3) then there exist GL codes which can communicate correspondingly close to the Shannon
`limit of a noisy channel whose noise distribution is given by P (cid:15)x(cid:16)(cid:4)
`
`(cid:15) (cid:16)
`
`
`
`Hughes, Exh. 1053, p. 7
`
`

`
`Theorem Given an integer t (cid:4) and a ratio (cid:9) (cid:7) (cid:6) there exists an entropy Hx(cid:15)(cid:9)(cid:2) t(cid:16) (cid:7) such
`that(cid:6) for any P (cid:15)x(cid:16) of mean entropy Hx (cid:3) Hx(cid:15)(cid:9)(cid:2) t(cid:16)(cid:6) and a desired block error probability (cid:6) (cid:7) (cid:6)
`there exists an integer M (cid:6) an integer L (cid:4) (cid:9)M and a matrix A having M rows and L columns with
`weight t or less per column(cid:6) with the following property(cid:17) if x is generated from P (cid:15)x(cid:16)(cid:6) the optimal
`decoder from z (cid:22) Ax back to (cid:29)x achieves a probability of block error less than (cid:6)(cid:3)
`
`Theorem Given a distribution P (cid:15)x(cid:16) of mean entropy Hx (cid:3) (cid:15)bits(cid:16) and a desired (cid:9) (cid:3) (cid:5)H x(cid:6)
`there exists an integer t(cid:15)Hx(cid:2) (cid:9)(cid:16) (cid:4) such that for any desired block error probability (cid:6) (cid:7)(cid:6) t here is
`an integer Mmin such that for any M (cid:7) Mmin(cid:6) there is a matrix A having M rows and L (cid:4) (cid:9)M
`columns with weight t(cid:15)Hx(cid:2) (cid:9)(cid:16) or less per column(cid:6) with the following property(cid:17) when x is generated
`from P (cid:15)x(cid:16)(cid:6) the optimal decoder from z (cid:22) Ax back to (cid:29)x achieves a probability of error less than (cid:6)(cid:3)
`
`Implications of theorems and for GL codes(cid:8) The (cid:20)rst theorem e(cid:24)ectively states that
`GL codes with any value of t (cid:4) are good(cid:3) i(cid:3)e(cid:3)(cid:3) for any channel with appropriate entropy(cid:3) there
`are GL codes which can achieve virtually error(cid:2)free transmission at rates up to some non(cid:2)zero rate
` (cid:2) (cid:5)(cid:9)(cid:3) if the block lengthL is made su(cid:31)ciently large(cid:4)
`The second theorem e(cid:24)ectively states that GL codes are very good(cid:27)if we are allowed to choose
`t(cid:3) then we can get arbitrarily close to capacity(cid:3) still using very sparse matrices with t(cid:5)M arbitrarily
`small(cid:4)
`In section (cid:4) we prove these theorems(cid:3) that is(cid:3) we derive expressions for a function Hx(cid:15)(cid:9)(cid:2) t(cid:16) (cid:7)
`satisfying theorem (cid:3) and a function t(cid:15)Hx(cid:2) (cid:9)(cid:16) satisfying theorem (cid:4) Let the largest function for which
`the (cid:20)rst theorem is true be H max
`(cid:15)(cid:9)(cid:2) t(cid:16)(cid:4) In section (cid:4) we evaluate a tighter numerical lower bound
`x
`for H max
`(cid:15)(cid:9)(cid:2) t(cid:16)(cid:4) These are worst case results(cid:3) true for any source of mean entropy Hx(cid:4) In section
`x
`(cid:4)(cid:4) we give numerical results for the case of the binary symmetric channel(cid:3) where considerably
`more optimistic bounds can be derived(cid:4)
`We also prove the following minimum distance theorem for GL codes which uses the function
`Hx(cid:15)(cid:9)(cid:2) t(cid:16) of theorem (cid:4)
`
`Theorem Given an integer t (cid:4) (cid:6) a fraction (cid:10) (cid:3) (cid:4)(cid:6) and a (cid:9) such that H(cid:15)(cid:10)(cid:16) (cid:3) Hx(cid:15)(cid:9)(cid:2) t(cid:16)(cid:6) there
`exist integers M and L (cid:4) (cid:9)M and a matrix A having M rows and L columns with weight t or less
`per column(cid:6) such that the GL code whose parity check matrix is A has minimum distance at least
`(cid:10)L(cid:3)
`
`We can also prove that the Gilbert bound can be attained as t (cid:6) (cid:3) still with A very sparse(cid:4)
`Theorem Given a fraction (cid:10) (cid:3) (cid:4)(cid:6) and a (cid:9) such that H(cid:15)(cid:10)(cid:16) (cid:3) (cid:5)(cid:9)(cid:6) there exists a t and an
`Mmin such that for any M (cid:7) Mmin there is a matrix A having M rows and L (cid:4) (cid:9)M columns with
`weight t or less per column(cid:6) such that the GL code whose parity check matrix is A has minimum
`distance at least (cid:10)L(cid:3)
`
`If one only aims to decode noise
`Implication of theorem contrasted with theorem (cid:8)
`patterns of weight up to half of the minimum distance d (cid:22) (cid:10)L (cid:15)as is conventional in much of coding
`theory(cid:16)(cid:3) then one can only handle noise levels up to (cid:15)d(cid:5)L(cid:16)(cid:5)(cid:4)
`In fact the optimal decoder can
`decode (cid:15)with vanishing probability of error(cid:16) at noise levels up to (cid:15)d(cid:5)L(cid:16)(cid:4) Thus GL codes can serve
`as good codes at noise levels twice as great as the maximum noise level that is attainable if one
`restricts attention to bounded distance decoding(cid:4) The intuition for this result is that in a very
`high dimensional binary space(cid:3) two spheres of radius r whose centres are a distance d apart have
`a vanishingly small volume of intersection(cid:3) if r is less than d (cid:27) even though they have non(cid:2)zero
`overlap for any r greater than d(cid:5)(cid:4)
`GL codes(cid:3) as Gallager showed (cid:9) (cid:12)(cid:3) and we will show later(cid:3) can in practice be decoded beyond
`their minimum distance(cid:4)
`
`
`
`Hughes, Exh. 1053, p. 8
`
`

`
`(cid:3) Proof of theorem
`
`Consider the problem(cid:3) given A and z(cid:3) of inferring x(cid:3) where Ax (cid:22) z(cid:3) where x has probability
`distribution P (cid:15)x(cid:16) with mean entropy Hx(cid:4) We consider the probability of error of a typical set
`decoder (cid:9) (cid:12)(cid:3) averaging it over all very sparse random matrices A(cid:4) We establish the existence of a
`function Hx(cid:15)(cid:9)(cid:2) t(cid:16) (cid:7) such that the probability of error can be bounded by a sum of terms which
`decrease as inverse powers of L or exponentially with L(cid:3) ifH x (cid:3) Hx(cid:15)(cid:9)(cid:2) t(cid:16)(cid:4)
`
`Typical set decoder(cid:8) We consider the typical set(cid:28)
`
`
`
`log
`
` L
`
`P (cid:15)x(cid:16) (cid:2) Hx(cid:5)(cid:5)(cid:5)(cid:5)
`T (cid:22) T(cid:5)L(cid:2)(cid:3)(cid:6) (cid:22) (cid:9)x f(cid:2) gL (cid:2

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases