throbber
B(cid:2) J(cid:2) Frey and D(cid:2) J(cid:2) C(cid:2) MacKay (cid:3) (cid:7) In Proceedings of the th Allerton Conference
`on Communication(cid:2) Control and Computing (cid:10) Champaign(cid:11)Urbana(cid:10) Illinois(cid:2)
`
`Trellis(cid:2)Constrained Codes
`
`Brendan J(cid:2) Frey
`Beckman Institute for Advanced Science and Technology
`University of Illinois at Urbana(cid:2)Champaign
`bfrey(cid:2)turbo(cid:3)beckman(cid:3)uiuc(cid:3)edu
`
`David J(cid:2) C(cid:2) MacKay
`Department of Physics(cid:3) Cavendish Laboratories
`Cambridge University
`mackay(cid:2)mrao(cid:3)cam(cid:3)ac(cid:3)uk
`
`Abstract
`
`We introduce a class of iteratively decodable trellis(cid:2)constrained codes as a gener(cid:2)
`alization of turbocodes(cid:3) low(cid:2)density parity(cid:2)check codes(cid:3) serially(cid:2)concatenated con(cid:2)
`volutional codes(cid:3) and product codes(cid:4) In a trellis(cid:2)constrained code(cid:3) multiple trellises
`interact to de(cid:5)ne the allowed set of codewords(cid:4) As a result of these interactions(cid:3) the
`minimum(cid:2)complexity single trellis for the code can have a state space that grows
`exponentially with block length(cid:4) However(cid:3) as with turbocodes and low(cid:2)density
`parity(cid:2)check codes(cid:3) a decoder can approximate bit(cid:2)wise maximum a posteriori de(cid:2)
`coding by using the sum(cid:2)product algorithm on the factor graph that describes the
`code(cid:4) We present two new families of codes(cid:3) homogenous trellis(cid:2)constrained codes
`and ring(cid:2)connected trellis(cid:2)constrained codes(cid:3) and give results that show these codes
`perform in the same regime as do turbo(cid:2)codes and low(cid:2)density parity(cid:2)check codes(cid:4)
`
` Introduction
`
`Recent interest in the impressive performances of turbocodes and low(cid:11)density parity(cid:11)
`check codes has led to several attempts at generalizing these codes in ways that lead
`to e(cid:12)cient iterative decoders(cid:2) In one view that is now quickly propagating across the
`research network a code is described by graphical constraints on a system of variables(cid:10)
`and the iterative decoder is based on a decade(cid:11)old expert systems algorithm applied to the
`graph which describes the constraints (cid:13) (cid:14)(cid:16)(cid:2) This probability propagation algorithm (cid:13)(cid:10) (cid:16) is
`exact only in cycle(cid:11)free graphs(cid:2) However(cid:10) as evidenced by the excellent error(cid:11)correcting
`capabilities of the iterative decoders for turbocodes (cid:13) (cid:16) and low(cid:11)density parity(cid:11)check
`codes (cid:13)(cid:16)(cid:10) the algorithm works impressively well in the graphs that describe these codes(cid:10)
`even though they contain many cycles(cid:2) See (cid:13) (cid:16) and (cid:13)(cid:16) for extensive dissertations on this
`subject(cid:2)
`
`
`
`Hughes, Exh. 1055, p. 1
`
`

`

`(cid:3)a(cid:7)
`
`(cid:3)b(cid:7)
`
`(cid:3)c(cid:7)
`
`R (cid:2) (cid:2)
`n (cid:2) (cid:2)
`
`R (cid:2) (cid:2)
`n (cid:2) (cid:2)
`
`(cid:3)d(cid:7)
`
`R (cid:2) (cid:2)(cid:6) n (cid:2)
`
`R (cid:2) (cid:2)(cid:6) n (cid:2)
`
`(cid:3)e(cid:7)
`
`Rt (cid:2) (cid:2)(cid:6) nt (cid:2) (cid:2) (cid:6) t (cid:2) (cid:3) (cid:3)
`
`Rt (cid:2) (cid:2)(cid:6) nt (cid:2) (cid:2) (cid:6) t (cid:2) (cid:3) (cid:4) (cid:4) (cid:4) (cid:3)
`
`Figure (cid:19) A codeword in a trellis(cid:6)constrained code must simultaneously be a codeword
`of all the constituent trellises(cid:10) with the codeword bits reordered(cid:2) The factor graphs are
`shown for (cid:3)a(cid:7) a single convolutional code(cid:20) (cid:3)b(cid:7) a turbocode(cid:10) where each parity bit is a
`codeword bit of only one trellis(cid:20) (cid:3)c(cid:7) a low(cid:11)density parity(cid:11)check code(cid:20) (cid:3)d(cid:7) a homogenous
`trellis(cid:11)constrained code(cid:20) and (cid:3)e(cid:7) a ring(cid:11)connected trellis(cid:11)constrained code(cid:2) The small
`un(cid:21)lled discs represent codeword bits(cid:2)
`
`Fig(cid:2) a shows the factor graph (cid:3)see (cid:13) (cid:16) in these proceedings(cid:7) for a trellis(cid:2) Unlike in a
`trellis(cid:10) in a factor graph the values that each state variable (cid:3)large white discs(cid:7) can take
`on are not explicitly shown(cid:2) Any two adjacent state variables and the corresponding
`codeword bits (cid:3)small white discs(cid:7) must satisfy a linear set of equations (cid:3)represented by
`the small black discs with a (cid:22)(cid:23)(cid:24) inside(cid:7)(cid:2) This representation is called a (cid:22)factor graph(cid:24)
`because it shows how the indicator function for allowed trellis behaviors factors into a
`product of local functions(cid:2) Associated with each black disc is a local function of its
`neighboring variables(cid:2) Each function evaluates to if its neighboring variables satisfy
`the local set of linear equations(cid:10) and to otherwise(cid:2) The global function is equal to the
`product of the local functions(cid:2) A given con(cid:21)guration of the codeword bits is a codeword if
`the global function evaluates to for some con(cid:21)guration of the state variables(cid:2) In general(cid:10)
`the local functions may be nonlinear(cid:10) the factor graph variables may be real(cid:11)valued(cid:10) and
`the local functions may evaluate to elements of a semi(cid:11)ring(cid:2)
`
`Although factor graphs are less explicit about local relationships than are trellises(cid:10)
`factor graphs allow us to represent a richer set of systems(cid:2) Fig(cid:2) b shows the factor graph
`for a simple turbocode(cid:2) A single trellis for the same turbocode would have an unwieldly
`large number of states(cid:2) More important than representation(cid:10) a factor graph provides
`a framework for iterative decoding via message passing on the graph(cid:2) The probability
`propagation algorithm (cid:13)(cid:10) (cid:16)(cid:10) a(cid:2)k(cid:2)a(cid:2) the sum(cid:6)product algorithm (cid:13) (cid:10) (cid:16)(cid:10) can be applied to
`a factor graph to approximate bit(cid:11)wise maximum a posteriori (cid:3)MAP(cid:7) decoding(cid:2) (cid:3)In the
`
`
`
`Hughes, Exh. 1055, p. 2
`
`

`

`special case of a turbocode(cid:10) this general algorithm reduces to turbodecoding (cid:13)(cid:10) (cid:16)(cid:2)(cid:7) Two
`di(cid:27)erent factor graphs for the same code may give decoders with di(cid:27)erent performances(cid:2)
`
`As another example(cid:10) Fig(cid:2) c shows the factor graph for a simple low(cid:11)density parity(cid:11)
`check code(cid:2) Each of the six trellises is a simple parity(cid:11)check trellis that enforces even
`parity on its six codeword bits (cid:13) (cid:16)(cid:2)
`
`In a sense(cid:10) whereas the trellis assisted in the design of low(cid:11)complexity codes and ex(cid:11)
`act linear(cid:11)time probabilistic decoders (cid:3)the Viterbi algorithm and the forward(cid:11)backward
`algorithm(cid:7)(cid:10) the factor graph assists in the design of high(cid:11)complexity codes and approx(cid:11)
`imate linear(cid:11)time probabilistic decoders(cid:2)
`In this paper(cid:10) we present a general class of
`high(cid:11)complexity(cid:10) linear(cid:11)time decodable codes that retain the chain(cid:11)type structure of trel(cid:11)
`lises locally(cid:10) as do turbocodes and to a lesser degree low(cid:11)density parity(cid:11)check codes(cid:2) A
`codeword in a trellis(cid:6)constrained code (cid:3)TCC(cid:7) must simultaneously be a codeword of mul(cid:11)
`tiple constituent trellises(cid:2) So(cid:10) if ft(cid:3)x(cid:7) is the constituent codeword indicator function for
`trellis t  f (cid:2) (cid:3) (cid:3) (cid:3) (cid:2) T g(cid:10) the global codeword indicator function is
`
`(cid:3) (cid:7)
`
`ft(cid:3)x(cid:7)(cid:3)
`
`TY i
`
`(cid:2)
`
`f (cid:3)x(cid:7) (cid:28)
`
`Each constituent indicator function is given by a product of the local functions within the
`corresponding trellis(cid:2) Usually(cid:10) the codeword bits interact with the constituent trellises
`through permuters that rearrange the order of the codeword bits(cid:2)
`
`For the turbocode in Fig(cid:2) b(cid:10) there are two constituent functions(cid:10) f (cid:3)(cid:2)(cid:7) and f(cid:3)(cid:2)(cid:7)(cid:2)
`f (cid:3)(cid:2)(cid:7) indicates that the upper row of codeword bits are valid output from a convolutional
`encoder with the middle row of codeword bits as input(cid:2) f (cid:3)(cid:2)(cid:7) doesnot directly place any
`restrictions on the lower row of codeword bits(cid:10) so it e(cid:27)ectively only checks (cid:29) of the
`codeword bits(cid:2) f(cid:3)(cid:2)(cid:7) indicates that the lower row of codeword bits are valid output from
`a convolutional encoder with the middle row of codeword bits as input(cid:2) In contrast to
`f (cid:3)(cid:2)(cid:7)(cid:10) f(cid:3)(cid:2)(cid:7) does not place any restrictions on the upper row of codeword bits(cid:2)
`The rate R of a TCC is related to the rates of the constituent trellises Rt(cid:10) t (cid:28)
` (cid:2) (cid:3) (cid:3) (cid:3) (cid:2) T in a simple way(cid:2) If nt is the fraction of codeword bits that trellis t checks(cid:10) then
`trellis t removes at most (cid:3) (cid:3) Rt(cid:7)ntN binary degrees of freedom from the code(cid:2) It may
`remove a small number less if some of its constraint equations are linearly dependent on
`those given by other constituent trellises(cid:2) For large(cid:10) randomly generated permuters this
`e(cid:27)ect is quite small(cid:10) so we will ignore it when computing rates in the remainder of this
`paper(cid:2) (cid:3)As a result(cid:10) the actual rates may be slightly higher than the given rates(cid:2)(cid:7) The
`total number of binary degrees of freedom left after all trellis constraints are satis(cid:21)ed is
`N (cid:3) PT
`t(cid:2) (cid:3) (cid:3) Rt(cid:7)ntN(cid:10) so the rate of the TCC is
`
`(cid:3)(cid:7)
`
`TX t
`
`(cid:2)
`
`(cid:3) (cid:3) Rt(cid:7)nt(cid:3)
`
`R (cid:28) (cid:3)
`
`From this equation(cid:10) it is easy to verify that the turbocode in Fig(cid:2) b has rate (cid:29) and
`that the low(cid:11)density parity(cid:11)check code in Fig(cid:2) c also has rate (cid:29)(cid:2) (cid:3)Note that a k(cid:11)bit
`parity(cid:11)check trellis has rate (cid:3)k (cid:3) (cid:7)(cid:4)k(cid:2)(cid:7)
`
`Unlike encoding turbocodes and serially(cid:11)concatenated convolutional codes(cid:10) encoding
`a general TCC takes quadratic time in N(cid:2) In a general TCC(cid:10) we can designate a subset
`of the codeword bits as the systematic bits of the entire code and then use Gaussian
`elimination to compute a generator matrix (cid:3)once only(cid:7)(cid:2) Using such a systematic generator
`matrix for encoding requires R(cid:3) (cid:3) R(cid:7)N  binary operations(cid:2)
`
`
`
`Hughes, Exh. 1055, p. 3
`
`

`

`Decoding a TCC involves performing the forward(cid:11)backward algorithm for each trellis
`and exchanging information between trellises in the fashion speci(cid:21)ed by the sum(cid:11)product
`algorithm(cid:2) The constituent trellises may be processed in parallel or sequentially(cid:2)
`
`In the following two sections(cid:10) we present two new families of TCC(cid:30)s and show that
`they perform in the same regime as do turbocodes and low(cid:11)density parity(cid:11)check codes(cid:2)
`
` Homogenous Trellis(cid:2)Constrained Codes
`
`In a turbocode(cid:10) the constituent trellises share only a systematic subset of their codeword
`bits(cid:2) The other parity bits of each constituent encoder are not constrained by the other
`encoders(cid:2) Fig(cid:2) d shows the factor graph for a simple homogenous TCC with T (cid:28) (cid:10) in
`which all of the bits are constrained by each constituent trellises(cid:2) From the general rate
`formula in (cid:3)(cid:7)(cid:10) we see that the rate for a homegenous turbocode is
`
`R (cid:28) (cid:3) T (cid:3) (cid:3) Ravg(cid:7)(cid:2)
`
`(cid:3) (cid:7)
`
`where Ravg (cid:28) (cid:2)PT
`t(cid:2) Rt(cid:3)(cid:4)T (cid:2)
`One di(cid:27)erence between the homogenous TCC and the turbocode is that the rate of
`a homogenous TCC decreases directly with the number of trellises T (cid:2) In the simulations
`discussed below(cid:10) we used T (cid:28)  and R (cid:28) R (cid:28) Ravg (cid:28) (cid:4) to get R (cid:28) (cid:4)(cid:2) To obtain
`the same rate with T (cid:28) would require Ravg (cid:28) (cid:4)(cid:2) In contrast(cid:10) the rate for a turbocode
`varies roughly inversely with T (cid:2) A rate (cid:29) turbocode with T (cid:28) can be obtained with
`R (cid:28) R (cid:28) R (cid:28) (cid:4)(cid:2) Another di(cid:27)erence is that the permuter length of a homogenous
`TCC is N (cid:10) whereas for a turbocode(cid:10) the permuter length is RN(cid:2)
`
`(cid:2) Encoding and Decoding
`
`The trellises in a homogenous TCC share all their bits(cid:10) so we can(cid:30)t simply encode by
`dividing the bits in each constituent trellis into a systematic set and a parity set and
`running a linear(cid:11)time encoding method for each trellis(cid:10) as is possible in a turbocode(cid:2)
`Instead(cid:10) we apply a previously computed generator matrix to a previously selected sys(cid:11)
`tematic subset of codeword bits(cid:10) which takes R(cid:3) (cid:3) R(cid:7)N  binary operations(cid:2)
`
`The iterative decoder processes each constituent trellis using the forward(cid:11)backward
`algorithm(cid:10) and passes (cid:22)extrinsic information(cid:24) between the trellises in the manner spec(cid:11)
`i(cid:21)ed by the sum(cid:11)product algorithm(cid:2) For two trellises(cid:10) the decoding schedule is straight(cid:11)
`forward(cid:2) For T (cid:5) (cid:10) di(cid:27)erent decoding schedules are possible(cid:2) The trellises may be
`processed sequentially(cid:10) in which case the current trellis uses the most recently computed
`probabilities produced by the other trellises(cid:2) Alternatively(cid:10) the trellises may be processed
`in parallel(cid:10) in which case the current trellis uses the probabilities produced by the other
`trellises in the previous decoding iteration(cid:2)
`
`For the sake of gaining insight into these new compound codes and the behavior of
`their iterative decoders(cid:10) we prefer to decode until a codeword is found or we are sure
`the iterative decoder has failed to (cid:21)nd a codeword(cid:2) After each decoding iteration(cid:10) the
`current bit(cid:11)wise MAP estimates are used to determine whether a valid codeword has
`been found(cid:10) in which case the iterative procedure is terminated(cid:2) If iterations are
`completed without (cid:21)nding a codeword(cid:10) we label the block a decoding failure(cid:2) Notice that
`given the factor graph of a code(cid:10) determining that a codeword is valid is simply a matter
`of checking that all the local functions evaluate to (cid:2)
`
`
`
`Hughes, Exh. 1055, p. 4
`
`

`

`(cid:2) Performance on an AWGN Channel
`
`Using Monte Carlo(cid:10) we estimated the performance of an N (cid:28) (cid:2) (cid:10) T (cid:28)  homogenous
`TCC with R (cid:28) R (cid:28) (cid:4)(cid:10) giving R (cid:28) (cid:4)(cid:2) (cid:3)See App(cid:2) A for a description of how the BER
`con(cid:21)dence intervals were computed(cid:2)(cid:7) Each rate (cid:29) trellis was obtained by shortening
`every (cid:21)fth bit of a rate (cid:29) nonsystematic convolutional code with maximum dmin(cid:2) (cid:3)The
`generator polynomials for this code are given in (cid:13) (cid:16) and are (cid:3) (cid:2) (cid:2) (cid:2) (cid:2) (cid:7)octal(cid:2)(cid:7) Fig(cid:2) 
`shows the performance of this homogenous TCC(cid:10) relative to the turbocode introduced
`by Berrou et(cid:7) al(cid:7) (cid:13) (cid:16) and the best rate (cid:29)(cid:10) N (cid:28) (cid:2)  low(cid:11)density parity(cid:11)check code
`published to date (cid:13)(cid:16)(cid:2) Although it does not perform as well as the turbocode(cid:10) it performs
`signi(cid:21)cantly better than the low(cid:11)density parity(cid:11)check code(cid:2) We believe there is room for
`improvement here(cid:10) since we chose the set of generator polynomials that gave maximum
`(cid:3)Keep in mind(cid:10) however(cid:10) that the
`dmin and this is quite likely not the best choice(cid:2)
`performance of a homogenous TCC is not necessarily governed by the same constituent
`trellis properties that govern the performance of a turbocode(cid:2)(cid:7) Of signi(cid:21)cant importance
`compared to turbocodes(cid:10) we have observed that this homogenous TCC does not have
`low(cid:11)weight codewords(cid:2)
`
`Turbocode
`Ring-connected TCC
`Homogenous TCC
`Low-density parity-check code
`
`Uncoded
`
`Shannon(cid:30)s limit
`
`0
`
`0.2
`
`0.4
`
`0.8
`0.6
`Eb(cid:4)N (cid:3)dB(cid:7)
`
`1
`
`1.2
`
`1.4
`
`1.6
`
`1
`
`1e-1
`
`1e-2
`
`1e-3
`
`BER
`
`1e-4
`
`1e-5
`
`1e-6
`
`Figure (cid:19) The performances of a homogenous TCC and a ring(cid:11)connected TCC compared
`to the best rate (cid:29) turbocode and low(cid:11)density parity(cid:11)check code performances published
`to date (cid:13)(cid:10) (cid:16)(cid:2)
`
` Ring(cid:2)Connected Trellis(cid:2)Constrained Codes
`
`Fig(cid:2) e shows the factor graph for a simple ring(cid:6)connected TCC with T (cid:28) (cid:2) This code
`can be viewed as a serially(cid:11)concatenated convolutional code (cid:13) (cid:10) (cid:16) in which some of the
`
`
`
`Hughes, Exh. 1055, p. 5
`
`

`

`output bits are constrained to be equal to some of the input bits(cid:2) The factor graph thus
`forms a ring of connected trellises(cid:2) In the ring(cid:11)connected TCC shown(cid:10) each constituent
`trellis checks exactly (cid:4)T of the codeword bits(cid:2) From the general rate formula in (cid:3)(cid:7)(cid:10) the
`rate for such a ring(cid:11)connected turbocode is
`
`R (cid:28) (cid:3)Ravg (cid:3) (cid:4)(cid:7)(cid:3)
`
`(cid:3)(cid:7)
`
`Unlike turbocodes and homogenous TCC(cid:30)s(cid:10) for such a ring(cid:11)connected TCC the rate does
`not depend on the number of constituent trellises T (cid:2) However(cid:10) the permuter lengths are
` (cid:4)T relative to the block length(cid:2)
`
` (cid:2) Encoding and Decoding
`
`For ring(cid:11)connected TCC(cid:30)s(cid:10) we cannot use the same type of encoding procedure that is
`used for serially(cid:11)concatenated convolutional codes(cid:10) since some of the (cid:22)output(cid:24) bits must
`match some of the (cid:22)input(cid:24) bits(cid:2) As with the homogenous TCC(cid:10) we can encode with ap(cid:11)
`proximately R(cid:3) (cid:3) R(cid:7)N  binary operations by applying a previously computed generator
`matrix to a previously selected systematic subset of codeword bits(cid:2) However(cid:10) if Rt (cid:28) (cid:4)(cid:10)
`t (cid:28) (cid:2) (cid:3) (cid:3) (cid:3) (cid:2) T (cid:3)giving R (cid:28) (cid:4)(cid:7)(cid:10) encoding computations can be saved in the following way(cid:2)
`First(cid:10) we pick N(cid:4)T systematic bits for trellis and generate the corresponding N(cid:4)T
`parity bits(cid:10) using a number of computations that is linear in N(cid:2) Then(cid:10) we work around
`the ring and for each trellis pick N(cid:4)T systematic bits and generate the corresponding
`N(cid:4)T parity bits(cid:2) When all but two trellises have been encoded(cid:10) the last two trellises are
`used to form a binary vector that when multiplied by a previously computed N(cid:4)T (cid:4) N(cid:4)T
`binary matrix yields the (cid:21)nal set of N(cid:4)T parity bits(cid:2) The computations for the last
`step dominate and require approximately N (cid:4)T  binary operations(cid:2) For Ravg (cid:28) (cid:4)(cid:10)
`R (cid:28) (cid:4)(cid:10) T (cid:28) (cid:10) the (cid:21)rst method described above takes N (cid:4) operations(cid:10) whereas the
`second method takes N (cid:4) operations(cid:2)
`
`As with homogenous TCC(cid:30)s(cid:10) when T (cid:5)  ring(cid:11)connected TCC(cid:30)s can be iteratively
`decoded using a variety of schedules(cid:2) In our simulations(cid:10) we process the trellises sequen(cid:11)
`Iterative decoding
`tially while passing probabilities around the ring in one direction(cid:2)
`continues until a valid codeword is found or until iterations are completed(cid:2)
`
` (cid:2) Performance on an AWGN Channel
`
`We estimated the performance of an N (cid:28) (cid:2) (cid:10) T (cid:28) ring(cid:11)connected TCC with
`n (cid:28) n (cid:28) n (cid:28) (cid:4) and R (cid:28) R (cid:28) R (cid:28) (cid:4)(cid:10) giving R (cid:28) (cid:4)(cid:2) Each rate (cid:29) trellis
`used the generator polynomials (cid:3) (cid:2) (cid:2) (cid:2) (cid:7)octal(cid:10) which we found by trial and error(cid:2) The
`constituent codeword bits were shared alternately with the two neighboring trellises(cid:2)
`Fig(cid:2)  shows the performance of this ring(cid:11)connected TCC(cid:2) It performs sign(cid:21)cantly better
`than the homogenous TCC and the low(cid:11)density parity(cid:11)check code and only (cid:2) dB worse
`than the turbocde(cid:2)
`
` Discussion
`
`There is some literature on the properties of the constituent trellises that make good tur(cid:11)
`bocodes (cid:13) (cid:10) (cid:10) (cid:16) and serially(cid:11)concatenated convolutional codes (cid:13) (cid:16)(cid:2) We are currently
`exploring similar arguments for choosing the properties of the constituent trellises that
`
`
`
`Hughes, Exh. 1055, p. 6
`
`

`

`will make good homogenous TCC(cid:30)s and ring(cid:11)connected TCC(cid:30)s(cid:2) The behavior of these
`two new families of TCC(cid:30)s is quite di(cid:27)erent from that of turbocodes(cid:10) so we expect dif(cid:11)
`ferent properties to be relevant(cid:2) For example(cid:10) in order to avoid low(cid:11)weight codewords in
`a turbocode(cid:10) we try to avoid permuters that keep a pair of bits the same distance apart
`in the two constituent trellises (cid:13) (cid:16)(cid:2) This degenerate e(cid:27)ect is (cid:22)broken(cid:24) by the ring of a
`ring(cid:11)connected TCC(cid:10) which requires not only that two neighboring trellises have an equal
`set of shared (cid:22)input(cid:24) bits(cid:10) but also that their (cid:22)output(cid:24) bits must satisfy the constraints
`given by the ramainder of the ring(cid:2) However(cid:10) in a ring(cid:11)connected TCC low(cid:11)weight code(cid:11)
`words are introduced by permuters that preserve a degenerate pattern around the entire
`ring(cid:2) Another important question is what properties of a TCC lead to the initial drop
`(cid:3)say down to a BER of (cid:2) (cid:7)(cid:31) These properties may very well be independent (cid:3)or even
`contrary to(cid:7) those properties that give high minimum distance(cid:2)
`
`We believe that the general class of (cid:22)trellis(cid:11)constrained codes(cid:24) presented in this paper
`is a fruitful generalization of several other iteratively decodable codes(cid:2) If we think of entire
`trellises as nodes in a graph whose edges represent bundles of codeword bits(cid:10) it is evident
`that there are a variety of other interesting new TCC(cid:30)s aside from those shown in Fig(cid:2)
`that have yet to be explored(cid:2)
`
`Acknowledgements
`
`We appreciate discussions we had with members of the (cid:22)Ulm group(cid:24)(cid:10) Dave Forney(cid:10) Ralph
`Koetter(cid:10) Frank Kschischang(cid:10) Andi Loeliger(cid:10) Bob McEliece(cid:10) Michael Tanner(cid:10) and Niclas
`Wiberg(cid:10) as well as discussions we had with Rob Calderbank and Alexander Vardy(cid:2)
`
`Appendix A(cid:7) Computing BER Con(cid:8)dence Intervals
`
`When analytic methods are not available for computing bit error rates in error(cid:11)correcting
`coding systems(cid:10) we must resort to simulation(cid:2) Estimated BER(cid:30)s can vary signi(cid:21)cantly
`from experiment to experiment(cid:10) and so it is often desirable to include con(cid:21)dence intervals(cid:2)
`This is especially important for the long block length codes discussed in this paper(cid:10) since
`signi(cid:21)cant variability can be introduced by our inability to simulate enough blocks to
`pin down the word error rate(cid:2) Also(cid:10) for low bit error rates (cid:3)e(cid:7)g(cid:7)(cid:10) (cid:0)(cid:7) we may not be
`able to measure the distribution of bit errors within erroneously decoded words(cid:2) In this
`section(cid:10) we present a Monte Carlo approach for estimating the median and a (cid:2) (cid:29) (cid:2)
`con(cid:21)dence interval for the BER(cid:10) based on errors measured by Monte Carlo simulation(cid:2)
`
`The error model contains two parameters(cid:19) the probability pw of word error(cid:10) and the
`probability pb of bit error within erroneous words(cid:2) This is a rather crude approximation(cid:10)
`since in practice we expect there to be more than one failure mode(cid:10) i(cid:7)e(cid:7)(cid:10) there ought to
`be several pb(cid:30)s corresponding to di(cid:27)erent failure modes(cid:2)
`Let M be the number of words transmitted and let nw be the number of measured
`word errors(cid:2) Let K be the number of information bits per word(cid:10) and let nb be the
`total number of bit errors measured while transmitting all M blocks(cid:2) We will refer to
`the measured values as the data(cid:10) D (cid:28) fnw(cid:2) nbg(cid:2) From the Bayesian perspective(cid:10) before
`observing D(cid:10) we place a prior distribution p(cid:3)pw(cid:2) pb(cid:7) on the error model parameters(cid:2) After
`observing D(cid:10) we draw conclusions (cid:3)e(cid:7)g(cid:7)(cid:10) compute a con(cid:21)dence interval(cid:7) from the posterior
`distribution p(cid:3)pw(cid:2) pbjD(cid:7)(cid:10) where
`
`p(cid:3)pw(cid:2) pbjD(cid:7) (cid:5) p(cid:3)pw(cid:2) pb(cid:7)P (cid:3)Djpw(cid:2) pb(cid:7)(cid:3)
`
`(cid:3)(cid:7)
`
`
`
`Hughes, Exh. 1055, p. 7
`
`

`

`In this equation(cid:10) the constant of proportionality does not depend on pw or pb(cid:2) The last
`factor P (cid:3)Djpw(cid:2) pb(cid:7) is called the likelihood(cid:2)
`We let pw and pb be independent beta(cid:11)distributed random variables under the prior(cid:19)
`p(cid:3)pw(cid:2) pb(cid:7) (cid:28)p(cid:3)p w(cid:7)p(cid:3)pb(cid:7)(cid:10) where
`
`p(cid:3)pw(cid:7) (cid:5) p(cid:5)w (cid:0)
`w
`
`(cid:3) (cid:3) pw(cid:7)(cid:6)w (cid:0) (cid:2)
`
`and
`
`p(cid:3)pb(cid:7) (cid:5) p(cid:5)b(cid:0)
`b
`
`(cid:3) (cid:3) pb(cid:7)(cid:6)b(cid:0) (cid:3)
`
`(cid:3)(cid:7)
`
`In frequentist terms(cid:10) (cid:6)w and (cid:7)w have the e(cid:27)ect of shrinking our measurements toward
`a word error rate of (cid:6)w(cid:4)(cid:3)(cid:6)w (cid:23) (cid:7)w(cid:7)(cid:10) where the in!uence of this shrinkage grows with
`(cid:6)w (cid:23) (cid:7)w(cid:2) Typically(cid:10) we choose (cid:6)w (cid:28) (cid:7)w (cid:28) (cid:10) which gives a uniform prior over pw as
`shown in Fig(cid:2) a(cid:2)
`
`As for the prior over pb(cid:10) it should be chosen while keeping in mind the behavior of the
`decoder(cid:2) If the main source of bit errors is a failure to decode(cid:10) and if we believe that for
`failures the decoder will produce a probability of bit error that is roughly equal to the
`probability pu of bit error for uncoded transmission(cid:10) then the prior should place weight
`on pb (cid:28) pu(cid:2) In this case(cid:10) we choose (cid:6)b (cid:28)  and (cid:7)b (cid:28) (cid:4)pu(cid:10) which ensures that the mode of
`the prior occurs at pu and that the prior is relatively broad(cid:2) For example(cid:10) for Eb(cid:4)N (cid:28)
`dB we have pu (cid:28) (cid:3) (cid:10) and so We choose (cid:6)b (cid:28)  and (cid:7)b (cid:28) (cid:4)(cid:3) (cid:28) (cid:3)(cid:10) giving
`the prior distribution for pb shown in Fig(cid:2) b(cid:2)
`
`0
`
`0.2
`
`0.4
`pb
`
`0.6
`
`0.8
`
`1
`
`012345678
`
`(cid:3)b(cid:7)
`
`p(cid:3)pb(cid:7)
`
`(cid:3)a(cid:7)
`
`1
`
`p(cid:3)pw(cid:7)
`
`0.8
`
`0.6
`
`0.4
`
`0.2
`
`0
`
`0
`
`0.2
`
`0.4
`
`0.8
`
`1
`
`0.6
`pw
`
`Figure (cid:19) (cid:3)a(cid:7) The prior distribution over the probability of word error pw(cid:2) (cid:3)b(cid:7) The prior
`distribution over the probability of bit error pb within erroneous words(cid:2) This distribu(cid:11)
`tion is designed so that its median is equal to the probability of bit error for uncoded
`transmission(cid:2)
`
`It is straightforward to show that the likelihood is
`
`w (cid:3) (cid:3) pw(cid:7)M (cid:0)nwpnb
`P (cid:3)Djpw(cid:2) pb(cid:7) (cid:28) P (cid:3)nw(cid:2) nbjpw(cid:2) pb(cid:7) (cid:5) pnw
`b (cid:3) (cid:3) pb(cid:7)nwK(cid:0)nb(cid:3)
`
`(cid:3)(cid:7)
`
`This distribution is the product of a binomial distribution for the number of word errors
`and a binomial distribution for the number of bit errors(cid:2) Combining this likelihood with
`the prior(cid:10) we obtain the posterior(cid:10)
`
`p(cid:3)pw(cid:2) pbjD(cid:7) (cid:5) p(cid:5)w (cid:0) (cid:10)nw
`w
`
`(cid:3) (cid:3) pw(cid:7)(cid:6)w(cid:0) (cid:10)M (cid:0)nw p(cid:5)b(cid:0) (cid:10)nb
`b
`
`(cid:3) (cid:3) pb(cid:7)(cid:6)b(cid:0) (cid:10)nwK(cid:0)nb(cid:2)
`
`(cid:3)(cid:7)
`
`
`
`Hughes, Exh. 1055, p. 8
`
`

`

`which is just the product of a beta distribution over pw and a separate beta distribution
`over pb(cid:2) Of course(cid:10) we are actually interested in the posterior distribution p(cid:3)pwpbjD(cid:7) over
`the total probability of a bit error pwpb(cid:2) A sample is obtained from p(cid:3)pwpbjD(cid:7) by drawing
`pw " pb pairs from the posterior in (cid:3)(cid:7) and taking the product of pw and pb in each pair(cid:2)
`This sample is sorted in ascending order(cid:10) and the value of pwpb occuring half(cid:11)way through
`the sorted list is taken as an estimate of the median of p(cid:3)pwpbjD(cid:7)(cid:2) Similarly(cid:10) the values
`of pwpb occuring (cid:2) and (cid:2) through the sorted list are taken as the  con(cid:21)dence
`interval(cid:2)
`
`For the homogenous TCC(cid:10) we simulated the transmission of M (cid:28)  blocks at
`Eb(cid:4)N (cid:28) (cid:3)  dB using a block length of N (cid:28) (cid:2) (cid:2) We measured nw (cid:28)  and
`nb (cid:28) (cid:2) (cid:2) Using the prior presented above for the slightly higher value of Eb(cid:4)N (cid:28)
`dB(cid:10) a sample of points from the posterior over pw and pb was obtained and is shown
`in Fig(cid:2) a(cid:2) As described above(cid:10) for (cid:8) (cid:28) (cid:2)(cid:10) (cid:2) and (cid:2) (cid:10) we found the values for p(cid:7)
`
`Samples
`= 9.9e-4
`= 1.7e-3
`= 2.6e-3
`
`pwpb
`pwpb
`pwpb
`
`(cid:3)a(cid:7)
`
`pb
`
`0.4
`
`0.35
`
`0.3
`
`0.25
`
`0.2
`
`0.15
`
`0.1
`
`0.05
`
`(cid:3)b(cid:7)
`
`pb
`
`0.4
`
`0.35
`
`0.3
`
`0.25
`
`0.2
`
`0.15
`
`0.1
`
`0.05
`
`Samples
`= 1.6e-7
`= 5.1e-6
`= 4.8e-5
`
`pwpb
`pwpb
`pwpb
`
`0
`1e-7
`
`1e-6
`
`1e-5
`
`1e-4
`
`1e-2
`
`1e-1
`
`1e-0
`
`1e-3
`pw
`
`0
`1e-7
`
`1e-6
`
`1e-5
`
`1e-4
`
`1e-2
`
`1e-1
`
`1e-0
`
`1e-3
`pw
`
`Figure (cid:19) (cid:3)a(cid:7) A (cid:11)point sample from p(cid:3)pw(cid:2) pbjD(cid:7) for M (cid:28) (cid:10) nw (cid:28) (cid:10) K (cid:28) (cid:2)  
`and nb (cid:28) (cid:2) (cid:10) for the prior described in the main text(cid:2) (cid:3)b(cid:7) A (cid:11)point sample from
`p(cid:3)pw(cid:2) pbjD(cid:7) for M (cid:28) (cid:2)  (cid:10) nw (cid:28) (cid:10) K (cid:28) (cid:2)   and nb (cid:28) (cid:10) for the same prior(cid:2)
`
`such that #p(cid:3)pwpb (cid:9) p(cid:7)jD(cid:7) (cid:28) (cid:8)(cid:10) where #p is the sample distribution(cid:2) The corresponding
`three curves of the form pwpb (cid:28) p(cid:7) are shown in Fig(cid:2) a(cid:10) and the corresponding values of
`p(cid:7) give a median of (cid:3) (cid:4) (cid:0) and a  con(cid:21)dence interval of (cid:3) (cid:3) (cid:4) (cid:0)(cid:2) (cid:3) (cid:4) (cid:0) (cid:7)(cid:2)
`Clearly(cid:10) in this case it is the values for pw that determine the p(cid:7)(cid:30)s for these curves(cid:10) whereas
`the values for pb are well(cid:11)determined by the measurements(cid:2) We could have assumed that
`pb took on its measured value instead of sampling from the posterior(cid:2)
`For the homogenous TCC described above(cid:10) we also simulated the transmission of
`M (cid:28) (cid:2)   blocks at Eb(cid:4)N (cid:28) (cid:3) dB(cid:2) In this case(cid:10) we measured nw (cid:28) and nb (cid:28) (cid:2)
`Using naive methods(cid:10) we might conclude that the bit error rate is and that there isn(cid:30)t
`any variation in this value(cid:2) However(cid:10) the Bayesian technique gives the sample from the
`posterior shown in Fig(cid:2) b(cid:2)
`In this case(cid:10) the values of both pw and pb play a role in
`determining the p(cid:7)(cid:30)s for the three curves(cid:2) The median is (cid:3) (cid:4) (cid:0) and the con(cid:21)dence
`interval is (cid:3) (cid:3) (cid:4) (cid:0)(cid:2) (cid:3) (cid:4) (cid:0)(cid:7)(cid:2)
`
`
`
`Hughes, Exh. 1055, p. 9
`
`

`

`References
`
`(cid:0) (cid:3) R(cid:4) M(cid:4) Tanner(cid:5) (cid:6)A recursive approach to low complexity codes(cid:5)(cid:7) IEEE Transactions on Information
`Theory(cid:5) vol(cid:4) (cid:5) pp(cid:4)  (cid:12)(cid:5)  (cid:4)
`
`(cid:0)(cid:3) N(cid:4) Wiberg(cid:5) H(cid:4)(cid:16)A(cid:4) Loeliger(cid:5) and R(cid:4) K(cid:17)otter(cid:5) (cid:6)Codes and iterative decoding on general graphs(cid:5)(cid:7)
`Eur

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket