`
`A PUBLICAT10N OF THE IEEE COMMUNICATIONS SOCIETY
`
`SELECTED AREAS IN
`COMMUNICATIONS
`I
`------
`
`~RUARY 1998
`
`VOLUME 16
`
`NUMBER 2
`
`ISACEM
`
`FEB 2 3 1998
`
`CONCATENATED CODING TECHNIQUES AND ITERATIVE DECODI
`SAILING TOWARD CHANNEL CAPACITY
`Guest Edltors-S. Benedetto, D. Divsalar, and J. Hagenauer
`
`Guest Editorial .. . . ...... . . . ...... . ..... . .... . .. . . . ... .. . ......... . . . . . . ..... . S. Benedetto, D. Divsalar, and J. Hagenauer
`
`137
`
`PAPERS
`
`Turbo Decoding as an Instance of Pearl 's "Belief Propagation" Algorithm .
`. ........ . .. .. . .... .... . . . . ... .. . . ..... ... .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. J. McEliece, D. J. C. MacKay, and 1.-F. Cheng
`Early Detection and Trellis Splicing: Reduced-Complexity Iterative Decoding . .. ..... 8 . J. Frey and F. R. Kschischang
`Design and Analysis of Turbo Codes on Rayleigh Fading Channels ............. .. .. . . . ..... E. K. Hall and S. G. Wilson
`Symbol-by-Symbol MAP Decoding Algorithm for High-Rate Convoluti onal Codes That Use Reciprocal Dual Codes
`..................... . . . .. . .. . ........................... . . . . . ... . .. . .. ....... ................... S. Riedel
`Concatenated Decoding with a Reduced-Search BCJR Algorithm . . ... .. . . . ... . . . ... . ...... V. Franz and J. B. Anderson
`Performance Eval uation of Superorthogonal Turbo Codes in AWGN and Flat Rayleigh Fading Channels ......... . .... .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Komulainen and K. Pehkonen
`Bandwidth-Efficient Turbo Trellis-Coded Modulation Using Punctured Component Codes . .. P. Robertson and T. Worz
`Iterative Decoding of Compound Codes by Probability Propagation in Graphical Models ... . ......... . .. .. ... . . .... . .. . .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. R. Kschischang and 8 . J. Frey
`Analysis, Design, and Iterative Decoding of Double Serially Concatenated Codes with Interleavers .. . ........ . ....... . .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara
`A Conceptual Framework for Understanding Turbo Codes ..... . .. . . . . ............... . .... .. ............ . ... . ... G. Rattail
`Mismatched Decoding of Intersymbol Interference Using a Parallel Concatenated Scheme .............................. .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Balachandran and J. B. Anderson
`An Intuitive Justification and a Simplified Implementation of the MAP Decoder for Convolutional Codes ... . .. ... ... . .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. J. Viterbi
`Multiple Differential Detection of Parallel Concatenated Convolutional (Turbo) Codes in Correlated Fast Rayleigh Fading
`. ...... . .. . .... .. . . .. . ...... . ........................ . . . . . .. . . . ..... . I. D. Marsland and P. T. Mathiopoulos
`On Iterative Soft-Decision Decoding of Linear Binary Block Codes and Product Codes . . . .. .. . . .. ... . .. . . ... . . . ... .. . .
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. Lucas, M. Bossert, and M. Breitbach
`
`140
`153
`160
`
`175
`186
`
`196
`206
`
`219
`
`231
`245
`
`255
`
`260
`
`265
`
`276
`
`(Continued on Back Cover)
`
`Page 1 of 15
`
`
`
`SAMSUNG EXHIBIT 1006
`
`
`
`IEEE COMMUNICATIONS SOCIETY
`
`t,:on"""" ot a ll Lcll'l:omm un1\.·at um ... md u<l ing Ld c phonL'. te legraph). t an 1m1k. an<l point -10- r,rnnt h."k \l ... inn. h)
`lhl' 11- Ll. C11111111u 111l,lllon, Soci«:t)
`Thi.'.' hdJ ul 1111~ro1 ul
`..,atc ll1 1c,. and la,er, : 1n m~1nnl. ,1cnmauti1.:al .
`l'k~trl1111.1i 1h..91. ll" pn•pag.111tm 11Klu<li 11g r.1J10 . \\1rc. ,u.:11J I. urulcrgroun<l , 1.:,,a, 1JI . an<l , uhma11nl· ,.:able, ; \.\ J \ Cgu1<lc,. cnmmun1catn111
`, p.1l t' . • 1nJ 11,i:U , talhlfl ,cr,11.:,. n..:pcatcr, . ,~1d1t1 rcl..1~1ng. ,1gnal , toragc:. an<l rc~cncration : tc:ku un munll:atum error tktcdu1n ~11u..l n urcl:l ton. 111ult1pk x. 1ng an<l t·arricr 1cthnu.1uc,.
`(Olllllll1Tlh .. .. t11on ,\\lh.h ing ')'ICIH .... 1.t11a \ t)lll ll\Ull h:al itHh : und co mm UllllJllon Lh l'Or")
`In add1th111 tu the.: uOo,t: . th i.., Jot R\.\ I or Lhc ll .rl.. T RAV\\( .\110'-S Of\ C0~1\1 l '\ IC \ TIO '\ S contai n, pape r, pcrtatn111g to Jlhllog ant.I <l1g1tal , 1g.11.1I pro1..:i:, , ing ,rn<l moUul Jt 1011 .
`• ,udio ,tnJ \llko 1..~111..:,,J111~ tc1...h1114ul..',. thi: tlll'lll\ ~mu <lc,ign o l 1ran,m1tter,. n..'l"ci , i:r, . • int i rcpcatL· r, tor ro mmunil";.ttion, '"' opt k al :mJ , 0111r mcdiJ. the <l..:,1g11 and ~in.ti}' " ol
`c..:umputi:r LlHlllllLHl h.:atmn ' ) 'lcnh , Jnd th.: dc,t lnpmcn t of 1.:ommuni..:at1011 ,oll~;.1rl' C'or1l nht1t1011, ol theory cnhanl'mg thl' Lm<ler,1,inl.1111!! ol um1111un1Gtti on '}'lcm, .md tcc..:h111quc,
`ar~ llh.:ludt,._·J . a, .ire..· d 1,l...'Ll\, 101h 111 the ,(x:1J I 1rn p! kdt1un, ol the Jc, ..: lopmi:nt ol ro mrn u111l'at1011 tcl' hnolog) All mi: mhcr, ol the IFFE a rt: c lig1hlc tor m..: rnhcr, h1p in the Souct) upon
`pa,mcnl nl the an nual S, ... 1cl) 111cmh.:" h1p kc nl \!1 IXI. ~k mtx-r, ma) rccc11c thi, JOl R"-\1 upo n pay me nt ol an aud111 onal \!7 (XI I 'ilUX) m1al1. the IEf:E TRV, S-\ C ITIO \\
`(" CO\l l ll ', I( ,r,o,, uron r•) mCnl " ' an adu11,n1al S!7 IK) (','i() IM I 101al1. nr hoth ruhli r a111111 , upo n j)J) ll1Cnl ol an auu11,o n,1I \'i-1 IXI I 77 (XI l<ll,tl J. l·nr mt nrm.tl lllll 1111 111in1ng.
`-\.f1 ml 1t ·1 , ''!'"'' of 71011\at tu•n\/}oumal, arr .Jor prnnnal U\t' 011/\ .
`
`'A-l"lll.• 10 l hL' ll'. L~ al the ,uJJ11,..'" tlc..:ln\\
`
`IEEE COMI\tl'"I I CATI ONS SOCI ETY
`JOU RNAL Editorial Board 1998
`J-SAC Home Page VRL: http://gum p. bellcore.com:5000
`
`11. rR \',TIR . /)m'.-1111' 11/ .loumlll \
`\,\
`Virg 1111.1 kc h
`-B2 NL'\\ l-.11g 1nccnn!! BIJ g
`B lac·i,,,hu rg . VA 240 6 1 0350
`h1ra111cr(!, , t. eJu
`
`L. 8 Mil ~Tl I\ , Ldi{Ol'· l ll·Ch11•f
`DL' pt. EIL'ct. and Cn mpu t. Eng .
`l\.1 a,I Code ().107
`Uni , . of C al1 fo m1 a. S D
`La Jo ll a , C ,\ 92091
`mil , 1c111 (n ccc .uc,J .cd u
`
`S l I L. M (' DO'- I LIJ. t l l'< ,,,,..,. J-,tfi1111
`Bc llrn rc . Rm. I A308R
`+1 5 South
`Ired
`M o rri , 1m, n. NJ 07960- 1910
`, uc(n t>cllcorc .co m
`
`!\.1 Bl ~II
`\
`'-iSf·
`420 I \\ ihon B,11ilc, urJ
`Roo m I 17 '1
`,\r'111gt1111 , \',\ 222 10
`
`\\ . Ht
`IHl\.1 RL' . I.ah lunch
`Sau1r1L"r,1r,1".: 4
`8801 Ru,ch l il11n '/H
`S\\ ll1L'rland
`
`J F H ~,1 ,
`Dcpl Lice En g .
`Co11rn rJ 1a Uni\ .
`1455 De M a1\\onncu,c
`\ 1nnl ~al Que bec
`Ca nad ,1 fLlG IM 8
`
`I'< . M 1\1 \I C Il l 1'.
`,\ r ,ld I .a hora1on c,
`Bit.lg 101. R oo m A l l :i
`I ~O P ,trl -\1 c
`Fl orham Park , NJ 079.12
`
`Senior Editor,
`R R ~,1~', \\ .~\II
`Tc llah ,
`15 Sky lrnc Dm e
`H a11-1ho rn c . NY 10532
`
`W H. T R-\',rFR
`Depl. E l~c . Eng.
`l! ni, of M1"o un - Ro ll a
`Ro ll a . MO 6 54 0 1 0249
`
`T S R -l l'P•IPO RT
`rhc Brat.lie) Dept. E lec Fn g.
`61 5 Wh111c m o rc flail
`Virg 1ma Po ly tec h . 111 , 1 & S tate Uni, .
`81.ic k, hurg. VA 2-1(>6 1--0111
`
`D. P. T ~YLOR
`Dept. E & E Eng ineering
`Univ. of C a 111c rhury
`Pm a te B ag 4 800
`C hmtc h urc h . Nc11 Zca la nJ
`
`THE I STITL'TE Of ELECTRICAL AND ELECTRONICS ENG l "IEERS, INC.
`Officers
`F RJ FI X) J I· 1\1 s,111,. v,, l' Prt•111/,•111. 1'11/1/i u,1 11111 A,-u,,1,e,
`Jn,1 1·11 811R1""•'- ~. l 'rnidt'/11
`D l'- IIL R. Bt.:S lti"-1. Vice l're.1 u/ur1. Regum<1 I ,\c111 ·111t·1
`Kl ._,, TII R. Lu.IR . f' rn1d,·111 Ue<I
`L. J o 11 , R A!S J(J '\f .. Vi ce l're,id,•111. S1w ularc/1 " ""w 11,111
`,-\ -.. t< "1 0 C. BA, 10,. \ ea,·1111:1·
`BHI 11 ,\ . !: hi ,,n.1-... fr,·,111m·r
`LI.en D A . M ORI I \. Vice Prt'.11d1·111. frch11irn l A, m ·i111•,
`,\ RI Ill R \ \'t\\ rn, , 17, ,, f'rl'l itl,•111. lct/11, <1//tJ!ltd 1\ c/11 1l i l'1·
`J OHN R . R I IM:.RT. Pl'l'\llll'III. It.L E USA
`() \\Ill D. D1 11 . 011 ,<1flr, 0 11 1\/011 Ill l////1//1 //11 / nl l //1/I \
`'/e, '11111 //lg\' /)11 •1111111
`
`. H11111<111 R,•,0111'< t \
`1)11. Al (l ( l RT!
`, \ 's II J1 )-.; ) J Fl RRARO. /'11/,/,.-,11111111
`J1 1>1111 lio R,1~ ,. S1w1tlt11-d, , \ , tin11,•,
`Cl n 11 -\ J A, J(t>\\\J(J , Rcg 1mw/ \ c 111 ·111,·1
`f'I Tl R , \ I I\\ I\, {;t/llnl/11111<1 / , \ <111-Wn
`
`Executive Staff
`lh,111 .I S1 \I \f I 1ffllf/l•f l>irnlllr
`RI CH \RD D. S CH\\ IRl / . /Ju 1111t·\\ 1\d111111 i ,1 n11io11
`W THll\1 \\ St nu . f' m/l'\\ i//1111 / \ c1in111·,
`M -IR) \\ \Rl>-C Al I , , . /c,·l,11 ,ca/ A cll n//1'\
`J o 11 , W J1\ J(J , . /11fi, r111a/1 /111 Te, l11111//lg 1
`
`IEEE Periodical\
`Transactions/Journals Departmen t
`.\1<1/f /)//'l'Clor f- RA', /.,\PPI 11 I
`Et/,tfl/'/lll /) /rt'l (/ 1/: V Ai I RII C A \I \ I ARATA
`Pmdw 111111 /)1n'c/fl r : R <>B l RI S\1H l· J(
`l ra11111c///111 1 M anager: G~ll S. F1 RE"-<"
`U,•<1m1111 · 1'11h/11h ing M w1u~t•r T1111 B o'sTR \UR
`I:' K ROi i'\
`\1c111ag111g /-,'t/i//11 '. GI R-111>1:SI
`
`IH· I Jo i R'-Al I)'\ s, 11 CTI 1.t \Rf -I'> I' C O ~l\ll ' ' ( 110'\ ~ nss:--. 117' 1 X7 l h1 i, runl,,hcu lllllC 11111c, a \C.tl in JanLI.lr). l·cntuJI) . ,\pnl MJ). June. Aup1'I. \cptcmh.:r. (kltolx'r .
`• m\l Dcu mhcr h) lh hhtltu ti: 111 ~kt,._·111c..:al .mJ f·I \.. ron11., l ng11wcr,. Inc Rl.', pon-..ih1 li1~ 101 thl.' L"ontcnh rc,h upon the Jlltho1, .m<l mH ll P' lll lht.• IEFf-__ the Sot.1i:1 ~1Cnunul , tir
`111, mt<ct . U E h Corpo rnte (>nke : 1-l'i Lu,1 -1 7 Stred . Nev. Y,,rl. , !\) IIKJl 7- 2W-l . IEEE OJ>eration, Center : .J..l 'i 11 ,,c, l anc. I' () B11, 11.1 I. 1'1',JIJ\\J) . !\J llhX'i~- 1111
`lt·lq1ho n<·: 711 '1, ) 0 IMII• • l'r ir e/Puhlicatlon lnfnm1atlon : lndl\ idu.11 «•rrc,· 11 ·1:F \ kmh.:r, $ 1(1.(Kl I lir>t cnp1 <>nl) I. nnnmcmh.:r, ~211 IKI per wp~ I N111c Adu . -1 .IKI ro,1agc
`,1
`,
`,llh.l h,111dhn,g 1.. hJljC to an~ order trom ) 1.00 tu 5-~l.fKI. 1ndm.ling prcp;.rn.l 0 1Jl·r, .1 \1 crn hc: r and imnmcm hc r , uh,Ln pllo n pn c..:1.:, .i, ,11l.1hk upo n rc4 uc,t A, all ,thle 1n miLrolid1L·
`Jlhi 1111u 11ilm Cop}right Jnd Reprint Permi,,ion,: \ h,trad mg 1, pi:rmltlcd \\ 1th c..:rct..li t 10 the ,mirl.T I .1hra n c, arc..· pe rm itted to phnllk..'Op~ to r pn,.1tt.' u": of patron, . pnl\ 1<lcJ
`f'l'.c.." 111J1c.11t.:J 111 the L"u,.k Jl th1..· no1to111 ol the l1r,t pag1..· i-. p,11d through the ( ·opyright Clcara m.:c- Cc ntt·r. 222 Ro,\.'.~ooJ Dn ,C' . Dall\ cr, . M .\ 0 1921 i-:or .ill ot her
`thC" 111· 1 •t,._·op_~
`1..np) 111g... rr p11 11 t, nr n.:puh\11. ;.1tin11 pc rnw,,11,n , v.ntl~ 111 ('op)' ngh t, un<l P1..·1m1" 1on, Di:pan mcnt . 11:--1 [~ Publu.:a11on, AJ 1111111 , trJ t1on . (cid:157)
`(cid:157)
`:'i lluc, I .Jnc. P 0. Bn, 1 l\ l , P1 "·'•1La\\il ~. ~ J
`l'J'JX h) ·1 hi: l! htllU IL" ol l-: ln :1111..:al anJ Elcc..:trnnic..:, L ng 111ca,, IIK. All ng hl"i rc":r\'cd . P..:nod1c..il , l'n, tagi: P,n<l dt r\/c~ York. NY and at aJU11 10nal 111.11 1mg
`OX85Ci -1 '\'\ I ( ·op) n ght
`.. 111,·,· . Po, tma,ter: \ end Jtld,,-,., d 1.tng,-s 10 II I I J 111 R\ \I o, SI u <7 1 I) \R I -IS I~ C0\1\ll ' '< IC \TIO\S. IH: 1-.. ~-1' lluc, l.anc. P 0. Do, I 1J I. !'"""''"'•')· '<J OHXS'i - 111 I
`1> _ 1~-c:;t,l41SH . PnntcJ 111 t S ,\
`( i\l Rc..'J!I 1r.1l11 (cid:141) n
`
`Page 2 of 15
`
`
`
`140
`
`IEEE JOURNAL ON SELECTED.AREAS IN COMMUNICATIONS, VOL. 16, NO. 2, FEBRUARY 1998
`
`Turbo Decoding as an Instance of
`Pearl's "~elief Propagation" Algorithm
`
`Robert J. McEliece, Fellow, IEEE, David J. C. MacKay, and Jung-Fu Cheng
`
`Abstract-In this paper, we will describe the close connection
`between the now celebrated iterative turbo decoding algorithm
`of Berrou et aL and an algorithm that has been well known
`in the .artificial intelligence community for a decade, but which
`is relatively unknown to information theorists: Pearl's belief
`propagation algorithm. We shall see that if Pearl's algorithm is
`applied to the "belief network" of a parallel concatenation of
`two or more codes, the turbo decoding algorithm immediately
`results. Unfortunately, however, this belief diagram has loops,
`and Pearl only proved that his algorithm works when there
`are no loops, so an explanation of the excellent experimental
`performance of turbo decoding is still lacking. However, we shall
`also show that Pearl's algorithm can be used to routinely derive
`previously known iterative, but suboptimal, decoding algorithms
`for a number of other error-control systems, including Gallager's
`low-density parity-check codes, serially concatenated codes, and
`product codes. Thus, belief propagation provides a very attrac(cid:173)
`tive general methodology for devising low-complexity iterative
`decoding algorithms for hybrid coded systems.
`
`Index Terms-Belief propagation, error-correcting codes, iter(cid:173)
`ative decoding, Pearl's Algorithm, probabilistic inference, turbo
`codes.
`
`I. INTRODUCTION AND SUMMARY
`
`T URBO codes, which were introduced in 1993 by Berrou
`
`et al. [10], are the most exciting and potentially important
`development in coding theory in many years. Many of the
`structural properties of turbo codes have now been put on
`a firm theoretical footing [7], [18], [20], [21], [27], [45], and
`several innovative variations on the turbo theme have appeared
`[5], [8], [9], [12], [27], [48].
`What is still lacking, however, is a satisfactory theoretical
`explanation of why the turbo decoding algorithm performs
`as well as it does. While we cannot yet announce a solution
`to this problem, we believe that the answer may come from
`a close study of Pearl's belief propagation algorithm, which
`is largely unknown to information theorists, but well known
`in the artificial intelligence community. (The first mention of
`belief propagation in a communications paper, and indeed the
`
`Manuscript received September 27, 1996; revised May 3, 1997. This work
`was supported by NSF Grant NCR-9505975, AFOSR Grant 5F49620-97-
`I-0313, and a grant from Qualcomm, Inc. A portion of R. J. McEliece's
`contribution was done while he was visiting the Sony Corporation in Tokyo.
`The collaboration between D. J. C. MacKay and R. J. McEliece was begun at,
`and partially supported by, the Newton Institute for Mathematical Sciences,
`Cambridge, U.K.
`R. J. McEliece is with the Department of Electrical Engineering, California
`Institute of Technology, Pasadena, CA 91125 USA.
`D. J. C. MacKay is with the Cavendish Laboratory, Department of Physics,
`Darwin College, Cambridge University, Cambridge CB3 OHE U.K.
`J. -F. Cheng is with Salomon Brothers Inc., New York, NY 10048 USA.
`Publisher Item Identifier S 0733-8716(98)00170-X.
`
`paper that motivated this one, is that of MacKay and Neal
`[37]. See also [38] and [39].)
`In this paper, we will review the turbo decoding algorithm
`as originally expounded by Berrou et al. [10], but which
`was perhaps explained more lucidly in [3], [18], or [50].
`We will then describe Pearl's algorithm, first in its natural
`"AI" setting, and then show that if it is applied to the "belief
`network" of a turbo code, the turbo decoding algorithm im(cid:173)
`mediately results. Unfortunately, however, this belief network
`has loops, and Pearl's algorithm only gives exact answers
`when there are no loops, so the existing body of knowledge
`about Pearl's algorithm does not solve the central problem
`of turbo decoding. Still, it is interesting and su~gestive that
`Pearl's algorithm yields the turbo decoding algorithm so easily.
`Furthermore, we shall show that Pearl's algorithm can also be
`used to derive effective iterative decoding algorithms for a
`number of other error-control systems, including Gallager's
`low-density parity-check codes, the recently introduced low(cid:173)
`density generator matrix codes, serially concatenated codes,
`and product codes. Some of these "BP" decoding algorithms
`agree with the ones previously derived by ad hoc methods,
`and some are new, but all prove to be remarkably effective. In
`short, belief propagation provides an attractive general method
`for devising low-complexity iterative decoding algorithms for
`hybrid coded systems. This is the message of the paper. (A
`similar message is given in the paper by Kschischang and
`Frey [33] in this issue.)
`Here is an outline of the paper. In Section u; we derive
`some simple but important results about, and introduce some
`compact notation for, "optimal symbol decision" decoding
`algorithms. In Section III, we define what we mean by a
`turbo code, and review the turbo decoding algorithm. Our
`definitions are deliberately more general than what has previ(cid:173)
`ously appeared in the literature. In particular, our transmitted
`information is not binary, but rather comes from a q-ary
`alphabet, which means that we must deal with q-ary probability
`distributions instead of the traditional "log-likelihood ratios."
`Furthermore, the reader may be surprised to find no discussion
`of "interleavers," which are an essential component of all
`turbo-coding systems. This is because, as we will articulate
`fully in our concluding remarks, we believe that the inter(cid:173)
`leaver's contribution is to make the turbo code a "good" code,
`but it has nothing directly to do with the fact that the turbo
`decoding algorithm is a good approximation to an optimal
`decoder. In Section IV, we change gears, and give a tutorial
`overview of the general probabilistic inference problem, with
`special reference to Bayesian belief networks. In Section V,
`
`This material may be protected by Copyright law (Title 17 U.S. Code)
`
`0733-8716/98$10.00 © 1998 IEEE
`
`Page 3 of 15
`
`
`
`McELIECE et al. : TURBO DECODING AS PEARL'S ALGORITHM
`
`U•(UJ, ... , U>) [-=-fuf--x_1_ .. u ::•(Y,1, Y<J
`
`Fig. 1. Codeword X = ( U , X 1 ) is transmitted over a memoryless channel
`and received as Y = (Y,, Yi) .
`
`we describe Pearl's BP algorithm, which can be defined on
`any belief network, and which gives an exact solution to the
`probabilistic inference problem when the belief network has
`no loops. In Section VI, we show that the -turbo decoding
`algorithm follows froJlT a routine application of Pearl's algo(cid:173)
`rithm to the appropriate (loopy) belief network. In Section VII,
`we briefly sketch some other decoding algorithms that can be
`derived from BP considerations. Finally, in Section VIII, we
`summarize our findings and venture some conclusions.
`
`II. PRELIMINARIES
`In this section, we will describe a general class of q-ary
`systematic encoders, and derive the optimal symbol-by-symbol
`decoding rule for a memoryless channel.
`Let U = (U1, •••,Uk) beak-dimensional random vector of
`independent, but not necessarily equiprobable, symbols from
`a q-letter alphabet A, with Pr{Ui = a} = 7ri(a) , for a E A.
`The vector U represents information to be transmitted reliably
`over an unreliable channel. We suppose that U is encoded
`systematically, i.e., mapped into a codeword X of the form
`
`(2.1)
`
`where U is the "systematic" part and X 1 is the "nonsystem(cid:173)
`atic" part of the codeword X . In the rest of the paper, we will
`sometimes call X 1 a codeword fragment.
`is transmitted over a
`We assume that the codeword X
`noisy channel with transition probabilities p(ylx) ~f Pr{Y =
`ylX = x} , and received as Y = (Y s, Y1) , where Y s is
`the portion of Y corresponding to the systematic part of the
`codeword U , and Y 1 is the portion corresponding to the
`codeword fragment X 1. We assume further that the channel is
`memoryless, which implies that the conditional density factors
`according to the rule
`
`p(ylx) =p(ys, Y1lu, xi)
`= P(Ys lu)p(Y1lx1)
`
`= (}]P(Ysi lui )) · P(Y1lxi)
`
`(2.2)
`
`(2.3)
`
`141
`
`(A communication theorist would use the term "a poste(cid:173)
`riori probability," rather than "belief.") If ao is such that
`BELi (a0) > BELi(a) , for all a i- ao , the decoder infers that
`Ui = a0 • The following straightforward computation is central
`to our results. In this computation, and for the rest of the paper,
`we will use Pearl's o notation [44].
`Definition 2.1: If x = (x1 , · · · ,xm ) and y = (YI,· · · , Ym)
`are vectors of nonnegative real numbers, the notation
`
`x=oy
`
`means that X i = yif (Y:,f:=1 Yk), for i = 1, · · · , m. In
`other words, x is a probability vector whose components are
`proportional to those of y. (If f(x) and g(x ) are nonnegative
`real-valued functions defined on a finite set, the notation
`f( x ) = og(x ) is defined similarly.)
`Lemma 2.2: If the likelihood P(Ysi lui) 1 is denoted by
`>-.i(ui) , then the belief BELi (a) defined in (2.4) is given by
`
`k
`
`BELi (a) = o L p(y1lx1) II >-.1(u1)1r1(u1)
`= o>-.i(a)1ri (a) L p(y1lx1) II >-.1(u1)1r1(u1).
`
`U :u, = a
`
`j=l
`
`k
`
`U : u , =a
`
`j=l
`# i
`
`(2.5)
`
`Proof' We have, by the definition (2.4), BELi(a) =
`Pr{Ui = alY = y}. Then
`
`(using the o notation)
`
`Pr{Ui = alY = y}
`Pr{Y = y , Ui = a}
`Pr{Y = y}
`= o Pr{Y = y , Ui =a}
`= Q I: p(y, u)
`= o L p(ylu) · p(u)
`= 0 L P(Y1lx1)P(Ys lu) · II 1r1(u1) by (2.2)
`
`U:ui = a
`
`U :ui = a
`
`U: u , = a
`
`k
`
`j=l
`
`= o L P(Y1lx1) · II >-.1(u1)1r1(u1) by (2.3)
`
`k
`
`U : u , = a
`
`j=l
`
`= o >-.i (a)1ri (a) L p(y1lx1) II >-.1(u1)1r1(u1).
`
`k
`
`U : u , =a
`
`j=l
`# i
`
`.
`
`where Ysi denotes the ith component of Ys· The situation is
`depicted in Fig. I.
`The decoding problem is to "infer" the values of the hidden
`variables Ui based on the "evidence," viz. the observed values
`Ys and y 1 of the variables Y s and Y1 , The optimal decision,
`i.e., the one that minimizes the probability of inferring an
`incorrect value for Ui, is the one based on the conditional prob(cid:173)
`ability, or "belief," that the information symbol in question
`has a given value
`
`The last two lines of the above calculation are the assertions
`(cid:127)
`of the lemma.
`We see from (2.5) that BELi (a) is the product of three
`terms. The first term, Ai (a) , might be called the systematic
`evidence term. The second term, 1r i (a) , takes into account the a
`priori distribution of Ui, Note that the effect of the systematic
`evidence is, in effect, to change the prior distribution of Ui
`from 1ri (a) to 01ri (a)>-.i(a). The third term, which is more
`
`BELi (a) ~f Pr{Ui = a!Ys = Ys, Y1 = yi}.
`
`1 If the encoder is not systematic, i.e., if the uncoded information symbols
`(2.4) U; are not transmitted, these likelihoods should all be set equal to one.
`
`Page 4 of 15
`
`
`
`142
`
`IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS. VOL. 16. NO. 2. FEBRUARY 1998
`
`TABLE I
`UPDATE RULES FOR PEARL'S ALGORITHM (HERE, (v) = "1'J l'2 · · · l'n. IF V = ( 1'J. • • · • 11,.)
`Is A VECTOR OF REAL NUMBERS)
`
`quantity (at X)
`
`Type
`
`Update Rule
`
`1.
`
`2.
`
`3.
`
`4.
`
`5.
`6.
`7.
`
`µx(u)
`
`Au1 x · · · x Au M -+ RM
`
`Ax(x)
`
`1rx(x)
`
`Ax -+R
`
`Ax-+ R
`
`'"Yx(u)
`
`Au1 x · · · x Au M -+ R
`
`BELx{x)
`..\x,u(u)
`
`1rX,Y; (x)
`
`Ax -+R
`Au1 x · · · x Au M -+ RM
`Ax-+ RN
`
`< 1ru,x(u) >
`( 1 if X has no parents)
`< ..\y,x(x) >
`{l if X has no children)
`:[;p(xlu)µx(u)
`u
`(p(x) if X has no parents)
`E Ax(x)p(xlu)
`
`X
`
`a · Ax(x)1rx(x)
`1ru,x 0 '"Yx
`1rx(x) · CT%1 Ay. x(x)
`' t
`
`i#-j
`
`complicated, takes into account the geometry of the code.
`Following [10], we will call this term the extrinsic term, and
`denote it by Ei (a). The extrinsic term is so important to what
`follows that we shall introduce a special notation for it. (This
`notation will also prove useful in Section V, where we shall
`use it to describe Pearl's algorithm-see Table I, line 6.)
`Thus, let A 1 , • • •, Ak be finite alphabets, let U ~ A 1 x
`• • • x Ak , and let R denote the set of real numbers. Let
`g = (g1, · · · , 9k) be a function mapping U into Rk. In
`other words, g is a vector of k real-valued functions, and if
`u = (u1 ,·· · ,uk) EU, then
`·
`
`Now, suppose that K(u) is a real-valued function defined on
`the set U , which we call a kernel. The K transform of g is
`the vector g' = (gi , · · · , g~) , where gi is defined by
`g:(a) = L K(u) IT g1(u1).
`
`(2.6)
`
`k
`
`where the codeword fragment x 1 = x 1(u) is a deterministic
`function of u. Then Lemma 2.2 can be summarized as follows:
`BEL= a..X1r(..X1r op)
`(2.9)
`where ..X(u) = (.X1(u1), · · ·, .Xk(uk)) and 1r(u) = (1r1(u1) ,
`7rk(Uk)).
`
`III. SYSTEMATIC PARALLEL CONCATENATED
`(TuRBO) CODES
`In this section, we will define what we mean by a turbo code,
`and present a general version of the turbo decoding algorithm.
`With the same setup as in Section II, suppose we have two
`systematic encodings of U
`C1: U -
`(U, X1)
`C2: U -(U, X2).
`One way to combine C1 and C2 into a single code is via the
`mapping
`
`U:u i = a
`
`j=l
`#i
`
`We summarize (2.6) by writing
`
`g' =goK.
`
`(2.7)
`
`Next, if/ and g are vector-valued functions as above, we de(cid:173)
`fine their adjacent product h = Jg as a simple componentwise
`product, i.e., h = (h 1 , ··· , hk) , where
`hi(a) = fi(a)gi(a).
`
`(2.8)
`
`Using the circle and adjacent notation,2 we can express the
`result of Lemma 2.2 compactly. To do so, we take U = Ak,
`and define a kernel p( u) as
`
`2 We assume that "adjacent" takes precedence over "circle" in order to
`minimize the use of parentheses.
`
`which is called the parallel concatenation of C1 and C2 , or the
`turbo code formed by combining C1 and C2 :
`Once again, we assume that the codeword X is transmitted
`through a noisy channel with transition probabilities p(ylx). It
`is received as Y = (Y . ., Y 1, Y 2), where Y s is the component
`of Y corresponding to U , Y 1 is the component of Y corre(cid:173)
`sponding to X 1 , and Y 2 is the component of Y corresponding
`to X 2 . We assume again that the channel is memoryless, which
`implies that the conditional density factors according to the
`rule
`
`p(ylx) = P(Ys, Y1 , Y2 lu, X1 , X2)
`.= P(Ys lu)p(y1 lx1 )P(Y2 lx2)
`
`= (ft P(Ysi lui ) )P(Y1 lx1)P(Y2lx2),
`
`(3.1)
`
`(3.2)
`
`The situation is as depicted in Fig. 2.
`
`Page 5 of 15
`
`
`
`MCELIECE et al.: TURBO DECODING AS PEARL'S ALGORITHM
`
`143
`
`U=(UI, ... , Uk)
`
`Ys = (YsI, .. . , Ysk)
`
`YI - - - - (cid:173)
`Ys - - - - - D1
`
`7t(2), 7t(4), 7t(6), . ..
`
`nO), n(3), n(5), ...
`
`- - - -Y 2
`D2
`~--~
`- - - -Y s
`Fig. 3. Block diagram of turbo decoding procedure.
`
`Fig. 2. Generic "turbo code." The codeword X = ( U , X 1 , X 2 ) is trans(cid:173)
`mitted over a memoryless channel and received as Y = (Y s, Y 1 , Y 1 ).
`
`By Lemma 2.2, the optimal decisions for the turbo code are
`based on the beliefs
`
`BELi(a) = a L p(y1lx1)p(y2jx2) IT >-.1(u1)1r1(u1)
`
`k
`
`j=l
`
`U : u i =a
`k
`
`· P(Y2lx2) IT >-.1(u1)1r1(u1).
`
`j=l
`#i
`
`(3.3)
`
`For simplicity, and in accordance with engineering practice,
`from now on we will assume that the a priori probability
`density of the U;'s is uniform, i.e., 'II' = (al,•••, al). With
`this assumption, using the notation introduced in Section II,
`(3.3) becomes3
`
`where the kernels P1 and P2 are defined by
`
`P1(u) =P(Y1lx1)
`P2(u) = P(Y2lx2)-
`
`(3.4)
`
`(3.5)
`
`The celebrated "turbo decoding algorithm" [10], [50], [3]
`is an iterative approximation to the optimal beliefs in (3.3)
`or (3.4), whose performance, while demonstrably suboptimal
`[41], has nevertheless proved to be "nearly optimal" in an im(cid:173)
`pressive array of experiments. The heart of the turbo algorithm
`is an iteratively defined sequence 'll'(m) of product probability
`densities on A k defined by
`
`w<0l =(al , .. •, al)
`
`(3.6)
`
`i.e., 'll'(o) is a list of k uniform densities on A, and form 2'.: 1
`
`if mis odd
`if mis even.
`
`Then the mth turbo belief vector is defined by
`BEL(m) = a,h·(m)'ll'(m-l).
`
`(3.7)
`
`(3.8)
`
`The general form of (3.7) is shown in Fig. 3.
`In a "practical" decoder, the decision about the information
`bits is usually made after a fixed number of iterations. (The
`hope that the limit of (3.8) will exist is, in general, a vain
`one since, in [41], several examples of nonconvergence are
`
`3 As we observed earlier, the effect of ~ is to change the prior distribution
`from 1r to ~ 1r. It follows that if there is a nonuniform prior 1r, it can be
`accounted for by replacing every occurrence of"~" in our formulas with ~'lr.
`
`i
`
`(3.9)
`
`given.) If the decision is made after m iterations, the mth
`turbo decision is defined as
`u<ml = arg max BEL<m\a) .
`
`'
`aEA
`We conclude this section by observing that, as we have
`stated it, the turbo algorithm [(3.7) and (3.9)] does not appear
`to be significantly simpler than the optimal algorithm (3.4)
`since (for example) (~ o p 1 ) is not, in general, much easier
`to compute than (~ o p 1p 2 ). The following theorem, and the
`discussion that follows, shed light on this problem.
`Theorem 3.1: If the components of U are assumed to be
`independent, with Pr{Ui = ui} = 1r}m-1\ui), for i =
`1, • • • , k, then
`
`Pr{Ui = alYs, Yi}
`(m)( )
`a - a - - - - - - - if m is odd
`1r
`'
`(m-1)( )
`)
`\ · (
`a
`-
`i
`A i a 1ri
`Pr{Ui = alYs, Y2}
`= a - - - - - - - if mis even.
`'
`Ai(a)1r;m-l\a)
`
`(3.10)
`
`Proof- We consider the case rr, odd, the proof for even
`m being essentially the same. By reasoning similar to that in
`Lemma 2.2, we find that
`Pr{Ui = a!Ys, Yi}
`= L P(Y1lu) IT >-.1(u1)1rt- 1l(u1). (3.11)
`
`k
`
`j=l
`U : u , =a
`Ifwe divide both sides of (3.11) by >-.i(a)1rt- 1\a), we obtain
`'°' ( I ) IT,·( ·) (m-1)(
`k
`
`A3 U3 7rj
`
`j=l
`#i
`
`·)
`U3
`
`Pr{Ui = alYs,Yi} _
`- ~ P Y1 U
`(m-1)
`Ai(a)1ri
`(a)
`U :u . =a
`= ~'J('(m-1) O Pl·
`(3.12)
`Since by (3.7), w<m) = a~w(m-l) o p 1 , the theorem follows.
`(cid:127)
`The significance of Theorem 3.1 is that it tells us that the
`appropriate components of the vectors w<m) can be computed
`by a decoder for C1 (or C2) which is capable of computing the
`probabilities Pr{Ui = alYs, Yi}, based on an observation
`of the noisy codeword Y = (Y s , Y 1 ) , i.e., an optimal "soft"
`symbol decision decoder. The ith component of the message
`passed to the second decoder module is then
`7r(m1{a) = Pr{Ui = alYs, Yi}
`· ( ) (m-1)()
`a
`Ai a 1ri
`which is the "extrinsic information" referred to earlier.
`
`i
`
`'
`
`(3.13)
`
`Page 6 of 15
`
`
`
`144
`
`IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 16. NO:-- 2, FEBRUARY 1998
`
`One of the keys to the success of turbo codes is to use com(cid:173)
`ponent codes C1 and C2 for which a low-complexity soft bit de(cid:173)
`cision algorithm exists. For example, the BCJR or "APP" de(cid:173)
`coding algorithm [4] provides such an algorithm for any code,
`block or convolutional, that can be represented by a trellis.4
`As far as is known, a code with a low-complexity optimal
`decoding algorithm cannot achieve high performance, which
`means that individually, the codes C1 and C2 must be relatively
`weak. The brilliant innovation of Berrou et al. [10] was
`to devise a code of the type shown in Fig. 2, in which
`the individual codes C1 and C2 are indeed relatively weak
`(but have a low-complexity decoding algorithm), in such a
`way that that the overall code is very powerful. Roughly
`speaking, they accomplished this by making the encoder E 2
`identical to E 1 , except for a random permutation (accom(cid:173)
`plished by the "interleaver") of the inputs. (The encoders were
`short-constraint-length systematic convolutional encoders with
`feedback.) However, since it is the object of this paper to
`study the decoding algorithm without regard to the resulting
`performance, we shall not discuss the constructive aspect of
`turbo codes further.
`
`IV. BACKGROUND ON PROBABILISTIC INFERENCE, BAYESIAN
`BELIEF NETWORKS, AND PEARL'S ALGORITHM
`In this section, we will give a tutorial overview of the
`so-called probabilistic inference problem of the artificial in(cid:173)
`telligence community, as well as a brief discussion of Pearl's
`algorithm, which solves the probabilistic inference problem in
`many important special cases.
`Thus, let X = {X1 , X 2 , · • • , XN }5 be a set of N discrete
`random variables, where X; assumes values in the finite
`alphabet Ai . The joint density function
`p(x) =p(x1 , x2,··•,xN)
`~f Pr{X1 =xi , ··· , XN = XN}
`is then a mapping from A1 x • • • x AN into the set of real
`numbers R. We assume that the marginal densities p(x; ) ~f
`Pr{ X ; = xi} are also known. The marginal density func(cid:173)
`tion p( x; ) represents our a priori "belief' about the random
`variable X ;. Now, suppose that one or more of these random
`variables is measured or "observed." This means that there is
`a subset J ~ {l , 2, • • •, N} (the evidence set) such that, for all
`j E J, the random variable Xi is known to have a particular
`value, say ai. The