`
`PACM Sy
`Théo:
`‘Doaputing on Théory of
`Proceedings of the ws
`annual ACM Symposium on
`Theary of Computing
`S&E Stacks
`;
`UC San Diego
`Received on:
`
`aay
`¢ Arent
`
`PROCEEDINGS OF THE TWENTY-NINTH
`
`ANNUAL ACM SYMPOSIUM ON
`
`THEORY OF COMPUTING
`
`t
`
`U i
`
`rea
`
`
`
`El Paso, Texas
`
`May 4-6, 1997
`
`SPONSORED BY
`
`THE ACM SPECIAL INTEREST GROUP FOR
`
`ALGORITHMS AND COMPUTATION THEORY
`
`yaNooom AOU
`
`
`
`Apple 1008
`
`
`
`The Association for Computing Machinery
`1515 Broadway
`NewYork New York 10036
`
`Copyright 1997 by the Association for Computing Machinery, Inc.(ACM), Permissionto
`make digital or hard work for personal or classroom use is granted without [ce provided that
`the copies are not madeordistributed for profit or commercial advantage and that copies
`bear this notice and the full citation on the first page. Copyrights for components ofthis
`work owned byothers than ACM must be honored. Abstracting with credit is permitted. To
`copy otherwise, to republish. to post onservers orto redistribute to lists requires prior
`specific permission and/ora fee. Request permissionto republish from: Publications Dept.
`ACM,Inc. Fax +1 (212) 869-0481 or <permissions(@acm.org> For other copying ofarticles
`that carry a code al the bottomof thefirst or last page. copving is permitted provided that the
`per-copyfee indicated in the code is paid through the Copyright Clearance Center, 222
`Rosewood Drive, Danvers. MA 01925.
`
`ACM ISBN: 0-89791-888-6
`
`Additional copies maybe ordered prepaid from:
`
`ACM Order Department
`PO Box 12114
`Church Street Station
`New York, NY 10257
`
`'
`
`ACM European Service Center
`108 Cowley Road
`Oxford OX 4+
`| JF
`UK
`
`Phone: +44-1-865-382338
`Phone: 1-800-342-6626
`
`(US and Canada) Fax:—+44-1-865-381338
`+]-212-626-0500
`E-Mail: acm_europe@acm.org
`(all other countries)
`URL: http:/Avww.acm.org
`Fax: +1-212-944-1318
`E-mail: acmpubs@acim.org
`
`ACM Order Number: 508970
`
`Printed in the USA
`
`
`
`Proceedings of the Twenty-Ninth Annual
`ACM Symposium on Theory of Computing
`May 4 — 6, 1997, El Paso, Texas
`
`TABLE OF CONTENTS
`
`Sunday, May 4
`
`Session 1A
`
`Some Optimal inapproximability Results scsssccs veers ssa creme asin sense wean 1
`Johan Hastad
`
`A Complete Classification of the Approximability of Maximization Problems Derived from Boolean
`CODSETAINE SALISTACLIOE mete teen teremretretteentrtmemteen er enatere eterattteeencter rte tenet teeter stseateretstetateretrtetnerretecersteterenreststeye 11
`Sanjeev Khanna, Madhu Sudan and David P. Williamson
`When HammingMeets Euclid: The Approximability of Geometric TSP and MST .................00005 21
`Luca Trevisan
`’
`
`Session 1B
`Approximate Complex Polynomial Evaluation in Near Constant Work perPoint .......-.-...-+-+00005. 30
`John H.Reif
`Fast and Precise Computations of Discrete Fourier Transforms Using Cyclotomic Integers ............... 40
`Joe Buhler, M. Amin Shokrollahi and Volker Stemann
`Quantum Computation of Fourier Transforms over Symmetric Groups ........... 0060s eee eee eee eee es 48
`Robert Beals
`
`Session 2A
`General Techniques for Comparing Unrooted Evolutionary Trees ........-. 60 cece eee eter eees 54
`Ming-Yang Kao, Tak-Wah Lam, Teresa M. Przytycka, Wing-Kin Sung and Hing-FungTing
`Tree Pattern Matching and Subset Matching in Randomized O(n log? m) Time «0.6... ccc eeeeeeeeeeeeees 66
`Richard Cole and Ramesh Hariharan
`
`Session 2B
`Randomized 2(n?) Lower Bound for Knapsack ...........--000ee cece eee eee e teen eee eet e eee eenees 76
`Dima Grigoriev and Marek Karpinski
`Exponential Lower Bounds for Depth 3 Boolean Circuits ...........-. 002 ee cence eee n eee n eee n ene eaes 86
`Ramamohan Paturi, Michael E. Saks and Francis Zane
`
`Invited Session I
`Algorithmic Complexity in Coding Theory and the Minimum Distance Problem ........................ 92
`Alexander Vardy
`
`Session 3A
`
`Approximating Total Flow Time on Parallel Machines ............. 60. ccc cece eee eee tee e tees 110
`Stefano Leonardi and Danny Raz
`Non-Clairvoyant Multiprocessor Scheduling of Jobs with Changing Execution Characteristics .......... 120
`Jeff Edmonds, Donald D. Chinn, Tim Brecht and Xiaotie Deng
`
`
`
`Better Bounds for Online Scheduling ........ 0... cect c een b ete eee tebe bebe eens 130
`Susanne Albers
`Optimal Time-Critical Scheduling via Resource Augmentation ..............000..0002c00eeeeeeeeee es 140
`Cynthia A. Phillips, CliffStein, Eric Torng and Joel Wein
`
`Session 3B
`
`Practical Loss-Resilient Codes 2.2.0...e bebe e ne tne eben en eeee 150
`Michael G, Luby, Michael Mitzenmacher, M. Amin Shokrollahi, Daniel A. Spielman and
`Volker Stemann
`
`Spectral Techniques for Expander Codes ....... 0.0 c ccc cece teen cence nen tere nen n tent ent eens 160
`John D. Lafferty and Daniel N. Rockmore
`Faster Solution of the Key Equation for Decoding BCH Error-Correcting Codes ...................0055 168
`Victor ¥. Pan
`
`Fault Tolerant Quantum Computation with Constant Error... 2.0.00. 6 06 cece eee eee 176
`Dorit Aharonov and Michael Ben-Or
`
`Session 4A
`On the Construction of Pseudo-Random Permutations: Luby-Rackoff Revisited .....................-. 189
`Moni Naor and OmerReingold
`Reducing Randomnessvia Irrational Numbers ..... 2.0.0.0 6.06 c eee cece eee teen nee n eee 200
`Zhi-Zhong Chen and Ming-Yang Kao
`Is There an Algebraic Proof for PA NC? ...... 6.0 cece ene een ence eee tenes 210
`Ketan Mulmuley
`P = BPPif E Requires Exponential Circuits: Derandomizing the XOR Lemma ..........:.00ceeeee eee 220
`Russell Impagliazzo and Avi Wigderson
`SL CLS occ cece cece cece vee eee eee eee eeeeteecceseeeeeseteeeeneeeeeeesetetetngeseeeeeeennes 230
`Roy Armoni, Amnon Ta-Shma, Avi Wigderson and Shiyu Zhou
`
`Session 4B
`
`Using Random Sampling to Find Maximum Flows in Uncapacitated Undirected Graphs ...............- 240
`David Karger
`Combinatorial Complexity of the Central Curve ... 2.0... 060 e cece cece eee eee ene tne eee es 250
`Peter A. Beling and Sushil Verma
`Approximation of k-Set Cover by Semi-Local Optimization ............ 0.6.6. eee c eee eee eee eee eee 256
`Rong-chii Duh and Martin Fiirer
`Approximation Algorithms for Facility Location Problems ......... 0.00 cee cece cence eee cence eee e eee 265
`David B. Shmoys, Eva Tardos and Karen Aardal
`Covering Points in the Plane by k-Tours: Towards a Polynomial Time Approximation Schemefor
`General ko... ccc n ee ne een tenet ene n eee eee tents tenet teens 275
`Tetsuo Asano, Naoki Katoh, Hisao Tamaki and Takeshi Tokuyama
`
`Monday, May 5
`
`Session 5A
`
`A Public-Key Cryptosystem with Worst-Case/Average-Case Equivalence .............6.0.0 sees eeeees 284
`Miklds Ajtai and Cynthia Dwork
`Private Information Storage ..........00 6000 e cece cece ee cece eee eee ence ee eee ees 294
`Rafail Ostrovsky and Victor Shoup
`
`vi
`
`
`
`Computationally Private Information Retrieval ..........scseseeeee ner e seen e reer teen reser ener enn es 304
`Benny Chorand Niv Gilboa
`
`Session 5B
`Approximating Hyper-Rectangles: Learning and Pseudo-Random Sets .........-.66+sesceeseeeeee eens 314
`Peter Auer, Philip M. Long and Aravind Srinivasan
`A Composition Theorem for Learning Algorithms with Applications to Geometric Concept Classes ..... 324
`Shai Ben-David, Nader H. Bshouty and Eyal Kushilevitz
`Using and Combining Predictors that Specialize .........----.eecee seen eee een e cent eee t nee e nee es 334
`Yoav Freund, Robert E. Schapire, Yoram Singer and Manfred K. Warmuth
`
`Session 6A
`On-Line Algorithms for Steiner Tree Problems ...........6 0... ese ce eee e cence eee etn eee e neces 344
`Piotr Berman and Chris Coulston
`Online Algorithmsfor Selective Multicast and Maximal Dense Trees ......... 6+... 0-0 eee e serene ees 354
`Baruch Awerbuch and Tripurari Singh
`
`Session 6B
`Direct Product Results and the GCD Problem, in Old and New Communication Models ........-..-.--- 363
`Itzhak Parnafes, Ran Raz and Avi Wigderson
`The Linear-Array Problem in Communication Complexity Resolved .............00+:02+se0sse2e00005 373
`Martin Dietzfelbinger
`
`Invited Session II
`Paul Erdés (1913-1996): His Influence on the Theory of Computing ..-..........6-see eens cence renee 383
`Lészlé Babai
`
`Session 7A
`Permanents, Pfaffian Orientations, and Even Directed Circuits ............6.6 000220 e eect eee eee ee 402
`William McCuaig, Neil Robertson, P. D. Seymour and Robin Thomas
`Property Testing in Bounded Degree Graphs .....-.- 0... 0 ese essen ese essen ete e tet eee e teen tenes 406
`Oded Goldreich and Dana Ron
`Exploring Unknown Environments ............ 0000 eeeee eee e eee eens eee renee nee ene e eee e ene 416
`Susanne Albers and Monika R. Henzinger
`On Floorplans of Planar Graphs .......... 00 ccc scene ener e tenner eee een ener nent tener eet eee eens nets 426
`Xin He
`
`Session 7B
`Linear Zero-Knowledge — A Note on Efficient Zero-Knowledge Proofs and Arguments .............--. 436
`Ronald Cramer and Ivan Damgérd
`Commodity-Based Cryptography ........... 06. secs ee cece nee neat eee nen een eee e nee nen een eeeeeee 446
`Donald Beaver
`Oblivious Data Structures: Applications to Cryptography ............ 6... e ese cence eee eee es 456
`Daniele Micciancio
`Is Linear Hashing Good? ........... 2... cence eee cence een este nee en een eeeee renee nen eeeeeneeees 465
`Noga Alon, Martin Dietzfelbinger, Peter Bro Miltersen, Erez Petrank and Gdbor Tardos
`
`vii
`
`
`
`Session 8A
`
`A Sub-Constant Error-Probability Low-Degree Test, and a Sub-Constant Error-Probability PCP
`Characterization of NP ...... 2.0.0.2. 2c e eee cence eee ete eet e nets eeeenteevnerennenes 475
`Ran Raz and Shmuel Safra
`Improved Low Degree Testing and its Applications .......c.cccesseceseteeenaneeassabavevieeseeeuus 485
`Sanjeev Arora and Madhu Sudan
`Probabilistically Checkable Proofs with Zero Knowledge ....-.. 2.22.06. .0 060 cece eee e cence cece et eeee 496
`Joe Kilian, Erez Petrank and Gdbor Tardos
`
`Making Games Short ..... 0... ccc eee tenet ene ene beeen eee ee te eeeeeeeeees 506
`Uriel Feige and Joe Kilian
`
`Session 8B
`
`Improved Routing and Sorting on Multibutterflies ............. 0.0.0 ccecee cece ee cncucuseeneuseceeacs $17
`Bruce M. Maggs and Berthold Vécking
`Static and Dynamic Path Selection on Expander Graphs: A Random Walk Approach .................-. 531
`Andrei Z. Breder, Alan M. Frieze and Eli Upfal
`On Sotting Strings in Extemial MEMOry wacviessccwascowscuninuniienves saree mnemy yun aeeNe eR 540
`Lars Arge, Paolo Ferragina, Roberto Grossi and Jeffrey Scott Vitter
`Pointer Jumping Requires Concurrent Read ics, ceav ernie cee eee ow ale nade ean es ed.0 84 se-e0a sees 549
`NoamNisan and Ziv Bar- Yossef
`
`Tuesday, May 6
`
`Session 9A
`
`Lower Boundsfor Distributed Coin-Flipping and Randomized Consensus ....................2.-20000. 559
`James Aspnes
`ByZante QUGMin Systems scone aiea siamese ee eaver wiamale seaateous iran ateraretmne ga
`Dahlia Malkhi and Michael Reiter
`
`latalnieielays eran eimunnca races 569
`
`All of Us Are Smarter Than Any of Us: Wait-Free Hierarchies Are Not Robust ................00.0.00.. 579
`Wai-Kau Lo and Vassos Hadzilacos
`
`The Decidability of Distributed Decision Tasks . 2.0.0.0... 0.0 c cece eee eee tenes 589
`Maurice Herlihy and Sergio Rajsbaum
`
`Session 9B
`
`Two Algorithms for Nearest-Neighbor Search in High Dimensions ......................000..02. 0005. 599
`Jon M. Kleinberg
`Nearest Neighbor Queries in Metric Spaces .......0.0..0.0 0.000 c ccc eee een ees n eee nee eaes 609
`Kenneth L, Clarkson
`
`Locality-Preserving Hashing in Multidimensional Spaces ............ 000.0000. 002 c cece eee ee 618
`Piotr Indyk, Rajeev Motwani, Prabhakar Raghavan and Santosh Vempala
`Incremental Clustering and Dynamic Information Retrieval ........ 2.0.0... c cece cece eee ence eee nnes 626
`Moses Charikar, Chandra Chekuri, Tomds Feder and Rajeev Motwani
`
`Session 10A
`A Constant-Factor Approximation Algorithm for Packet Routi ng, and Balancing Local vs. Global
`CRUPETI A, sais acs OES eee Os ST GEG DERE, ENS es RS ONG PALS ON TE Usk aa Mee Fach eee aw te mers 636
`Aravind Srinivasan and Chung-Piaw Teo
`
`viii
`
`
`
`Universal O(congestion+dilation+log!*+*N) Local Control Packet Switching Algorithms .............. 644
`Rafail Ostrovsky and Yuval Rabani
`Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the
`World Wide WED ccccccanscese sememasnrersswe seine Hae ew eo AUER REIN RENE are NrOe HME NS 654
`David Karger, Eric Lehman, Tom Leighton, Matthew Levine, Daniel Lewin and
`Rina Panigrahy
`..0.:05..00scseste doses acesacsacce sacle veieeoyewdeeeeaees 664
`Allocating Bandwidth for Bursty Connections:
`Jon Kleinberg, Yuval Rabani and Eva Tardos
`
`Session 10B
`The Swendsen-Wang Process Does Not Always Mix Rapidly .......... 06.0256 cee eee eee ene eee ees 674
`Vivek K. Gore and Mark R. Jerrum
`Approximately Counting up to Four ............. ccs eee c eee cece eee eb eee eee ee ete eee ee en eee eas 682
`Michael Luby and Eric Vigoda
`AnInterruptible Algorithm for Perfect Sampling via Markov Chains .........-. +0. +sesseeeseereeseees 688
`JamesAllen Fill
`Sampling Lattice Poms: acs: sacaswancesausmawrniancncewae ne smeins amanie a mere eniie mee ate ERMINE 696
`Ravi Kannan and Santosh Vempala
`
`Session 11A
`Page Replacement with Multi-Size Pages and Applications to Web Caching ......... 0.0.22. 0.0 sees ees 701
`Sandy Irani
`A Polylog(n)-Competitive Algorithm for Metrical Task Systems ........-..+-0+-+0+-eeeteeeee eee ees 711
`Yair Bartal, Avrim Blum, Carl Burch and Andrew Tomkins
`
`Session 11B
`On ACC [ip] Frege Proofs scvsscinanwas weiseseasersercsnar mas sicsa nausea oaieimemmnesiniees 720
`Alexis Maciel and Toniann Pitassi
`Reducing the Complexity of Reductions ........... 0... 0s sees cece scene cere terete etre teen eter ee ten es 730
`Manindra Agrawal, Eric Allender, Russell Impagliazzo, Toniann Pitassi and Steven Rudich
`Read-Once Branching Programs, Rectangular Proofs of the Pigeonhole Principle and the Transversal
`Caloulusic: ccece sue tae cee tia eiews oo rere iow occa enin aneni canine aman eaa nia aiee ce naramre tT 739
`Alexander Razborov, Avi Wigderson and Andrew Yao
`
`Errata
`Eigenvalues, Flows and Separators of Graphs .........-- 00sec ee eee eee e eect eee eee teen etree eens 749
`FR. K,. Chung andS.-T. Yau
`Retraction of “Probabilistic Computation and Linear Time” ........--.. 00.0. e eee eee eee eee ee eee 750
`Lance Fortnow and Michael Sipser
`
`AMATOE TOR ace aiesececoreec eee soseexcesn un ncoiaiecesneinee einnn ernie ease sumed ge DEN hd FUER ORES SEMEN Gls Mae SRE SEES 751
`
`
`
`Practical Loss-Resilient Codes
`
`Michael G. Luby”
`
`Michael Mitzenmachert
`
`M.Amin Shokrollahi*
`
`Daniel A. Spielman?
`
`Volker Stemann!
`
`Abstract
`
`We present randomized constructions of linear-time en-
`codable and decodable codes that can transmit over lossy
`channels at rates extremely close to capacity. The encod-
`ing and decoding algorithms for these codes have fast and
`simple software implementations. Partial implementations
`of our algorithms are faster by orders of magnitude than the
`best software implementations of any previousalgorithm for
`this problem. We expectthese codes will be extremely useful
`for applications such as real-time audio and video transmis-
`sion over the Internet, where lossy channels are common and
`fast decoding is a requirement.
`Despite the simplicity of the algorithms, their design and
`analysis are mathematically intricate. The design requires the
`careful choice of a random irregular bipartite graph, where
`the structure of the irregular graph is extremely important.
`We model the progress of the decoding algorithm bya set
`of differential equations. The solution to these equations can
`then be expressed as polynomials in one variable with coef-
`
`“Digital Equipment Corporation, Systems Research Center, Palo Alto,
`CA, and ComputerScience Division, University ofCalifornia at Berkeley. A
`substantial portion ofthis research done whileat the International Computer
`Science Institute. Research supported in part by National Science Founda-
`tion operating grant NCR-9416101, and United States-Israel Binational Sci-
`ence Foundation grant No. 92-00226.
`t Digital Equipment Corporation, Systems Research Center, Palo Alto,
`CA. A substantial portion ofthis research done while at the Computer Sci-
`ence Department, UC Berkeley, under the National Science Foundation
`grant No. CCR-9505448.
`t international Computer Science Institute Berkeley, and Institut fiir In-
`formatik der Universitit Bonn, Germany. Research supported by a Habilita-
`tionsstipendium of the Deutsche Forschungsgemeinschaft, Grant Sh 57/I-1.
`§ Departmentof Mathematics, M.LT. Supported by an NSF mathematical
`sciences postdoc. A substantial portion of this research done while visiting
`U.C. Berkeley.
`TResearch done while at the International ComputerScience Institute
`
`Pennission to make digital/hard copies ofall or part of this material for
`personal or classroomuse is granted without fee provided that the copies
`are not made ordistributed for profit or commercial advantage, the copy-
`night nofice. the Gide of the publication andils date appear. and noticeis
`given that copyright is by permission of the ACM, Ine. To copy otherwise.
`to republish, to post on servers orfo redistribute tolists, requires specific
`pennission andeor fee
`STOC 797 Eh Paso. Vexas USA
`Copyright 1997 ACM 0-8979 1-888-G/97/08 83.50
`
`ficients determined by the graph structure. Based on these
`polynomials, we design a graphstructure that guarantees suc-
`cessful decoding with high probability.
`
`1
`
`Introduction
`
`Studies show that the Internet exhibits packet loss, and
`the measurements in [10] show that the situation has become
`worse over the past few years. A standard solution to this
`problem is to request retransmission of data that is not re-
`ceived. When someofthis retransmissionis lost, another re-
`quest is made, and so on.
`In some applications, this intro-
`ducestechnical difficulties. For real-time transmissionthis
`solution can lead to unacceptable delays caused by several
`rounds of communication between sender and receiver. For
`a multicast protocol with one sender and many receivers,dif-
`ferent sets of receivers can lose different sets of packets, and
`this solution can addsignificant overhead to the protocol.
`An alternative solution, often called forward error-
`correction in the networking literature,
`is
`sometimes
`desirable. Consider an application that sends a real-time
`stream of data symbolsthat is partitioned and transmitted in
`logical units of blocks.! Suppose the network experiences
`transient and unpredictable losses of at most a p fraction of
`symbols out of each block. The following insurance policy
`can be used to tradeoff the effects of such uncontrollable
`losses on the receiver for controllable degradation in quality.
`Let n be the block length. Instead of sending blocks of n
`data symbols each, place (1 — p)n data symbols in each
`block, by either selecting the most important parts from
`the original data stream and omitting the remainder, or
`by generating a slightly lower quality stream at a (1 — p)
`fraction of the original rate. Fill out the block to its original
`length of n with pn redundant (check) symbols. This scheme
`provides optimal loss protection if the (1 ~ p)n symbols in
`the message can all be recovered from any set of {1 — p)n
`
`1 An exampleofthis is an MPEG stream, where a group ofpictures can
`constitute such a block, and where each symbolcorrespondsto the contents
`of one packet in the block. The latency incurred by the application is propor-
`tional to the time it takes between whenthe first and last packet of the block
`is sent, plus the one-way travel time through the network.
`
`150
`
`
`
`received symbols from the block. Such a scheme can be
`used as the basic building block for the more robust and
`general protection scheme described in [1].
`To demonstrate the benefits of forward error-correction In
`a simplified setting, consider the Internet loss measurements
`performed by [14] involving a multicast by one sender to a
`numberof geographically distributed receivers. In one typi-
`cal transmission to eleven receivers for a period of an hour,
`the average packetloss rate per receiver was 9.3%, yet 46.5%
`of the packets were lost by at least one of the receivers. To
`simplify the example, suppose that the maximum loss rate
`per block of 1000 packets for any receiver is 200.7 Then, the
`sender could use forward error-correction by adding 200 re-
`dundantpackets to each 800 data packets to form blocks con-
`sisting of 100 packets each. This approach sends 25% more
`packets total than are in the original data stream, whereas
`a naive scheme where the sender retransmits lost packets
`would send about 50% more.
`It is a challenge to design fast enough encoding and de-
`coding algorithms to make forward error-correction feasible
`in real-time for high bandwidth applications.
`In this paper,
`we present codes that can be encoded and decoded in linear
`time while providing near optimal loss protection. Moreover,
`these linear time algorithms can be implemented to run very
`quickly in software.
`Ourresults hold whether each symbolis a single bit or a
`packet of many bits. We assumethat the receiver knows the
`position of each received symbol within the stream ofall en-
`coding symbols. This is appropriate for the Internet, where
`packets are indexed. We adopt as our modeloflosses the era-
`sure channel, introduced by Elias [6], in which each encod-
`ing symbolis lost with a fixed constant probability p in transit
`independentofall the other symbols. This assumptionis not
`appropriate for the Internet, where losses can be highly cor-
`related and bursty. However, losses on the Internet in general
`are not sensitive to the actual contents of each packet, and
`thus if we place the encoding into the packets in a random
`order then the independentloss assumptionis valid.
`Elias [6] showed that the capacity of the erasure channel
`is 1 — p and that a random linear code can be used to trans-
`mit over the erasure channel at any rate R < 1 — p. Fur-
`thermore, a standard linear MDS code can be used to con-
`vert a message of length An into a transmission of length n
`from which the message can be recovered from any portion of
`length greater than Rn. Moreover,general linear codes have
`quadratic time encoding algorithms and cubic time decoding
`algorithms. One cannot hope for better information recovery,
`but faster encoding and decoding times are desirable, espe-
`cially for real-time applications.
`Reed-Solomon codes can be used to transmit at the ca-
`pacity of the erasure channel with nlogn encoding time
`and quadratic decoding time. These codes have recently
`
`2 This is twice the average loss rate, but due to the bursty nature of losses
`in the Internet it is still likely that the maximum loss rate per block exceeds
`20%. One can use the results of [1] to simultaneously protect against various
`levels of loss while still keeping the overall redundancy modest.
`
`been customized to compensate for Internet packet loss in
`real-time transmission of moderate-quality video [1]. Even
`this optimized implementation required the use of dedicated
`workstations. Transmission of significantly higher quality
`video requires faster coding algorithms.
`In theory, it is possible to decode Reed-Solomon codes
`in time O(n log” nloglogn) (see,
`[4, Chapter 11.7] and
`[9, p. 369]). However, for small values of n, quadratic
`time algorithms are faster than the fast algorithms for the
`Reed-Solomon based codes, and for larger values of n the
`O(log? n log log n) multiplicative overhead in the running
`time of the fast algorithms (with a moderate sized constant
`hidden by the big-Ohnotation)is large,i.e., in the hundreds
`orlarger.
`We obtain very fast linear-time algorithmsby transmitting
`just below channel capacity. We produce rate R = 1—p(1+
`¢€) codes along with decoding algorithms that recover from
`the random lossofa p fraction of the transmitted symbols in
`time proportional to nIn(1/¢) with high probability, where
`n is the length of the encoding. They can also be encoded in
`time proportional to n In(1/¢). In Section 7, we dothisforall
`€ > 0. The fastest previously known encoding and decoding
`algorithms [2] with such a performance guarantee have run
`times proportional to nIn(1/e)/e. (See also [3] for related
`work.)
`The overall structure of our codes are related to codes in-
`troduced in [13] for error-correction. We explain the gen-
`eral construction along with the encoding and decodingal-
`gorithmsin Section 2.
`Our encoding and decoding algorithms are almost sym-
`metrical. Both are extremely simple, computing exactly one
`exclusive-oroperation for each edge in arandomly chosen bi-
`partite graph. As in many similar applications, the graphis
`chosen to be sparse, which immediately implies that the en-
`coding and decoding algorithmsare fast. Unlike many sim-
`ilar applications, the graph is not regular; instead it is quite
`irregular with a carefully chosen degree sequence. We de-
`scribe the decoding algorithm as a process on the graph in
`Section 3. Our main tool is a model that characterizes almost
`exactly the performanceof the decoding algorithm as a func-
`tion ofthe degree sequenceofthe graph. In Section 4, we use
`this tool to model the progress of the decoding algorithm by
`aset ofdifferential equations. As shown in Lemma 1, the so-
`lution to these equations can then be expressed as polynomi-
`als in one variable with coefficients determined by the degree
`sequence. The positivity of one of these polynomials on the
`interval (0, 1] with respect to a parameter 6 guarantees that,
`with high probability, the decoding algorithm can recoveral-
`mostal] the message symbols from a loss of up to a é frac-
`tion of the encoding symbols. The complete success of the
`decoding algorithm can then be demonstrated by combinato-
`rial arguments such as Lemma3.
`Our analytical tools allow us to almost exactly charac-
`terize the performance of the decoding algorithm for any
`given degree sequence. Using these tools, we analyze reg-
`
`151
`
`
`
`computes the sum
`neighbors
`
`x;
`
`c, +4, +% OC
`
`©
`
`+ 3
`
`°
`
`
`
`* O80) * OS6
`
`modulo 2 ofits % Or
`
`x O
`
`c
`D &
`+ c;
`
`ular graphs in Section 6, and concludethat they cannotyield
`codesthat are close to optimal. Hence irregular graphs are a
`necessary componentofourdesign.
`Notonly do ourtools allow us to analyze a given degree
`sequence, but they also help us to design good irregular de-
`gree sequences. In Section 7 we describe, given a parameter
`¢ > 0, a degree sequence for which the decodingis success-
`ful with high probability for a loss fraction 6 that is within € of
`1—R. Although these graphsare irregular, with some nodes
`of degree 1/e, the average nodedegree is only In(1/e). This
`is the main result of the paper,i.e., a code with encoding and
`decoding times proportional to In(1/e) that can recover from
`a loss fraction that is within € of optimal.
`In Section 9, we show how linear programming tech-
`niques can be used to find good degree sequences for the
`nodes on the right given a degree sequenceforthe left nodes.
`We demonstrate these techniques by finding the right degree
`sequencesthat are close to optimal for a series ofexampleleft
`degree sequences.
`
`(a)
`
`(b)
`
`Figure I: (a) A graph defines a mapping from messagebits
`to checkbits.
`(b) Bits 2;, 22, and c; are used to solve for x3.
`
`Decoding Operation
`
`Given the value of a check bit and all but one of
`the messagebits on which it depends,
`set the missing message bit to be the 6 of the
`check bit and its known messagebits.
`
`See Figure 1(6) for an example ofthis operation. Thead-
`vantageofrelying solely on this recovery operationis that the
`total decoding time is (at most) proportional to the number of
`edges in the graph. Our main technical innovation is in the
`design of sparse random graphs where repetition of this op-
`eration is guaranteed to usually recover all the message bits
`if at most (1 — €)@n of the message bits have beenlost from
`C(B).
`To produce codes that can recover from losses regardless
`of their location, we cascade codes of the form C(B): we
`first use C(B) to produce $n check bits for the original n
`message bits, we then use a similar code to produce §?n
`checkbits for the 8n check bits of C(B), and so on (see Fig-
`ure 2). At the last level, we use a more conventional loss-
`resilient code. Formally, we construct a sequence of codes
`C(B,),...,C(Bm) from a sequence of graphs Bo,..., Bm,
`where B; has f'n left nodes and 6'+'n right nodes. We se-
`lect m so that 9+! n is roughly \/n and we end the cascade
`witha loss-resilient code C ofrate 1—@ with 8"+1n message
`bits for which we know howto recover from the random loss
`of £ fraction of its bits with high probability. We then define
`the code C(Bo, Bi,..., Bm, C) to be a code with n message
`bits and
`
`m+1
`dS Bin + B™*?n/(1 - 8) = nB/(1 - A)
`i=l
`
`check bits formed by using C( Bo) to produce Gn checkbits
`for the n messagebits, using C(B;) to form $'+! n checkbits
`
`1.1 Terminology
`
`The block length of a code is the number of symbols in
`the transmission. In a systematic code, the transmitted sym-
`bols can be divided into message symbols and check symbols.
`Wetake the symbolsto be bits, and write a @ 6 to denote the
`exclusive-orof bits a and 6. It is easy to extend our construc-
`tions to work with symbolsthat are packets ofbits: where we
`would take the @ oftwobits, just take the bit-wise @ of two
`packets. The message symbols can be chosenfreely, and the
`check symbols are computed from the message symbols. The
`rate of a codeis the ratio of the number of message symbols
`to the block length. In a code of block length n andrate R,
`the encoder takes as input An message symbols and produces
`n symbols to be transmitted. In all of our constructions, we
`assume that the symbolsare bits.
`
`2 The Codes
`
`In this section, we explain the overall construction,as well
`as the encoding and decoding algorithms. We begin by defin-
`ing acode C(B) withn messagebits and fn check bits, by as-
`sociating these bits with a bipartite graph B. The graph B has
`n left nodes and fn right nodes, corresponding to the mes-
`sage bits and the check bits, respectively. C(B) is encoded
`by setting each check bit to be the @ of its neighboring mes-
`sage bits in B (see Figure 1(a)). Thus, the encodingtimeis
`proportional to the number ofedges in B.
`The main contribution of our work is the design and anal-
`ysis of the bipartite graph B so that the repetition of the fol-
`lowing simplistic decoding operation recovers all the missing
`message bits.
`
`
`
`for the 3'n bits produced by C(B;_,), and finally using C
`to produce an additional n§”*? /(1 — 8) check bits for the
`6™*)n bits output by C(Bm). AsC(Bo, B1,..., Bm, C) has
`n messagebits and n/(1— 3) checkbits, it is a code ofrate
`1— f.
`
`
` > conventional :
`
`code
`
`Figure 2: The code levels.
`
`consists of all nodes on the left that were lost but have not
`been decoded thusfar, all the nodes onthe right, and all the
`edges between these nodes. Recall that the decoding process
`requires finding a check bit on the right such that only one
`adjacent messagebit is missing; this adjacent bit can then be
`recovered. In terms ofthe subgraph,this is equivalentto find-
`ing a node of degree one onthe right, and removingit, its
`neighbor, and all edges adjacentto its neighbor from the sub-
`graph. Werefer to this entire sequence of events hereafter as
`one step of the decoding process. We repeat this step until
`there are no nodes of degree one available on the right. The
`entire process is successfulif it does not halt until all nodes
`on the left are removed, or equivalently, until all edges are
`removed.
`The graphsthat we use are chosen at random from care-
`fully chosen degree sequence on sparse bipartite graphs. In
`contrast with many applications of random graphs in com-
`puter science, our graphsare not regular. Indeed, the analysis
`in Section 6 showsthat it is not possible to approach channel
`capacity with regular graphs.
`We refer to edges that are adjacent to a node of degree
`i on the left (right) as edges of degree i on theleft (right).
`Each of our degree sequences is specified by a pair of vectors
`(Ai,...,Am) and (1, ..., Pm) where 4; is the initial fraction
`of edges on theleft of degree i andp; is theinitial fraction of
`edges on the right of degree j. Note that our graphs are spec-
`ified in terms of fractions of edges, and not nodes, of each
`degree; this form is more convenient for much of our work.
`To choose an appropriate random bipartite graph B with E
`edges, n nodes on the left, and Gn nodes on the right, we be-
`gin with a bipartite graph B’ with E nodes on boththeleft
`and right hand sides, with each node of B’ representing an
`edge slot. Each node on the left hand side of B’ is associated
`with a node ontheleft side of B, so that the distribution of
`degrees is given by (A,,..., Am), and similarly for the right.
`We then choose a random matching(i.e., a random permuta-
`tion) between the two sets of F nodes on B’. This induces
`arandom bipartite graph on B (perhaps with multi-edges)in
`the obvious manner with the desired degree structure