throbber
Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 416 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 417 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 418 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 419 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 420 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 421 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 422 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 423 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 424 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 425 of 787
`
`

`
`Attachment 12a: Copy of Document 12
`from the University of Illinois
` at Urbana-Champaign Library
`
`ACTIVISION, EA, TAKE-TWO, 2K, ROCKSTAR, Ex. 1004, p. 426 of 787
`
`

`
`Attachment 12b: University of Illinois at Urbana-Champaign Library catalog record for
` ACM Transactions on Computer Systems
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:21)(cid:26)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12b: University of Illinois at Urbana-Champaign Library catalog record for
` ACM Transactions on Computer Systems
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:21)(cid:27)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`Bimodal Multicast
`
`KENNETH P. BIRMAN
`Cornell University
`MARK HAYDEN
`Digital Equipment Corporation/Compaq
`OZNUR OZKASAP and ZHEN XIAO
`Cornell University
`MIHAI BUDIU
`Carnegie Mellon University
`and
`YARON MINSKY
`Cornell University
`
`There are many methods for making a multicast protocol “reliable.” At one end of the
`spectrum, a reliable multicast protocol might offer atomicity guarantees, such as all-or-
`nothing delivery, delivery ordering, and perhaps additional properties such as virtually
`synchronous addressing. At the other are protocols that use local repair to overcome transient
`packet loss in the network, offering “best effort” reliability. Yet none of this prior work has
`treated stability of multicast delivery as a basic reliability property, such as might be needed
`in an internet radio, television, or conferencing application. This article looks at reliability
`with a new goal: development of a multicast protocol which is reliable in a sense that can be
`rigorously quantified and includes throughput stability guarantees. We characterize this new
`protocol as a “bimodal multicast” in reference to its reliability model, which corresponds to a
`family of bimodal probability distributions. Here, we introduce the protocol, provide a
`theoretical analysis of its behavior, review experimental results, and discuss some candidate
`applications. These confirm that bimodal multicast is reliable, scalable, and that the protocol
`provides remarkably stable delivery throughput.
`Categories and Subject Descriptors: C.2.1 [Computer-Communication Networks]: Network
`
`This work was supported by DARPA/ONR contracts N0014-96-1-10014 and ARPA/RADC
`F30602-96-1-0317, the Cornell Theory Center, and the Turkish Research Foundation.
`Authors’ addresses: K. P. Birman, Department of Computer Science, Cornell University, 4126
`Upson Hall, Ithaca, NY 14853; email: ken@cs.cornell.edu; M. Hayden, Systems Research
`Center, Digital Equipment Corporation/Compaq, 130 Lytton Avenue, Palo Alto, CA 94301;
`email: hayden@src.dec.com; O. Ozkasap and Z. Xiao, Department of Computer Science, Cornell
`University,
`4126 Upson Hall,
`Ithaca, NY 14853;
`email:
`ozkasap@cs.cornell.edu;
`xiao@cs.cornell.edu; M. Budiu, Department of Computer Science, Carnegie Mellon University,
`Ithaca, NY 14853; email: mihaib@cs.cmu.edu; Y. Minsky, Department of Computer Science,
`Cornell University, 4126 Upson Hall, Ithaca, NY 14853.
`Permission to make digital / hard copy of part or all of this work for personal or classroom use
`is granted without fee provided that the copies are not made or distributed for profit or
`commercial advantage, the copyright notice, the title of the publication, and its date appear,
`and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to
`republish, to post on servers, or to redistribute to lists, requires prior specific permission
`and / or a fee.
`© 1999 ACM 0734-2071/99/0500 –0041 $5.00
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999, Pages 41–88.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:21)(cid:28)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`42
`
`•
`
`K. P. Birman et al.
`
`Architecture and Design; C.2.2 [Computer-Communication Networks]: Network Protocols;
`C.2.4 [Computer-Communication Networks]: Distributed Systems; D.4.1 [Operating
`Systems]: Process Management; D.4.4 [Operating Systems]: Communications Management;
`D.4.7 [Operating Systems]: Organization and Design
`Additional Key Words and Phrases: Bimodal Multicast, internet media transmission, isochro-
`nous protocols, probabilistic multicast reliability, scalable group communications, Scalable
`Reliable Multicast
`
`PREFACE
`Encamped on the hilltops overlooking the enemy fortress, the commanding
`General prepared for the final battle of the campaign. Given the information
`he was gathering about enemy positions, his forces could prevail. Indeed, if
`most of his observations could be communicated to most of his forces the
`battle could be won even if some reports reached none or very few of his
`troops. But if many reports failed to get through, or reached many but not
`most of his commanders, their attack would be uncoordinated and the battle
`lost, for only he was within direct sight of the enemy, and in the coming
`battle strategy would depend critically upon the quality of the information
`at hand.
`Although the General had anticipated such a possibility, his situation was
`delicate. As the night wore on, he dispatched wave upon wave of updates on
`the enemy troop placements. Some couriers perished in the dark, wet forests
`separating the camps. Worse still, some of his camps were beset by the
`disease that had ravaged the allies since the start of the campaign. They
`could not be relied upon, as chaos and death ruled there.
`With the approach of dawn, the General sat sipping coffee—rotgut stuff—
`reflectively. In the night, couriers came and went, following secret protocols
`worked out during the summer. At the appointed hour, he rose to lead the
`attack. The General was not one to shirk a calculated risk.
`
`1. INTRODUCTION
`Although many communication systems provide software support for reli-
`able multicast communication, the meaning given to “reliability” splits
`them into two broad classes. One class of definitions corresponds to
`“strong” reliability properties. These typically include atomicity, which is
`the guarantee that if a multicast is delivered to any destination that
`remains operational
`it will eventually be delivered to all operational
`destinations. An atomic multicast may also provide message delivery
`ordering, support for virtual synchrony (an execution model used by many
`group communication systems), security properties, real-time guarantees,
`or special behavior if a network partitioning occurs [Birman 1997]. A
`criticism is that to obtain these strong reliability properties one employs
`costly protocols, accepts the possibility of unstable or unpredictable perfor-
`mance under stress, and tolerates limited scalability [Cheriton and Skeen
`1993] (but see also Birman [1994], Cooper [1994], and van Renesse [1994]).
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:22)(cid:19)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`Bimodal Multicast
`
`•
`
`43
`
`As we will see shortly, transient performance problems can cause these
`protocols to exhibit degraded throughput. Even with a very stable network,
`it is hard to scale these protocols beyond several hundred participants
`[Piantoni and Stancescu 1997].
`Protocols belonging to the second class of “reliable multicast” solutions
`focus upon best-effort reliability in very large scale settings. Examples
`include the Internet MUSE protocol (for network news distribution) [Lidl et
`al. 1994], the Scalable Reliable Multicast protocol (SRM) [Floyd et al.
`1995], the XPress Transfer Protocol [XTP Forum 1995], and the Reliable
`Message Transport Protocol (RMTP) [Lin and Paul 1997; Paul et al. 1997].
`These systems include scalable multicast protocols which overcome mes-
`sage loss or failures, but are not provided with an “end to end” reliability
`guarantee. Indeed, as these protocols are implemented, “end to end”
`reliability may not be a well-defined concept. There is no core system to
`track membership in the group of participants; hence it may not be clear
`what processes belong to the destination set for a multicast, or even
`whether the set is small or large. Typically, processes join anonymously by
`linking themselves to a multicast forwarding tree, and subsequently inter-
`act only with their immediate neighbors. Similarly, a member may drop out
`or fail without first informing its neighbors.
`The reliability of such protocols is usually expressed in “best effort”
`terminology: if a participating process discovers a failure, a reasonable
`effort is made to overcome it. But it may not always be possible to do so.
`For example, in SRM (the most carefully studied among the protocols in
`this class) a router overload may disrupt the forwarding of multicast
`messages to processes downstream from the router. If this overload also
`prevents negative acknowledgments and retransmissions from getting
`through for long enough, gaps in the message delivery sequence may not be
`repaired. Liu and Lucas report conditions under which SRM can behave
`pathologically, remulticasting each message a number of times that rises
`with system scale [Liu 1997; Lucas 1998]. Here, we present additional data
`of a similar nature. (Liu also suggests a technique to improve SRM so as to
`partially overcome the problem.) The problematic behavior is triggered by
`low levels of systemwide noise or by transient elevated rates of message
`loss, phenomena known to be common in Internet protocols [Labovitz et al.
`1997; Paxson 1997]. Yet SRM and similar protocols do scale beyond the
`limits of the virtual synchrony protocols, and when message loss is suffi-
`ciently uncommon, they can give a very high degree of reliability.
`In effect, the developer of a critical application is forced to choose
`between reduced scalability but stronger notions of reliability in the first
`class of reliable multicast protocol, and weaker guarantees but better
`normal-case scalability afforded by the second class. For critical uses, the
`former option may be unacceptable because of the risk of a throughput
`collapse under unusual but not “exceptionally rare” conditions. Yet the
`latter option may be equally unacceptable, because it is impossible to
`reason about the behavior of the system when things go wrong.
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:22)(cid:20)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`44
`
`•
`
`K. P. Birman et al.
`
`The present article introduces a new option: a bimodal multicast protocol
`that scales well, and provides predictable reliability even under highly
`perturbed conditions. For example, the reliability and throughput of our
`new protocol remain steady even as the network packet loss rate rises to
`20% and even when 25% of the participating processes experience transient
`performance failures. We also present data showing that the LAN imple-
`mentation of our protocol overcomes bursts of packet loss with minimal
`disruption of throughput.
`The sections that follow start by presenting the protocol itself and some
`of the results of an analytic study (the details of the analysis are included
`as an Appendix). We show that the behavior of our protocol can be
`predicted given simple information about how processes and the network
`behave most of the time, and that the reliability prediction is strong
`enough to support a development methodology that would make sense in
`critical settings. Next, we present a variety of data comparing our new
`protocol with prior protocols, notably a virtually synchronous reliable
`multicast protocol, also developed by our group, and the SRM protocol. In
`each case we use implementations believed to be the best available in terms
`of performance and tuned to match the environment. Our studies include a
`mixture of experimental work on an SP2, simulation, and experiments with
`a bimodal multicast implementation for LANs (possibly connected by WAN
`gateways). Under conditions that cause other reliable protocols to exhibit
`unstable throughput, bimodal multicast remains stable. Moreover, we will
`show that although our model makes simplifying assumptions,
`it still
`makes accurate predictions about real-world behavior. Finally, the article
`examines some critical reliable multicast applications, identifying roles for
`protocols with strong properties and roles for bimodal multicast.
`Bimodal multicast is not a panacea: the protocol offers a new approach to
`reliability, but uses a model that is weaker in some ways than virtual
`synchrony, despite its stronger throughput guarantees. We see it as a tool
`to offer side by side with other reliability tools, but not as a solution that
`“competes” with previous protocols.
`
`2. MULTICAST THROUGHPUT STABILITY
`In our work with reliable multicast we participated in the development of
`communication infrastructures for applications such as stock markets (the
`New York and Swiss Stock Exchanges) [Birman 1999; Piantoni and Stanc-
`escu 1997] and air traffic control (the French console replication system1
`called PHIDIAS). The critical nature of such applications means that
`developers will have to know exactly how their systems behave under
`expected operational conditions, and to do so they need detailed informa-
`tion about how the reliable communication primitives they use will behave.
`These applications also demand high performance and scalability. In par-
`ticular, they often have a data transport subsystem that will produce a
`
`1http://www.stna.dgac.fr/projects/phidias/
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:22)(cid:21)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`Bimodal Multicast
`
`•
`
`45
`
`sustained, fairly high volume of data considered critical for safe operation.
`In the past, such subsystems were often identified as critical real-time
`applications, but today’s computers and networks are so fast that the real
`need is for stable throughput.
`Our new protocol permits designers of such applications to factor out the
`soft real-time data stream. Bimodal multicast will handle this high-volume
`workload, leaving a less demanding lower-volume residual communication
`task for protocols like the virtual synchrony ones, which work well in less
`stressful settings. Because the communication demands of bimodal multi-
`cast can be predicted from the multicast rate and a small set of parameters,
`a designer can anticipate that bimodal multicast will consume a fixed
`percentage of available bandwidth and memory resources, and configure
`the system with adequate time for the virtual synchrony mechanisms.
`Bimodal multicast is a good choice for this purpose, for several reasons.
`First, as just noted, the load associated with the protocol is predictable and
`largely independent of scale. The protocol can be shown to have a bimodal
`delivery guarantee: given information about the environment—information
`that we believe is reasonable for typical networks running standard Inter-
`net protocols— our protocol can be configured to have a very small proba-
`bility of delivering to a small number of destinations (counting failed ones),
`an insignificant risk of delivering to “many” but not “most” destinations,
`and a very high probability of delivering the message to all or almost all
`destinations. Our model lets us tune the actual probabilities to match the
`intended use. And we will show how to use the model to evaluate the safety
`of applications, such as the ones mentioned above.
`Secondly, our protocol has stable throughput. Traditional reliable multi-
`cast protocols—atomic broadcast in its various incarnations—suffer from a
`form of interference between flow control and reliability mechanisms. This
`can trigger unstable throughput when the network is scaled up, and some
`application programs exhibit erratic behavior. We are able to demonstrate
`the stability of our new protocol both theoretically and experimentally. For
`the types of applications that motivate our work, this stability guarantee is
`extremely important: one needs to know that the basic data stream is
`delivered at a steady rate and that it is delivered reliably.
`To give some sense of where the article is headed, consider Figure 1. Here
`we measured throughput at a healthy process in virtually synchronous
`multicast groups of various sizes: 32, 64, and 96 members. One of these
`members attempts to inject 7KB multicast messages at a rate of 200 per
`second. Ideally, 200 messages per second emerge. But the graph shows that
`as we “perturb” even a single group member by causing it to sleep for the
`percentage of each second shown on the x-axis, throughput collapses for the
`unperturbed group members. The problem becomes worse as the group
`grows larger (it would also be worse if we increase the percentage of
`perturbed members). In the experimental sections of this article, we will
`see that the bimodal multicast achieves the “ideal” output rate of 200
`messages per sec. under the same conditions, even with 25% of the
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:22)(cid:22)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`46
`
`•
`
`K. P. Birman et al.
`
`Virtually synchronous Ensemble multicast protocols
`
`group size: 32
`group size: 64
`group size: 96
`
`250
`
`200
`
`150
`
`100
`
`50
`
`average throughput on nonperturbed members
`
`0
`
`0
`
`0.1
`
`0.2
`
`0.3
`
`0.5
`0.4
`perturb rate
`
`0.6
`
`0.7
`
`0.8
`
`0.9
`
`Fig. 1. Throughput as one member of a multicast group is “perturbed” by forcing it to sleep
`for varying amounts of time.
`
`members perturbed. Details of the experiment used to produce Figure 1
`appear in the experimental section of this article.
`As mentioned earlier, studies of SRM have identified similar problems.
`In the case of SRM, networkwide noise and routing problems represent the
`worst case. For example, Lucas, in his doctoral dissertation [Lucas 1998],
`shows that even low levels of network noise can cause SRM to broadcast
`high rates of retransmission requests and retransmitted data messages, so
`that each multicast triggers multiple messages on the wire. Lucas finds
`that the rate of retransmissions rises in proportion to the SRM group size.
`Liu, studying other problems (but of a similar nature) proposes a number of
`changes to SRM that might improve its behavior in noisy networks. Our
`own simulations,
`included in the experimental section of this article,
`confirm these problems and make it clear that as SRM is scaled up, the
`protocol will eventually collapse, much as does the virtually synchronous
`protocol shown in Figure 1.
`What causes these problems? In the case of the virtually synchronous
`protocols, a perturbed process is particularly difficult to accommodate. On
`the one hand, the process is not considered to have failed, since it is
`sending and receiving messages. Yet it is slow to acknowledge messages
`and may experience high loss rates, particularly if operating systems
`buffers fill up. The sender and healthy receivers keep copies of unacknowl-
`edged messages until they get through, exhausting available buffering
`space and causing flow control to kick in. One could imagine setting failure
`detection parameters more aggressively (this is what Piantoni and Stanc-
`escu [1997] recommend), but now the risk of an erroneous failure classifi-
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:22)(cid:23)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`Bimodal Multicast
`
`•
`
`47
`
`cation will rise roughly as the square of the group size. The problem is that
`all group members can be understood as monitoring one another; hence, the
`more aggressive the failure detector, the more likely that a paging or
`scheduling delay will be interpreted as a crash. Thus, as one scales these
`protocols beyond a group size of about 50 –100 members, the tension
`between throughput stability and failure detection accuracy becomes a
`significant problem. Not surprisingly, most successes with virtual syn-
`chrony use fairly small groups, sometimes structured hierarchically. The
`largest systems are typically ones where performance demands are limited
`to short bursts of multicasts, far from the rates seen in Figure 1 [Birman
`1999].
`Turning to SRM, one can understand the problem as emerging from a
`form of stochastic attack on the probabilistic assumptions built into the
`protocol. Readers familiar with SRM will know that the protocol includes
`many such assumptions: they are used to prevent duplicated multicasts of
`requests for retransmission and duplicated retransmissions of data, and to
`estimate the appropriate time-to-live (TTL) value to use for each multicast.
`Such assumptions have a small probability of being incorrect, and in the
`case of SRM, as the size of the system rises, the absolute likelihood of
`mistakes rises, causing the background overhead to rise. Eventually, these
`forms of overhead interfere with normal system function, causing through-
`put to become unstable. For sufficiently large configurations or loads, they
`can trigger a form of “meltdown.”
`We believe that our article is among the first to focus on stable through-
`put in reliable multicast settings. Historically, reliable multicast split early
`into two “camps.” One camp focused on performance and scalability,
`emphasizing peak performance under ideal situations. The other camp
`focused on providing rigorous definitions for reliability and protocols that
`could be proved to implement their reliability specifications. These proto-
`cols tended to be fairly heavyweight, but performance studies also empha-
`sized their best-case performance. Neither body of work viewed stable
`scalable throughput as a fundamental reliability goal, and as we have seen,
`stable throughput is not so easily achieved.
`The properties of the bimodal multicast protocol seem to be ideal for
`many of the applications where virtual synchrony encounters its limits.
`These include internet media distribution (such for radio and television
`broadcasts, or conferencing), distribution of stock prices and trade informa-
`tion on the floor of all-electronic stock exchanges, distribution of flight
`telemetry data in air traffic control systems and military theater of
`operations systems, replication of medical telemetry data in critical care
`systems, and communication within factory automation settings. Each of
`these is a setting within which the highest-volume data sources have an
`associated notion of “freshness,” and the importance of delivering individ-
`ual messages decreases with time. Yet several are also “safety critical”
`applications for which every component must be amenable to analysis.
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:22)(cid:24)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`48
`
`•
`
`K. P. Birman et al.
`
`3. A BIMODAL MULTICAST PROTOCOL
`Our protocol is an outgrowth of work which originated in studies of gossip
`protocols done at Xerox [Demers et al. 1988], the Internet MUSE protocol
`[Lidl et al. 1994], the SRM protocol of Floyd et al. [1995], the NAK-only
`protocols used in XTP [XTP Forum 1995], and the lazy transactional
`replication method of Ladin et al. [1992]. Our protocol can be understood as
`offering a form of weak real-time guarantee; relevant prior work includes
`Cristian et al. [1985; 1990] and Baldoni et al. [1996a; 1996b].
`The idea underlying gossip protocols dates back to the original USENET
`news protocol, NNTP, developed in the early 1980’s. In this protocol, a
`communications graph is superimposed on a set of processes, and neighbors
`gossip to diffuse news postings in a reliable manner over the links. For
`example, if process A receives a news posting and then establishes commu-
`nication with process B, A would offer B a copy of that news message, and
`B solicits the copy if it has not already seen the message.
`The Xerox work considered gossip communication in the context of a
`project developing wide-area database systems [Demers et al. 1988]. They
`showed how gossip communication is related to the mathematics underly-
`ing the propagation of epidemics, and developed a family of gossip-based
`multicast protocols. However, the frequency of database updates was low
`(at most, a few per second); hence, the question of stable throughput did not
`arise. The model itself considered communication failures but not process
`failures. Our work addresses both aspects.
`In this article, we actually report on multiple implementations of our new
`protocol. The version we study most carefully was implemented within
`Cornell University’s Ensemble system [Hayden 1998], which offers a mod-
`ular plug-and-play framework that includes some of the standard reliable
`multicast protocols, and can easily be extended with new ones. This
`plug-and-play architecture was important both because it lets our new
`work sit next to other protocols, and because it facilitated controlled
`experiments. Ensemble supports group-communication protocol stacks that
`are constructed by composing microprotocols, an idea that originated in the
`Horus project [Birman 1997; van Renesse et al. 1996].
`Because we implemented this version of our protocol within Ensemble,
`the system model is greatly simplified. An Ensemble process group exe-
`cutes as a series of execution periods during each of which group member-
`ship is static, and known to all members (see Figure 2, where time
`advances from left to right, and each time-line corresponds to the execution
`of an individual process). One execution period ends and a new one begins
`when membership changes by the addition of new processes, or the depar-
`ture (failure) of old ones. Below, we will discuss our new protocol for just a
`single execution period; this restriction is especially important in the
`formal analysis we present. The mechanisms whereby Ensemble switches
`from one period to the next depend only on knowledge of when the
`currently active set of multicast instantiations has terminated, i.e., stabi-
`
`ACM Transactions on Computer Systems, Vol. 17, No. 2, May 1999.
`
`(cid:36)(cid:38)(cid:55)(cid:44)(cid:57)(cid:44)(cid:54)(cid:44)(cid:50)(cid:49)(cid:15)(cid:3)(cid:40)(cid:36)(cid:15)(cid:3)(cid:55)(cid:36)(cid:46)(cid:40)(cid:16)(cid:55)(cid:58)(cid:50)(cid:15)(cid:3)(cid:21)(cid:46)(cid:15)(cid:3)(cid:53)(cid:50)(cid:38)(cid:46)(cid:54)(cid:55)(cid:36)(cid:53)(cid:15)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:19)(cid:23)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)(cid:22)(cid:25)(cid:3)(cid:82)(cid:73)(cid:3)(cid:26)(cid:27)(cid:26)
`
`

`
`Attachment 12c: Copy of Document 12 from the ACM Digital Library
`
`Bimodal Multicast
`
`•
`
`49
`
`p q r s
`
`Fig. 2. Multicast execution periods in Ensemble. Initially, the group consist of p and q;
`multicasts sents by p are delivered to q (and vice versa). R then joins and state is transferred
`to it. After a period of additional multicasting, q fails; s later joins and receives an additional
`state transfer. The period between two consecutive membership lists is denoted an “execution
`period”.
`
`lized. In our new protocol, this

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket