throbber
Pastry: Scalable, Decentralized Object Location, and
`Routing for Large-Scale Peer-to-Peer Systems
`
`Antony Rowstron1 and Peter Druschel2(cid:2)
`
`1 Microsoft Research Ltd, St. George House,
`Guildhall Street, Cambridge, CB2 3NH, UK.
`antr@microsoft.com
`2 Rice University MS-132, 6100 Main Street,
`Houston, TX 77005-1892, USA.
`druschel@cs.rice.edu
`
`Abstract. This paper presents the design and evaluation of Pastry, a scalable, dis-
`tributed object location and routing substrate for wide-area peer-to-peer applica-
`tions. Pastry performs application-level routing and object location in a potentially
`very large overlay network of nodes connected via the Internet. It can be used to
`support a variety of peer-to-peer applications, including global data storage, data
`sharing, group communication and naming.
`Each node in the Pastry network has a unique identifier (nodeId). When presented
`with a message and a key, a Pastry node efficiently routes the message to the
`node with a nodeId that is numerically closest to the key, among all currently
`live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in
`the nodeId space, and notifies applications of new node arrivals, node failures
`and recoveries. Pastry takes into account network locality; it seeks to minimize
`the distance messages travel, according to a to scalar proximity metric like the
`number of IP routing hops.
`Pastry is completely decentralized, scalable, and self-organizing; it automatically
`adapts to the arrival, departure and failure of nodes. Experimental results obtained
`with a prototype implementation on an emulated network of up to 100,000 nodes
`confirm Pastry’s scalability and efficiency, its ability to self-organize and adapt to
`node failures, and its good network locality properties.
`
`1 Introduction
`
`Peer-to-peer Internet applications have recently been popularized through file sharing
`applications like Napster, Gnutella and FreeNet [1,2,8]. While much of the attention
`has been focused on the copyright issues raised by these particular applications, peer-
`to-peer systems have many interesting technical aspects like decentralized control, self-
`organization, adaptation and scalability. Peer-to-peer systems can be characterized as
`distributed systems in which all nodes have identical capabilities and responsibilities
`and all communication is symmetric.
`There are currently many projects aimed at constructing peer-to-peer applications
`and understanding more of the issues and requirements of such applications and sys-
`tems [1,2,5,8,10,15]. One of the key problems in large-scale peer-to-peer applications
`(cid:2) Work done in part while visiting Microsoft Research, Cambridge, UK.
`
`R. Guerraoui (Ed.): Middleware 2001, LNCS 2218, pp. 329–350, 2001.
`c(cid:2) Springer-Verlag Berlin Heidelberg 2001
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 1 of 22
`
`

`

`330
`
`A. Rowstron and P. Druschel
`
`is to provide efficient algorithms for object location and routing within the network.
`This paper presents Pastry, a generic peer-to-peer object location and routing scheme,
`based on a self-organizing overlay network of nodes connected to the Internet. Pastry
`is completely decentralized, fault-resilient, scalable, and reliable. Moreover, Pastry has
`good route locality properties.
`Pastry is intended as general substrate for the construction of a variety of peer-to-
`peer Internet applications like global file sharing, file storage, group communication and
`naming systems. Several application have been built on top of Pastry to date, including
`a global, persistent storage utility called PAST [11,21] and a scalable publish/subscribe
`system called SCRIBE [22]. Other applications are under development.
`Pastry provides the following capability. Each node in the Pastry network has a
`unique numeric identifier (nodeId). When presented with a message and a numeric key,
`a Pastry node efficiently routes the message to the node with a nodeId that is numerically
`closest to the key, among all currently live Pastry nodes. The expected number of routing
`steps is O(log N), where N is the number of Pastry nodes in the network. At each Pastry
`node along the route that a message takes, the application is notified and may perform
`application-specific computations related to the message.
`Pastry takes into account network locality; it seeks to minimize the distance messages
`travel, according to a scalar proximity metric like the number of IP routing hops. Each
`Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies
`applications of new node arrivals, node failures and recoveries. Because nodeIds are
`randomly assigned, with high probability, the set of nodes with adjacent nodeId is diverse
`in geography, ownership, jurisdiction, etc. Applications can leverage this, as Pastry can
`route to one of k nodes that are numerically closest to the key. A heuristic ensures that
`among a set of nodes with the k closest nodeIds to the key, the message is likely to
`first reach a node “near” the node from which the message originates, in terms of the
`proximity metric.
`Applications use these capabilities in different ways. PAST, for instance, uses a fileId,
`computed as the hash of the file’s name and owner, as a Pastry key for a file. Replicas of
`the file are stored on the k Pastry nodes with nodeIds numerically closest to the fileId.
`A file can be looked up by sending a message via Pastry, using the fileId as the key. By
`definition, the lookup is guaranteed to reach a node that stores the file as long as one
`of the k nodes is live. Moreover, it follows that the message is likely to first reach a
`node near the client, among the k nodes; that node delivers the file and consumes the
`message. Pastry’s notification mechanisms allow PAST to maintain replicas of a file on
`the k nodes closest to the key, despite node failure and node arrivals, and using only
`local coordination among nodes with adjacent nodeIds. Details on PAST’s use of Pastry
`can be found in [11,21].
`As another sample application, in the SCRIBE publish/subscribe System, a list of
`subscribers is stored on the node with nodeId numerically closest to the topicId of a
`topic, where the topicId is a hash of the topic name. That node forms a rendez-vous
`point for publishers and subscribers. Subscribers send a message via Pastry using the
`topicId as the key; the registration is recorded at each node along the path. A publisher
`sends data to the rendez-vous point via Pastry, again using the topicId as the key. The
`rendez-vous point forwards the data along the multicast tree formed by the reverse paths
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 2 of 22
`
`

`

`Pastry: Scalable, Decentralized Object Location, and Routing
`
`331
`
`from the rendez-vous point to all subscribers. Full details of Scribe’s use of Pastry can
`be found in [22].
`These and other applications currently under development were all built with little
`effort on top of the basic capability provided by Pastry. The rest of this paper is orga-
`nized as follows. Section 2 presents the design of Pastry, including a description of the
`API. Experimental results with a prototype implementation of Pastry are presented in
`Section 3. Related work is discussed in Section 4 and Section 5 concludes.
`
`2 Design of Pastry
`
`A Pastry system is a self-organizing overlay network of nodes, where each node routes
`client requests and interacts with local instances of one or more applications. Any com-
`puter that is connected to the Internet and runs the Pastry node software can act as a
`Pastry node, subject only to application-specific security policies.
`Each node in the Pastry peer-to-peer overlay network is assigned a 128-bit node
`identifier (nodeId). The nodeId is used to indicate a node’s position in a circular nodeId
`space, which ranges from 0 to 2128 − 1. The nodeId is assigned randomly when a node
`joins the system. It is assumed that nodeIds are generated such that the resulting set
`of nodeIds is uniformly distributed in the 128-bit nodeId space. For instance, nodeIds
`could be generated by computing a cryptographic hash of the node’s public key or its
`IP address. As a result of this random assignment of nodeIds, with high probability,
`nodes with adjacent nodeIds are diverse in geography, ownership, jurisdiction, network
`attachment, etc.
`Assuming a network consisting of N nodes, Pastry can route to the numerically
`closest node to a given key in less than (cid:2)log2b N(cid:3) steps under normal operation (b is a
`configuration parameter with typical value 4). Despite concurrent node failures, eventual
`delivery is guaranteed unless (cid:4)|L|/2(cid:5) nodes with adjacent nodeIds fail simultaneously
`(|L| is a configuration parameter with a typical value of 16 or 32). In the following, we
`present the Pastry scheme.
`For the purpose of routing, nodeIds and keys are thought of as a sequence of digits
`with base 2b. Pastry routes messages to the node whose nodeId is numerically closest
`to the given key. This is accomplished as follows. In each routing step, a node normally
`forwards the message to a node whose nodeId shares with the key a prefix that is at least
`one digit (or b bits) longer than the prefix that the key shares with the present node’s
`id. If no such node is known, the message is forwarded to a node whose nodeId shares
`a prefix with the key as long as the current node, but is numerically closer to the key
`than the present node’s id. To support this routing procedure, each node maintains some
`routing state, which we describe next.
`
`2.1 Pastry Node State
`
`Each Pastry node maintains a routing table, aneighborhood set and a leaf set. We begin
`with a description of the routing table. A node’s routing table, R, is organized into
`(cid:2)log2b N(cid:3) rows with 2b − 1 entries each. The 2b − 1 entries at row n of the routing table
`each refer to a node whose nodeId shares the present node’s nodeId in the first n digits,
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 3 of 22
`
`

`

`332
`
`A. Rowstron and P. Druschel
`
`NodeId 10233102
`Leaf set
`10233033
`10233001
`
`10233122
`10233232
`
`SMALLER
`10233021
`10233000
`
`LARGER
`10233120
`10233230
`
`Routing table
`-0-2212102
`0
`10-0-31203
`102-0-0230
`1023-0-322
`10233-0-01
`0
`
`1
`1-1-301233
`10-1-32102
`102-1-1302
`1023-1-000
`1
`
`Neighborhood set
`13021022
`10200230
`02212102
`22301203
`
`-3-1203203
`1-3-021022
`10-3-23302
`3
`3
`
`-2-2301203
`1-2-230203
`2
`102-2-2302
`1023-2-121
`10233-2-32
`102331-2-0
`2
`
`11301233
`31203203
`
`31301233
`33213321
`
`Fig. 1. State of a hypothetical Pastry node with nodeId 10233102, b = 2, and l = 8. All numbers
`are in base 4. The top row of the routing table is row zero. The shaded cell in each row of the
`routing table shows the corresponding digit of the present node’s nodeId. The nodeIds in each
`entry have been split to show the common prefix with 10233102 - next digit - rest of nodeId. The
`associated IP addresses are not shown.
`
`but whose n + 1th digit has one of the 2b − 1 possible values other than the n + 1th digit
`in the present node’s id.
`
`Each entry in the routing table contains the IP address of one of potentially many
`nodes whose nodeId have the appropriate prefix; in practice, a node is chosen that is
`close to the present node, according to the proximity metric. We will show in Section 2.5
`that this choice provides good locality properties. If no node is known with a suitable
`nodeId, then the routing table entry is left empty. The uniform distribution of nodeIds
`ensures an even population of the nodeId space; thus, on average, only (cid:2)log2b N(cid:3) rows
`are populated in the routing table.
`
`The choice of b involves a trade-off between the size of the populated portion of the
`routing table (approximately (cid:2)log2b N(cid:3) ×(2 b − 1) entries) and the maximum number
`of hops required to route between any pair of nodes ((cid:2)log2b N(cid:3)). With a value of b = 4
`and 106 nodes, a routing table contains on average 75 entries and the expected number
`of routing hops is 5, whilst with 109 nodes, the routing table contains on average 105
`entries, and the expected number of routing hops in 7.
`The neighborhood set M contains the nodeIds and IP addresses of the |M| nodes
`that are closest (according the proximity metric) to the local node. The neighborhood set
`is not normally used in routing messages; it is useful in maintaining locality properties,
`as discussed in Section 2.5. The leaf set L is the set of nodes with the |L|/2 numerically
`closest larger nodeIds, and the |L|/2 nodes with numerically closest smaller nodeIds,
`relative to the present node’s nodeId. The leaf set is used during the message routing, as
`described below. Typical values for |L| and |M| are 2b or 2 × 2b.
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 4 of 22
`
`

`

`Pastry: Scalable, Decentralized Object Location, and Routing
`
`333
`
`How the various tables of a Pastry node are initialized and maintained is the subject
`of Section 2.4. Figure 1 depicts the state of a hypothetical Pastry node with the nodeId
`10233102 (base 4), in a system that uses 16 bit nodeIds and a value of b = 2.
`
`2.2 Routing
`
`The Pastry routing procedure is shown in pseudo code form in Table 1. The procedure
`is executed whenever a message with key D arrives at a node with nodeId A. We begin
`by defining some notation.
`l: the entry in the routing table R at column i, 0 ≤ i < 2b and row l, 0 ≤ l < (cid:4)128/b(cid:5).
`Ri
`Li: the i-th closest nodeId in the leaf set L, −(cid:4)|L|/2(cid:5) ≤i ≤ (cid:4)|L|/2(cid:5), where neg-
`ative/positive indices indicate nodeIds smaller/larger than the present nodeId, respec-
`tively.
`Dl: the value of the l’s digit in the key D.
`shl(A, B): the length of the prefix shared among A and B, in digits.
`
`Table 1. Pseudo code for Pastry core routing algorithm.
`
`if (L−(cid:2)|L|/2(cid:3) ≤ D ≤ L(cid:2)|L|/2(cid:3)) {
`(1)
`// D is within range of our leaf set
`(2)
`forward to Li, s.th. |D − Li| is minimal;
`(3)
`(4) } else {
`(5)
`// use the routing table
`Let l = shl(D, A);
`(6)
`(cid:3)= null) {
`if (RDl
`(7)
`forward to RDl
`(8)
`}
`(9)
`else {
`(10)
`(11)
`// rare case
`forward to T ∈ L ∪ R ∪ M , s.th.
`(12)
`shl(T, D) ≥ l,
`(13)
`|T − D| < |A − D|
`(14)
`(15)
`(16) }
`
`;
`
`l
`
`l
`
`}
`
`Given a message, the node first checks to see if the key falls within the range of
`nodeIds covered by its leaf set (line 1). If so, the message is forwarded directly to the
`destination node, namely the node in the leaf set whose nodeId is closest to the key
`(possibly the present node) (line 3).
`If the key is not covered by the leaf set, then the routing table is used and the message
`is forwarded to a node that shares a common prefix with the key by at least one more digit
`(lines 6–8). In certain cases, it is possible that the appropriate entry in the routing table
`is empty or the associated node is not reachable (line 11–14), in which case the message
`is forwarded to a node that shares a prefix with the key at least as long as the local node,
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 5 of 22
`
`

`

`334
`
`A. Rowstron and P. Druschel
`
`and is numerically closer to the key than the present node’s id. Such a node must be in
`the leaf set unless the message has already arrived at the node with numerically closest
`nodeId. And, unless (cid:4)|L|/2(cid:5) adjacent nodes in the leaf set have failed simultaneously,
`at least one of those nodes must be live.
`This simple routing procedure always converges, because each step takes the message
`to a node that either (1) shares a longer prefix with the key than the local node, or (2)
`shares as long a prefix with, but is numerically closer to the key than the local node.
`
`Routing performance. It can be shown that the expected number of routing steps is
`(cid:2)log2b N(cid:3) steps, assuming accurate routing tables and no recent node failures. Briefly,
`consider the three cases in the routing procedure. If a message is forwarded using the
`routing table (lines 6–8), then the set of nodes whose ids have a longer prefix match
`with the key is reduced by a factor of 2b in each step, which means the destination is
`reached in (cid:2)log2b N(cid:3) steps. If the key is within range of the leaf set (lines 2–3), then the
`destination node is at most one hop away.
`The third case arises when the key is not covered by the leaf set (i.e., it is still more
`than one hop away from the destination), but there is no routing table entry. Assuming
`accurate routing tables and no recent node failures, this means that a node with the
`appropriate prefix does not exist (lines 11–14). The likelihood of this case, given the
`uniform distribution of nodeIds, depends on |L|. Analysis shows that with |L| = 2b and
`|L| = 2 × 2b, the probability that this case arises during a given message transmission
`is less than .02 and 0.006, respectively. When it happens, no more than one additional
`routing step results with high probability.
`In the event of many simultaneous node failures, the number of routing steps required
`may be at worst linear in N , while the nodes are updating their state. This is a loose upper
`bound; in practice, routing performance degrades gradually with the number of recent
`node failures, as we will show experimentally in Section 3.1. Eventual message delivery
`is guaranteed unless (cid:4)|L|/2(cid:5) nodes with consecutive nodeIds fail simultaneously. Due
`to the expected diversity of nodes with adjacent nodeIds, and with a reasonable choice
`for |L| (e.g. 2b), the probability of such a failure can be made very low.
`
`2.3 Pastry API
`
`Next, we briefly outline Pastry’s application programming interface (API). The presented
`API is slightly simplified for clarity. Pastry exports the following operations:
`
`nodeId = pastryInit(Credentials, Application) causes the local node to join an ex-
`isting Pastry network (or start a new one), initialize all relevant state, and return
`the local node’s nodeId. The application-specific credentials contain information
`needed to authenticate the local node. The application argument is a handle to the
`application object that provides the Pastry node with the procedures to invoke when
`certain events happen, e.g., a message arrival.
`route(msg,key) causes Pastry to route the given message to the node with nodeId nu-
`merically closest to the key, among all live Pastry nodes.
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 6 of 22
`
`

`

`Pastry: Scalable, Decentralized Object Location, and Routing
`
`335
`
`Applications layered on top of Pastry must export the following operations:
`
`deliver(msg,key) called by Pastry when a message is received and the local node’s
`nodeId is numerically closest to key, among all live nodes.
`forward(msg,key,nextId) called by Pastry just before a message is forwarded to the
`node with nodeId = nextId. The application may change the contents of the message
`or the value of nextId. Setting the nextId to NULL terminates the message at the
`local node.
`newLeafs(leafSet) called by Pastry whenever there is a change in the local node’s leaf
`set. This provides the application with an opportunity to adjust application-specific
`invariants based on the leaf set.
`
`Several applications have been built on top of Pastry using this simple API, including
`PAST [11,21] and SCRIBE [22], and several applications are under development.
`
`2.4 Self-Organization and Adaptation
`
`In this section, we describe Pastry’s protocols for handling the arrival and departure
`of nodes in the Pastry network. We begin with the arrival of a new node that joins the
`system. Aspects of this process pertaining to the locality properties of the routing tables
`are discussed in Section 2.5.
`
`Node arrival. When a new node arrives, it needs to initialize its state tables, and then
`inform other nodes of its presence. We assume the new node knows initially about a
`nearby Pastry node A, according to the proximity metric, that is already part of the
`system. Such a node can be located automatically, for instance, using “expanding ring”
`IP multicast, or be obtained by the system administrator through outside channels.
`Let us assume the new node’s nodeId is X. (The assignment of nodeIds is application-
`specific; typically it is computed as the SHA-1 hash of its IP address or its public key).
`Node X then asks A to route a special “join” message with the key equal to X. Like any
`message, Pastry routes the join message to the existing node Z whose id is numerically
`closest to X.
`In response to receiving the “join” request, nodes A, Z, and all nodes encountered
`on the path from A to Z send their state tables to X. The new node X inspects this
`information, may request state from additional nodes, and then initializes its own state
`tables, using a procedure describe below. Finally, X informs any nodes that need to be
`aware of its arrival. This procedure ensures that X initializes its state with appropriate
`values, and that the state in all other affected nodes is updated.
`Since node A is assumed to be in proximity to the new node X, A’s neighborhood
`set to initialize X’s neighborhood set. Moreover, Z has the closest existing nodeId to X,
`thus its leaf set is the basis for X’s leaf set. Next, we consider the routing table, starting
`at row zero. We consider the most general case, where the nodeIds of A and X share
`no common prefix. Let Ai denote node A’s row of the routing table at level i. Note that
`the entries in row zero of the routing table are independent of a node’s nodeId. Thus, A0
`contains appropriate values for X0. Other levels of A’s routing table are of no use to X,
`since A’s and X’s ids share no common prefix.
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 7 of 22
`
`

`

`336
`
`A. Rowstron and P. Druschel
`
`However, appropriate values for X1 can be taken from B1, where B is the first node
`encountered along the route from A to Z. To see this, observe that entries in B1 and
`X1 share the same prefix, because X and B have the same first digit in their nodeId.
`Similarly, X obtains appropriate entries for X2 from node C, the next node encountered
`along the route from A to Z, and so on.
`Finally, X transmits a copy of its resulting state to each of the nodes found in its
`neighborhood set, leaf set, and routing table. Those nodes in turn update their own state
`based on the information received. One can show that at this stage, the new node X
`is able to route and receive messages, and participate in the Pastry network. The total
`cost for a node join, in terms of the number of messages exchanged, is O(log2b N). The
`constant is about 3 × 2b.
`Pastry uses an optimistic approach to controlling concurrent node arrivals and de-
`partures. Since the arrival/departure of a node affects only a small number of existing
`nodes in the system, contention is rare and an optimistic approach is appropriate. Briefly,
`whenever a node A provides state information to a node B, it attaches a timestamp to
`the message. B adjusts its own state based on this information and eventually sends an
`update message to A (e.g., notifying A of its arrival). B attaches the original timestamp,
`which allows A to check if its state has since changed. In the event that its state has
`changed, it responds with its updated state and B restarts its operation.
`
`Node departure. Nodes in the Pastry network may fail or depart without warning. In this
`section, we discuss how the Pastry network handles such node departures. A Pastry node
`is considered failed when its immediate neighbors in the nodeId space can no longer
`communicate with the node.
`To replace a failed node in the leaf set, its neighbor in the nodeId space contacts the
`live node with the largest index on the side of the failed node, and asks that node for its
`leaf table. For instance, if Li failed for (cid:4)|L|/2(cid:5) < i < 0, it requests the leaf set from
`(cid:5)
`L−(cid:3)|L|/2(cid:4). Let the received leaf set be L
`. This set partly overlaps the present node’s leaf
`set L, and it contains nodes with nearby ids not presently in L. Among these new nodes,
`the appropriate one is then chosen to insert into L, verifying that the node is actually
`alive by contacting it. This procedure guarantees that each node lazily repairs its leaf
`set unless (cid:4)|L|/2(cid:5) nodes with adjacent nodeIds have failed simultaneously. Due to the
`diversity of nodes with adjacent nodeIds, such a failure is very unlikely even for modest
`values of |L|.
`The failure of a node that appears in the routing table of another node is detected
`when that node attempts to contact the failed node and there is no response. As explained
`in Section 2.2, this event does not normally delay the routing of a message, since the
`message can be forwarded to another node. However, a replacement entry must be found
`to preserve the integrity of the routing table.
`To repair a failed routing table entry Rd
`l , a node contacts first the node referred to
`l, i (cid:7)= d of the same row, and asks for that node’s entry for Rd
`by another entry Ri
`l . In the
`event that none of the entries in row l have a pointer to a live node with the appropriate
`l+1, i (cid:7)= d, thereby casting a wider net. This
`prefix, the node next contacts an entry Ri
`procedure is highly likely to eventually find an appropriate node if one exists.
`The neighborhood set is not normally used in the routing of messages, yet it is impor-
`tant to keep it current, because the set plays an important role in exchanging information
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 8 of 22
`
`

`

`Pastry: Scalable, Decentralized Object Location, and Routing
`
`337
`
`about nearby nodes. For this purpose, a node attempts to contact each member of the
`neighborhood set periodically to see if it is still alive. If a member is not responding, the
`node asks other members for their neighborhood tables, checks the distance of each of
`the newly discovered nodes, and updates it own neighborhood set accordingly.
`Experimental results in Section 3.2 demonstrate Pastry’s effectiveness in repairing
`the node state in the presences of node failures, and quantify the cost of this repair in
`terms of the number of messages exchanged.
`
`2.5 Locality
`
`In the previous sections, we discussed Pastry’s basic routing properties and discussed
`its performance in terms of the expected number of routing hops and the number of
`messages exchanged as part of a node join operation. This section focuses on another
`aspect of Pastry’s routing performance, namely its properties with respect to locality.
`We will show that the route chosen for a message is likely to be “good” with respect to
`the proximity metric.
`Pastry’s notion of network proximity is based on a scalar proximity metric, such as
`the number of IP routing hops or geographic distance. It is assumed that the application
`provides a function that allows each Pastry node to determine the “distance” of a node
`with a given IP address to itself. A node with a lower distance value is assumed to be
`more desirable. An application is expected to implements this function depending on its
`choice of a proximity metric, using network services like traceroute or Internet subnet
`maps, and appropriate caching and approximation techniques to minimize overhead.
`Throughout this discussion, we assume that the proximity space defined by the chosen
`proximity metric is Euclidean; that is, the triangulation inequality holds for distances
`among Pastry nodes. This assumption does not hold in practice for some proximity
`metrics, such as the number of IP routing hops in the Internet. If the triangulation
`inequality does not hold, Pastry’s basic routing is not affected; however, the locality
`properties of Pastry routes may suffer. Quantifying the impact of such deviations is the
`subject of ongoing work.
`We begin by describing how the previously described procedure for node arrival is
`augmented with a heuristic that ensures that routing table entries are chosen to provide
`good locality properties.
`
`Locality in the routing table. In Section 2.4, we described how a newly joining node
`initializes its routing table. Recall that a newly joining node X asks an existing node A
`to route a join message using X as the key. The message follows a paths through nodes
`A, B, etc., and eventually reaches node Z, which is the live node with the numerically
`closest nodeId to X. Node X initialized its routing table by obtaining the i-th row of its
`routing table from the i-th node encountered along the route from A to Z.
`The property we wish to maintain is that all routing table entries refer to a node that
`is near the present node, according to the proximity metric, among all live nodes with a
`prefix appropriate for the entry. Let us assume that this property holds prior to node X’s
`joining the system, and show how we can maintains the property as node X joins.
`First, we require that node A is near X, according to the proximity metric. Since the
`entries in row zero of A’s routing table are close to A, A is close to X, and we assume
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 9 of 22
`
`

`

`338
`
`A. Rowstron and P. Druschel
`
`that the triangulation inequality holds in the proximity space, it follows that the entries
`are relatively near A. Therefore, the desired property is preserved. Likewise, obtaining
`X’s neighborhood set from A is appropriate.
`Let us next consider row one of X’s routing table, which is obtained from node B.
`The entries in this row are near B, however, it is not clear how close B is to X. Intuitively,
`it would appear that for X to take row one of its routing table from node B does not
`preserve the desired property, since the entries are close to B, but not necessarily to
`X. In reality, the entries tend to be reasonably close to X. Recall that the entries in
`each successive row are chosen from an exponentially decreasing set size. Therefore,
`the expected distance from B to its row one entries (B1) is much larger than the expected
`distance traveled from node A to B. As a result, B1 is a reasonable choice for X1. This
`same argument applies for each successive level and routing step, as depicted in Figure 2.
`
`Level 0
`
`X
`
`A
`
`Z
`
`Level 2
`
`Level 1
`
`Level 0
`
`Fig. 2. Routing step distance versus distance of the representatives at each level (based on exper-
`imental data). The circles around the n-th node along the route from A to Z indicate the average
`distance of the node’s representatives at level n. Note that X lies within each circle.
`
`After X has initialized its state in this fashion, its routing table and neighborhood set
`approximate the desired locality property. However, the quality of this approximation
`must be improved to avoid cascading errors that could eventually lead to poor route
`locality. For this purpose, there is a second stage in which X requests the state from
`each of the nodes in its routing table and neighborhood set. It then compares the distance
`of corresponding entries found in those nodes’ routing tables and neighborhood sets,
`respectively, and updates its own state with any closer nodes it finds. The neighborhood
`set contributes valuable information in this process, because it maintains and propagates
`information about nearby nodes regardless of their nodeId prefix.
`Intuitively, a look at Figure 2 illuminates why incorporating the state of nodes men-
`tioned in the routing and neighborhood tables from stage one provides good represen-
`tatives for X. The circles show the average distance of the entry from each node along
`the route, corresponding to the rows in the routing table. Observe that X lies within
`each circle, albeit off-center. In the second stage, X obtains the state from the entries
`discovered in stage one, which are located at an average distance equal to the perimeter
`of each respective circle. These states must include entries that are appropriate for X,
`but were not discovered by X in stage one, due to its off-center location.
`
`Code200, UAB v. Bright Data Ltd.
`Code 200's Exhibit 1044
`Page 10 of 22
`
`

`

`Pastry: Scalable, Decentralized Object Location, and Routing
`
`339
`
`Experimental results in Section 3.2 show that this procedure maintains the locality
`property in the routing table and neighborhood sets with high fidelity. Next, we discuss
`how the locality in Pastry’s routing tables affects Pastry routes.
`
`Route locality. The entries in the routing table of each Pastry node are chosen to be close
`to the present node, according to the proximity metric, among all nodes with the desired
`nodeId prefix. As a result, in each routing step, a message is forwarded to a relatively
`close node with a nodeId that shares a longer common prefix or is numerically closer to
`the key than the local node. That is, each step moves the message closer to the destination
`in the nodeId space, while traveling the least possible distance in the proximity space.
`Since only local information is us

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket