`
`(12) Ulllted States Patent
`Dubnicki et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,140,625 B2
`Mar. 20, 2012
`
`(54) METHOD FOR OPERATINGA FIXED
`PREFIXTO
`
`2007/0050590 A1*
`3332;232:333 ::
`
`3/3007 Syed Gt a1~
`$232;
`
`~~~~~~~~~~~~~~~~~~ ~~ 711/170
`
`(75)
`
`Inventors: C ezary Dubnicki, Monmouth Junction,
`NJ (US); Leszek G_ryZ’ Prmceton’ NJ
`(US); Krzysztof Llch0ta= Warszawa
`(PL); Cristian Ungureanus Princeton:
`NJ (US)
`
`OTHER PUBLICATIONS
`Ben-Or, M., “Another Advantage of Free Choice: Completely Asyn-
`chronous Agreement Protocols”, Proc. 2nd ACM Symposium Prin-
`ciples Distributed Computing, p. 27-30, 1983.
`Dubnicki, C., et al., “FPN: A Distributed Hash Table for Commercial
`Applications”, 13th IEEE Int’l. Symposium High Peit Dist. Comput-
`
`( * ) Notice:
`
`.
`(21) Appl No" 12/023,141
`.
`.
`Ffled‘
`
`(22)
`
`(73) Assignee: NEC Laboratories America, Inc.,
`_
`_
`_
`”
`_
`_
`Inga P~ 120;128a 2004;
`Princeton NJ (US)
`Geels, D., Data Replication in OceanStore , University of Califor-
`’
`nia, (EECS) Report No. UCB/CSD-02-l2l7, 2002.
`Subject to any disclaimer, the term ofthis mf>rI:c’
`ll§1}1,tI:1(lIl,(E)Sli)3:Ic1t (I,fcf'>:lr::fi;?
`patent is extended or adjusted under 35
`Architectures, p. 41-52, 2002.
`U.S,C. 15403) by 440 days,
`Leslie, M., et al., “Replication Strategies for Reliable Decentralised
`Storage”, IEEE Computer Society, Proc. lst Int’l. Conf., ARES,
`2006.
`Lynch, N., et al., “Atomic Data Access in Distributed Hash Tables”,
`Lecture Notes in Comp. Sci., 2429, p. 295-305, IPTPS, 2002.
`Schneider, F., “Implementing Fault-Tolerant Services Using the State
`Machine Approach: A Tutorial”, ACM Comp. Surveys, 22(4), p.
`299.319, 1990,
`
`Jan‘ 31’ 2008
`
`(65)
`
`Prior Publication Data
`
`Related U-S- Appncanon Data
`(60) Provisional application No. 60/890,661, filed on Feb.
`20, 2007.
`
`Primary Examiner — Peling Shaw
`(741) $107’719)’: A897”; 0" F17’7” * James Bnenoé —l05ePh
`K0 0
`a
`
`(51)
`
`Int. Cl.
`(200601)
`G06F 15/16
`(2006-01)
`G06F 7/00
`(52) U.S. Cl.
`............... .. 709/205; 709/206; 707/El/032
`(58) Field of Classification Search ........ .. 709/205-206;
`707/ 10, 100, E17, 32, 395 .32, El7.032
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`2004/0215622 A1
`10/2004 Dubnickiet al.
`2005/0135381 Al*
`6/2005 Dubnickiet al.
`
`...... .. 370/395.32
`
`ABSTRACT
`(57)
`A fixed prefix peer to peer network has a number of physical
`nodes. The nodes are logically divided into a number of
`storage slots. Blocks of data are erasure coded into original
`and redundant data fragments and the resultant fragments of
`data are stored in slots on separate physical nodes such that no
`physical node has more than one original and/or redundant
`fragment. The storage locations of all of the fragments are
`organized into a logical virtual node (e.g., a supernode). Thus,
`the supernode and the original block of data can be recovered
`even if some of the physical nodes are lost.
`
`20 Claims, 8 Drawing Sheets
`
`102
`
`108
`
`E
`|#
`E0
`
`EI
`
`IE
`l€-
`
`.,
`
`IEEEIE
`
`En‘m
`léfil
`|E
`
`106
`
`gl
`
`SPRINGPATH
`SPRINGPATH
`EXHIBIT 1005
`EXHIBIT 1005
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 1 of8
`
`US 8,140,625 B2
`
`100
`
`102
`
`v-I
`
`r-—I l\) 93
`
`|— n— l\)C‘
`
`r-—I‘NO
`F-‘
`
`>— u— l\.) 9--
`
`114b
`
`104
`
`136v
`
`163
`
`105
`
`I
`
`I
`
`FK3.1
`
`108
`
`118d
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 2 of8
`
`US 8,140,625 B2
`
`©(
`
`'\l
`(‘I
`
`’-v‘
`
`I
`
`212
`
`210
`
`208
`
`206
`
`204
`
`202
`
`200\
`
`FIG.2
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 3 of8
`
`US 8,140,625 B2
`
`FIG.3
`
`308
`
`300
`
`304
`
`306
`
`302
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 4 of8
`
`US 8,140,625 B2
`
`Supemode 226
`
`‘ 1101
`
`Cardinality = 6
`
`Version
`
`Component 214a 01101 _
`Component 2156 01101
`Component218b 01101
`Component 220d 01101
`
`408
`
`410
`
`FIG. 4
`
`404
`
`Component 222a 01101
`
`Component 224a 01101
`
`Supernode A
`
`404 H1
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 5 of8
`
`US 8,140,625 B2
`
`Z>>OZMmmeOQ
`
`mzofiaomzoo
`
`maymommqommo
`
`wzoemqmmm
`
`mH.<>E.O<
`
`HZMZOEZOU
`
`Homnmm
`
`mo<mmm2
`
`fizmzomzou
`
`
`
`zofiaomzoozkozm
`
`
`
`mime<>:b<~..zoEmo.=>_oo
`
`
`
`
`
`mmemmomz<m,_.mmBmz
`
`1
`
`
`
`mmmmEOMEmw<mmm=>_mzmomm
`
`Em
`
`zoimomzoo2
`
`m>oEmm
`
`zoemqmva
`
`ozm4/«mm.
`
`m0:
`
`N_m
`
`Emmamzmm
`
`
`
`
`
`MQOZ.H<zo:.<2momzHmmoem
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 6 of8
`
`US 8,140,625 B2
`
`602
`
`START
`
`600
`
`/
`
`604
`
`
` 606
`
`ADD NEW PHYSICAL NODE
`
`
`
`LOCATE LIVE PHYSICAL NODE
`
`
`
`
`PERFORM EMPTY PHYSICAL
`
`NODE ALGORITHM 608
`
`CREATE COMPONENTS
`
`ROUTE MESSAGES THROUGH
`PEER TO PEER NETWORK
`
`61 O
`
`612
`
`614
`
`6 1 6
`
`618
`
`UPDATE PEER TO PEER
`NETWORK
`
`MONITOR STATUS OF PEERS
`
`
`
`IS PEER REACHABLE7
`
`Y
`
`
`
`RECOVER DATA FROM
`UNREACHABLE PEERS
`
`
`
`
`
`622
`
`END
`
`620
`
`FIG. 6
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 7 of8
`
`US 8,140,625 B2
`
`702
`
`START
`
`700
`
`./
`
`704
`
`706
`
`708
`
`
`COMPUTE ALL POSSIBLE
`. CHANGES TO THE NETWORK
`
`REQUEST INFORIVIATION FROM A
`
`POTENTIAL NODE
`
`
`
`
`IS CURRENT STATE CLOSE TO
`
`KNOWN STATISTICS?
`
`
`
`
`710
`
`VOTE ON NEW COMPONENT
`
`LOCATION
`
`712
`
`IS VOTE WON?
`
`Y
`
`714
`
`SEND MESSAGE TO
`
`DESTINATION NODE
`
`FIG. 7
`
`END
`
`71 6
`
`
`
`U.S. Patent
`
`Mar. 20, 2012
`
`Sheet 8 of8
`
`US 8,140,625 B2
`
`800
`
`FIG. 8
`
`NETWORK
`INTERFACE
`
`88
`
`PROCESSOR
`
`STORAGE
`
`MEMORY
`
`
`
`US 8,140,625 B2
`
`1
`METHOD FOR OPERATING A FIXED
`PREFIX PEER TO PEER NETWORK
`
`This application claims the benefit of U.S. Provisional
`Application No. 60/890,661 filed Feb. 20, 2007 which is
`incorporated herein by reference.
`
`5
`
`BACKGROUND OF THE INVENTION
`
`15
`
`The present invention relates generally to peer to peer 10
`networking and more particularly to storing data in peer to
`peer networks.
`Peer to peer networks for storing data may be overlay
`networks that allow data to be distributively stored in the
`network (e.g., at nodes). In the peer to peer networks, there are
`links between any two peers (e.g., nodes) that know each
`other. That is, nodes in the peer to peer network may be
`considered as being connected by virtual or logical links, each
`of which corresponds to a path in the underlying network
`(e.g., a path of physical links). Such a structured peer to peer
`network employs a globally consistent protocol to ensure that
`any node can efliciently route a search to some peer that has
`a desired file or piece of data. A common type of structured
`peer to peer network uses a distributed hash table (DHT) in
`which a variant of consistent hashing is used to assign own-
`ership of each file to a particular peer in a way analogous to a
`traditional hash table’s assignment of each key to a particular
`array slot.
`However, traditional DHTs do not readily support data
`redundancy and may compromise the integrity of data stored
`in systems using DHTs. To overcome these obstacles, data
`items are N-way replicated, but this results in high storage
`overhead and often requires multiple hashing functions to
`locate copies of the data. Further, it is diflicult to add support
`for monitoring data resiliency and automatic rebuilding of
`missing data.
`Accordingly, improved systems and methods oforganizing
`and storing data in peer to peer networks are required.
`
`20
`
`25
`
`30
`
`35
`
`2
`
`ity of the physical nodes, facilitates communication between
`the physical nodes associated with the components to deter-
`mine if a component is lost, and if a component is lost,
`initiates a recovery procedure. The recovery procedure
`involves evaluating a plurality of new host locations for a
`replacement component, voting on the plurality of replace-
`ment locations, and creating a replacement component from
`data in the original components.
`These and other advantages of the invention will be appar-
`ent to those of ordinary skill in the art by reference to the
`following detailed description and the accompanying draw-
`ings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a diagram of an exemplary peer to peer network
`according to an embodiment of the invention;
`FIG. 2 is a diagram of an exemplary peer to peer network
`according to an embodiment of the invention;
`FIG. 3 is a diagram of an exemplary peer to peer network
`according to an embodiment of the invention;
`FIG. 4 is a depiction of data to be stored in a peer to peer
`network;
`FIG. 5 is a flowchart of a method of storing data in a fixed
`prefix peer to peer network according to an embodiment of
`the present invention;
`FIG. 6 shows a flowchart ofa method ofoperation ofa node
`in a fixed prefix peer to peer network according to an embodi-
`ment of the present invention;
`FIG. 7 shows a flowchart of a generic change method; and
`FIG. 8 is a schematic drawing of a controller.
`
`DETAILED DESCRIPTION
`
`The present invention extends the concept of Distributed
`Hash Tables (DHTs) to create a more robust peer to peer
`network. The improved methods of storing data described
`herein allow for a simple DHT organization with built-in
`support for multiple classes of data redundancy which have a
`smaller storage overhead than previous DHTs. Embodiments
`of the invention also support automatic monitoring of data
`resilience and automatic reconstruction of lost and/or dam-
`
`aged data.
`The present invention provides greater robustness and
`resiliency to the DHT-based peer to peer network known as a
`Fixed Prefix Network (FPN) disclosed in U.S. patent appli-
`cation Ser. No. 10/813,504, filed Mar. 30, 2004, now U.S. Pat.
`No. 7,304,994, issued on Dec. 4, 2007, and incorporated
`herein by reference. Unlike traditional peer to peer networks,
`FPNs and networks according to the present
`invention,
`known as FPNs with Supernodes (FPN/SN), are constructed
`such that the contributed resources (e.g., nodes) are dedicated
`to the peer to peer system and the systems are accordingly
`significantly more stable and scalable.
`FIGS. 1-3 depict various illustrative embodiments of peer
`to peer networks utilizing FPN/SNs. FIGS. 1-3 are exemplary
`diagrams to illustrate the various structures and relationships
`described below and are 11ot meant to limit the invention to the
`
`BRIEF SUMMARY OF THE INVENTION
`
`40
`
`The present invention generally provides a method of oper-
`ating a fixed prefix peer to peer network having a plurality of
`physical nodes logically divided into storage slots.A set ofthe
`storage slots host components and are logically organized 45
`into a virtual node. A physical node receives a message with
`identification information indicative of a version of a compo-
`nent hosted on another physical node, determines a relative
`age ofthe component based on the identification information,
`and if the version of the component hosted on the second 50
`physical node is newer than a version of the component
`hosted on the node, stores the identification information as a
`component skeleton at a storage slot on the node. The node
`also determines if the identification information stored at the
`
`storage slot on the first physical node activates a component 55
`and, ifthe identification information activates the component,
`activates the component by hosting the component on the
`node.
`
`The present invention further provides a method of opera-
`tion a node of a peer to peer network having a plurality of 60
`physical nodes each having a plurality of storage slots. The
`peer to peer network stores data in the plurality of storage
`slots, associates a plurality of components with a plurality of
`the storage slots, associates each component with the physical
`node that has the storage slot that hosts the component, asso-
`ciates a set of the components logically as a virtual node
`where the virtual node comprises storage slots from a plural-
`
`65
`
`specific network layouts shown.
`FIG. 1 is a diagram of an exemplary peer to peer network
`100 foruse with an embodiment of the present invention. The
`peer to peer network 100 has a plurality of physical nodes
`102,104,106, and 108 that communicate with each other
`through an underlying transport network 110 as is known.
`There is no restriction on the location, grouping, or number of
`the physical nodes 102-108 with regards to the present inven-
`tion. Though depicted in FIG. 1 as four physical nodes 102-
`
`
`
`US 8,140,625 B2
`
`3
`108, it is understood that any number ofnodes in any arrange-
`ment may be utilized. Similarly, the physical nodes 102-108
`may vary in actual storage space, processing power, and/or
`other resources.
`
`Physical nodes 102-108 each have associated memories
`and/or storage areas (not shown) as is known. The memories
`and/or storage areas of physical nodes 102-108 are each logi-
`cally divided into a plurality of slots approximately propor-
`tional to the amount of storage available to each physical
`node. That is, the memory and/or storage area of physical
`node 102 is logically divided into approximately equivalent-
`sized slots 112a, 112b, 112c, and 112d, the memory and/or
`storage area of physical node 104 is logically divided into
`approximately equivalent-sized slots 114a, 114b, 114c, and
`114d, the memory and/or storage area ofphysical node 106 is
`logically divided into approximately equivalent-sized slots
`116a, 116b, 116c, and 116d, and the memory and/or storage
`area of physical node 108 is logically divided into approxi-
`mately equivalent-sized (e.g., in terms of storage capacity)
`slots 118a, 118b, H80, and 118d. A physical node may be
`logically divided in that its memory and/or storage allocation
`may be allocated as different storage areas (e.g., slots). Physi-
`cal nodes 102-108 may be divided into any appropriate num-
`ber of slots, the slots being representative of an amount of
`storage space in the node. In other words, data may be stored
`in the nodes 102-108 in a sectorized or otherwise compart-
`mentalized manner. Ofcourse, any appropriate division ofthe
`storage and/or memory of physical nodes 102-108 may be
`used and slots 112a-d, 114a-d, 116a-d, and 118a-dmay be of
`unequal size. Further, slot size may not be static and may
`grow or shrink and slots may be split and/or may be merged
`with other slots.
`
`Each physical node 102-108 is responsible for the storage
`and retrieval of one or more objects (e.g., files, data, pieces of
`data, data fragments, etc.) in the slots 112a-d, 114a-d, 116a-
`d, and 118a-d, respectively. Each object may be associated
`with a preferably fixed-size hash key of a hash function. In
`operation, one or more clients 120 may communicate with
`one or more ofphysical nodes 102-108 and issue a request for
`a particular object using a hash key.
`Slots 112a-d, 114a-d, 116a-d, and 118a-dmay also each be
`associated with a component of a virtual (e.g., logical) node
`(discussed in further detail below with respect to FIGS. 2 and
`3). Herein, components are not physical entities, but repre-
`sentations of a portion of a virtual node. That is, components
`may be logical representations of and/or directions to or
`addresses for a set or subset of data that is hosted in a particu-
`lar location in a node (e.g., hosted in a slot). Storage locations
`of data fragments (e.g., data fragments discussed below) are
`logically organized into a virtual node.
`FIG. 2 is a diagram of a portion of an exemplary peer to
`peer network 200 for use with an embodiment of the present
`invention. The peer to peer network 200 is similar to peer to
`peer network 100 and has a plurality of physical nodes 202,
`204, 206, 208, 210, and 212 similar to physical nodes 102-
`108. Physical nodes 202-212 are each logically divided into a
`plurality of slots approximately proportional to the amount of
`storage available to each physical node. That is, physical node
`202 is dividedlogically into slots 214a, 214b, 214c, and 214d,
`physical node 204 is divided logically into slots 216a, 21617,
`216c, and 216d, physical node 206 is divided logically into
`slots 218a, 218b, 218c, and 218d, physical node 208 is
`divided logically into slots 220a, 220b, 220c, and 220d,
`physical node 210 is divided logically into slots 222a, 222b,
`222c, and 222d, and physical node 212 is divided logically
`into slots 224a, 224b, 2240, and 2240/. For simplicity of
`discussion and depiction in FIG. 2, since each slot 214a-d,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`
`216a-d, 218a-d, 220a-d, 222a-d, and 224a-d hosts a compo-
`nent, the component corresponding to its l1ost slot is referred
`to herein with the same reference numeral. For example, the
`component hosted in slot 214c of physical node 202 is
`referred to as component 214c.
`A grouping of multiple components is referred to as a
`virtual node (e.g., a “supemode”). In the example of FIG. 2,
`supemode 226 comprises components 214b, 216c, 218b,
`220d, 222a, and 224a. A virtual node (e.g., supemode) is thus
`a logical grouping of a plurality of storage locations on mul-
`tiple physical nodes. The supernode may have any number of
`components—where the number of components is the super-
`node cardinality (e.g., the number of components in a super-
`node)—associated with any number of physical nodes in a
`network and a supemode need not have components from
`every physical node. However, each component of a supem-
`ode must be hosted in slots on different physical nodes. That
`is, no two components in a supemode should be hosted at the
`same physical node. The total number of components in a
`supemode may be given by a predetermined constant—su-
`pernode cardinality. In some embodiments, the supemode
`cardinality may be in the range of 4-32. The supernode car-
`dinality may be a predetermined (e.g., desired, designed, etc.)
`number of data fragments.
`In some embodiments, a larger supernode cardinality is
`chosen to increase flexibility in choosing data classes. In
`alternative embodiments, a smaller supernode cardinality is
`chosen to provide greater access to storage locations (e.g.,
`disks) in read/write operations. Here, data classes define a
`level of redundancy where lower data classes (e.g., data class
`low) have less redundancy and higher data classes (e.g., data
`class high) have more redundancy. There may be a number of
`data classes equal to the predetermined supernode cardinality.
`The lowest data class is defined as having no redundant frag-
`ment and the highest class is defined as having (supemode
`cardinality—l) redundant fragments.
`In an exemplary embodiment, data class low may refer to a
`single redundant fragment and data class high may refer to
`four redundant fragments. Of course, any appropriate number
`of data fragments may be set for data class low and/or data
`class
`In this exemplary embodiment, data blocks that
`are classified by user as data class low will be divided into a
`number of fragments equal to a supemode cardinality, where
`there are (supernode cardinality—l) original fragments and
`one redundant fragment. Accordingly, one fragment may be
`lost and the data block may be recreated. Using data class high
`(e.g., four redundant fragments) a block of data will be
`divided into fragments such that four of them will be redun-
`dant. Thus, four fragments may be lost and the original block
`of data may be recreated. Fragmentation, especially redun-
`dant fragments, is described in concurrently filed, commonly
`assigned and co-pending U.S. patent application Ser. No.
`12/023,l33,filed on Jan. 31, 2008, entitled “Method and
`Apparatus for Storing Data in a Peer to Peer Network,”incor-
`porated by reference herein.
`Components ofthe supernode may be considered peers and
`may similarly associated (e.g.,
`in a hash table, etc.),
`addressed, and/or contacted as peer nodes in a traditional peer
`to peer network.
`FIG. 3 depicts a high level abstraction of an exemplary peer
`to peer network 300 according to an embodiment of the
`invention. Peer to peer network 300 is similar to peer to peer
`networks 100 and 200 and has multiple physical nodes 302,
`304, 306, and 308. Each of the physical nodes 302-308 is
`divided into multiple slots as described above. In the particu-
`lar example ofFIG. 3, each ofthe physical nodes 302-308 has
`eight slots. As in FIG. 2, each slot 310, 312, 314, 316, 318,
`
`
`
`US 8,140,625 B2
`
`5
`320, 322, or 324 hosts a component 310, 312, 314, 316, 318,
`320, 322, or 324. Components 310-324 are each associated
`with a corresponding supemode and are distributed among
`the physical nodes 302-308. In this way, eight supemodes are
`formed, each with one component 310-324 on each of the
`four physical nodes 302-308. For example, a first supernode
`is formed with four components—component 310 hosted on
`physical node 302 (e.g., in a slot 310), component 310 hosted
`in physical node 304 (e.g., in a slot 310), component 310
`hosted in physical node 306 (e.g., in a slot 310), and compo-
`nent 310 hosted in physical node 308 (e.g., in a slot 310). The
`first supemode, comprising components 310, is shown as
`dashed boxes. A second supemode comprises the four com-
`ponents 312 hosted in physical nodes 302-308 and is shown
`as a trapezoid. Of course, these are merely graphical repre-
`sentations to highlight the different components comprising
`different supemodes and are not meant to be literal represen-
`tations of what a slot, component, node, or supernode might
`look like. The remaining six supernodes are formed similarly.
`To facilitate data storage using the supernodes as described
`and shown in FIGS. 1-3, the fixed prefix network model of
`DHTs (e.g., FPN) may be extended to use supernodes. Any
`advantageous hashing function that maps data (e.g., objects,
`files, etc.) to a fixed-size hash key may be utilized in the
`context ofthe present invention. The hash keys may be under-
`stood to be fixed-size bit strings (e.g., 5 bits, 6 bits, etc.) in the
`space containing all possible combinations of such strings. A
`subspace of the hashkey space is associated with a group of
`bits of the larger bit string as is known. For example, a group
`of hash keys beginning with 110 in a 5 bit string would
`include all hash keys except those beginning with 000, 001,
`010, 011, 100, and 101. That is, the prefix is 110. Such a
`subspace of the hashkey space may be a supernode and a
`further specification may be a component of the supemode.
`The prefix may be fixed for the life of a supemode and/or
`component. In such embodiments, the peer to peer network is
`referred to as a fixed-prefix peer to peer network. Other meth-
`ods of hashing may be used as appropriate.
`FIG. 4 is an exemplary supernode composition and com-
`ponent description table 400 according to an embodiment of
`the present invention. The supemode composition and com-
`ponent description table 400 may be used in conjunction with
`the peer to peer network 200, for example. Each supernode
`(e.g., supernode 226) is described by a supernode composi-
`tion (e.g., with supernode composition and component
`description table 400) comprising the supernode prefix 402,
`an array 404 of the component descriptions, and a supernode
`version 406. Since each component has a description as
`described below, the array 402 size is equal to the supernode
`cardinality. The supernode version 406 is a sequence number
`corresponding to the current incarnation of the supemode.
`Each supernode is identified by a fixed prefix 402 as described
`above and in U.S. patent application Ser. No. 10/813,504,
`now U.S. Pat. No. 7,304,994, issued on Dec. 4, 2007. For
`example, in hashing and/or storing data in peer to peer net-
`work 200 according to supemode composition and compo-
`nent description table 400 in which hash keys are fixed size bit
`strings, the supemode 226 has a fixed prefix of 01 101 . There-
`fore, any data that has a hash key beginning with 01101 will
`be associated with supernode 226.
`In operation, each component (e.g., 214b, 216c, 218b,
`220d, 222a, 224a, etc.) in the component array 404 is
`described by a component description comprising a fixed
`prefix 408, a component index 410, and a component version
`412. All components of the supernode (e.g., in array 404) are
`also assigned the same fixed prefix for their lifetime. The
`component index 410 of each component corresponds to a
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`location in the supemode array. A component’s index is fixed
`for the component’s lifetime and is an identification number
`pointing to the particular component. A component index is a
`number between 0 and (supemode cardinality—1). A com-
`ponent’s version is a version number sequentially increased
`whenever the component changes hosts (e.g., nodes). In some
`embodiments, described in detail below, a component may be
`split or moved from one physical node to another and its
`version is increased in such instances.
`
`Supernode composition and component description table
`400 is an example of an organization of the information
`related to physical nodes, supernodes, and their respective
`components. Of course, one skilled in the art would recognize
`other methods of organizing and providing such information,
`such as storing the information locally on physical nodes in a
`database, storing the information at a remote location in a
`communal database, etc.
`Updated indications of the supernode composition are
`maintained (e.g., in supemode composition and component
`description table 400, etc.)
`to facilitate communication
`amongst peers. Further, physical nodes associated with the
`components maintain compositions of neighboring physical
`and/or virtual nodes. To maintain such compositions, physi-
`cal nodes associated with components ping peers and neigh-
`bors as is known. In this way, a physical node associated with
`a component may internally ping physical nodes associated
`with peers in the component’s supernode to determine virtual
`node health and/or current composition. Further, a physical
`node associated with a component may externally ping physi-
`cal nodes associated with neighbors (e.g., components with
`the san1e index, but belonging to a different supernode) to
`propagate and/or collect composition information. Of course,
`other systems and methods of organizing and/or keeping
`track of supemodes and their components, including version/
`incarnation information may be used as appropriate.
`FIG. 5 is flowchart of a method 500 of organizing a frame-
`work (e.g., a skeleton) of a component in a fixed prefix peer to
`peer network with supernodes. A component skeleton may be
`organized (e.g., created and/or stored) on a physical node
`(e.g., physical nodes 202-212) to reserve storage space for a
`component
`(e.g., components 214a-d, 216a-d, 218a-d,
`220a -d, 222a-d, 224a-d, etc.). That is, the component skel-
`eton may provide indication the a slot on the physical node is
`utilized. The method begins at step 502.
`In step 504, a message is received from a remote peer (e.g.,
`another component of a supemode and/or physical node) at a
`destination node (e.g., physical nodes 202-212). The message
`may contain information about the description (e.g., compo-
`sition) of the remote peer as described above with respect to
`FIG. 4. The composition may be as discussed above and/or
`may include a composition identification of the remote peer
`including a prefix and composition version number. The
`method may proceed simultaneously and/or in series to
`method steps 506, 514, and/or 518.
`In step 506, a determination is made as to the age (e.g.,
`incarnation, version, etc.) of the composition. If the compo-
`sition is older than the newest composition known by the
`destination node (e.g., ifthe composition l1as a lower version
`number), the message is rejected at step 508 and the method
`ends at step 522. If the composition is newer than the young-
`est composition known by the destination node (e.g., if the
`composition has a higher version number), the method pro-
`ceeds to step 510. Here, a composition is “known by” a node
`ifthe node has such information stored at the node, ifthe node
`has received information indicative of such compositions,
`
`
`
`US 8,140,625 B2
`
`7
`and/or if the node has access to (e.g., via a controller 700
`and/or supemode composition and component description
`table 400) such information.
`In step 510, information about the skeleton (e.g., a compo-
`sition and/or component index) is stored at the destination 5
`node. In this way, a component skeleton is created. A reply
`message is then sent to the originating node (e.g., the remote
`peer) in step 512 indicating success or failure of component
`creation.
`
`Following step 512, the method 500 may return control to
`step 504. That is, after a skeleton is created, the physical node
`hosting the skeleton receives information about each new
`composition and determines if the newly received composi-
`tion is newer than the present composition. If the newly
`received composition is newer, it replaces the older compo-
`sition at the node (e.g., is stored as in step 510).
`In step 514, a check is performed to determine ifthe present
`composition activates the component related to the compo-
`nent skeleton. A composition activates the component if the
`composition is newer than the composition that is part of the
`skeleton identifier, has the same index as the component, the
`same prefix or a longer prefix than the skeleton, and the
`incarnation of the physical node hosting the component is the
`same incarnation as the new component. In step 516, the
`component is activated. That is, the component is hosted on
`the new physical node. If the composition does not activate
`the component, the method returns control to step 504 to
`receive more messages.
`In step 518, a check is performed to determine ifthe known
`compositions (e.g., all the compositions received in messages
`in step 504) obsolesce component skeleton. The component
`skeleton is determined to be obsolete if a set of the known
`
`compositions are newer than the component skeleton’s com-
`position and all of the compositions of the set cover the space
`represented by the skeleton’s prefix and index. IIere, the
`space represented by the skeleton’s prefix and index is the
`Cartesian product ofhashing space and index space spanning
`from zero to supemode cardinality minus one. If the compo-
`nent skeleton is obsolete, the skeleton is removed in step 520.
`If it is not, the method returns control to step 504 to receive
`more messages. The method ends at step 522.
`FIGS. 6A and 6B depict a flowchart of a method 600 of
`operation of nodes in a fixed prefix peer to peer network
`according to an embodiment of the present invention. The
`method 600 may be performed in a network as described
`above with respect to FIGS. 1-4 wherein the fixed prefix
`network extends an FPN to include supernodes. Further, con-
`ventional functions ofpeer to peer networks, especially FPNs
`as disclosed in U.S. patent application Ser. No. 10/813,504,
`filed Mar. 30, 2004, now U.S. Pat. No. 7,304,994, issued on
`Dec. 4, 2007, and Incorporated herein by reference, may be
`modified for use with the present invention. Accordingly, it
`should be understood that certain method steps ofmethod 600
`described herein may be performed in other orders and are
`meant to be illustrative. One of skill in the art would recognize
`the necessary modifications to the conventional use of a peer
`to peer network that would be applied with an FPN/SN sys-
`tem described herein. The method starts at step 602 in FIG.
`6A.
`
`In step 604, a new physical node is added. The physical
`node may be a physical node similar to physical nodes 102-
`108, 202-212, and 302-308 ofFIGS. 1, 2, and 3, respectively.
`In step 606, the new physical node locates a live physical
`node in a peer to peer network (e.g., networks 100, 200, 300,
`etc.). To find a live physical node, the new physical node may
`locate the live node via a predetermined (e.g., user supplied,
`system supplied, etc.) list of nodes in the network. In the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`context of the present invention, a live node is a node having
`data currently stored thereon. In an alternative embodiment,
`the new physical node may use a Service Location Protocol
`(SLP) as is known to find one or more live physical nodes in
`the network.
`
`In step 608, an empty physical node method is performed.
`A physical node may be considered to be “empty” where
`there are no active components (e.g., components 214a-d,
`216a-d, 218a-d, 220a-d, 222a-d, and 224a-cl), no obsolete
`components, no component skeletons (as described above
`with respect to FIG. 5), and no user supplied data stored at the
`node. Accordingly, method step 608 may be performed when-
`ever a physical node in a fixed prefix network with supernodes
`is found to be empty even if it is not a newly added node from
`step 604. The method may be performed as follows:
`The empty node transmits a number of empty node probes
`to nodes that have been identified as having more than one
`component from their node statistics. Node statistics may be
`a summary of information about a physical node such as the
`number of used slots, the number of free slots, the number of
`hosted components, total storage size, free disk space, etc. If
`there are not enough nodes, the empty 11ode transmits the
`remainder ofthe necessary probes by generating random long
`prefixes and transmitting the empty node probes to those
`prefixes. In embodiments having a relatively small number of
`physical nodes, all nodes may be contacted. In embodiments
`having large numbers

Accessing this document will incur an additional charge of $.
After purchase, you can access this document again without charge.
Accept $ ChargeStill Working On It
This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.
Give it another minute or two to complete, and then try the refresh button.
A few More Minutes ... Still Working
It can take up to 5 minutes for us to download a document if the court servers are running slowly.
Thank you for your continued patience.

This document could not be displayed.
We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.
You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.
Set your membership
status to view this document.
With a Docket Alarm membership, you'll
get a whole lot more, including:
- Up-to-date information for this case.
- Email alerts whenever there is an update.
- Full text search for other cases.
- Get email alerts whenever a new case matches your search.

One Moment Please
The filing “” is large (MB) and is being downloaded.
Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!
If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document
We are unable to display this document, it may be under a court ordered seal.
If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.
Access Government Site