throbber
Tarzan: A Peer-to-Peer Anonymizing Network Layer
`
`Michael J. Freedman
`NYU Dept of Computer Science
`715 Broadway #715
`New York, NY 10003 USA
`mfreed@cs.nyu.edu
`
`Robert Morris
`MIT Lab for Computer Science
`200 Technology Sq. #509
`Cambridge, MA 02139 USA
`rtm@lcs.mit.edu
`
`ABSTRACT
`Tarzan is a peer-to-peer anonymous IP network overlay. Because
`it provides IP service, Tarzan is general-purpose and transparent
`to applications. Organized as a decentralized peer-to-peer overlay,
`Tarzan is fault-tolerant, highly scalable, and easy to manage.
`Tarzan achieves its anonymity with layered encryption and multi-
`hop routing, much like a Chaumian mix. A message initiator
`chooses a path of peers pseudo-randomly through a restricted topol-
`ogy in a way that adversaries cannot easily influence. Cover traffic
`prevents a global observer from using traffic analysis to identify an
`initiator. Protocols toward unbiased peer-selection offer new direc-
`tions for distributing trust among untrusted entities.
`Tarzan provides anonymity to either clients or servers, without
`requiring that both participate. In both cases, Tarzan uses a net-
`work address translator (NAT) to bridge between Tarzan hosts and
`oblivious Internet hosts.
`Measurements show that Tarzan imposes minimal overhead over
`a corresponding non-anonymous overlay route.
`
`1.
`
`INTRODUCTION
`The ultimate goal of Internet anonymization is to allow a host to
`communicate with an arbitrary server in such a manner that nobody
`can determine the host’s identity. Toward this goal, we envision a
`system that uses an Internet-wide pool of nodes, numbered in the
`thousands, to relay each others’ traffic to gain anonymity.
`Different entities may be interested in exposing the host’s iden-
`tity, each with varying capabilities to do so: curious individuals
`or groups may run their own participating machines to snoop on
`traffic; parties skirting legality may break into a limited number of
`others’ machines; and large, powerful organizations may tap and
`monitor Internet backbones.
`Clearly, each type of adversary suggests different design criteria
`for an anonymizing system. Prior systems have either underesti-
`mated the ease of cracking or crashing individual machines, or dis-
`counted the prevalence of wide-spread eavesdropping capabilities,
`exemplified by the “Great Firewall of China” [30], the FBI’s Carni-
`vore system [11], or subpoenas of Tier-1 ISP traffic for copyright-
`protection compliance [22].
`
`Permission to make digital or hard copies of all or part of this work for
`personal or classroom use is granted without fee provided that copies are
`not made or distributed for profit or commercial advantage and that copies
`bear this notice and the full citation on the first page. To copy otherwise, to
`republish, to post on servers or to redistribute to lists, requires prior specific
`permission and/or a fee.
`CCS’02, November 18–22, 2002, Washington, DC, USA.
`Copyright 2002 ACM 1-58113-612-9/02/0011 ...$5.00.
`
`This paper describes Tarzan, a practical system aimed at realiz-
`ing anonymity against all three flavors of adversary. First, however,
`we discuss why less ambitious approaches are not adequate.
`In the simplest alternative, a host sends messages to a server
`through a proxy, such as Anonymizer.com [1]. This system fails
`if the proxy reveals a user’s identity [18] or if an adversary can
`observe the proxy’s traffic. Furthermore, servers can easily block
`these centralized proxies and adversaries can prevent usage with
`denial-of-service attacks.
`To overcome this single point of failure, a host can connect
`to a server through a set of mix relays [3]. The anonymous re-
`mailer system [10], Onion Routing [26], and Zero-Knowledge’s
`Freedom [13] offer such a model, providing anonymity through a
`small, fixed core set of relays. However, if a corrupt relay receives
`traffic from a non-core node, the relay can identify that node as
`the ultimate origin of the traffic. Colluding entry and exit relays
`can use timing analysis to determine both source and destination.
`Even an external adversary can mount the same attack. Therefore,
`the connecting host remains vulnerable to individual relay failures,
`and these relays provide obvious targets for attacking or blocking.
`Few of these systems attempt to provide anonymity against an
`adversary that can passively observe all network traffic. Such pro-
`tection requires fixing traffic patterns or using cover traffic to make
`such traffic analysis more difficult. Proposals that do exist have
`several shortcomings, however. Some protect only the core of the
`static mix network and thus allow traffic analysis on its edges [2,
`26]. Some simulate full synchrony and thus trivial DoS attacks halt
`their operation in entirety [7]. And some require central control and
`knowledge of the entire network [15].
`Tarzan, on the other hand, does not suffer from these same weak-
`nesses. Its main contributions are two-fold.
`First, Tarzan extends known mix-net designs to a peer-to-peer
`environment. Tarzan nodes communicate over sequences of mix
`relays chosen from an open-ended pool of volunteer nodes, without
`any centralized component. We present techniques to securely dis-
`cover and select other nodes as communication relays: All peers are
`potential originators of traffic; all peers are potential relays. Such a
`scalable design lessens the significance of targeted attacks and in-
`hibits network-edge analysis, as a relay cannot tell if it is the first
`hop in a mix path. Furthermore, we leverage our new concept of a
`domain to remove potential adversarial bias: An adversary may run
`hundreds of virtual machines, yet is unlikely to control hundreds of
`different IP subnets.
`Second, Tarzan introduces a scalable and practical technique
`for cover traffic that uses a restricted topology for packet routing:
`Packets can be routed only between mimics, or pairs of nodes as-
`signed by the system in a secure and universally-verifiable manner.
`This technique is practical in that it does not require network syn-
`
`CODE200 ET AL. EXHIBIT 1033
`Page 1 of 14
`
`

`

`chrony and consumes only a small factor more bandwidth than the
`data traffic to be hidden, and it is powerful as it shields all network
`participants, not only core routers.
`Tarzan allows client applications on participating hosts to talk to
`non-participating Internet servers through special IP tunnels. The
`two ends of a tunnel are a Tarzan node running a client application
`and a Tarzan node running a network address translator; the lat-
`ter forwards the client’s traffic to its ultimate Internet destination.
`Tarzan is transparent to both client applications and servers, though
`it must be installed and configured on participating nodes.
`Tarzan supports a systems-engineering position: anonymity can
`be built-in at the transport layer, transparent to most systems, triv-
`ial to incorporate, and with a tolerable loss of efficiency compared
`to its non-anonymous counterpart. This approach immediately
`reduces the effort required for application writers to incorporate
`anonymity into existing designs, and for users to add anonymity
`without changing existing non-anonymous applications. In the long
`term, the ability for individual anonymizing relays to easily partic-
`ipate in multiple kinds of traffic may make it easier to achieve a
`critical mass of anonymizing relays.
`The rest of this paper is structured as follows. Section 2 explains
`Tarzan’s design goals and threat models. Section 3 describes the de-
`sign of Tarzan: its tunneling architecture, peer discovery and selec-
`tion protocols, and restricted topology and cover traffic mechanism.
`Section 4 presents an analysis of Tarzan’s anonymity properties.
`Section 5 describes Tarzan’s implementation, and Section 6 evalu-
`ates its performance. Section 7 discusses integration transparency,
`Section 8 describes related work, and Section 9 concludes.
`
`2. DESIGN GOALS AND NETWORK MODEL
`This paper uses the following terminology. A node is an Internet
`host’s virtual identity in the system, created by running an instan-
`tiation of the Tarzan software on a single IP address. A tunnel is
`a virtual circuit for communication spread across an ordered se-
`quence of nodes. A relay is a node acting as a packet forwarder as
`part of a tunnel.
`We designed Tarzan to meet a number of goals. Ordered by pri-
`ority, these goals are the following:
`
`1. Application independence: Tarzan should be transparent to
`existing applications and allow users to interact with existing
`services. To achieve this, Tarzan should provide the abstrac-
`tion of an IP tunnel.
`
`2. Anonymity against malicious nodes: Tarzan should pro-
`vide sender or recipient anonymity against colluding nodes.
`That is, a particular host should not be uniquely linkable
`as the sender (recipient) of any message, or that a message
`should not be linkable to any sender (recipient) [20]. We
`consider these properties in terms of an anonymity set: the
`set of possible senders of a message. The larger this set, the
`“more” anonymous an initiator remains.
`
`These properties implies the weaker relationship anonymity:
`an adversary should not be able to identify a pair of hosts as
`communicating with each other, irrespective of which host is
`running Tarzan.
`
`3. Fault-tolerance and availability: Tarzan should resist an
`adversary’s attempts to overload the entire system or to block
`system entry or exit points. Tarzan should minimize the dam-
`age any one adversary can cause by running a few compro-
`mised machines.
`
`unswitched
`LAN
`
`local subnet
`
`border gateway
`
`X
`
`honest routers
`malicious routers
`corrupted domains
`
`honest nodes
`malicious nodes
`spoofed nodes
`
`Figure 1: Tarzan network model. In relation to node X, ad-
`versarial machines can control address spaces and can spoof
`virtual nodes within corrupted domains.
`
`4. Performance: Tarzan should maximize the performance of
`tunnel transmission, subject to our anonymity requirements,
`to make Tarzan a viable IP-level communication channel.
`
`5. Anonymity against a global eavesdropper: An adversary
`observing the entire network should be unable to determine
`which Tarzan relay initiates a particular message. Therefore,
`a node’s traffic patterns should be statistically independent of
`it originating data traffic.
`
`Because anyone can join Tarzan, the system will likely be tar-
`geted by misbehaving users. While a correct host runs only one
`honest node—which forwards packets properly, does not log ad-
`dressing or timing information, and so on—an adversary can run
`potentially many malicious nodes or spoof many fake addresses. A
`node is malicious if it modifies, drops, or records packets, analyzes
`traffic patterns, returns incorrect network information, or otherwise
`does not properly follow the protocols.
`From a naive viewpoint, the fraction of Tarzan nodes that are ma-
`licious determines the probability that a tunnel relay is malicious.
`Yet, a single compromised computer may operate on multiple IP
`addresses and thus present multiple Tarzan identities.
`To defend against such a situation, we make the observation that
`a single machine likely controls only a contiguous range of IP ad-
`dresses, typically by promiscuously receiving packets addressed to
`any IP address on a particular LAN or by acting as a gateway router.
`This observation is useful in bounding the damage each mali-
`cious node can cause. We will call this subnet controllable by a
`single malicious machine a domain.1
`A node belongs to a /d domain if the node’s d-bit IP prefix
`matches that of the domain. Figure 1 shows the dependence of
`intra-domain node failure: a malicious machine “owns” all of the
`address space behind it.
`Domains capture some notion of fault-independence: While an
`adversary can certainly subvert nodes within the same domain in
`
`1Our domain notion is completely unrelated to DNS and applies to
`both IPv4 and IPv6.
`
`CODE200 ET AL. EXHIBIT 1033
`Page 2 of 14
`
`

`

`client
`app
`
`initiator
`
`(App
`
`Priv)
`
`kernel
`divert
`
`src: App
`dst: Dest
`
`relayd
`
`relayd
`
`relayd
`
`(31
`
`17)
`
`(17
`
`59)
`
`Tag: 31
`
`src: Priv
`dst: Dest
`
`Tag: 17
`src: Priv
`dst: Dest
`
`Tag: 59
`src: Priv
`dst: Dest
`
`PNAT
`(Priv
`PNAT)
`
`Internet
`Dest
`
`src: PNAT
`dst: Dest
`
`relayd
`
`Figure 2: Tarzan Architecture Overview: An IP packet is diverted to the local tunnel initiator, which NATs it to a private address
`space, wraps it in several layers of encryption, and sends it to the first relay in UDP. Based on the packet’s flow tag, the relay decrypts
`one layer of the encryption and sends the result to the next relay. The PNAT decrypts the last layer, extracts the original IP packet,
`NATs the packet to its own public address, and writes the raw packet to the Internet.
`
`a dependent fashion, nodes in different domains may fail indepen-
`dently. Therefore, when selecting relays, Tarzan should consider
`the notion of distinct domains, not that of distinct nodes.
`Ideally, we would know the actual size of each domain in address
`space and count all nodes within that address space as a single en-
`tity. However, this internetwork topology is non-uniform and dif-
`ficult to measure. Therefore, Tarzan chooses some fixed IP prefix
`size as its granularity for counting domains: first among /16 sub-
`net masks, then among /24 masks. We believe that this provides a
`reasonable notion of distinct physical and administrative control.2
`
`3. ARCHITECTURE AND DESIGN
`This section describes Tarzan’s design: its basic tunnel mecha-
`nism, its peer-discovery protocol, and its cover-traffic technique.
`Figure 2 shows a simple Tarzan overlay network. All participat-
`ing nodes run software that 1) discovers other participating nodes,
`2) intercepts packets generated by local applications that should be
`anonymized, 3) manages tunnels through chains of other nodes to
`anonymize these packets, 4) forwards packets to implement other
`nodes’ tunnels, and 5) operates a NAT (network address translator)
`to forward other participants’ packets onto the ordinary Internet.
`Typical use proceeds in three stages. First, a node running an
`application that desires anonymity selects a set of nodes to form a
`path through the overlay network. Next, this source-routing node
`establishes a tunnel using these nodes, which includes the distri-
`bution of session keys. Finally, it routes data packets through this
`tunnel. The exit point of the tunnel is a NAT. This NAT forwards
`the anonymized packets to servers that are not aware of Tarzan, and
`it receives the response packets from these servers and reroutes the
`packets over this tunnel.
`Tarzan restricts route selection to pairs of nodes that use cover
`traffic to maintain traffic levels independent of data rates. The sys-
`tem enforces this topology by assigning neighbors in a decentral-
`ized yet verifiable manner.
`Tarzan operates at the IP (Internet Protocol) level and offers a
`best-effort delivery model. End hosts must provide functionality
`like reliability or authentication.
`Tarzan uses layered encryption similar to Chaumian mixes [3]:
`each leg of the tunnel removes or adds a layer of encryption, de-
`pending upon the packet’s direction of traversal. The tunnel initia-
`tor sanitizes IP headers, as well as TCP headers if applicable.
`
`3.1 Packet relay
`A Tarzan tunnel passes two distinct types of messages between
`nodes: data packets, to be relayed through existing tunnels, and
`control packets, containing commands and responses that establish
`and maintain these virtual circuits. Tarzan encapsulates both packet
`types inside UDP.
`A flow tag (similar to MPLS [23]) uniquely identifies each link
`of each tunnel. A relay rapidly determines how to route a packet
`tag. Symmetric encryption hides data, and a MAC protects its in-
`tegrity, on a per-relay basis. Separate keys are used in each direc-
`tion of each relay.
`In the forward path, the tunnel initiator clears each IP packet’s
`source address field, performs a nested encoding for each tun-
`nel relay, and encapsulates the result in a UDP packet. More
`precisely, consider a tunnel that consists of a sequence of nodes
`T = (h1, h2, . . . , hl, hpnat).3 Let the forward encryption and in-
`tegrity keys for each node be ekhi and ikhi , respectively, and let
`seq be the packet sequence number, initiated to zero at the time
`of tunnel establishment. Then, an initiator produces the following
`block Bi for each relay hi in the tunnel, starting with hpnat:
`ci = ENC(ekhi ,{Bi+1})
`ai = MAC(ikhi ,{seq, ci})
`Bi = {seq, ci, ai}
`The origin tags block B1 with the first relay’s flow identifier and
`forwards the result to h1. The first relay extracts the packet’s pay-
`load, determines the relevant keys by its flow identifier, checks the
`block’s integrity, decrypts the block (i.e., strips off one layer of en-
`cryption), retags the resulting block B2, encapsulates it in a new
`UDP packet, and forwards the packet on to the next relay. The
`node drops any packet that fails its integrity check. This process
`continues until the packet reaches the last relay, which strips off the
`innermost layer of encryption, revealing the initiator’s IP packet.
`On the reverse path, each successive relay performs a single en-
`cryption with its appropriate key for the reverse direction, re-tags
`and forwards the packet back towards the origin. This process
`wraps the packet in layers of encryption, which the origin of the
`tunnel must unwrap by performing l +1 decryptions. This design
`places the bulk of the encryption workload on the node seeking
`anonymity. Nodes that are merely relaying perform only a single
`symmetric key operation per packet that is processed.4
`
`2Even years since the introduction of CIDR, active Internet ad-
`dresses still disproportionally belong to network prefixes that re-
`flect classful addressing [17].
`
`3Section 3.7 explains our strange bookkeeping with the last relay.
`4Section 3.7 describes an additional encryption-decryption used
`between immediate nodes.
`
`CODE200 ET AL. EXHIBIT 1033
`Page 3 of 14
`
`

`

`anonymity, any two users can communicate by each creating a tun-
`nel to a different PNAT; each user’s application connects to the
`other’s PNAT to form a double-blinded channel.
`3.4 Tunnel failure and reconstruction
`A tunnel fails if one of its relays stops forwarding packets. To de-
`tect failure, the initiator regularly sends ping messages to the PNAT
`through the tunnel and waits for acknowledgments. Upon multiple
`unsuccessful retries, the initiator attempts to determine the point-
`of-failure by sending pings through the tunnel to each relay.
`If the PNAT is the point-of-failure, i.e., hl still responds to pings,
`the initiator selects a new hpnat for the tunnel. Otherwise, it at-
`tempts to rebuild the tunnel to the original PNAT, so that higher-
`level connections, such as TCP, do not die upon tunnel failure.
`If the furthest node to reply to the ping is hi, for i < l, interme-
`diate relay hi+1 is unreachable. So, the initiator attempts to rebuild
`(cid:2)
`the tunnel from hi forward, T
`, hpnat).
`+1,. . ., h
`= (h1,. . ., hi, h
`Upon multiple unsuccessful attempts, the initiator decrements i by
`one and reattempts reconstruction.
`3.5 Peer discovery
`A Tarzan node requires some means to learn about all other
`nodes in the network, knowing initially only a few other nodes.
`Anything less than near-complete network information allows an
`adversary to bias the distribution of a node’s neighbor set towards
`malicious peers, leaks information through combinatorial profiling
`attacks, and results in inconsistencies during relay selection. Sec-
`tion 4.1 discusses these attacks in more depth.
`Tarzan uses a simple gossip-based protocol for peer discovery.
`Tarzan’s goal—to learn about all network resources—differs from
`recent peer-to-peer lookup protocols [25], which spend great effort
`to achieve immediate information propagation and load balancing
`in a flat namespace, often at the cost of security.
`Gossiping offers a simple mechanism for nodes to learn about
`new neighbors.5 A node can prune inactive neighbors lazily when
`they do not respond to cover traffic establishment requests, which
`we explain further in Section 3.7.
`This problem can be modeled as a directed graph: vertices rep-
`resent Tarzan nodes; edges correspond to the relation that node a
`knows about, and thus can communicate with, node b. Edges are
`added to the graph as nodes discover other peers. We assume that
`the graph is initially weakly connected; otherwise, nodes in sepa-
`rate network partitions could never learn of one another. Tarzan’s
`peer discovery goal is to make this graph fully connected.
`Our technique to grow this network graph is similar to the Name-
`Dropper resource discovery protocol [16]. In each round of Name-
`Dropper, node a simply contacts one neighbor at random and trans-
`fers its entire neighbor set.
`The Tarzan discovery protocol supports three related operations:
`initialization, redirection, and maintenance. Initialization provides
`the bulk-transfer functionality of Name-Dropper, which allows fast
`information propagation. Redirection allows nodes to shed load
`by redirecting new nodes to random neighbors. As this protocol
`progresses, nodes sending entire neighbor sets will transmit many
`elements already known to their recipients, wasting bandwidth.
`In response, maintenance messages provide an incremental up-
`date P of a node’s peer database with only new information
`(P ∩ db = ∅). Tarzan calculates these set differences efficiently
`by performing k-ary searches on prefix-aggregated hashes of the
`set elements. This mechanism is briefly described in Section 4.1.
`5“Gossiping” is a slight misnomer. Traditional gossiping protocols
`assume a fully-connected or fixed network and seek to optimize the
`broadcast of extra information, such as link state.
`
`(cid:2)i
`
`(cid:2)l
`
`h0 = initiator;
`h1 ∈R {h0.neighbors};
`for i = 1 to l
`hi+1 ∈R {hi.neighbors}
`send establish request(hi−1, hi+1) to hi via tunnel;
`rc =wait for establish response;
`if rc ∈ {!ok, timeout}
`i = i − 1;
`while rc ∈ {!ok, timeout}
`if max retries exceeded
`decrement i and break;
`hi+1 ∈R {hi.neighbors};
`send reset forward request(hi+1) to hi;
`rc =wait for reset forward response;
`send establish response(hl) to hpnat via tunnel;
`
`Figure 3: Pseudocode for tunnel establishment protocol
`
`3.2 Tunnel setup
`When forming a tunnel, a Tarzan node pseudo-randomly selects
`a series of nodes from the network based on its local topology (see
`Section 3.7). The initiator is responsible for iteratively setting up
`the entire tunnel, one relay at a time. This process consists mainly
`of generating and distributing the symmetric keys, encrypted un-
`der the relays’ public keys. Section 3.5 describes how an initiator
`discovers nodes and their corresponding public keys. Each node
`generates its public key locally the first time it enters the network.
`The high-level establishment algorithm is shown in Figure 3. An
`establish request sent to node hi is relayed as a normal data packet
`from h1 through hi−1. Node hi cannot distinguish whether the
`packet originated from node hi−1 or from one of that node’s prede-
`cessors; node hi−1 cannot distinguish successive establish requests
`from ordinary tunneled data.
`The initiating node creates an establish request by using the pub-
`lic key of node hi to encrypt the initial forward session key, there-
`after used to decrypt packets received from hi−1. This session key
`encrypts the forward integrity key, the subsequent reverse keys for
`packets from hi+1, the addresses of hi−1 and hi+1, and the flow
`identifiers that will be used to tag packets going in each direction.
`When hi has successfully stored the state for this request, it re-
`sponds to the origin for an end-to-end check of correctness.
`For path length l, this algorithm takes O(l) public-key opera-
`tions and O(l2) inter-relay messages to complete. This overhead is
`sufficiently small for realistic choices of l.
`3.3 IP packet forwarding
`Tarzan provides a client IP forwarder and a server-side pseudony-
`mous network address translator (PNAT) to create a generic anony-
`mizing IP tunnel. The IP forwarder diverts certain packets from
`the client’s network stack that matches user-specified IP firewall
`rules and ships them over a Tarzan tunnel. The client forwarder
`replaces its real address in the packets with a random address as-
`signed by the PNAT from the reserved private address space. The
`PNAT translates this private address to one of its real addresses.
`Remote hosts can communicate with PNAT normally, as if it origi-
`nated the traffic. Correspondingly, response packets are deNAT’ed
`twice, once at each end of the tunnel.
`The IP forwarder only hides the Internet Protocol address and
`special header fields, such as origin port numbers, for TCP and
`UDP packets. Section 7 discusses ways of coping with applications
`that require more work than this to anonymize.
`The pseudonymous NAT also offers port forwarding to allow or-
`dinary Internet hosts to connect through Tarzan tunnels to anony-
`mous servers.
`In fact,
`to achieve both sender and recipient
`
`CODE200 ET AL. EXHIBIT 1033
`Page 4 of 14
`
`

`

`// Let Ua be the set of a’s unvalidated known peers
`// Let Va be the set of a’s validated known peers
`a.gossip()
`while true
`if (Ua = ∅), Ua = Va;
`b ∈R Ua;
`|Vb|)
`if (|Va| < 1
`c
`b.busy ? a.redirect(b) : a.initialize(b);
`else if (|Vb| < 1
`|Va|)
`c
`a.busy ? b.redirect(a) : b.initialize(a);
`else
`a.maintain(b);
`
`b.maintain(a);
`
`Figure 4: Pseudocode for the peer discovery protocol
`
`Tarzan differentiates between unvalidated addresses (Ua) and
`validated addresses (Va) in a node’s peer database. A node learns
`{ipaddr, port, hash(pubkey)} tuples through gossiping: these un-
`validated values can easily be forged.
`A node validates a tuple once its corresponding peer correctly re-
`sponds to a discovery request sent directly to its gossiped address.
`The request includes a random nonce. This two-way network hand-
`shake is a weak yet practical authentication mechanism to show a
`node “speaks for” its address. This validation distinction stops an
`adversary from injecting arbitrary tuples into a peer database and
`later impersonating a streak of invalid addresses following it in a
`tunnel (see Section 4.1).
`Figure 4 shows the main gossip protocol. To join the system, a
`new node a contacts some existing node b to discover a new set
`of unvalidated addresses. Node a validates b once a receives a
`response. Node a successively contacts the new neighbors in Ua
`before retrying neighbors in Va.
`Running the discovery protocol, a node learns about and vali-
`dates all other nodes in the network in O(n) connections.
`
`3.6 Peer selection
`This section describes Tarzan’s method for selecting nodes from
`this peer database.
`One may be tempted to simply choose nodes completely at ran-
`dom from Va. This approach is problematic: if an adversary runs as
`many Tarzan nodes as IP addresses to which it has access, a user is
`very likely to select malicious nodes. However, these addresses are
`rarely scattered uniformly through the IP address space. Instead,
`they are often located in the same IP prefix space. Thus, we choose
`among distinct IP prefixes, not among all known IP addresses.
`We select nodes by choosing randomly among the populated do-
`mains at each level of the table in Figure 5. Tarzan uses a three-level
`hierarchy: first among all known /16 subnets, then among /24 sub-
`nets belonging to this 16-bit address space, then among the relevant
`IP addresses.
`A node generates this table by inserting all peers in Va into their
`corresponding identifier rings. The leading d-bits of a node’s IP
`address are transformed to an identifier via hash(ipaddr /d , date),
`where hash is a cryptographic hash function and date is the day-
`of-the-month according to GMT. Identifiers are ordered on their
`|id|
`corresponding rings modulo 2
`.
`Therefore, Tarzan’s lookup(key) method selects peers as fol-
`lows: Node a first generates identifier id16 via hash(key/16 , date)
`and finds the smallest identifier ≥ id16 (with wrap-around) on the
`/0 identifier ring; it repeats this process recursively with id24 and
`id32 on their corresponding rings of increased specificity.
`Note that a node executes lookup completely locally, based on in-
`formation already accumulated in its peer database. Therefore, two
`
`/0
`
`18D3
`
`3CB8
`
`49A1
`
`58E2
`
`712F
`
`9D37
`
`B541
`
`CA13
`
`F72A
`
`18.26/16
`
`21F8
`
`3A25
`
`45F1
`
`5212
`
`7C38
`
`94D1
`
`B1E3
`
`E436
`
`18.26.4/24
`
`23A5
`
`4F9A
`
`61D1
`
`974F
`
`B11A
`
`Shown is a
`Figure 5: Peer selection on validated nodes.
`lookup(key) with id16 = 541A, id24 = 82F1, and id32 = 261B.
`This ultimately maps to the hash value 4F9A, which yields a
`node with IP address 18.26.4.9.
`
`nodes may have slightly different lookup structure replicas, which
`can yield temporarily inconsistent results. We return to the impact
`of inconsistencies in the next section.
`Tarzan includes the date in identifier hashes to daily reorder of
`ring elements. This randomization stops any particular domain or
`address from owning a larger space in the ring for any duration.
`Furthermore, this rebalancing reorders the validated set daily, ran-
`domizing how nodes propagate their neighbors during maintain.
`3.7 Cover traffic and link encoding
`If the pattern of inter-node Tarzan traffic varied with usage, a
`wide-spread eavesdropper could analyze the patterns to link mes-
`sages to their initiators. Prior work has suggested the use of cover
`traffic to provide more time-invariant traffic patterns independent of
`bandwidth demands [3]. Such traffic provides a node with stronger
`plausible deniability that it is the actual message initiator.
`Our key contributions include introducing the concept of a traffic
`mimic. We propose traffic invariants between a node and its mimics
`that protect against information leakage. These invariants require
`some use of cover traffic and yield an anonymity set exponential in
`path length.
`
`3.7.1 Selecting mimics
`Upon joining the network, node a asks k other nodes to exchange
`mimic traffic with it. Similarly, an expected k nodes select a as they
`look for their own mimics. Thus, each node has κ mimics, where
`E(κ) = 2k for some global parameter k. Mimics are assigned
`verifiably at random from the set of nodes in the network.
`A node establishes a bidirectional, time-invariant packet stream
`with a mimic node, into which real data can be inserted, indistin-
`guishable from the cover traffic.
`This mimic relationship must be symmetric for three reasons.
`First, a otherwise would send data only on its outgoing links, not
`trusting its incoming mimic connections. This practice halves a’s
`anonymity set on average. Second, and related, variations in host
`density behind different IP prefixes may account for some nodes
`receiving few incoming connections. Second, a otherwise would
`not be incentivized to provide cover traffic on its incoming links.
`
`CODE200 ET AL. EXHIBIT 1033
`Page 5 of 14
`
`

`

`PNAT
`
`Figure 6: Mimic topology and traffic flows for k = 3: Each node
`has κ ≈ 6 mimics (shown connected by solid lines). A node es-
`tablishes its tunnel over mimic links (arrows in bold) and com-
`pletes it with a random PNAT (dotted line). Nodes inject cover
`traffic with data traffic to yield a uniform total traffic T that
`flows bidirectionally over the solid lines. All packets on these
`links are pair-wise encrypted and padded to the same size.
`
`This mimic relationship must be universally verifiable to stop an
`adversary from selecting more than k mimics. As mimics will be
`used for tunnel establishment, an adversary could otherwise bias a
`node’s choice of tunnel relays.
`Node a chooses its ith mimic, or Ma
`i , as the peer returned by
`lookupi(a.ipaddr). This function is similar to that in Section 3.6,
`except that the identifier idi
`d is generated by recursively applying
`the cryptographic hash function i times to {a.ipaddr /d , date},
`i ≤ (k +1). Therefore, we map domains to domains. We explain
`this design choice in Section 4.1.
`Node a sends a mimic request, including the tuple {a.ipaddr , i},
`to Ma
`i , which we refer to as b. Node b accepts a mimic establish-
`ment request from node a if and only if the following hold:
`• 1 < i ≤ (k+1)
`• b.lookupi(a.ipaddr) = b
`If the lookupi check fails, we must consider two cases. First,
`node b can have a different network view than a:
`its execution
`b.lookupi(a.ipaddr) maps to a different node. This inconsistency
`occurs when b knows a ring identifier between a.ipaddr/d and b/d,
`unknown to a. Node b declines the mimic request but returns the
`identifier’s corresponding node, to which a sends a new request.
`Second, a may initially contact node c and receive no response,
`signifying c’s failure. Node a removes c from Va, and a.lookupi
`now maps to b. However, b is not aware of this failure, thus a’s new
`request includes the message that c has failed. To verify, b pings
`c and waits for the acknowledgment to timeout, at which time b
`removes c from Vb and thus accepts a’s mimic request.
`If a looses connectivity to its ith mimic, a removes it from Va
`and replaces it with the new node mapped from lo

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket