throbber
Matchmaking for Online Games
`and Other Latency-Sensitive P2P Systems
`
`Jacob R. Lorch
`Sharad Agarwal
`Microsoft Research
`
`ABSTRACT– The latency between machines on the Internet can
`dramatically affect users’ experience for many distributed applica-
`tions. Particularly, in multiplayer online games, players seek to
`cluster themselves so that those in the same session have low la-
`tency to each other. A system that predicts latencies between ma-
`chine pairs allows such matchmaking to consider many more ma-
`chine pairs than can be probed in a scalable fashion while users are
`waiting. Using a far-reaching trace of latencies between players on
`over 3.5 million game consoles, we designed Htrae, a latency pre-
`diction system for game matchmaking scenarios. One novel feature
`of Htrae is its synthesis of geolocation with a network coordinate
`system. It uses geolocation to select reasonable initial network co-
`ordinates for new machines joining the system, allowing it to con-
`verge more quickly than standard network coordinate systems and
`produce substantially lower prediction error than state-of-the-art la-
`tency prediction systems. For instance, it produces 90th percentile
`errors less than half those of iPlane and Pyxida. Our design is gen-
`eral enough to make it a good fit for other latency-sensitive peer-to-
`peer applications besides game matchmaking.
`
`Categories and Subject Descriptors – C.2.4 [Computer Systems Organi-
`zation]: Computer Communication Networks – Distributed Systems; K.8.0
`[Personal Computing]: General – Games
`General Terms – Algorithms, design, experimentation, measurement, per-
`formance
`Keywords – latency estimation, matchmaking, network coordinates, online
`gaming
`
`1.
`
`INTRODUCTION
`Online gaming is a rapidly growing industry with revenues ex-
`ceeding that of the entire movie business [29]. Appealing to in-
`creasingly discriminating game players requires careful attention to
`their chief concerns, particularly lag, the perceived time between an
`action and its effect. For the majority of games, the primary con-
`tributor to lag is direct communication between participants’ game
`machines over the Internet. To reduce lag and hence improve the
`game experience, it is critical to perform accurate matchmaking—
`selecting groups of players with low latency to each other.
`In typical matchmaking, each player sends network probes to
`potential game hosts and selects one host based on latency and
`bandwidth. Since probing consumes time and bandwidth, and play-
`ers have limited patience for matchmaking, only a small fraction of
`potential hosts can be probed. Furthermore, in games where player
`machines communicate directly with each other rather than only
`with the host, it is prohibitive to probe all the potential paths traffic
`will take, which can be quadratic in the number of online play-
`ers. For these reasons, game matchmaking can benefit from latency
`prediction—determining the latency between machines without any
`
`Permission to make digital or hard copies of all or part of this work for personal or
`classroom use is granted without fee provided that copies are not made or distributed
`for profit or commercial advantage and that copies bear this notice and the full citation
`on the first page. To copy otherwise, to republish, to post on servers or to redistribute
`to lists, requires prior specific permission and/or a fee.
`SIGCOMM’09, August 17–21, 2009, Barcelona, Spain.
`Copyright 2009 ACM 978-1-60558-594-9/09/08 ...$5.00
`
`probing. In this paper, we develop a new latency prediction sys-
`tem, called Htrae, suited to game matchmaking. Using a trace of
`50 million latency measurements between 3.5 million players of the
`popular game Halo 3, we show that our system works better than
`state-of-the-art latency prediction systems.
`A novel feature of Htrae is its merging of two disparate ap-
`proaches to latency prediction, network coordinate systems (NCS)
`and geolocation. An NCS assigns each machine a coordinate in a
`virtual metric space, such that the distance between two machines’
`coordinates approximates the round-trip time (RTT) between them.
`Because initial conditions are random, NCSes take time to converge
`to a reasonable state, and usually converges to a sub-optimal local
`minimum. Geolocation, on the other hand, determines each ma-
`chine’s location on the planet and uses physical distance as a predic-
`tor of network delay. This approach does not model the impact of
`different routing efficiencies on different paths and disadvantages
`machines whose location is imprecisely known. Htrae combines
`these approaches by assigning each machine a coordinate on a vir-
`tual Earth based on its physical location, but allowing this coordi-
`nate to shift based on observations of actual latency. In so doing, we
`gain the benefits of both approaches: the realistic initial conditions
`of geolocation, combined with the ability of an NCS to converge to
`a state with greater predictive power.
`Htrae also includes enhancements that are beneficial in game
`matchmaking. Triangle inequality violations and different network
`access latencies for different machines are scenarios that tradition-
`ally pose problems for an NCS. They occur often enough in our
`game traces that we use simple heuristics to help combat their ef-
`fects. We also observe that since reliably measuring RTT for game
`purposes can consume a fair amount of bandwidth, the overhead of
`one additional message to notify the probee of the result is low, and
`quite worthwhile considering its usefulness to the recipient.
`Our evaluation for game matchmaking shows that Htrae has sub-
`stantially better predictive power than other latency prediction sys-
`tems. For instance, its 90th percentile of prediction error is 71 ms,
`compared to 176 ms for Pyxida [13] and 162 ms for iPlane [18],
`even when restricting consideration only to the 23% of pairs iPlane
`can make a prediction for. Also, when searching for the best peer
`game server, Htrae picks the best one 70% of the time, compared to
`only 35% for Pyxida and 55% for iPlane. Even though it does not
`always find the best one, 95% of the time it picks one at most 47 ms
`worse than optimal, compared to 184 ms for Pyxida and 131 ms for
`iPlane. In scenarios allowing a limited amount of probing to select
`the best server, Htrae finds one with under 75 ms of latency 68%
`of the time, compared to 44% for Pyxida and 41% when no latency
`prediction is used. Our results show that Htrae will have tremen-
`dous impact on the playability of games, and may also be useful for
`other latency-sensitive peer-to-peer applications.
`The contributions of our work are as follows:
`• We propose a novel latency prediction system, Htrae, a hybrid
`of geolocation and network coordinate systems that achieves the
`benefits of both.
`• Based on extensive traces of latencies between game consoles,
`we thoroughly evaluate the accuracy, convergence and drift of
`Htrae.
`
`315
`
`Data Co Exhibit 1048
`Data Co v. Bright Data
`
`

`

`Least-squares fit
`
`Average
`
`400
`350
`300
`250
`200
`150
`100
`50
`0
`
`median RTT (ms)
`
`0
`
`1000
`
`6000
`5000
`4000
`3000
`2000
`distance between machines (miles)
`
`7000
`
`8000
`
`Figure 1. Correlation between distance and RTT. Each point
`is the median RTT among machines with the same distance,
`rounded to the nearest mile. RTT data is from Halo 3 players
`from March 1–31, 2008, and distance data is from MaxMind’s
`IP-to-geo database. The least-squares fit weights each point by
`its number of contributing machine pairs.
`
`1
`
`A
`
`B
`
`2
`
`RTT 30 ms
`
`B
`
`3 A
`
`Figure 2. Overview of updating coordinates. (1) Node A sends
`a message to B. (2) B responds, and A measures the RTT. (3) If
`the distance between their virtual coordinates is too low/high,
`A applies a virtual force to its coordinate, moving it away
`from/toward B’s.
`
`When a machine joins the system, it determines its latitude and
`longitude in one of various ways, e.g., by looking up its IP address
`in a database.
`It then uses this as its initial location, along with
`height equal to half the least-squares y-intercept, or 31.66 ms. If
`it cannot determine its location, it uses latitude and longitude 0.
`From then on, whenever it determines its RTT to another machine,
`it applies a virtual force to its coordinates, either toward the other
`machine if the RTT was unexpectedly small or away if it was unex-
`pectedly large, as shown in Figure 2. The magnitude of this force
`is calculated as in Vivaldi [6], but adapted to using spherical co-
`ordinates instead. See Figures 3 and 4 for details. Note that the
`authors of Vivaldi tried using spherical coordinates in their system,
`but found them not to work well; we will see in §5.2 that they work
`in Htrae because of geographic bootstrapping.
`As in Vivaldi, each node also maintains an estimate of the un-
`certainty of its coordinates, a weighted moving average of the error
`observed between expected RTTs and observed RTTs. When coor-
`dinates improve such that distances better predict observed RTTs,
`uncertainty will decrease. This uncertainty is used to decide how
`strong a force to apply when updating coordinates: the greater the
`moving node’s uncertainty, and the lower the other node’s uncer-
`tainty, the stronger the force will be, as shown in Figure 3. However,
`unlike Vivaldi, we do not always initialize uncertainty to 100%. If a
`machine initializes its coordinates based on geographic location, it
`uses initial uncertainty of 29.2%, the average relative error obtained
`from using geolocation alone.
`RTT measurements are prone to errors, which can harm the cor-
`rectness and stability of network coordinates [13]. Therefore, as in
`Pyxida [13], we do not use a single RTT measurement when adjust-
`ing coordinates but rather an aggregate of multiple measurements.
`Specifically, we use the median of five back-to-back RTT measure-
`ments; our traces of Xbox LIVE probes report only these medians.
`
`• We also evaluate in detail the effectiveness of various imple-
`mentation details of Htrae, including a component that corrects
`for triangle-inequality violations in an NCS.
`The rest of this paper is structured as follows. §2 details the
`design of our system, Htrae, including how we merge geolocation
`with an NCS. §3 shows how we implemented Htrae. §4 describes
`the traces of game matchmaking probes we collected and presents
`the methodology we use to evaluate our system. §5 gives the results
`of that evaluation. §6 discusses these results as well as avenues for
`future work. §7 describes related work, and, finally, §8 concludes.
`
`2. DESIGN
`Htrae is a hybrid between two approaches to latency prediction:
`geolocation and network coordinate systems.
`Geolocation predicts latency based on the real-world distance
`between two physical machines, which many researchers have
`found is a strong predictor of RTT [10, 14, 24]. However, this cor-
`relation is weak, especially for the home machines typically found
`in games; for instance, we found a correlation coefficient of only
`+0.56 among Halo 3 players. Furthermore, if geolocation inaccu-
`rately judges the location of some player’s machine, it will consis-
`tently give that player poor performance.
`An NCS gives each machine a coordinate in a virtual space, such
`that the distance between two coordinates is an estimate of their
`RTT. Coordinates are dynamically adjusted based on observations
`of RTT so as to make estimation more accurate. Unfortunately, ac-
`curately embedding the Internet graph in a virtual coordinate space
`is difficult. One reason is that the Internet has many routing inef-
`ficiencies, some that lead to triangle-inequality violations (TIVs),
`where two nodes have a greater RTT than the sum of their respective
`RTTs to some other node. Coordinate spaces cannot have such vio-
`lations [16, 36]. Furthermore, embeddings are often sensitive to ini-
`tial conditions [12], since they can fall into one of many imperfect
`local minima in the space of possible coordinate assignments [28].
`Our insight is that these two approaches, NCS and geolocation,
`are complementary, i.e., we can combine them in a way that miti-
`gates disadvantages of both. We do this by geographic bootstrap-
`ping, i.e., initializing NCS coordinates to correspond to the loca-
`tions of the nodes in actual space. Our approach improves on an
`NCS because it provides better initial conditions, and improves on
`geolocation because its dynamic coordinate refinement can correct
`inaccurate or missing information. Essentially, our coordinate sys-
`tem is a rough representation of Earth, modified to better predict
`Internet latencies; the name Htrae comes from a warped version of
`Earth in a certain comic-book universe.
`2.1 Geographic bootstrapping
`Geographic bootstrapping requires a virtual space with a known
`relationship to the real world. Therefore, in Htrae, we use latitude
`and longitude on a virtual Earth. Also, as in Vivaldi [6], we use
`a virtual height, which represents the component of latency a ma-
`chine experiences on all its paths, e.g., due to its Internet access
`link. The predicted RTT between two machines is the great-circle
`distance between their virtual locations, times 0.0269 ms/mile, plus
`the sum of the two machines’ heights. We obtained the factor
`0.0269 from Figure 1, which shows the relationship between dis-
`tance and median RTT. It shows that this relationship is strongly
`linear, with R2 = 0.976, slope 0.0269 ms/mile, and y-intercept
`63.32 ms. This factor of 0.0269 ms/mile is about five times greater
`than the inverse speed of light. A factor of two is expected since
`we use round-trip latency, and the remaining factor of 2.5 suggests
`other causes besides the speed of light.
`
`316
`
`

`

`ws ← wA/(wA + wB)
`ε ← ((cid:3)(cid:2)xA −(cid:2)xB(cid:3)− lAB) /lAB
`wA ← cewsε + (1− cews)wA
`(cid:2)xA ← (cid:2)xA + ccws ((cid:3)(cid:2)xA −(cid:2)xB(cid:3)− lAB)u((cid:2)xB −(cid:2)xA)
`Figure 3. Vivaldi update algorithm, run after node A learns the
`RTT lAB to node B. Here,(cid:2)xN is the virtual location of node N, wN
`is the uncertainty of node N’s coordinates, u((cid:2)y) is the unit vector
`in the direction of (cid:2)y, and cc and ce are algorithmic constants.
`
`2.2 TIV avoidance and history
`Triangle inequality violations in Internet delay are typically
`caused by inefficient routing between two nodes, resulting in more
`delay between them than the sum of delays via some other interme-
`diate nodes. Game consoles on singly-homed residential connec-
`tions can be particularly susceptible because they are restricted by
`their ISPs’ routing policies, as opposed to a server in a datacenter
`that has multiple upstream ISPs to choose from. For example, in our
`dataset, a game console in Vancouver, BC measured a 378 ms me-
`dian RTT to a console in Tukwila, WA. This is an unusually high
`delay for a distance of only 141 miles. Indeed, around the same
`time, the Vancouver console measured a median RTT of only 47
`ms to a console in Pleasanton, CA (distance of 959 miles) and the
`Tukwila console had a RTT of 75 ms to Los Angeles (1,100 miles
`away).1 Such violations are relatively infrequent—in this exam-
`ple, the Vancouver and Tukwila consoles were involved in 78 RTT
`measurements to other nodes, none of which appear to be TIVs.
`TIVs are problematic for both NCSes and geolocation-based la-
`tency estimation. Geolocation will under-estimate the latency be-
`tween two nodes, as evident from the above example. Similarly,
`an NCS will underestimate the latency because the coordinates for
`the two nodes may have converged to stable positions based on the
`vast majority of non-TIV latency measurements. Further, when a
`TIV measurement is used to update an NCS, the coordinates of the
`nodes involved will be pushed further apart by the resulting force,
`thereby worsening their positions with respect to other nodes.
`Some prior solutions to this problem have been shown to either
`worsen prediction accuracy or not scale in distributed settings [33].
`We instead use a heuristic similar to TIV Alert proposed in [33].
`When updating a node’s coordinate, if the measured latency ex-
`ceeds the predicted latency by δ, we skip this coordinate update.
`We apply this heuristic only when both nodes in question have un-
`certainty lower than the geographic bootstrapping default of 29.2%,
`and update their uncertainties based on this measurement. This
`guards against slowing down convergence for nodes with poor ini-
`tial coordinates. We have empirically determined that a δ of 100 ms
`works best for our client population.
`While this heuristic reduces the impact of TIVs on the quality
`of node coordinates, it does not help improve latency prediction of
`TIV edges. To address that problem, we rely on history prioriti-
`zation. Every time a node learns the RTT to another, it saves this
`RTT in its history. When a prediction is needed, if there is history
`for the destination node, we use the most recently measured RTT
`instead of the RTT predicted by the coordinates. Note that we use
`the most recent one because we assume robust RTT measurements,
`as discussed earlier. The past RTT can be a good estimate of future
`RTT because RTTs on the Internet can be very stable. In our data,
`
`1We found the physical locations of these consoles by using
`a geolocation database (§ 4.1), and confirming by looking up the
`DNS names of nearby routers using traceroute.
`
`r ← 1− ccws ((cid:3)(cid:2)xA −(cid:2)xB(cid:3)− lAB) /(cid:3)(cid:2)xA −(cid:2)xB(cid:3)
`−1 [cos(φA)cos(φB)+
`d ← cos
`sin(φA)sin(φB)cos(λB − λA)]
`(cid:3)
`(cid:2)
`sin(φB)sin(φA)sin(λB − λA)
`−1
`γ ← tan
`cos(φA)− cos(d)cos(φB)
`−1 [cos(rd)cos(φB) + sin(rd)sin(φB)cos(γ)]
`φA ← cos
`(cid:3)
`(cid:2)
`sin(φB)sin(rd)sin(γ)
`−1
`β ← tan
`cos(rd)− cos(φA)cos(φB)
`λA ← λB − β
`Figure 4. Adaptation of Vivaldi update algorithm to spheri-
`cal coordinates. Here, (cid:2)xN, the coordinates for node N, consist
`of φN, the latitudinal distance in radians between node N and
`the North Pole, and λN, the longitude in radians of node N. We
`compute r, the ratio between the desired and current distance
`between nodes A and B; d, the current distance in radians be-
`tween them; γ, the angle from the North Pole to B to A (which
`stays the same as A moves); and β, the angle from B to the North
`Pole to A’s new location. Note that some special cases are not
`shown.
`
`for about 95% of nodes that measured RTT multiple times between
`themselves for as long as 50 days, the coefficient of variation in
`RTT was under 0.2. History prioritization is particularly useful for
`node pairs that cause TIVs, because latency estimates from their
`coordinates will be inaccurate.
`It might seem unlikely that among millions of players, a player
`will ever probe the same player twice, but it does happen. This is
`presumably because the factors that make two players good can-
`didates for matchmaking, such as liking the same type of game,
`having similar skill level, and liking to play at the same time of day
`or week, change infrequently.
`2.3 AS correction
`There are many autonomous systems (ASes) serving game play-
`ers, so most pairs of players will belong to different ASes. Thus,
`we can expect our coordinate system will automatically tune itself
`to predicting latencies assuming endpoints are in different ASes.
`Unfortunately, this means that when endpoints are in the same AS,
`the faster resulting routing will not be reflected in the coordinate
`system, leading to systematic overestimation of the RTT of such
`paths. This can be particularly frustrating for game players, since it
`may lead to unnecessary exclusion of some of the closest players.
`To understand our solution, recall that Htrae, like other NCSes,
`augments coordinates with a virtual height reflecting the latency a
`machine incurs on essentially all of its paths [6]. For home ma-
`chines, we can loosely consider this latency to have two main com-
`ponents. First, a packet incurs latency in the so called “last-mile.”
`In the case of a machine with a broadband DSL connection, this is
`the latency between the machine and the next IP-level device, such
`as the broadband remote access server (BRAS). The second compo-
`nent is the remaining latency through the high-speed network core
`for the AS the machine resides in. Typically, this would be a set of
`core routers where that AS peers with others.
`As shown in Figure 5, for a node A to reach node B, pack-
`ets would traverse the last-mile, reach the core, traverse inter-AS
`links to the second core, and then traverse the last-mile. However,
`a packet from A to C will traverse a shorter path that skips some
`of the second component of the height. This can occur when two
`nodes are part of the same AS and are “close-by” in the network.
`Using BGP routing tables, we are able to determine if two nodes
`belong to the same AS, and using geolocation we are able to deter-
`
`317
`
`

`

`Figure 6. Geographic spread of nodes in trace A
`
`new node pairs
`new nodes
`
`140000
`
`120000
`
`100000
`
`80000
`
`60000
`
`40000
`
`20000
`
`0
`
`11/21
`
`11/19
`
`11/17
`
`11/15
`
`11/13
`
`11/11
`
`11/9
`
`11/7
`
`time (hour bins)
`
`12/9
`
`12/7
`
`12/5
`
`12/3
`
`12/1
`
`11/29
`
`11/27
`
`11/25
`
`11/23
`
`Figure 7. Arrival rate of nodes and node pairs in trace A
`4. METHODOLOGY
`In this section, we describe our methodology for the experi-
`ments in §5. We describe our data set, how we use it to evaluate
`Htrae, and how we use it to compare to other systems. We also
`describe how we deployed Htrae to conduct tests in a real-world
`environment.
`4.1 Traces
`To evaluate the techniques behind Htrae in the context of game
`matchmaking, we use traces of Xbox 360 consoles participating in
`matchmaking for the popular game Halo 3. For Halo 3, matchmak-
`ing involves choosing one console to be a server, then choosing up
`to 16 consoles to be clients. When a player starts matchmaking,
`his console chooses whether to be a client or server and notifies the
`Xbox LIVE matchmaking service. Xbox LIVE replies with a set of
`potential servers, and the client probes them all. Each probe mea-
`sures RTT by taking the median value of multiple ping-like mea-
`surements. Based on the probe results, the client chooses a server.
`For each session, i.e., instance of one client probing one or more
`servers to find a game, we log: UTC time, client’s and servers’ pub-
`lic IP addresses, and RTT from the client to each server. Due to the
`enormous volume of online game play, the logging system cannot
`record every probe, so for each session it chooses randomly with
`probability 10% whether to log all measurements in that session.
`In this paper, we focus on the two time periods described in
`Table 1. Our results from several other time periods are similar
`and we focus on these two for conciseness. With almost 50 million
`RTT measurements across over 3.5 million unique IPs in trace A,
`our evaluation data set is still among the largest ever used.
`Figure 6 shows the geographic locations of the nodes in trace
`A. While there is extensive coverage across North America and Eu-
`rope, several other major regions such as Japan and eastern Aus-
`tralia are also well represented. We believe such geographic spread
`is common for large distributed systems.
`Two important characteristics of our traces are the rate at which
`new nodes are seen, and the rate at which any node probes another it
`has not probed before. Figure 7 shows that after the first three days
`
`318
`
`last
`mile
`
`last
`mile
`
`A
`
`C
`
`BRAS
`
`BRAS
`
`core
`
`core
`
`BRAS
`
`Internet
`
`last
`mile
`
`B
`
`Figure 5. Simple model of two ASes with broadband machines
`on the Internet
`
`mine their physical distance. Given that it is difficult to break a
`height down into its two components based solely on end-to-end
`RTT measurements, we rely on a heuristic: we ignore a portion of
`the sum of heights when predicting RTT or updating coordinates
`for such nodes. We have empirically determined that ignoring 20%
`of the heights for nodes in the same AS and under 225 miles apart
`produces the best results.
`2.4 Symmetric updates
`As we have observed earlier, RTT measurement between game
`machines requires multiple round trips and typically is coupled with
`bandwidth measurements. Therefore, it does not add substantially
`to traffic to send one additional message, at the conclusion of the
`measurement, to notify the other machine of the RTT. Since the
`RTT is the same in both directions, the recipient can use the re-
`ported RTT to update its own coordinates, saving it the time of per-
`forming its own RTT measurement. Note that the overhead is even
`less when the matchmaking service is centralized. In that case, the
`measuring machine must expend network traffic to notify the cen-
`tral service of the RTT anyway, at which point the service can up-
`date the coordinates of both endpoints. For these reasons, Htrae
`adjusts both machines’ coordinates after an RTT measurement. It
`adjusts them effectively simultaneously, i.e., it uses the pre-update
`coordinates for one in the computation of the other.
`
`3.
`
`IMPLEMENTATION
`Our Htrae implementation is based on the publicly-available
`code for Pyxida, a practical implementation of the Vivaldi algo-
`rithm for the Azureus BitTorrent client [13]. We ported this code to
`C# and added our algorithmic modifications as well as support for
`experimentation on home machines. Our system is approximately
`3,600 lines of code, about a third of which is the direct port of Pyx-
`ida.
`To enable geographic bootstrapping, Htrae relies on a location
`service. For simplicity, we built a centralized server that loads the
`GeoIP City Database from MaxMind [20] in memory and responds
`to IP address lookups with the corresponding latitude and longi-
`tude. The 01 January 2009 version of the database that we use has
`4,100,436 entries, each with an IP address range, latitude, and lon-
`gitude. §5 discusses how well this database covers the IP addresses
`in our traces and deployment.
`To enable AS correction, Htrae uses a routing table service.
`For simplicity, we built a centralized server that loads a routing
`table into a PATRICIA trie [22] in memory and responds to IP
`address lookups with the origin AS. The routing table we use is
`from the Route Views Project [27], in particular from the route-
`views.oregon-ix.net router which peers with 43 different routers
`across 37 ASes on the Internet. We use a snapshot from 04:00 on
`01 January 2009, with entries for 277,383 prefixes.
`Home networks typically use network address translation
`(NAT), preventing straightforward direct communication between
`them [11]. To enable experiments in which home machines com-
`municate with each other, we implemented a simple approach sim-
`ilar to STUN [26].
`
`

`

`trace
`A
`B
`
`start time
`11/07/2008 12am
`01/14/2009 12am
`
`session count
`distinct IPs
`end time
`training end
`15,630,101
`3,534,120
`12/10/2008 12am
`11/10/2008 12am
`5,859,214
`1,700,547
`01/24/2009 12am
`01/17/2009 12am
`Table 1. Halo 3 matchmaking traces
`
`total probes
`49,946,991
`20,300,141
`
`post-training probes
`44,227,511
`14,810,694
`
`120
`
`110
`
`100
`
`90
`
`80
`
`70
`
`60
`
`50
`
`40
`
`30
`
`20
`
`10
`
`0
`
`100
`90
`80
`70
`60
`50
`40
`30
`20
`10
`0
`
`cumulative % of predictions
`
`htrae
`geolocation
`pyxida
`naïve
`
`150
`
`140
`
`130
`
`error (ms)
`Figure 8. CDF of prediction error for Htrae, Geolocation, Pyx-
`ida, and Naive on trace A. Horizontal axis limited to 150 ms.
`
`flect the current state of the Internet, we do not use historical data.
`We use the last day of trace B, and query the systems on January 25,
`2009, shortly after that trace was captured. To avoid overwhelming
`the services, we randomly sample 0.5% of the sessions from the
`last day and use only that subsample for evaluation. This subsam-
`ple contains 10,869 probes in 2,648 sessions.
`For illustration, we sometimes also compare to Oracle and
`Naive. Oracle always predicts perfectly, i.e., it always returns the
`actual RTT. Naive always guesses 85 ms, the average seen in Fig-
`ure 1.
`4.4 Deployment
`To demonstrate the effectiveness of our Htrae implementation,
`we deployed it on several friends’ home computers running Win-
`dows at 23:00 PST on October 7, 2008. This deployment used an
`older version of Htrae with a 3-dimensional virtual coordinate sys-
`tem rather than the spherical coordinate system we later found to
`be more accurate. We also used earlier versions of the MaxMind
`database and Route Views BGP table. Eleven people participated
`with locations in CA, IL, MA, NY, WA, and the U.K. We ran several
`experiments serially, with each lasting approximately 30 minutes.
`Everyone’s state reset at the start of each experiment. In each exper-
`iment, every three seconds, each node calculated its RTT to another
`and compared the result to what would have been predicted. It then
`incorporated the RTT measurement into its predictor. For measure-
`ment stability, it calculated RTT as the minimum from ten probes
`sent 100 ms apart. The first five minutes of each experiment are
`considered training; we report results for only the remaining time.
`
`5. EVALUATION
`We begin our evaluation by comparing Htrae to other latency
`prediction systems. We then examine in detail the impact of some
`of Htrae’s components, followed by analyses of convergence, drift,
`and a hybrid of probing and prediction. Finally, we summarize re-
`sults from our modest deployment.
`5.1 Htrae
`Figure 8 shows the prediction error of Htrae on the month-long
`trace A, comparing it to Pyxida, Geolocation, and Naive. We see
`that Htrae substantially outperforms all other three. For 50% of
`predictions, Htrae is off by under 15 ms, compared to 24 ms, 44 ms,
`and 47 ms for Geolocation, Pyxida, and Naive respectively. At the
`95th percentile, Htrae is at 138 ms, while Geolocation is at 208 ms,
`Pyxida at 244 ms, and Naive at 285 ms. Figure 9 shows the results
`from the week-long trace B, and as expected, they are similar.
`
`trace
`
`A
`B
`
`num. of sessions with choice of
`> 3 servers
`2 servers
`3 servers
`1 server
`3,043,439
`2,096,555
`1,138,157
`9,351,950
`1,005,266
`620,834
`344,420
`3,888,694
`Table 2. Server counts per session
`
`of trace A, there is an almost constant diurnal arrival of new nodes
`through the end of the month. The high rate of new probe pairs
`indicates the need for efficient latency estimation between nodes
`that have not previously directly communicated.
`4.2 Trace replay
`To evaluate a latency predictor, we replay each session from a
`trace in timestamp order. Nodes are initialized using geographic
`bootstrapping at the time that they first appear in the trace. Subse-
`quently, if a node is absent and then later returns, it rejoins using
`the coordinates it had the last time it was in the trace. Thus our
`evaluation reflects the churn that is present in online gaming.
`During the training period, we simply feed each probe’s source
`IP address, destination IP address, and RTT to the predictor. After
`training ends, evaluation begins. For each session, for each of its
`probes, we ask the predictor to predict the RTT from the source IP
`address to the destination IP address. We evaluate these predictions,
`then feed the predictor the session’s RTT measurements.
`We use two main metrics to quantify the quality of a prediction.
`The first, prediction error, is the absolute difference between the ac-
`tual median RTT and the prediction. This is relevant to applications,
`like games, that care about absolute RTT magnitude, e.g., to decide
`if a pairing meets some QoS requirement. The second, best-server
`error, reflects the need to choose among a set of others the one with
`the lowest latency, as is the approach of current matchmaking sys-
`tems. It is computed on a per-session basis, as the additional RTT a
`client would experience if it sought the lowest-RTT server based on
`the predictor’s output instead of a perfect oracle. In best-server ex-
`periments, we ignore sessions with only one probe; Table 2 shows
`that this still leaves a large number of sessions for evaluation.
`Given the enormous size of trace A, we did not have time to
`evaluate every aspect of Htrae on it. We instead focus on trace B
`for most of our evaluation, and use trace A for overall results and
`evaluations of long-term properties such as convergence and drift.
`4.3 Comparison to other systems
`We compare Htrae to various other latency prediction systems,
`including Pyxida [13], Hyperbolic Vivaldi [17], iPlane [18], and
`OASIS [10]. Pyxida is the implementation of the Vivaldi algorithm,
`with improvements, in the Azureus BitTorrent client. We turned off
`the recent neighbor set after finding it unhelpful and after discus-
`sions with an author indicating it was chiefly meant to deal with an
`artifact of the Azureus deployment. Hyperbolic Vivaldi is an adap-
`tation of Pyxida that uses hyperbolic coordinates, as described by
`Lumezanu et al. [17].
`iPlane uses multiple vantage points on the
`Internet to create an atlas with information about every link. It uses
`the atlas to predict the path and its RTT between a given pair of
`nodes. OASIS’s main purpose is to select, for a given client, the
`lowest-RTT server providing a given service. It uses infrastructure
`nodes to periodically probe ad

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket