throbber
Differential Congestion Notification: Taming the Elephants
`
`Long Le Jay Aikat Kevin Jeffay F. Donelson Smith
`Department of Computer Science
`University of North Carolina at Chapel Hill
`http://www.cs.unc.edu/Research/dirt
`
`Abstract — Active queue management (AQM) in routers has
`been proposed as a solution to some of the scalability issues asso-
`ciated with TCP’s pure end-to-end approach to congestion control.
`A recent study of AQM demonstrated its effectiveness in reducing
`the response times of web request/response exchanges as well as
`increasing link throughput and reducing loss rates [10]. However,
`use of the ECN (explicit congestion notification) signaling protocol
`was required to outperform drop-tail queuing. Since ECN is not
`currently widely deployed on end-systems, we investigate an alter-
`native to ECN, namely applying AQM differentially to flows based
`on a heuristic classification of the flow’s transmission rate. Our
`approach, called differential congestion notification (DCN), distin-
`guishes between “small” flows and “large” high-bandwidth flows
`and only provides congestion notification to large high-bandwidth
`flows. We compare DCN to other prominent AQM schemes and
`demonstrate that for web and general TCP traffic, DCN outper-
`forms all the other AQM designs, including those previously de-
`signed to differentiate between flows based on their size and rate.
`
`1. Introduction
`Congestion control on the Internet has historically been per-
`formed end-to-end with end-systems assuming the responsi-
`bility for detecting congestion and reacting to it appropri-
`ately. Currently, TCP implementations detect instances of
`packet loss, interpret these events as indicators of conges-
`tion, and reduce the rate at which they are transmitting data
`by reducing the connection’s window size. This congestion
`reaction (combined with a linear probing congestion avoid-
`ance mechanism) successfully eliminated the occurrence of
`congestion collapse events on the Internet and has enabled
`the growth of the Internet to its current size.
`
`However, despite this success, concerns have been raised
`about the future of pure end-to-end approaches to conges-
`tion control [1, 5]. In response to these concerns, router-
`based congestion control schemes known as active queue
`management (AQM) have been developed and proposed for
`deployment on the Internet [1]. With AQM, it is now possi-
`ble for end-systems to receive a signal of incipient conges-
`tion prior to the actual occurrence of congestion. The signal
`can be implicit, realized by a router dropping a packet from
`a connection even though resources exist to enqueue and
`forward the packet, or the signal can be explicit, realized by
`a router setting an explicit congestion notification (ECN) bit
`in the packet’s header and forwarding the packet.
`
`In a previous study of the effects of prominent AQM de-
`signs on web performance, we argued that ECN was re-
`quired in order to realize the promise of AQM [10]. This
`
`was a positive result that showed a tangible benefit to both
`users and service providers to deploying AQM with ECN.
`When compared to drop-tail routers, the deployment of par-
`ticular AQM schemes with ECN (and with ECN support in
`end-system protocol stacks) allowed users to experience
`significantly reduced response
`times for web re-
`quest/response exchanges, and allowed service providers to
`realize higher link utilization and lower loss rates. Without
`ECN, certain AQM schemes could realize modest perform-
`ance improvements over simple drop-tail queue manage-
`ment, but the gains were small compared to those achiev-
`able with ECN.
`
`The positive ECN results, however, beg the question of
`whether or not all AQM inherently requires ECN in order to
`be effective, or if it simply is the case that only existing
`AQM designs require ECN in order to be effective. This is a
`significant issue because ECN deployment requires the par-
`ticipation of both routers and end-systems and hence raises a
`number of issues including the cost and complexity of im-
`plementing and deploying ECN, the incremental deploy-
`ability of ECN, and the (largely unstudied) issue of dealing
`with malicious end-systems that advertise ECN support but
`in fact ignore ECN signals or simply have not been config-
`ured appropriately. Other deployment issues include the fact
`that many firewalls and network address translators inten-
`tionally or unintentionally drop all ECN packets or clear
`ECN bits. In a study of TCP behavior, Padhye and Floyd
`found that less than 10% of the 24,030 web servers tested
`had ECN enabled, of which less than 1% had a compliant
`implementation of ECN [14]. More recent results (August
`2003) showed that only 1.1% of 441 web servers tested had
`correctly deployed ECN [15]. This clearly points to obvious
`difficulties in deploying and properly using ECN on the
`end-systems. Thus, AQM could be significantly more ap-
`pealing if ECN were not required for effective operation.
`
`In this paper, we present an AQM design that signals con-
`gestion based on the size and rate of the flow and does not
`require ECN for good performance. Our approach is to dif-
`ferentially signal congestion to flows (through the dropping
`of packets) based upon a heuristic classification of the
`length and rate of the flow. We classify traffic into “mice,”
`short connections that dominate on many Internet links
`(more than 84% of all flows in some cases [21]), and “ele-
`phants,” long connections that, while relatively rare, account
`for the majority of bytes transferred on most links (more
`than 80% of all bytes [21]). Our AQM design attempts to
`
`In: Proceedings of the 12th IEEE International Conference on Network Protocols (ICNP),
`Berlin, Germany, October 2004, pages 118-128.
`
`Cloudflare - Exhibit 1040, page 1
`
`

`

`2
`
`notify only high-bandwidth “elephants” of congestion while
`allowing “slower” (and typically shorter) connections to
`remain unaware of incipient congestion. The motivation for
`this approach, borne out by our analysis of AQM schemes,
`is that providing early congestion notifications to mice only
`hurts their performance by forcing these short TCP connec-
`tions to simply wait longer to transmit their last few seg-
`ments. These short flows are often too short to have a
`meaningful transmission rate and are so short that slowing
`them down does not significantly reduce congestion. In con-
`trast, providing early congestion notification to elephants
`can lead to abatement of congestion and more efficient use
`of the network. Our form of differential congestion notifica-
`tion, called DCN, significantly improves the performance of
`the vast majority of TCP transfers and provides response
`times, link utilizations, and loss ratios that are better than
`those of existing AQM schemes including those that also
`attempt to differentiate between flows based on their size
`and rate.
`
`The remainder of the paper makes the case for differential
`congestion notification based on classification of flow-rate.
`Section 2 discusses previous related work in AQM schemes
`in general and in differential AQM specifically. Section 3
`presents our DCN scheme. Section 4 explains our experi-
`mental evaluation methodology and Section 5 presents the
`results of a performance study of DCN and several promi-
`nent AQM schemes from the literature. The results are dis-
`cussed in Section 6.
`
`2. Background and Related Work
`Several AQM designs have attempted to achieve fairness
`among flows or to control high-bandwidth flows. Here we
`give a description of the AQM designs most related to ours.
`
`The Flow Random Early Drop (FRED) algorithm protects
`adaptive flows from unresponsive and greedy flows by im-
`plementing per-flow queueing limits [11]. The algorithm
`maintains state for each flow that currently has packets
`queued in the router. Each flow is allowed to have up to
`minq but never more than maxq packets in the queue. Packets
`of flows that have more than minq but less than maxq packets
`in the queue are probabilistically dropped. When the number
`of flows is large and a high-bandwidth flow consumes only
`a small fraction of the link capacity, FRED maintains a large
`queue. However, a large queue results in high delay. Fur-
`thermore, the algorithm becomes less efficient with a large
`queue since the search time for flow state is proportional to
`queue length.
`
`The Stabilized Random Early Drop (SRED) algorithm con-
`trols the queue length around a queue threshold independent
`of the number of active connections [13]. The algorithm
`keeps the header of recent packet arrivals in a “zombie list.”
`When a packet arrives, it is compared with a randomly cho-
`sen packet from the zombie list. If the two packets are of the
`same flow, a “hit” is declared. Otherwise, the packet header
`
`in the zombie list is probabilistically replaced by the header
`of the new packet. The number of active connections is es-
`timated as the reciprocal of the average number of hits in a
`given interval (a large number of active connections results
`in a low probability of hits and vice versa). The drop prob-
`ability of a packet is a function of the instantaneous queue
`length and the estimated number of active connections. Hits
`are also used to identify high-bandwidth flows. Packets of
`high-bandwidth flows are dropped with a higher probability
`than other packets.
`
`The CHOKe algorithm heuristically detects and discrimi-
`nates against high-bandwidth flows without maintaining per
`flow state [17]. The algorithm is based on the assumption
`that a high-bandwidth flow is likely to occupy a large
`amount of buffer space in the router. When a new packet
`arrives, CHOKe picks a random packet in the queue and
`compares that packet’s header with the new packet’s header.
`If both packets belong to the same flow, both are dropped,
`otherwise, the new packet is enqueued. As with FRED,
`CHOKe is not likely to work well on a high-speed link and
`in the presence of a large aggregate of flows.
`
`The Stochastic Fair BLUE (SFB) algorithm detects and rate-
`limits unresponsive flows by using accounting bins that are
`organized hierarchically [4]. The bins are indexed by hash
`keys computed from a packet’s IP addresses and port num-
`bers and used to keep track of queue occupancy statistics of
`packets belonging to the bin. High-bandwidth flows can be
`easily identified because their bins’ occupancy is always
`high. These high-bandwidth flows are then rate-limited.
`SFB works well when the number of high-bandwidth flows
`is small. When the number of high-bandwidth flows in-
`creases, more bins become occupied and low bandwidth
`flows that hash to these bins are incorrectly identified as
`high-bandwidth and penalized.
`
`The Approximate Fairness through Differential Dropping
`(AFD) algorithm approximates fair bandwidth allocation by
`using a history of recent packet arrivals to estimate a flow’s
`transmission rate [16]. AFD uses a control theoretic algo-
`rithm borrowed from PI [7] to estimate the “fair share” of
`bandwidth that a flow is allowed to send. Packets of a flow
`are marked or dropped with a probability that is a function
`of the flow’s estimated sending rate. The algorithm uses a
`shadow buffer to store recent packet headers and uses these
`to estimate a flow’s rate. The estimated rate of a flow is
`proportional to the number of that flow’s headers in the
`shadow buffer. When a packet arrives, its header is copied
`to the shadow buffer with probability 1/s, where s is the
`sampling interval, and another header is removed randomly
`from the shadow buffer. Note that while sampling reduces
`implementation overhead, it also reduces the accuracy in
`estimating flows’ sending rate. This problem can be severe
`when most flows only send a few packets per RTT.
`
`The RED with Preferential Dropping (RED-PD) algorithm
`provides protection for responsive flows by keeping state
`
`Cloudflare - Exhibit 1040, page 2
`
`

`

`Generated response sizes
`
`100
`
`1000
`
`10000 100000 1e+06 1e+07 1e+08 1e+09
`Response size (bytes)
`Figure 1: CDF of generated response sizes.
`
`1
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`0
`
`10
`
`1
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`Cumulative probability
`
`Percentage of bytes transferred
`
`3
`
`0
`100
`
`1000
`
`10000
`
`1e+06
`100000
`Response size (bytes)
`Figure 2: CDF of percentage of total bytes transferred as a func-
`tion of response sizes.
`
`1e+07
`
`1e+08
`
`1e+09
`
`mission rate that is adaptable. By the time they receive a
`congestion notification signal they have either already com-
`pleted or have only one segment remaining to be sent. Fur-
`thermore, since short flows have a small congestion win-
`dow, they have to resort to timeouts when experiencing a
`packet loss. Thus giving these flows a congestion signal
`does not significantly reduce congestion and can only hurt
`the flows’ performance by delaying their completion. In
`contrast, high-bandwidth flows carrying large responses are
`capable of reducing their transmission rate and hence can
`have an impact on congestion. Unlike short flows, high-
`bandwidth flows do not have to resort to timeouts and in-
`stead can use TCP mechanisms for fast retransmission and
`fast recovery to recover from their packet losses. Our ap-
`proach will also police high-bandwidth non-TCP or non-
`compliant TCP flows that do not reduce their transmission
`rate when congestion occurs.
`
`Our observation about traffic characteristics is also con-
`firmed by other studies of Internet traffic. For example,
`Zhang et al. found that small flows (100KB or less) ac-
`counted for at least 84% of all flows, but carried less than
`15% of all bytes [21]. They also found that large flows ac-
`counted for a small fraction of the number of flows, but car-
`ried most of the bytes. Moreover, the flows that are “large”
`
`for just the high-bandwidth flows and preferentially drop-
`ping packets of these flows [12]. RED-PD uses the history
`of recent packet drops to identify and monitor high-
`bandwidth flows. The algorithm is based on the assumption
`that high-bandwidth flows also have a high number of
`packet drops in the drop history. Packets of a high-
`bandwidth flow are dropped with a higher probability than
`other packets. After being identified as high-bandwidth, a
`flow is monitored until it does not experience any packet
`drop in a certain time period. The absence of packet drops
`of a high-bandwidth flow in the drop history indicates that
`the flow has likely reduced its sending rate. In this case, the
`flow is deleted from the list of monitored flows.
`
`The RIO-PS scheme (Red with In and Out with Preferential
`treatment to Short flows) gives preferential treatment to
`short flows at bottleneck links [6]. With preferential treat-
`ment, short flows experience a lower drop-rate than long
`flows and can thus avoid timeouts. In RIO-PS, edge routers
`maintain per-flow state for flows entering the network. The
`first few packets of a flow are marked as “short” or “in.”
`Subsequent packets of that flow are marked as “long” or
`“out.” Core routers use the standard RIO algorithm [2] and
`drop long or out packets with a higher probability than short
`or in packets.
`
`Our DCN algorithm, described next, is an amalgam of ex-
`isting AQM mechanisms. Like AFD it uses a control theo-
`retic algorithm for selecting packets to drop and like RED-
`PD it maintains state for only the suspected high-bandwidth
`flows. However, we show empirically that our particular
`choice and construction of mechanisms results in better ap-
`plication and network performance than is possible with
`existing differential and non-differential AQM designs.
`
`3. The DCN Algorithm
`The design of DCN is based on the observation that on
`many networks, a small number of flows produce a large
`percentage of the traffic. For example, for the web traffic we
`have used to evaluate AQM designs, Figure 1 shows the cu-
`mulative distribution function (CDF) of the empirical distri-
`bution of HTTP response sizes [18]. Figure 2 shows a CDF
`of the percentage of total bytes transferred in an hour-long
`experiment as a function of HTTP response size. Together,
`these figures show that while approximately 90% of web
`responses are 10,000 bytes or less, these responses account
`for less than 25% of the bytes transferred during an experi-
`ment. Moreover, responses greater than 1 megabyte make
`up less than 1% of all responses, but account for 25% of the
`total bytes.
`
`These data suggest that providing early congestion notifica-
`tion to flows carrying responses consisting of a few TCP
`segments (e.g., flows of 2,000-3,000 bytes, approximately
`70% of all flows), would have little effect on congestion.
`This is because these flows comprise only 6-8% of the total
`bytes and because these flows are too short to have a trans-
`
`Cloudflare - Exhibit 1040, page 3
`
`

`

`4
`
`and “fast” (i.e., high-bandwidth, greater than 10KB/sec)
`account for less than 10% of all flows, but carrying more
`than 80% of all the bytes.
`
`The Differential Congestion Notification (DCN) scheme is
`based on identifying these “large” and “fast” flows, and
`providing congestion notification to only them. While the
`idea is simple, the challenge is to design an algorithm with
`minimal state requirements to identify the few long-lived,
`high-bandwidth flows from a large aggregate of flows and
`provide them with a congestion signal when appropriate. An
`important dimension of this problem is that of all the flows
`carrying large responses, we most want to signal flows that
`are also transmitting at a high-rate. These are the flows that
`are consuming the most bandwidth and hence will produce
`the greatest effect when they reduce their rate. Additionally,
`we must ensure that flows receiving negative differential
`treatment are not subject to undue starvation.
`
`Our DCN AQM design has two main components: identifi-
`cation of high-bandwidth flows and a decision procedure for
`determining when early congestion notification is in order.
`
`3.1 Identifying High-bandwidth Flows
`Our approach to identifying high-bandwidth, long-lived
`flows is based on the idea that packets of high-bandwidth
`flows are closely paced (i.e., their interarrival times are
`short) [3]. DCN tracks the number of packets that have been
`recently seen from each flow. If this count exceeds a thresh-
`old, the flow is considered to be a “long-lived and high-
`bandwidth” flow. The flow’s rate is then monitored and its
`packets are eligible for dropping. As long as a flow remains
`classified as high-bandwidth, it remains eligible for drop-
`ping. If a flow reduces its transmission rate, it is removed
`from the list of monitored flows and is no longer eligible for
`dropping.
`
`DCN uses two hash tables for classifying flows: HB (“high
`bandwidth”) and SB (“scoreboard”). The HB table tracks
`flows that are considered high-bandwidth and stores each
`flow’s flow ID (IP addressing 5-tuple) and the count of the
`number of forwarded packets. The SB table tracks a fixed
`number of flows not stored in HB. For these flows SB stores
`their flow ID and “recent” forwarded packet count.
`
`When a packet arrives at a DCN router, the HB table is
`checked to see if this packet belongs to a high-bandwidth
`flow. If the packet’s flow is found in HB, then it is handled
`as described below. If the packet’s flow ID is not in HB,
`then the packet is enqueued and its flow is tested to see if it
`should be entered into HB. The SB table is searched for the
`flow’s ID. If the flow ID is not present, it is added to SB.1 If
`
`1 If a flow ID hashes to an entry in SB for another flow, then the
`new flow overwrites the entry for the previous flow. This ensures
`that all SB operations can be performed in constant time. Thus, SB
`stores the packet counts for all currently active non-high-
`bandwidth flows, modulo hash function collisions on flow IDs.
`
`Arriving packet
`
`yes
`
`Is Flow ID
`in HB?
`
`Mark or drop
`probabilistically
`
`yes
`
`Has Tdec elapsed
`since last decrease?
`
`yes
`
`no
`
`Enqueue
`
`Is Flow ID
`in SB?
`
`no
`
`Decrease
`pktcount by pref
`
`no
`
`yes
`
`Last update less
`than clearing
`interval ago?
`
`no
`
`Overwrite the
`existing flow entry
`
`(Enqueue if
`not dropped)
`
`Increment pktcount
`
`Reset pktcount
`
`yes
`
`pktcount ≥ 4?
`
`Copy flow entry
`to HB
`
`Figure 3: High-level DCN flowchart.
`
`the flow ID is present in SB, the flow’s packet count is in-
`cremented.
`
`A flow is classified as long-lived and high-bandwidth if the
`number of packets from the flow arriving within a “clearing
`interval,” exceeds a threshold. Once the flow’s packet count
`in SB has been incremented, if the count exceeds the thresh-
`old, the flow’s entry in SB is added to HB.2 If no packets
`have been received for the flow within a clearing interval,
`the flow’s packet count is reset to 0.
`
`A high-level flow chart of the DCN algorithm is given in
`Figure 3. All operations on the SB table are performed in
`O(1) time. Since the number of flows identified as high-
`bandwidth is small (e.g., ~2000 for traffic generated during
`experiments reported herein), hash collisions in HB are rare
`for a table size of a few thousand entries. Thus, operations
`on the HB table are also usually executed in O(1) time.
`
`3.2 Early Congestion Notification
`Packets from a high-bandwidth flow are dropped with a
`probability 1 – pref / pktcount, where pktcount is the number
`of packets from that flow that have arrived at the router
`within a period of Tdec, and pref is the current “fair share” of a
`flow on a congested link. When congestion is suspected in
`the router we target high-bandwidth flows for dropping in
`proportion to their deviation from their fair share (pref pack-
`ets within an interval Tdec) [16, 19].
`DCN uses a simple control theoretic algorithm based on the
`well-known proportional integral controller to compute pref.
`The instantaneous length of the queue in the router is peri-
`odically sampled with period Tupdate. A flow’s fair share of
`the queue at the kth sampling period is given by:
`
`pref(kTupdate) = pref((k–1)Tupdate) + a (cid:1) (q(kTupdate) – qref) –
` b (cid:1) (q((k–1)Tupdate) – qref)
`
`2 If a collision occurs in HB when trying to insert the new flow,
`then a hash chain is used to locate a free table entry.
`
`Cloudflare - Exhibit 1040, page 4
`
`

`

`5
`
`Network Monitor
`
`ISP 1
`Router
`
`ISP 2
`Router
`
`Ethernet
`Switches
`
`Ethernet
`Switches
`
`...
`
`100
`Mbps
`
`1
`Gbps
`
`ISP 2
`Browsers/Servers
`
`100/1,000
`Mbps
`
`1
`Gbps
`
`100
`Mbps
`
`...
`
`ISP 1
`Browsers/Servers
`
`Network
`Monitor
`
`Figure 4: Experimental network setup.
`
`version of dummynet [8] to configure out-bound packet de-
`lays on machines on the edge of the network. These delays
`emulate different round-trip times on each TCP connection
`(thus giving per-flow delays). Our version of dummynet
`delays all packets from each flow by the same randomly-
`chosen minimum delay. The minimum delay in milliseconds
`assigned to each flow is sampled from a discrete uniform
`distribution on the range [10, 150] with a mean of 80 ms.
`The minimum and maximum values for this distribution
`were chosen to approximate a typical range of Internet
`round-trip times within the continental U.S. and the uniform
`distribution ensures a large variance in the values selected
`over this range.
`
`A TCP window size of 16K bytes was used on the end sys-
`tems because widely used OS platforms, e.g., most versions
`of Windows, typically have default windows of 16K or less.
`
`4.1 Synthetic Generation of TCP Traffic
`Two synthetically generated TCP workloads will be used to
`evaluate DCN. The first is an HTTP workload derived from
`a large-scale analysis of web traffic [18]. Synthetic HTTP
`traffic is generated according to an application-level de-
`scription of how the HTTP 1.0 and 1.1 protocols are used by
`web browsers and servers today. The specific model of
`synthetic web browsing is as described in [10], however,
`here we note that the model is quite detailed as it, for exam-
`ple, includes the use of persistent HTTP connections and
`distinguishes between web objects that are “top-level” (e.g.,
`HTML files) and objects that are embedded (e.g., image
`files).
`
`The second workload is based on a more general model of
`network traffic derived from measurements of the full mix
`of TCP connections present on Internet links. For the ex-
`periments here we emulate the traffic observed on an Inter-
`net 2 backbone link between Cleveland and Indianapolis
`[20]. Thus in addition to generating synthetic HTTP con-
`nections, this model will also generate synthetic FTP,
`SMTP, NNTP, and peer-to-peer connections. Details on this
`model can be found in [20].
`
`For both workloads, end-to-end response times for TCP data
`exchanges will be our primary measure of performance.
`Response time is defined as the time interval necessary to
`complete the exchange of application-level data units be-
`
`where a and b, a < b , are control coefficients (constants)
`that depend on the average number of flows and the average
`RTT of flows (see [7] for a discussion), q() is the length of
`the queue at a given time, and qref is a target queue length
`value for the controller. Since a < b, pref decreases when the
`queue length is larger than qref (an indication of congestion)
`and hence packets from high-bandwidth flows are dropped
`with a high probability. When congestion abates and the
`queue length drops below qref, pref increases and the prob-
`ability of dropping becomes low. Pan et al. and Misra et al.
`use the same equation in the design of AFD and PI respec-
`tively [16, 7].
`
`The flow ID of a high-bandwidth flow is kept in the HB
`table as long as the flow’s counter pktcount is positive. After
`each interval Tdec, the counter pktcount is decreased by pref.
`If a high-bandwidth flow’s packet count becomes negative,
`the flow is deleted from HB. We set Tdec to 800 ms in our
`experiments because the maximum RTT in our network can
`be up to 400 ms. Furthermore, we want to avoid situations
`where a new high-bandwidth flow is detected at the end of
`an interval Tdec and immediately removed from HB. We
`experimented with different parameter settings for a, b, Tup-
`date, and Tdec and here report only the results for our empiri-
`cally determined best parameter settings.
`
`4. Experimental Methodology
`To evaluate DCN we ran experiments in the testbed network
`described in [10]. The network, illustrated in Figure 4,
`emulates a peering link between two Internet service pro-
`vider (ISP) networks. The testbed consists of approximately
`50 Intel processor based machines running FreeBSD 4.5.
`Machines at the edge of the network execute one of a num-
`ber of synthetic traffic generation programs described be-
`low. These machines have 100 Mbps Ethernet interfaces and
`are attached to switched VLANs with both 100 Mbps and 1
`Gbps ports on 10/100/1000 Ethernet switches. At the core of
`this network are two router machines running the ALTQ
`extensions to FreeBSD [9]. ALTQ extends IP-output queu-
`ing at the network interfaces to include alternative queue-
`management disciplines. We used the ALTQ infrastructure
`to implement DCN, AFD, RIO-PS, and PI.
`
`Each router has sufficient network interfaces to create either
`a point-to-point 100 Mbps or 1 Gbps Ethernet network be-
`tween the two routers. The Gigabit Ethernet network is used
`to conduct calibration experiments to benchmark the traffic
`generators on an unloaded network. To evaluate DCN and
`compare its performance to other AQM schemes, we create
`a bottleneck between the routers by altering the (static)
`routes between the routers so that all traffic flowing in each
`direction uses a separate 100 Mbps Ethernet segment. This
`setup allows us to emulate the full-duplex behavior of a
`typical wide-area network link.
`
`So that we can emulate flows that traverse a longer network
`path than the one in our testbed, we use a locally-modified
`
`Cloudflare - Exhibit 1040, page 5
`
`

`

`6
`
`tween two endpoints. For example, for the HTTP workload,
`response time is defined as the time between when an
`(emulated) web browser makes a request for content to an
`(emulated) web server and the time when the last byte of the
`response is delivered from the server to the browser.
`
`4.2 Experimental Procedures
`To evaluate DCN we performed experiments on the two
`emulated ISP networks in our testbed connected with 100
`Mbps links. For the evaluation we congest these 100 Mbps
`links with varying degrees of traffic. To quantify the traffic
`load in each experiment we define offered load as the net-
`work traffic (in bits/second) resulting from emulating the
`behavior of a fixed-size population of users (e.g., a popula-
`tion of users browsing the web in the case of the HTTP
`workload). More specifically, load is expressed as the long-
`term average throughput on an uncongested link that would
`be generated by that user population. For example, to de-
`scribe the load offered by emulating a population of 20,000
`users evenly distributed on our network testbed, we would
`first emulate this user population with the two ISP networks
`connected with a gigabit/second link and measure the aver-
`age throughput in one direction on this link. The measured
`throughput, approximately 105 Mbps in the case of the
`HTTP workload, is our value of offered load.
`
`Although the TCP traffic we generate is highly bursty (e.g.,
`for the HTTP workload we generate a long-range dependent
`packet arrival process at the routers [10]), the range of loads
`we attempt to generate in our experiments (approximately
`50-120 Mbps) is such that congestion will never be present
`on the gigabit network. Since experiments are ultimately
`performed with the two ISP networks connected at 100
`Mbps, a calibration process (described in [10]) is used to
`determine the population of users required to achieve a par-
`ticular degree of congestion on the 100 Mbps network.
`
`Each experiment was run using offered loads of 90%, 98%,
`or 105% of the capacity of the 100 Mbps link connecting the
`two router machines. It is important to emphasize that terms
`like “105% load” are used as a shorthand notation for “a
`population of users that would generate a long-term average
`load of 105 Mbps on a 1 Gbps link.” As offered loads ap-
`proach saturation of the 100 Mbps link, the actual link utili-
`zation will, in general, be less than the intended offered
`load. This is because as utilization increases, response times
`become longer and emulated users have to wait longer be-
`fore they can generate new requests and hence generate
`fewer requests per unit time.
`
`Each experiment was run for 120 minutes to ensure very
`large samples (over 10,000,000 TCP data exchanges), but
`data were collected only during a 90-minute interval to
`eliminate startup effects at the beginning and termination
`synchronization anomalies at the end.
`
`The key indicator of performance we use in reporting our
`results are the end-to-end response times for each data ex-
`change (e.g., an HTTP request/response). We report these as
`CDFs of response times up to 2 seconds. We also report the
`fraction of IP datagrams dropped at the link queues, the link
`utilization on the bottleneck link, and the number of re-
`quest/response exchanges completed in the experiment.
`
`5. Experimental Results
`We compared the performance of DCN against a large
`number of AQM designs but report here only the results of
`DCN versus only PI, AFD and RIO-PS. PI was selected
`because it was the best performing non-differential AQM
`scheme from a previous study [10]. AFD and RIO-PS were
`chosen because they were the best performing differential
`schemes. We also include results from experiments with
`drop-tail FIFO queue management to illustrate the perform-
`ance of no AQM (i.e., the performance to beat), and results
`from drop-tail experiments on the uncongested gigabit net-
`work to illustrate the best possible performance. (The ex-
`periments on the uncongested network provide the best pos-
`sible response times as there is no queue present on either
`router at any of the load levels considered here.)
`
`For PI, DCN and AFD, an important parameter is the target
`queue length the algorithm attempts to maintain. After ex-
`tensive experimentation, we found that AFD perform best at
`a target queue length of 240 packets at all loads. DCN and
`PI obtain their best performance with a target queue length
`of 24 packets. We used the parameter settings for RIO-PS
`that were recommended by its inventors [6]. In all cases we
`set the maximum queue size to a number of packets that was
`large enough to ensure tail drops did not occur.
`
`5.1 Experimental Results for the HTTP Workload
`Figures 5-8 give the results for DCN, PI, AFD, and RIO-PS.
`They show the cumulative distribution functions (CDFs) for
`response times of HTTP request/response exchanges at of-
`fered loads of 90%, 98%, and 105% respectively. We also
`report other statistics for the experiments in Table 1.
`
`Figure 5 shows that at all loads, the per

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket