`
`(12)
`
`United States Patent
`
`McCanne et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,069,225 B2
`Nov. 29, 2011
`
`(54) TRANSPARENT CLIENT-SERVER
`TRANSACTION ACCELERATOR
`
`6,415,329 B1 *
`5519575 B1
`
`7/2002 Gelman et 31
`2/2003 Freeman ~~~~~~~~~~~~~~~~~~~~~~~ ~~ 705/21
`(Continued)
`
`(75)
`
`Inventors: Steven McCanne, Berkeley, CA (US);
`Michael J. Demmer, San Francisco, CA
`(US); Arvind Jain, Santa Clara, CA
`(US); David Tze-Si Wu, Fremont, CA
`(US)
`
`EP
`
`FOREIGN PATENT DOCUMENTS
`0 313 326 A2
`12/1997
`(Continued)
`
`OTHER PUBLICATIONS
`
`(73) Assigneei Riverbed Technology’ Inc" San
`Franclsco’ CA (US)
`
`Knutsson, Bjorn et al.; “Transparent Proxy Signalling”; Journal of
`Communication Networks Mar 2001 pp 1-11
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`USC. 154(b) by 2039 days.
`
`(21) Appl. No.: 10/640,405
`
`(22) Filed:
`
`Aug. 12, 2003
`
`<65)
`
`(60)
`
`PriorPub1ie=tionData
`Us 2004/0215746 Al
`Oct. 28, 2004
`Related U_s_ Application Data
`l
`.
`l
`.
`Pr0V151011a1 aPP11Cal10n NO- 60/462990: filed 011 A131
`14, 2003-
`
`(51)
`
`Int- CL
`(2005-01)
`G06F 15/16
`(52) U.S. Cl.
`........................................ .. 709/219; 706/21
`(58) Field of Classification Search ................ .. 71 1/158;
`717/169; 704/456; 709/223, 219, 236, 233,
`709/231, 250, 203, 228; 707/1; 706/21
`See application file for Complete Search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,754,774 A
`6,178,461 B1
`6,219,642 B1*
`
`5/1998 Bittinger et al.
`1/2001 Chan et al.
`4/2001 Asghar et al.
`
`............ .. 704/256.8
`
`(Continued)
`
`Primary Examiner — Dustin Nguyen
`(74) Attorney, Agent, or Firm — Philip H. Albert; Kilpatrick
`Townsend & Stockton LLP
`
`(57)
`ABSTRACT
`I
`k h
`f
`1'
`d
`£si3§ZV§§%Jma§ei33¥§?§ifiiflfitiJéfiiriiiiiiiifiifiiié-
`Erator for accelerating 1transact(i1ons linvolving data transfer
`etween at east one c ient an at east one server over a
`netiwork comprising aglient-sidefienginie, a servézr-siie en(gine
`an a transaction pre ictor con gure
`to pre ict,
`ase on
`past transactions, which transactions are likely to occur in the
`future between the client and server. The transaction predictor
`might be in the server-side engine, the client-side engine, or
`both. The client-side engine receives indications of requests
`from the client, a transaction buffer for storing results of
`predicted transactions received from the server or the server-
`slde engme ahead efreeelpt efa eerrespendlng request, and a
`collator for collating the requests from the client with the
`stored results or received results, wherein a request and a
`response that are matched by the collator are identified and
`the matched response is provided to the client in response to
`the matched request. The server-side engine receives indica-
`tions of transactions including requests and responses and
`conveys requests to the server in response to actual transac-
`tions or predicted transactions.
`
`33 Claims, 10 Drawing Sheets
`
`- Initial latency delay could be optimized away as well
`
`oTransactions are predicted based on past behavior, then pipelined
`- All transactions still go back to the sewer
`-Any data that is sent is optimized using the hierarchical cloning system
`-Actual amount of data sent is a small fraction of original size
`
`RIV-1005 / Page 1 of 29
`
`L1007
`
`L4559
`
`L3771
`
`E F
`
`i|es1,2&3
`
`
`
`US 8,069,225 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`Padmanabhan et al., “Using Predictive Prefetching to Improve World
`Wide Web Latency,” IEEE Transactions on Antennas and Propaga-
`tion, 26(3):22-36 (1996).
`Caceres, Ramon et al., “Webb Proxy Caching: The Devil Is In the
`Details,” Jun. 1998, Proceedings ofthe Workshop on Internet Server
`Performance, Madison, Wisconsin, pp. 111-118.
`Deshpande, Mukund et al., “Selective Markov Models for Predicting
`Web-Page Accesses,” 2004, ACM Transactions on Internet Technol-
`ogy, vol. 4, Issue 2, pp. 163-184.
`Fan, Li et al., “Summary Cache: A Scalable Wide Area Web Cache
`Sharing Protocol,” Jun. 2000, Proceedings ofthe IEEE/ACM Trans-
`actions on Networking, vol. 8, No. 3; pp. 281-293.
`Feldmeier, D.C. et al., “Protocol Boosters,” Apr. 1998, IEEE JSAC,
`vol. 16, Issue No. 3, pp. 437-444.
`Griffioen, James et al ., “Automatic Prefetching in aWAN,” Oct. 1993,
`Proceedings of the IEEE Workshop on Advances in Parallel and
`Distributed Systems, Technical Report # CS243 -93, pp. 8-12.
`Griffioen, James et al., “Reducing File System Latency using a Pre-
`dictive Approach,” Jun. 1994, Proceedings ofthe USENIX Summer
`1994 Technical Conference on USENIX Technical Conference, vol.
`1.
`Lei, Hui et al., “An Analytical Approach to File Prefetching,” Jan.
`1997, Proceedings of the Annual Conference on USENIX Annual
`Technical Conference, Anaheim, California, pp. 1-12.
`Oly, James et al., “Markov Model Prediction of I/O Requests for
`Scientific Applications,” Jun. 2002, Proceedings ofthe 16th Interna-
`tional Conference on Supercomputing, pp. 147-155.
`Rhea, Sean C. et al., “Value-Based Web Caching,” May 2003, Pro-
`ceedings of the 12th International Conference on World Wide Web,
`Budapest, Hungary, pp. 619-628.
`Tolia, Niraj, et al., “An Architecture for Internet Data Transfer,” May
`2006, Third Symposium on Networked Systems Design and Imple-
`mentation.
`Yang, Qiang et al., “Mining Web Logs for Prediction Models in
`WWW Caching and Prefetching,” Aug. 2001, Proceedings of the
`Seventh ACMSIGKDD International Conference on Knowledge Dis-
`covery and Data Mining KDD’01, San Francisco, California, pp.
`473-478.
`Office Action in U.S. Appl. No. 12/191,514 dated May 17, 2011.
`
`* cited by examiner
`
`RIV-1005 / Page 2 of 29
`
`4/2003 Baber et al.
`6,546,428 B2
`5/2003 Bhagwat et al.
`6,563,517 B1*
`6/2003 Eylon et al.
`6,574,618 B2*
`..
`6/2003 Starnes et al.
`6,578,073 B1*
`9/2003 Datta ............ ..
`6,622,168 B1*
`9/2004 Pedrizettiet al.
`6,789,255 B1*
`6,820,133 B1* 11/2004 Grove et al.
`6,839,682 B1*
`1/2005 Blume et al.
`6,874,017 B1
`3/2005 Inoue etal.
`7,082,456 B2*
`7/2006 Mani-Meitav et al.
`7,149,817 B2* 12/2006 Pettey ........... ..
`7,224,839 B2*
`5/2007 Zeineh .
`. . . . . . . .
`7,619,545 B2
`11/2009 Samuels etal.
`7,809,818 B2
`10/2010 Plamondon
`7,827,237 B2
`11/2010 Plamondon
`7,865,585 B2
`1/2011 Samuels etal.
`7,916,047 B2
`3/2011 Samuels etal.
`2002/0013853 A1
`1/2002 Baber et al.
`2002/0023145 A1*
`2/2002 Orr et al.
`..................... .. 709/219
`2002/0029326 A1*
`3/2002 Reuter et al.
`................ .. 711/206
`2002/0087547 A1
`7/2002 Kausik et al.
`2002/0107971 A1*
`8/2002 Bailey et al.
`2002/0138511 A1
`9/2002 Psounis etal.
`2002/0147895 A1* 10/2002 Glance et al.
`2002/0194382 A1
`12/2002 Kausik et al.
`2003/0009583 A1*
`1/2003 Chan et al.
`.................. .. 709/236
`2004/0088376 A1
`5/2004 McCanne et al.
`2004/0215746 A1
`10/2004 McCanne et al.
`
`..... .. 709/203
`709/250
`. . . . . . . . . .. 382/232
`
`................ .. 709/231
`............... .. 711/158
`
`............ .. 715/735
`707/1
`709/219
`709/219
`.. 717/169
`709/238
`.................. .. 705/10
`
`
`
`W0
`
`FOREIGN PATENT DOCUMENTS
`WO 01/63420 A1
`8/2001
`
`OTHER PUBLICATIONS
`
`Amer et al., “File Access Prediction with Adjustable Accuracy,”
`Conf Proc. of 2002 IEEE Int. Performance, Computing, and Com-
`munications, 21 : 131-140, conference heldApr. 3-5, 2002 in Phoenix,
`AZ.
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 1 of 10
`
`US 8,069,225 B2
`
`Throughput
`
`(Mb/s)
`
`0
`
`10
`
`20
`
`30
`
`40
`
`50
`
`
`60
`70
`80
`90
`100
`
`Latency (ms)
`
`FIG. 1
`
`5-3
`15/2‘
`
`
`
`45
`
`4°
`35
`
`25 g
`g
`20-?-
`§
`15 E
`
`10
`
`5 0
`
`RIV-1005 / Page 3 of 29
`
`
`
`‘\‘5§“§
`
`
`
`3—l 30 «
`
`‘
`
`
`
`U.S. Patent
`
`0
`
`00S
`
`2B522,
`
`UwaammEEZ86|l_wow_
`
`w.0,mGI
`
`$z\E_o
`
`_m>._mm.
`
`1__/m__/2,EE/9.|I2._Q23:/w/E/,N/E1
`
`S4/NO_\///__/x
`
`:§_2m3III4
`
`1_.m_\2_\\\
`
`ml»\\\\
`
`mE¢__o
`
`RIV-1005 / Page 4 of 29
`
`
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 3 of 10
`
`US 8,069,225 B2
`
`U)
`U)
`(D
`
`oOLD
`
`.
`(D
`
`
`
`
`
`LinuxInterceptModule
`
`HostMemory
`
`DualPortNiC
`
`FIG.4
`
`RIV-1005 / Page 5 of 29
`
`EC
`
`’)
`
`CU
`
`J
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 4 of 10
`
`US 8,069,225 B2
`
`_m>._mm
`
`._®>._®w
`
`toqmcmc.
`
`one
`
`:o_§_8_n_
`
`mmmgfimo
`
`._m>._om
`
`o_o_m
`
`59:0
`
`m_:co_>_
`
`Azowmv
`
`Loaow
`
`mpm
`
`o_:vo_>_
`
`:>__m@
`
`E9.m_m_mn_
`
`m:o..mEoEm¢w
`
`E9m_w._mn_
`
`E05Emcamw
`
`oc_m.m_o__u_
`
`toamcwc.
`
`o:_m.,m_m__“_
`
`toamcmc.
`
`E020
`
`toamcm._._.
`
`E9
`
`Evoocm
`
`co_§_8E
`
`ommnfimo
`
`
`
`
`
` 3.5._o>m._toqwcmc.Emm___m..:_
`
`oEw_o>._mm
`
`mc_mcm_
`
`2.5Ego
`
`m:_mcm_
`
`m.®_u_
`
`RIV-1005 / Page 6 of 29
`
`
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 5 of 10
`
`US 8,069,225 B2
`
`
`
`mwm;w®__n_
`
`...Mm.3.—x
`
`FEB891So:
`
`_w>$w
`
`E®__O
`
`905EmEmmw
`
`99
`
`Emzcmmm
`
`
`
`Eo..m>mmc_co_o_moE8m$E9:9%:_o®N_E:QOm_Emmm_6.:Emu>:<.
`
`
`
`
`
`
`
`$289:9xomnom__=mmcozommcmb__<.
`
`
`
`cm:__wQ_a:9:._o_>m;wnEmacowomanuofifimaEmm:o=omm:m.._..
`
`
`
`
`
`
`
`mN_m_mc_m_.o.6cozomt__mEmmm_EomEmuhoE:oEm_m:..o<.
`
`
`
`
`
`
`
`...m.K.—.
`
`m.mm;8_:
`
`RIV-1005 / Page 7 of 29
`
`c..O_n_
`
`
`
`:95mm>m>>m_u®N_E_..QOmg2300>m_mu>oc9m__m_::_.
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`v0N
`
`11m
`
`06tem
`
`m
`
`Snu
`
`522,960,
`
`
`
`owwco_..o__uma
`
`Etna
`
`V/oow
`
`9938
`
`
`
`Emwbmqs9.._Oo_ma._®>._®woco_oJ._O._0>._®m..._u‘._O..m__OoiQ”_._
`
`
`
`233233:59
`
`fwow
`
`womcoQm2
`
`Swomcoqmme
`
`8oww
`
`2.BNGE
`
`RIV-1005 / Page 8 of 29
`
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 7 of 10
`
`US 8,069,225 B2
`
`FIG. 8 (Prior Art)
`
`RIV-1005 / Page 9 of 29
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 8 of 10
`
`US 8,069,225 B2
`
`FIG. 9
`
`RIV-1005 / Page 10 of 29
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 9 of 10
`
`US 8,069,225 B2
`
`FIG. 10
`
`RIV-1005 / Page 11 of 29
`
`
`
`U.S. Patent
`
`Nov. 29, 2011
`
`Sheet 10 of 10
`
`US 8,069,225 B2
`
`@6699
`
`FIG. 11A
`
`o'e""'5"F’e
`
`FIG. 11B
`
`
`
`FIG. 12
`
`RIV-1005 / Page 12 of 29
`
`
`
`US 8,069,225 B2
`
`1
`TRANSPARENT CLIENT-SERVER
`TRANSACTION ACCELERATOR
`
`CROSS-REFERENCES TO RELATED
`APPLICATIONS
`
`This application claims priority from U.S. Provisional
`Patent Application No. 60/462,990 entitled “Transport-Level
`Network Accelerator” filed Apr. 14, 2003 and is incorporated
`by reference herein for all purposes.
`The present disclosure is related to the following com-
`monly assigned co-pending U.S. patent applications:
`U.S. patent application Ser. No. 10/285,315 entitled
`“Transaction Accelerator for Client-Server Communi-
`cation Systems” (hereinafter “McCanne I”) filed on Oct.
`30, 2002 and is incorporated by reference herein for all
`purposes;
`U.S. patent application Ser. No. 10/285,330 entitled “Con-
`tent-Based Segmentation Scheme for Data Compres-
`sion in Storage and Transmission Including Hierarchical
`Segment Representation” (hereinafter “McCanne II”)
`filed on Oct. 30, 2002 and is incorporated by reference
`herein for all purposes; and
`U.S. patent application Ser. No. 10/640,562 know U.S. Pat.
`No. 7,318,100) cntitlcd “Coopcrativc Proxy Auto-Dis-
`covery and Connection Interception”
`(hereinafter
`“McCanne IV”) filedconcurrently herewith and is incor-
`porated by reference herein for all purposes.
`
`FIELD OF THE INVENTION
`
`The present invention relates to data transport over net-
`works in general and more particularly may relate to improve-
`ments in data transport at the transport level between a client
`and a server.
`
`BACKGROUND OF THE INVENTION
`
`Local Area Network (LAN) communication is character-
`ized by generous bandwidths, low latencies and considerable
`enterprise control over the network. By contrast, Wide Area
`Networks (WANs) often have lower bandwidths and higher
`latencies than LANs and often have a measure of network
`
`control that is outside the enterprise for which the WAN is
`being used.
`Wide-area client-server applications are a critical part of
`almost any large enterprise. A WAN might be used to provide
`access to widely used and critical infrastructure, such as file
`servers, mail servers and networked storage. This access most
`often has very poor throughput when compared to the perfor-
`mance across a LAN. Whether an enterprise is taking a cen-
`tralized approach or a distributed approach, high performance
`communication across the WAN is essential in order to mim-
`
`mize costs and maximize productivity. Enterprise IT manag-
`ers today typically take one oftwo approaches to compensate
`for the performance challenges inherent in WANs:
`1 . Ifthe IT infrastructure is decentralized and they intend to
`keep it that way, corporate network and server managers
`typically have to deploy local file servers, data storage,
`mail servers, and backup systems with some redundancy
`in each remote office to ensure fast and reliable access to
`
`critical data and applications at each office. They also
`may maintain over-provisioned WAN links in order to
`enable reasonable levels ofperformance for file transfers
`and similar data-intensive tasks.
`
`2. If the IT infrastructure is already centralized or semi-
`centralized, the enterprise will be faced with a constant
`
`10
`
`15
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`demand for “more bandwidth” to remote sites in an
`
`effort to improve the performance for distributed users.
`Causes of Poor WAN Throughput
`The two primary causes of the slow throughput on WANs
`are well known: high delay (or latency) and limited band-
`width. The “bandwidth” of a network of charmel refers to
`measure of the number of bits that can be transmitted over a
`
`link or path per unit of time. “Latency” refers to a measure of
`the amount of time that transpires while the bits traverse the
`network, e. g., the time it takes a given bit transmitted from the
`sender to reach the destination. “Round-trip time” refers to
`the sum of the “source-to-destination” latency and the “des-
`tination-to-source” latency. If the underlying paths are asym-
`metric, the round-trip latency might be different than twice a
`one-way latency. The term “throughput” is sometimes con-
`fused with bandwidth but refers to a measure of an attained
`
`transfer rate that a client-server application, protocol, etc.
`achieves over a network path. Throughput is typically less
`than the available network bandwidth.
`
`The speed of light, a fundamental and fixed constant,
`implies that information transmitted across a network always
`incurs some nonzero latency as it travels from the source to
`the destination. In practical terms, this means that sending a
`packet from Silicon Valley to NewYork and back could never
`occur faster than about 30 milliseconds (ms), the time infor-
`mation in an electromagnetic signal would take to travel that
`distance in a direct path cross-country. In reality, this cross-
`country round trip time is more in the range of 100 ms or so,
`as signals in fiber or copper do not always travel at the speed
`of light in a vacuum and packets incur processing delays
`through each switch and router. This amount of latency is
`quite significant as it is at least two orders of magnitude
`higher than typical sub-millisecond LAN latencies.
`Other round-trips might have more latency. Round trips
`from the West Coast of the U.S. to Europe can be in the range
`of 100-200 ms, and some links using geo-stationary satellites
`into remote sites can have latencies in the 500-800 ms range.
`With latencies higher than about 50 ms, many client-server
`protocols and applications will function poorly relative to a
`LAN as those protocols and applications expect very low
`latency.
`While many employees routinely depend upon Fast Ether-
`net (100 Mbps) or Gigabit Ethernet (1 Gbps) within most
`corporate sites and headquarters facilities, the bandwidth
`interconnecting many corporate and industrial sites in the
`world is much lower. Even with DSL, Frame Relay or other
`broadband technologies, WAN connections are slow relative
`to a LAN. For example, 1 Mbps DSL service offers only 1/100"’
`the bandwidth ofFast Ethernet and 1/1,ooo”’ ofwhat is available
`using Gigabit Ethernet.
`While some places might have high bandwidth backbone
`networks, such as the Metro Ethernet available in South
`Korea and Japan, the latency and bandwidth issues persist
`whenever data needs to travel outside areas with such net-
`
`works. For example, a Japanese manufacturer with plants in
`Japan and the U.S. might needs to send CAD/CAM files back
`and forth between plants. The latency from Japan to the East
`Coast ofthe U.S. might be as high as 200 ms and trans-Pacific
`bandwidth can be expensive and limited.
`WAN network bandwidth limits almost always impact cli-
`ent-server application throughput across the WAN, but more
`bandwidth can be bought. With latency, lower latency cannot
`be bought if it would require faster than light communica-
`tions. In some cases, network latency is the bottleneck on
`performance or throughput. This is often the case with win-
`dow-based transport protocols such as TCP or a request-
`response protocol such as the Common Internet File System
`
`RIV-1005 / Page 13 of 29
`
`
`
`US 8,069,225 B2
`
`3
`(CIFS) protocol or the Network File System (NFS) protocol.
`High network latency particularly slows down “chatty” appli-
`cations, even ifthe actual amounts of data transmitted in each
`transaction are not large. “Chatty” applications are those in
`which client-server interactions involve many back-and-forth
`steps that might not even depend on each other. Adding band-
`width (or compressing data) does not improve the throughput
`of these protocols/applications when the round-trip time
`exceeds some critical point and once the latency reaches that
`critical point, throughput decays quickly.
`This phenomenon can be understood intuitively: the rate of
`work that can be performed by a client- server application that
`executes serialized steps to accomplish its tasks is inversely
`proportional to the round-trip time between the client and the
`server. If the client-server application is bottlenecked in a
`serialized computation (i.e., it is “chatty”), then increasing
`the round-trip by a factor of two causes the throughput to
`decrease by a factor of two because it takes twice as long to
`perform each step (while the client waits for the server and
`vice versa).
`More generally, the throughput of client-server applica-
`tions that are not necessarily chatty but run over a window-
`based protocol (such as TCP) can also suffer from a similar
`fate. This can be modeled with a simple equation that
`accolmts for the round-trip time (RTT) and the protocol win-
`dow (W). The window defines how much data the sender can
`transmit before requiring receipt of an acknowledgement
`from the receiver. Once a window’ s worth of data is sent, the
`sender must wait until it hears from the receiver. Since it takes
`
`a round-trip time to receive the acknowledgement from the
`receiver, the rate at which data can be sent is simply the
`window size divided by the round trip time:
`T:W/RTT
`
`The optimal choice ofwindow size depends on a number of
`factors. To perform well across a range ofnetwork conditions,
`a TCP device attempts to adapt its window to the underlying
`capacity of the network. So, if the underlying bottleneck
`bandwidth (or the TCP sender’s share of the bandwidth) is
`roughly B bits per second, then a TCP device attempts to set
`its window to B><RTT, and the throughput, T, would be:
`T:(BxR TT)/R 17:13
`
`In other words, the throughput would be equal to the avail-
`able rate. Unfortunately, there are often other constraints.
`Many protocols, such as TCP and CIFS, have an upper bound
`on the window size that is built into the protocol. For example,
`the maximum request size in CIFS is 64 KB and in the
`original TCP protocol, the maximum window size was lim-
`ited by the fact that the advertised window field in the protocol
`header is 16 bits, limiting the window also to 64 KB. While
`modern TCP stacks implement the window scaling method in
`RFC 1323 to overcome this problem, there are still many
`legacy TCP implementations that do not negotiate scaled
`windows, and there are more protocols such as CIFS that have
`application-level limits on top of the TCP window limit. So,
`in practice, the throughput is actually limited by the maxi-
`mum window size
`
`T=min(B><R TZMWS)/R TT< :3
`
`Even worse, there is an additional constraint on throughput
`that
`is fundamental
`to the congestion control algorithm
`designed into TCP. This flaw turns out to be non-negligible in
`wide-area networks where bandwidth is above a few megabits
`and is probably the key reason why enterprises often fail to
`see marked performance improvements of individual appli-
`cations after substantial bandwidth upgrades.
`
`5
`
`10
`
`15
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`
`Essentially, this problem stems from conflicting goals of
`the TCP congestion control algorithm that are exacerbated in
`a high-delay environment. Namely, upon detecting packet
`loss, a TCP device reacts quickly and significantly to err on
`the side of safety (i.e., to prevent a set of TCP connections
`from overloading and congesting the network). Yet, to probe
`for available bandwidth, a TCP device will dynamically
`adjust its sending rate and continually push the network into
`momentary periods of congestion that cause packet loss to
`detect bandwidth limits. In short, a TCP device continually
`sends the network into congestion then aggressively backs
`off. In a high-latency enviromnent, the slow reaction time
`results in throughput limitations.
`An equation was derived in the late 1990’ s that models the
`behavior of a network as a function ofthe packet loss rate that
`TCP induces and that equation is:
`
`C WS: 1 .2><S/sqrl(p)
`
`As indicated by that equation, the average congestion win-
`dow size (CWS) is roughly determined by the packet size (S)
`and the loss rate (p). Taking this into account, the actual
`throughput ofa client- server application running over TCP is:
`T: W/R TT:min(MWS, CWS,BxR TT)/RTT
`
`FIG. 1 is a graph that illustrates this problem from a very
`practical perspective. That graph shows the performance of a
`TCP data transfer when the network is experiencing a low
`degree ofnetwork loss (less than 1/10 of 1 percent) for increas-
`ing amounts of latency. The bottom curve represents the TCP
`throughput achievable from a T1 line, which is roughly equal
`to the available bandwidth (1.544 Mb/s) all the way up to 100
`ms latencies. The top curve, on the other hand, illustrates the
`performance impact of the protocol window at higher band-
`widths. With a T3 line, the TCP throughput starts out at the
`available line rate (45 Mb/s) at low latencies, but at higher
`latencies the throughput begins to decay rapidly (in fact,
`hyperbolically). This effect is so dramatic that at a 100 ms
`delay (i.e., a typical cross-country link), TCP throughput is
`only 4.5 Mb/s ofthe 45 Mb/s link.
`Under such conditions, application performance does not
`always increase when additional bandwidth is added. As FIG.
`1 shows, if the round trip time (RTT) is greater than a critical
`point (just 15 ms or so in this example) then increasing the
`bandwidth of the link will only marginally improve through-
`put at higher latency and at even higher latencies, throughput
`is not increased at all with increases in bandwidth.
`
`FIG. 2 graphs a surface ofthroughput model derived above,
`presuming a TCP transfer over a 45 Mb/s T3 link. The surface
`plots throughput as a function of both round-trip times and
`loss rates. This graph shows that both increasing loss and
`increasing latency impair performance. While latency has the
`more dramatic impact, they combine to severely impact per-
`formance. In environments with relatively low loss rates and
`normal WAN latencies, throughput can be dramatically lim-
`ited.
`
`Existing Approaches to Overcoming WAN Throughput Prob-
`lems
`
`Given the high costs and performance challenges ofWAN-
`based enterprise computing and communication, many
`approaches have been proposed for dealing with these prob-
`lems.
`
`Perhaps the simplest approach to dealing withperformance
`is to simply upgrade the available bandwidth in the network.
`Of course this is the most direct solution, but it is not always
`the most effective approach. First of all, contrary to popular
`belief, bandwidth is not free and the costs add up quickly for
`large enterprises that may have hundreds of offices. Second,
`
`RIV-1005 / Page 14 of 29
`
`
`
`US 8,069,225 B2
`
`5
`as discussed earlier, adding bandwidth does not necessarily
`improve throughput. Third,
`in some places adding more
`bandwidth is not possible, especially across international
`sites, in remote areas, or where it is simply too expensive to
`justify.
`Another approach is to embed intelligence in the applica-
`tions themselves, e.g., to exploit that fact that data often
`changes in incremental ways so that the application can be
`designed to send just incremental updates to between clients
`and servers. Usually, this type ofapproach employs some sort
`ofversioning system to keep track of version numbers of files
`(or data objects) so that differences between versioned data
`can be sent between application components across the net-
`work. For example, some content management systems have
`this capability and storage backup software generally
`employs this basic approach. However, these systems do not
`deal with scenarios where data is manipulated outside oftheir
`domain. For example, when a file is renamed and re-entered
`into the system the changes between the old and new versions
`are not captured. Likewise, when data flows between distinct
`applications (e.g., a file is copied out of a content manage-
`ment system and into a file system), versioning cannot be
`carried out between the different components.
`This approach of managing versions and communicating
`updates can be viewed as one specific (and application-spe-
`cific) approach to compression. More generally, data com-
`pression systems can be utilized to ameliorate network band-
`width bottlenecks. Compression is a process of representing
`one set of data with another set of data wherein the second set
`
`of data is, on average, a smaller number of bits than the first
`set of data, such that the first set of data, or at least a sufficient
`approximation of the first set of data, can be recovered from
`an inverse of the compression process in most cases. Com-
`pression allows for more efficient use of a limited bandwidth
`and might result in less latency, but in some cases, no latency
`improvement occurs. In some cases, compression might add
`to the latency, if time is needed to compress data after the
`request is made and time is needed to decompress the data
`after it is received. This may be able to be improved ifthe data
`can be compressed ahead of time, before the request is made,
`but that may not be feasible if the data is not necessarily
`available ahead of time for compression, or if the volume of
`data from which the request will be served is too large relative
`to the amount of data likely to be used.
`One way to deploy compression is to embed it in applica-
`tions. For example, a Web server can compress the HTML
`pages it returns before delivering them across the network to
`end clients. Another approach is to deploy compression in the
`network without having to modify the applications. For many
`years, network devices have included compression options as
`features (e.g., in routers, modems, dedicated compression
`devices, etc) [D. Rand, “The PPP Compression Control Pro-
`tocol (CCP)”, Request-for-Comments 1962, June 1996]. This
`is a reasonable thing to do, but the effectiveness is limited.
`Most methods of lossless data compression typically reduce
`the amount of data (i.e., bandwidth) by a factor of 1.5 to 4,
`depending on the inherent redundancy present. While helpful,
`it is not enough to dramatically change performance if the
`amount of data being sent is large or similar data is sent
`repeatedly, perhaps over longer time scales. Also, when per-
`formance is limited by network latency, compressing the
`underlying data will have little or no impact.
`Rather than compress the data, another approach to work-
`ing around WAN bottlenecks is to replicate servers and server
`data in local servers for quick access. This approach in par-
`ticular addresses the network latency problem because a cli-
`ent in a remote site can now interact with a local server rather
`
`10
`
`15
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`than a remote server. There are several methods available to
`
`enterprises to store redundant copies of data in replicated file
`systems, redundant or local storage servers, or by using any
`number of distributed file systems. The challenge with this
`kind of approach is the basic problem of managing the ever-
`exploding amount of data, which requires scaling up storage,
`application and file servers in many places, and trying to make
`sure that the files people need are indeed available where and
`when they are needed. Moreover, these approaches are gen-
`erally non-transparent, meaning the clients and servers must
`be modified to implement and interact with the agents and/or
`devices that perform the replication function. For example, if
`a file server is replicated to a remote branch, the server must
`be configured to send updates to the replica and certain clients
`must be configured to interact with the replica while others
`need to be configured to interact with the original server.
`Rather than replicate servers, another approach is to deploy
`transport-level or application-level devices called “proxies”,
`which function as performance-enhancing intermediaries
`between the client and the server. In this case, a proxy is the
`terminus for the client connection and initiates another con-
`
`nection to the server on behalf of the client. Alternatively, the
`proxy connects to one or more other proxies that in turn
`connect to the server. Each proxy may forward, modify, or
`otherwise transform the transactions as they flow from the
`client to the server and vice versa. Examples of proxies
`include (1) Web proxies that enhance performance through
`caching or enhance security by controlling access to servers,
`(2) mail relays that forward mail from a client to another mail
`server, (3) DNS relays that cache DNS name resolutions, and
`so forth.
`
`With a proxy situated between the client and server, the
`performance impairments of network latency can be
`addressed by having the proxy cache data. Caching is a pro-
`cess of storing previously transmitted results in the hopes that
`the user will request the results again and receive a response
`more quickly from the cache than if the results had to come
`from the original provider. Caching also provides some help
`in mitigating both latency and bandwidth bottlenecks, but in
`some situations it does not help much. For example, where a
`single processor is retrieving data from memory it controls
`and does so in a repetitive fashion, as might be the case when
`reading processor instructions from memory, caching can
`greatly speed a processor’ s tasks. Similarly, file systems have
`employed caching mechanisms to store recently accessed
`disk blocks in host memory so that subsequent accesses to
`cached blocks are completed much faster than reading them
`in from disk again as in BSD Fast File System [McKusick, et
`al., “A Fast File System for BSD”, ACM Transactions on
`Computer Systems, Vol. 2(3), 1984], the Log-based File Sys-
`tem [Rosenblum and Ousterhout, “The Design and Imple-
`mentation of a Log-structured File System”, ACM Transac-
`tions on Computer Systems, Vol. 10(1), 1992], etc. In a
`typical cache arrangement, a requestor requests data from
`some memory, device or the like and the results are provided
`to the requestor and stored in a cache having a faster response
`time than the original device supplying the data. Then, when
`the requestor requests that data again, ifit is still in the cache,
`the cache can return the data in response to the request before
`the original device could have returned it and the request is
`satisfied that much sooner.
`
`Caching has its difficulties, one of which is that the data
`might change at the source and the cache would then be
`supplying “stale” data to the requestor. This is the “cache
`consistency” problem. Because of this, caches are often “read
`only” requiring that changes to data be transmitted through
`the cache back to the source in a “write-through” fashion.
`
`RIV-1005 / Page 15 of 29
`
`
`
`US 8,069,225 B2
`
`7
`Another problem with caching is that the original source of
`the data might want to track usage of data and would not be
`aware of uses that were served from the cache as opposed to
`from the original source. For example, where a Web server is
`remote from a number of computers running Web browsers
`that are “pointed to” that Web server, the Web browsers might
`cache Web pages from that site as they are viewed, to avoid
`delays that might occur in downloading the Web page again.
`While this would improve performance in many cases, and
`reduce the load on the Web server, the Web server operator
`might try to track the total number of “page views” but would
`be ignorant of those served by the cache. In some cases, an
`Internet service provider might operate the cache remote
`from the browsers and provide cached content for a large
`number of browsers, so a Web server operator might even
`miss unique users entirely.
`Where loose consistency can be tolerated, caching can
`work remarkably well. For example, the Domain Name Sys-
`tem (DNS), dating back to the early l980