`
`6,038,216
`*Mar. 14, 2000
`
`[11] Patent Number:
`[45] Date of Patent:
`
`RFC 1122, "Requirements for Internet Hosts", Branden, ed.,
`1989, 116 pages.
`20.3 Sliding Windows, TCP/IP Illustrated, vol. I, pp.
`280-284.
`10 Protocol Layering, TCP/IP, vol. I, pp. 139-144.
`2.5 The Idea Behind Sliding Windows, TCP/IP, vol. I, pp.
`175-177.
`12.10 Variable Window Size and Flow Control, TCP/IP, vol.
`I, pp. 182-194.
`14 TCP: Flow Control and Adaptive Retransmission, TCPI
`IP, vol. II, pp. 261-283.
`
`Thomas, Stephen A., "IPng and the TCP/IP Protocols", John
`Wiley & Sons, Inc., pp. 239-240, 1996.
`Fengmin Gong et al., "Study of a two-level flow control
`scheme and buffering strategies", INFOCOM'94. Network-
`ing for Global Communications., 13th Proceedings IEEE
`(94CH3401-7), vol. 3, pp. 1224-1233, Jun. 1994.
`Roberts, Lawrence G. "Explicit Rate Flow Control." Apr.
`1997, lroberts@ziplink.net; http://www.ziplink.neti-lrob-
`erts/Ex . . . ate/Explicit-Rate-Flow-Control.htm; pp. 1-14.
`
`United States Patent [19]
`Packer
`
`[54] METHOD FOR EXPLICIT DATA RATE
`CONTROL IN A PACKET COMMUNICATION
`ENVIRONMENT WITHOUT DATA RATE
`SUPERVISION
`
`[75]
`
`Inventor: Robert L. Packer, Los Gatos, Calif.
`
`[73] Assignee: Packeteer, Inc., Cupertino, Calif.
`
`[ * Notice:
`
`This patent issued on a continued pros-
`ecution application filed under 37 CFR
`1.53(d), and is subject to the twenty year
`patent term provisions of 35 U.S.C.
`154(a)(2).
`
`[21] Appl. No.: 08/742,994
`
`[22] Filed:
`
`Nov. 1, 1996
`
`[51] Int. C1.7
`[52] U.S. Cl.
`[58] Field of Search
`
` HO4L 12/26
` 370/231; 370/232; 370/235
` 370/229-238
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`8/1991 Hayakawa
`5,042,029
`3/1993 Jain .
`5,193,151
`5,359,593 10/1994 Derby et al. .
`5,426,635
`6/1995 Mitra et al. .
`5,455,826 10/1995 Ozveren et al. .
`
`OTHER PUBLICATIONS
`
`Balakrishnan, H., et al., "Improving TCP/IP Performance
`Over Wireless Networks", Proc. of 1st ACM Conf. on
`Mobile Computing and Networking, Berkeley, CA, Nov.
`1995, 10 pages.
`RFC 793, "Transmission Control Protocol—DARPA Inter-
`net Program Protocol Specification", Postel, ed., 1981, 87
`pages.
`
`10
`
`SERVER—TCP END SYSTEM
`
`12
`
`370/60
`
`Primary Examiner—Melvin Marcelo
`Attorney, Agent, or Firm—Townsend and Townsend and
`Crew LLP; Kenneth R. Allen
`
`[57]
`
`ABSTRACT
`
`A method for explicit data rate control is introduced into a
`packet communication environment (10) which does not
`have data rate supervision by adding latency to the acknowl-
`edgment (ACK) packet and by adjusting the size of the flow
`control window associated with the packet in order to
`directly control the data rate of the source data at the station
`(12 or 14) originating the packet.
`
`19 Claims, 10 Drawing Sheets
`
`CLIENT—TCP END SYSTEM
`
`14
`
`20
`
`NETWORK
`CLOUD
`
`16
`
`ROUTER
`
`0.0001:100120
`
`ACCESS LINK
`/
`1
`18
`
`26-.
`
`RATE CONTROL
`DEVICE
`
`000CICICIOCIO 00
`
`22—
`ACCESS LINK
`
`24
`
`ROUTER
`
`00000000000
`
`Cloudflare - Exhibit 1059, page 1
`
`Cloudflare - Exhibit 1059, page 1
`
`
`
`Waled •S11
`
`000z 17T •llI/\I
`
`OI JO 1 lamIS
`
`CLIENT—TCP END SYSTEM
`
`14
`
`16
`
`ROUTER
`
`0001301:11200130
`
`ACCESS LINK
`/
`I
`18
`
`10
`
`ik
`
`SERVER—TCP END SYSTEM
`
`12
`
`26 ----
`
`RATE CONTROL
`DEVICE
`
`013001:10001:11:113
`
`20
`
`NETWORK
`CLOUD
`
`22----N
`ACCESS LINK
`
`24
`
`ROUTER
`...........
`
`FIG. 1
`
`Cloudflare - Exhibit 1059, page 2
`
`Cloudflare - Exhibit 1059, page 2
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 2 of 10
`
`6,038,216
`
`tcpSchedule
`
`TCP
`Data
`Present?
`
`Start of a
`New Burst?
`
`202
`
`206
`
`210
`
`Yes
`
`Yes
`
`Is there a
`Target
`Rate?
`
`No
`
`Yes
`
`214
`
`Holding a
`Retransmission
`of Data for
`which we
`are Delaying
`a
`ACK?
`
`218
`
`This ACK seq.>
`Last seq.?
`
`Yes
`
`Yes
`
`h‘--
`
`201
`
`F204
`Update Data Rate
`Recv'd for this
`Connection
`
`208
`
`212
`
`216
`
`Find the CIR/EIR
`Policy for this
`Connection and
`Direction
`
`Schedule an
`Immediate
`Transmission
`
`Return
`
`Dispose of
`Retransmission
`Data in
`Packer Buffer
`
`Return
`
`220
`
`Data Rate
`on the Other Half
`Connection Exceeds
`its Target
`Rate?
`
`Yes
`
`)0.
`
`Fig. 2A
`
`A
`
`Cloudflare - Exhibit 1059, page 3
`
`Cloudflare - Exhibit 1059, page 3
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 3 of 10
`
`6,038,216
`
`4,2•Z
`—J
`
`2.2
`
`Set lastAckForwarded
`for this Half Connection
`to tcpHdr->ack
`
`TCP Data to
`Schedule?
`
`No
`
`Set timeLastFreshAckForwarded on
`this Half Connection to Current Time to
`Cause an Immediate Transmit
`
`Yes
`
`Change ACK on the
`Data Packet and
`Forward it Immediately
`( See Fig. 2C )
`
`.230
`
`Z34
`
`Schedule Packet for
`Immediate
`Transmission
`
`41--Yes
`
`Was an ACK
`created?
`
`2.12_
`
`Invoke tcpDownShift to
`Handle TCP ACK
`Packets
`( See Fig. 2D )
`
`C Return
`
`Fig. 2B
`
`Cloudflare - Exhibit 1059, page 4
`
`Cloudflare - Exhibit 1059, page 4
`
`
`
`U.S. Patent
`
`Mar. 14,2000
`
`Sheet 4 of 10
`
`6,038,216
`
`tcpSeparateData
`
`hi— 205
`
`240
`
`Is there Already
`an ACK Packet for this
`Connection?
`
`No
`
`Copy Front of
`
`Packet ( ip and tcp [ 242
`
`header ) into new
`ackPacket
`
`Update the Length
`in the ackPacket
`to exclude data
`
`244
`
`Yes
`
`246
`
`Y
`Set Flag on this Half
`Connection to Reuse the
`ackPacket when it is
`Eventually Recycled at
`Transmit Complete
`
`248
`
`Change ACK and
`Checksum in Original
`Packet and Sent It
`
`250
`
`New ack Packet?
`
`Yes
`
`Compute and Set
`Checksum for the
`ackPacket
`
`252
`
`Y
`
`Return ( ackPacket )
`
`Fig. 2C
`( Step 228 of Fig. 2B )
`
`Cloudflare - Exhibit 1059, page 5
`
`Cloudflare - Exhibit 1059, page 5
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 5 of 10
`
`6,038,216
`
`256
`
`EopDownShif)
`
`255
`
`Set up new Ack packet:
`1. Set this half connections
`ack packet to point to this
`packet
`2. Set the appropriate
`recycle function and
`argument to capture the
`transmit complete
`confirmation from the driver
`
`NO
`
`ACK already
`outstanding?
`
`YES,
`
`y--- 260
`
`YES,
`
`Is
`amount of
`data > two
`MSS?
`
`NO
`
`_7- 257
`
`,( Return)
`
`Mark Ack
`packet
`for reuse
`and recycle
`
`262
`
`Clamp the
`newAck at the
`lastAckForwarded
`
`2 * MSS
`
`Compute:
`1. Time since last ack
`2. Target rate
`3. Actual rate
`4. Time until ack req'd
`
`263
`
`268
`
`NO
`
`Set Other Half
`Connection Actual
`Rate to Ratio of
`Data Rate
`/ RTO Time
`
`264
`
`y---- 266
`
`Data
`Rate Reduction
`Achievable in One
`Step? (See
`Fig. 2F)
`
`YES
`
`Set Other Half
`Connection Actual
`Rate to Target Rate
`
`Late
`in Sending
`an Ack?
`
`NO
`
`y-- 278
`
`270
`
`272
`
`YES
`
`Is There
`Unack'd Data?
`
`YES
`
`276
`
`Apply Existing
`Late CarryOver
`to Reduce
`Delay Here
`
`YES
`
`Existing Late
`CarryOver for this
`Connection?
`
`Recover Time by
`Acking More Data
`
`NO
`
`Set Flag Indicating a
`Late Carryover
`
`274
`
`279
`
`280
`
`Check Window Size
`is Not Causing Ack
`to be Late
`(See Fig. 2G)
`
`O
`
`FIG. 2D
`(Step 232 of Fig. 2B)
`
`Cloudflare - Exhibit 1059, page 6
`
`Cloudflare - Exhibit 1059, page 6
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 6 of 10
`
`6,038,216
`
`D
`
`Compute:
`A Transmission Time Delay =
`Time until next ACK must be
`sent to maintain the target
`data rate — Time since last ACK
`was sent (computed in Fig. 2D,
`step 263)
`
`281
`
`282
`
`--- 284
`
`Is the
`Advertised Receive
`Window Too Large?
`(See Fig. 2H)
`
`YES
`
`Scale Down
`Advertised
`Receive Window
`
`NO
`
`Update
`Checksum in
`Packet Header
`
`286
`
`Schedule Packet
`for Transmission at
`Time = Present
`Time + Delay
`computed
`in step 281
`
`y -288
`
`(Return)
`
`FIG. 2E
`
`Cloudflare - Exhibit 1059, page 7
`
`Cloudflare - Exhibit 1059, page 7
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 7 of 10
`
`6,038,216
`
`*(-- 209
`
`tcpRtoTimeD
`
`
`
`[ 290
`
`Determine Time
`to RTO
`
`292
`
`Reduce Time to
`RTO by Time to
`Send an ACK
`
`Y
`
`294
`
`Further Reduce
`RTO by a
`Safety Factor
`
``V C.R_eturn (1:1 tfD
`
`Fig. 2F
`( Step 264 of Fig. 2D )
`
`Cloudflare - Exhibit 1059, page 8
`
`Cloudflare - Exhibit 1059, page 8
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 8 of 10
`
`6,038,216
`
`-.'
`tcpCheckMinWinSize
`C
`
`_...1
`
`.300
`
`202
`
`30.44
`—J
`
`Determine the Time
`Since Last Increase in
`Window Size
`
`•
`Determine the
`Minimum Window
`Increase Time *the
`Maximum Window
`Adjustment
`
`Determine the Lesser
`of the Quantities
`Computed in Steps
`300 and 302
`
`Scale Quantity
`Computed in Step 304
`by the Minimum
`Window Size
`
` 30%
`
`
`Reduce the Maximum
`Window Adjustment
`Segments by the
`Quantity Computed in
`Step 306
`
`3 to
`
`•
`Increase the Minimum
`Number of MSS Sized
`Segments in Window
`by the Quantity
`Computed in Step 308
`
`Return
`
`.211
`7/
`
`bi
`
`Fig. 2G
`( Step 280 of Fig. 2D )
`
`Cloudflare - Exhibit 1059, page 9
`
`Cloudflare - Exhibit 1059, page 9
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 9 of 10
`
`6,038,216
`
`213
`
`0
`
`222,
`
`Reset Closing Flag
`
`C Return (False)
`
`tcpUpdateAckRxWindow
`
`311
`
`Original Amt
`Ack'd > 2 * MSS AND
`No CarryOver?
`
`Yes
`
`.11k
`
`4 No
`
`Win Closing
`Flag Not Set?
`
`Yes-0 -
`
`
`
`314
`
`Set Window Size to
`Allow the Specified
`Rate Given the Current
`RTT and the Detected
`Speed
`
`3;44
`L
`
`Set Win Closing Flag
`( Adjustment Will
`Occur in Next Ack )
`
`3/it
`
`C Return (False)
`
`Clamp Receive
`Window to at Least
`2 * MSS
`
`Move Revised Window
`into Packet
`
`r Return (True)
`
`Fig. 2H
`( Step 282 of Fig. 2E )
`
`Cloudflare - Exhibit 1059, page 10
`
`Cloudflare - Exhibit 1059, page 10
`
`
`
`U.S. Patent
`
`Mar. 14, 2000
`
`Sheet 10 of 10
`
`6,038,216
`
`tcpSchedAckEmitted
`
`hr -- 215
`
`330
`
`No
`
`Is the
`Connection Still
`Valid?
`
`Yes
`Y-----,--332
`Set
`lastAckForwarded
`for this Half
`Connection
`
`Return
`
`334
`
`Set
`timeLastFreshAckForwarded
`for this Half Connection
`to the Current Time
`
`336
`
`Is the
`lastAckForwarded =
`lastAckRcvd?
`
`Yes
`
`V
`Return
`
`r338
`
`Set New Ack to
`Equal IastAckRcvd
`
`r340
`
`Invoke tcpDownShift
`in Order to Compute
`New Ack Time and
`to Reschedule Packet
`( See Fig. 2D )
`
`C Return
`Fig. 21
`
`Cloudflare - Exhibit 1059, page 11
`
`Cloudflare - Exhibit 1059, page 11
`
`
`
`6,038,216
`
`1
`METHOD FOR EXPLICIT DATA RATE
`CONTROL IN A PACKET COMMUNICATION
`ENVIRONMENT WITHOUT DATA RATE
`SUPERVISION
`
`COPYRIGHT NOTICE
`
`A portion of the disclosure of this patent document
`contains material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyone of the patent document or the patent
`disclosure as it appears in the Patent and Trademark Office
`patent file or records, but otherwise reserves all copyright
`rights whatsoever.
`
`BACKGROUND OF THE INVENTION
`
`This invention relates to digital packet
`telecommunications, and particularly to data flow rate con-
`trol at a particular layer of a digitally-switched packet
`telecommunications environment normally not subject to
`data flow rate control wherein data packets are communi-
`cated at a variety of rates without supervision as to rate of
`data transfer, such as under the TCP/IP protocol suite.
`The widely-used TCP/IP protocol suite, which imple-
`ments the world-wide data communication network envi-
`ronment called the Internet and is employed in local net-
`works also (Intranets), intentionally omits any explicit
`supervisory function over the rate of data transport over the
`various media which comprise the network. While there are
`certain perceived advantages, this characteristic has the
`consequence of juxtaposing very high-speed packets and
`very low-speed packets in potential conflict and the resultant
`inefficiencies. Certain loading conditions can even cause
`instabilities which could lead to overloads that could stop
`data transfer temporarily. It is therefore considered desirable
`to provide some mechanism to optimize efficiency of data
`transfer while minimizing the risk of data loss.
`In order to understand the exact context of the invention,
`an explanation of technical aspects of the Internet/Intranet
`telecommunications environment may prove helpful.
`Internet/Intranet technology is based largely on the TCP/
`IP protocol suite, where IP is the network level Internet
`Protocol and TCP is the transport level Transmission Control
`Protocol. At the network level, IP provides a "datagram"
`delivery service. By contrast, TCP builds a transport level
`service on top of the datagram service to provide guaranteed,
`sequential delivery of a byte stream between two IP hosts.
`TCP has `flow control' mechanisms operative at the end
`stations only to limit the rate at which a TCP endpoint will
`emit data, but it does not employ explicit data rate control.
`The basic flow control mechanism is a `sliding window', a
`time slot within an allowable window which by its sliding
`operation essentially limits the amount of unacknowledged
`transmit data that a transmitter can emit.
`Another flow control mechanism is a congestion window,
`which is a refinement of the sliding window scheme involv-
`ing a conservative expansion to make use of the full,
`allowable window. A component of this mechanism is
`sometimes referred to as `slow start'.
`The sliding window flow control mechanism works in
`conjunction with the Retransmit Timeout Mechanism
`(RTO), which is a timeout to prompt a retransmission of
`unacknowledged data. The timeout length is based on a
`running average of the Round Trip Time (RTT) for acknowl-
`edgment receipt, i.e. if an acknowledgment is not received
`within (typically) the smoothed RTT+4*mean deviation,
`
`5
`
`10
`
`2
`then packet loss is inferred and the data pending acknowl-
`edgment is retransmitted.
`Data rate flow control mechanisms which are operative
`end-to-end without explicit data rate control draw a strong
`inference of congestion from packet loss (inferred, typically,
`by RTO). TCP end systems, for example, will `back-off',
`i.e., inhibit transmission in increasing multiples of the base
`RTT average as a reaction to consecutive packet loss.
`Bandwidth Management in TCP/IP Networks
`Bandwidth management in TCP/IP networks is accom-
`plished by a combination of TCP end systems and routers
`which queue packets and discard packets when some con-
`gestion threshold is exceeded. The discarded and therefore
`unacknowledged packet serves as a feedback mechanism to
`15 the TCP transmitter. (TCP end systems are clients or servers
`running the TCP transport protocol, typically as part of their
`operating system.)
`The term `bandwidth management' is often used to refer
`20 to link level bandwidth management, e.g. multiple line
`support for Point to Point Protocol (PPP). Link level band-
`width management is essentially the process of keeping
`track of all traffic and deciding whether an additional dial
`line or ISDN channel should be opened or an extraneous one
`25 closed. The field of this invention is concerned with network
`level bandwidth management, i.e. policies to assign avail-
`able bandwidth from a single logical link to network flows.
`Routers support various queuing options. These options
`are generally intended to promote fairness and to provide a
`30 rough ability to partition and prioritize separate classes of
`traffic. Configuring these queuing options with any precision
`or without side effects is in fact very difficult, and in some
`cases, not possible. Seemingly simple things, such as the
`length of the queue, have a profound effect on traffic
`35 characteristics. Discarding packets as a feedback mechanism
`to TCP end systems may cause large, uneven delays per-
`ceptible to interactive users.
`Routers can only control outbound traffic. A 5% load or
`less on outbound traffic can correspond to a 100% load on
`40 inbound traffic, due to the typical imbalance between an
`outbound stream of acknowledgments and an inbound
`stream of data.
`A mechanism is needed to control traffic which is more
`efficient in that it is more tolerant of and responsive to traffic
`45 loading.
`As background, further information about TCP/IP and the
`state of the art of flow control may be had in the following
`publications:
`Comer, Douglas. Internetworking with TCP/IP. Vol. I.
`Prentice Hall, 1991.
`Comer, Douglas and Stevens, David. Internetworking
`with TCP/IP. Vol. II. Design, Implementation, and Internals.
`Prentice Hall, 1991.
`55 W. Richard Stevens, TCP/IP Illustrated. Vol. I—The Pro-
`tocols. Addison-Wesley. 1994.
`RFC 793. Transmission Control Protocol. Postel, 1981.
`RFC 1122. Host Requirements. Braden 1989.
`60 A particularly relevant reference to the present work is:
`Hari Balakrishnan, Srinivasan Seshan, Elan Amir, Randy
`H. Katz. Improving TCP/IP Performance over Wireless
`Networks. Proc. 1st ACM Conf. on Mobile Computing and
`Networking, Berkeley, Calif., November 1995.
`The above document reports efforts of a research group at
`the University of California at Berkeley to implement TCP
``interior spoofing' to mitigate the effects of single packet
`
`s
`
`65
`
`Cloudflare - Exhibit 1059, page 12
`
`Cloudflare - Exhibit 1059, page 12
`
`
`
`6,038,216
`
`3
`loss in micro-cellular wireless networks. Its mechanism is
`the buffering of data and performing retransmissions to
`preempt end to end RTO. It is a software mechanism at a
`wireless network based station which will aggressively retry
`transmission a single time when it infers that a single packet
`loss has occurred. This is a more aggressive retransmission
`than the normal TCP RTO mechanism or the `quick recov-
`ery' mechanism, whereby a transmitter retransmits after
`receiving N consecutive identical acknowledgments when
`there is a window of data pending acknowledgment.
`Sliding window protocols are known, as in Comer, Vol. I.
`page 175. Known sliding window protocols are not time
`based. Rate is a byproduct of the characteristics of the
`network and the end systems.
`
`SUMMARY OF THE INVENTION
`
`According to the invention, a method for explicit network
`level data rate control is introduced into a level of a packet
`communication environment at which there is a lack of data
`rate supervision to control assignment of available band-
`width from a single logical link to network flows. The
`method includes adding latency to the acknowledgment
`(ACK) packet of the network level and adjusting the
`reported size of the existing flow control window associated
`with the packet in order to directly control the data rate of
`the source data at the station originating the packet.
`Called direct feedback rate control, the method comprises
`a mechanism that mitigates TCP packet level traffic through
`a given link in order to manage the bandwidth of that link.
`A software mechanism to implement the function may be a
`software driver, part of a kernel of an operating system or a
`management function implemented on a separate dedicated
`machine in the communication path.
`The invention has the advantage of being transparent to
`all other protocol entities in a TCP/IP network environment,
`For example, in the connections controlled according to the
`invention, it is transparent to TCP end systems (i.e., end
`systems using the TCP protocol).
`The invention will be better understood upon reference to
`the following detailed description in connection with the
`accompanying drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block diagram of a system according to the
`invention.
`FIGS. 2A-2I depict flowcharts of a particular embodi-
`ment according to the invention.
`
`DETAILED DESCRIPTION OF SPECIFIC
`EMBODIMENTS
`
`Referring to FIG. 1, a system 10 which uses the invention
`comprises a first TCP end system 12, such as a server device,
`a second TCP end system 14, such as a client device, which
`is connected through a first router 16 and thence via a first
`access link 18 into a network cloud 20 of a character as
`hereinafter explained, which in turn provides a logical
`connection via a second access link 22 to a second router 24.
`According to the invention, there is provided between at
`least one of the end systems 12 and one of the routers 24 a
`rate control device 26 which is operative to control the rate
`at which a TCP transmitter can emit packets. In this
`invention, rate control is accomplished by (1) delaying the
`transmission of an acknowledgment, which may be a special
`packet, (known as an ACK) issued by the end system 12 or
`14; and/or by (2) modifying the report of the length of the
`
`4
`receive window in those same packets; and/or by (2) gen-
`erating acknowledgments substantially independently of the
`acknowledgments generated by the end receiving station.
`The rate control device 26 is preferably placed at the end
`5 system acting as the server, but a rate control device may be
`placed adjacent any system in a path for data. At the server,
`it is most effective in controlling flow rate because it
`typically receives and relays the bulk of the traffic of interest.
`Several factors are employed to weight the length of the
`10 added acknowledgment delay: the amount of data being
`acknowledged, the measured and smoothed round trip time,
`the targeted transmission rate, and the mean of the deviation
`between the measured and smoothed roundtrip time.
`The direct feedback rate control mechanism may be
`incorporated conveniently into computer-executable code,
`15 assuming the end system or the router is programmable. It
`can be used in an open loop environment or it can be used
`with a rate monitoring mechanism that provides an indica-
`tion of flow rate.
`The following is a detailed description of a specific
`20 embodiment of a direct feedback rate control mechanism
`expressed in pseudocode. For example, the pseudo-code
`explains a mechanism of generating acknowledgments,
`including substituting acknowledgments. This pseudo-code
`also addresses specifics of the TCP environment. It should
`25 be noted that latency and the sliding window flow control
`can be implemented anywhere in the data path.
`According to an embodiment of the present invention, a
`method for controlling data rate of data packets in a digital
`data packet communication environment is provided. This
`30 environment employs TCP/IP protocols and has a plurality
`of interconnected digital packet transmission stations. TCP/
`IP data packet communication environments lack explicit
`end-to-end data rate supervision, such as that found in ATM
`or Frame Relay systems. In this invention, a first digital
`35 packet transmission station sends at least one source packet
`to a second digital packet transmission station over the
`network. The first digital packet transmission station waits
`for typically one acknowledgment packet before sending
`additional source packets. The second digital packet trans-
`40 mission station sends an acknowledgment packet toward the
`first digital packet transmission station after receipt of the at
`least one source packet.
`The second digital packet transmission station can thus
`establish a limit on explicit rate of emission of packets from
`45 the first transmission station to the second transmission
`station. It invokes this limit by the timing of transmission of
`the acknowledgment packet from the second digital packet
`transmission station to the first digital packet transmission
`station. Specifically, the second digital packet transmission
`50 station will insert a latency or delay in issuance of the
`acknowledgment packet. The latency will be based upon the
`selected limit. The latency can be invoked at any point
`traversed by the acknowledgment packet by delaying propa-
`gation of the acknowledgment packet through that point.
`55 One way this is done is by intercepting the acknowledgment
`packet at an intermediate point, and rescheduling it and
`reusing it, i.e., substituting a duplicate acknowledgment
`packet at a delay from normal propagation. An alternative to
`the duplicate substitution is the substitution of a sequence of
`60 values which include the acknowledgment sequence indica-
`tors. The substitution sequence might also include other
`indicia, for example, an indication of a suitable window size
`to be communicated to the first transmission station. These
`indicators serve to signal to the first transmission station the
`65 information needed to repackage groupings of packets,
`which also imposes an upper limit and explicit transmission
`rate.
`
`Cloudflare - Exhibit 1059, page 13
`
`Cloudflare - Exhibit 1059, page 13
`
`
`
`6,038,216
`
`5
`Substitute acknowledgment packets can be generated at
`any point of latency in response to groupings of packets
`from the first digital packet transmission station. The
`acknowledgments and the groupings can be generated with-
`out correlation to time, range of data being acknowledged,
`window size being advertised to the first transmission station
`and number of acknowledgments. The selection of the
`latency time in theory can be a manually preset value.
`Alternatively it can be selected based on a simple formula
`relating latency to explicit maximum rate of transmission.
`Since the rate of emission of packets from a source or first
`transmission station to the second transmission station is a
`limited by the fact that acknowledgments must be received
`before the next emission of a packet or grouping of packets,
`the latency of acknowledgment imposes an upper bound on
`explicit transmission rate.
`FIG. 2A depicts a flow chart 201 of processing steps in the
`subroutine tcpschedule. This routine is invoked for every
`TCP segment to schedule its emission. In a decisional step
`202, a check is made whether there is TCP data present. If
`this is so, then in a step 204 the data rate received for this
`connection is updated and processed. Otherwise, or in any
`event, processing continues with a decisional step 206
`wherein a check is made whether the packets being tested
`are the start of a new burst. If this is so, then in a step 208,
`an appropriate policy for this connection and this direction
`of traffic flow is found. This yields a combination of CIR
`(committed information rate) and EIR (excess information
`rate). Otherwise, or in any event, processing continues with
`a decisional step 210 wherein a check is performed to see if
`there is a target rate associated with this flow. If there is no
`target rate, then in a step 212, the packet is scheduled for an
`immediate transmission after which processing returns.
`Otherwise, if there is a target rate for this particular flow,
`then in a decisional step 214, a check is performed to see if
`an acknowledgment packet or sequence (ACK) for the
`outbound retransmission of TCP data is being retained. This
`check determines whether: 1) there is TCP data AND 2) it is
`a retransmitted data packet and 3) an ACK is being retained
`for this data. If all these conditions are true, the retransmitted
`data packet is discarded in a step 216. A buffer containing the
`packet is cleared and processing returns. This step ensures
`that there is no retransmission caused by our own acknowl-
`edgment delay. If this is not so, then in a step 218, a check
`is made to see if this acknowledgment (ACK) sequence is
`greater than the last forwarded acknowledgment sequence.
`If this is so, then in a decisional step 220, a check is made
`to see if the data rate on the other half connection (data
`flowing in the other direction along this connection) exceeds
`its target rate.
`FIG. 2B depicts a continuation flow chart 203. If the test
`in step 218 in FIG. 2A of whether this ACK sequence is
`greater than the last forwarded ACK sequence fails, then
`processing continues with a step 234 in FIG. 2B, wherein the
`packet is scheduled for immediate transmission and then
`processing returns. If the test of decisional step 220 in FIG.
`2A, of whether the data rate on the other half connection
`exceeds its target rate, fails, then processing continues with
`a step 222 in FIG. 2B, wherein a lastACKforwarded (last
`acknowledgment forwarded) variable is set for this particu-
`lar half connection to tcphdr pointing on ACK. Then in a
`step 224, variable timeLastFreshAckForwarded for this half
`connection is set to the current time in order to cause an
`immediate transmit. Then in step 234, the packet is sched-
`uled for immediate transmission and processing returns.
`Finally, if the test of step 220 in FIG. 2A, whether the data
`rate on the other half connection exceeds its target rate, is
`
`6
`true, then processing continues with a decisional step 226 in
`FIG. 2B, in which a check is made whether there is TCP data
`to schedule. If this is not so, then in a step 228, a procedure
`tcpSeparateData is invoked in order to change the ACK
`5 value on the data packet and forward it immediately. Then
`in a decisional step 230, a test is performed to determine
`whether a new ACK was created. If step 230 is not true, then
`the processing returns. Otherwise a new ACK was created,
`so processing continues with a step 232, in which routine
`10 tcpDownShift is invoked to handle TCP ACK packets. After
`step 232, processing returns. If the test in decisional step 226
`fails because there is no TCP data to schedule, then pro-
`cessing continues directly with step 232 to invoke tcpDown-
`Shift to handle TCP packets. After step 232, processing
`15 returns.
`FIG. 2C depicts a flow chart 205 of the process steps of
`routine tcpSeparateData which is step 228 of FIG. 2B. This
`subroutine is invoked by routine tcpSchedule to separate
`data from IP and TCP headers, assign it an older (i.e., prior)
`20 ACK and schedule the transmission, or to create a new
`ACK-only packet with specified ACK sequence and to
`return it. In a decisional step 240, a determination is made
`whether an ACK packet for this connection already exists. If
`this is not so, then a new ACK packet is created. In a step
`25 242, the front of the packet, i.e. the IP and TCP headers, is
`copied into the newly created ACK packet. Then in a step
`244, the length in the newly created ACK packet is updated
`in order to exclude the length of the data.
`If the test in step 240 is true, then in a step 246, a flag for
`30 this half connection is set to cause a reuse of the existing
`ACK packet when it is eventually recycled at transmit
`complete time. Subsequently, in a step 248, the ACK and
`checksum in the original packet are altered and it is sent.
`Then in a decisional step 250, a determination is made
`35 whether a new ACK packet was created. If this is so, then in
`a step 252 the checksum is computed and set for the newly
`created packet. Otherwise, or in any event, the ACK packet
`is returned.
`FIG. 2D depicts a flow chart 207 of the processing steps
`40 of subroutine tcpDownShift. This subroutine is called from
`function tcpschedule and from a recycle routine to schedule
`an acknowledgment for a half connection which is exceed-
`ing its target rate based on information supplied to it.
`TcpDownShift is the function to choose a latency time to
`45 establish a limit on explicit rate of emission of packets from
`the first transmission station to the second transmission
`station. One technique for establishing the limit on rate is to
`generate substitute acknowledgment packets at a point of
`latency in response to groupings of packets received from
`50 the first digital packet transmission station. In a first deci-
`sional step 255, a determination is made whether there is
`already an acknowledgment outstanding for this half con-
`nection. If this is so, then in a step 257, that outstanding
`acknowledgment packet is marked for reuse at the appro-
`55 priate time after its scheduled transmission and the routine
`returns to its caller. Otherwise, in a step 256, a new acknowl-
`edgment packet is prepared. An indicator for this half
`connection's acknowledgment packet is set to point to the
`new acknowledgment packet. A recycle function and argu-
`60 ment to capture the transmit complete confirmation from the
`driver is also associated with the new acknowledgment
`packet. Then, in a first decisional step 260, a determination
`is made whether the amount of data to be acknowledged is
`greater than two times the MSS (Maximum Segment Size).
`65 If this is the case, then in a step 262, processing "clamps" the
`new acknowledgment at the lastAcknowledgementFor-
`warded added to the amount of two times the MSS. This has
`
`Cloudflare - Exhibit 1059, page 14
`
`Cloudflare - Exhibit 1059, page 14
`
`
`
`6,038,216
`
`7
`the effect of limiting the acknowledgment to the position
`where the last acknowledgment occurred, plus a number of
`bytes equal to two times the MSS. Otherwise, or in any
`event, in a step 263, values are computed for: 1) a time since
`the last acknowledgment was sent for the particular
`connection, 2) a target data rate for the connection, 3) an
`actual rate for the connection and 4) an amount of time until
`the acknowledgment must be sent back to achieve the target
`data rate. The time to send an acknowledgment is computed
`from the amount of data sent from the first transmission
`station and the target rate for the connec