`Cheriton
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,027,393 Bl
`Apr. 11, 2006
`
`I 1111111111111111 11111 1111111111 111111111111111 lllll 111111111111111 11111111
`US007027393Bl
`
`(54) TCP OPTIMIZED SINGLE RATE POLICER
`
`(75)
`
`Inventor: David R. Cheriton, Palo Alto, CA (US)
`
`(73) Assignee: Cisco Technology, Inc., San Jose, CA
`(US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1124 days.
`
`(21) Appl. No.: 09/798,648
`
`(22) Filed:
`
`Mar. 2, 2001
`
`(51)
`
`Int. Cl.
`H04J 1116
`H04J 3/14
`H04L 12126
`
`(2006.01)
`(2006.01)
`(2006.01)
`
`(52) U.S. Cl. .................. 370/230.1; 370/232; 370/235.1
`(58) Field of Classification Search ................. 370/235,
`370/232, 229,230, 230.1, 231, 233, 234,
`370/235.1, 253, 252; 709/224, 225, 232,
`709/233, 234, 235; 109/223
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,734,825 A * 3/1998 Lauck et al. ................ 709/233
`
`6,144,639 A * 11/2000 Zhao et al.
`................. 370/235
`6,147,969 A * 11/2000 Benmohamed et al. ..... 370/230
`6,324,165 Bl * 11/2001 Fan et al.
`................... 370/232
`6,400,684 Bl * 6/2002 Benmohamed et al. .. 370/230.1
`2003/0223370 Al * 12/2003 Jain et al .................... 370/235
`2004/0057378 Al * 3/2004 Gronberg .................... 370/230
`2004/0170127 Al * 9/2004 Tanaka ....................... 370/235
`
`* cited by examiner
`
`Primary Examiner-Feben M. Haile
`(74) Attorney, Agent, or Firm-Campbell Stephenson
`Ascolese LLP
`
`(57)
`
`ABSTRACT
`
`An extension to the conventional single rate microflow
`policer that provides dual rate policing with a minimum of
`extra resource utilization. Using the extended microflow
`policer, an aggressive TCP flow ramps up to exceed the
`policer rate, setting a burst drop flag. Once the flow rate
`exceeds the burst rate, a single packet is dropped and the
`burst drop flag is cleared. On seeing the single packet drop,
`the TCP sender is then expected to reduce its rate. Flows that
`do not back off will eventually exceed a higher, hard drop
`threshold and experience packet drop. An aggressive TCP
`rate thus oscillate around the burst rate, efficiently approach(cid:173)
`ing the hard drop rate without exceeding it. The addition of
`only a single bit flag avoids the cost of a dual-rate policer
`and the tail drop behavior induced by a single rate policer.
`
`26 Claims, 3 Drawing Sheets
`
`/300
`
`~R-ec-ei-ve- 310
`packet in
`RPM
`
`Yes
`
`No
`
`325
`
`Clear
`tcpBurstDrop
`flag
`
`330
`
`340
`
`No
`~
`Drop first
`packet
`
`Set
`tcpBurstDrop
`flag
`
`370
`
`Drop packet
`
`Yes
`
`Cloudflare - Exhibit 1035, page 1
`
`
`
`
`
`U.S. Patent
`
`Apr. 11, 2006
`
`Sheet 2 of 3
`
`US 7,027,393 Bl
`
`"(""
`...
`r-- -----
`I ,o
`I~
`\..
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`01
`~I
`~
`
`..
`
`. . .. . ·--.
`-- --
`-- --
`- - - - --------------------
`
`- -~J
`
`~
`
`~
`
`lD
`<D
`N
`\..
`
`c:-
`0
`E
`(1)
`~
`
`'
`
`0 co
`N
`./
`
`'-
`(l)
`0 .....
`i::
`0
`(..)
`
`-
`
`~
`
`...
`
`~
`
`...
`
`~
`
`~ ..-
`a... 0
`er:: N
`
`j ~
`
`-s V)
`Q)
`..... Q)
`a. :::,
`:::,
`:::,
`0 0
`
`H
`
`0
`0 a..
`w
`
`~
`:::,
`a)
`
`H
`
`....
`..... Q)
`~ 0)
`ro
`:::, C
`co n:,
`:E
`
`lo.
`
`--------------------
`-- --,
`-- --
`.... .. . ..~-~
`
`..
`
`N
`·-LL
`
`0)
`
`--1
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`__J
`
`0
`t-
`N
`
`1
`
`'-
`~
`::,
`-0
`(1)
`
`t
`0
`a.. .L; u
`en
`
`0
`"<:f'
`N
`\.___
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`lD
`I
`IN
`1N
`\..
`I
`I
`I
`I
`I
`I
`I
`I __ -----
`~--·
`/·--..
`
`C)
`C"')
`N
`\...
`
`0
`N
`N
`
`Cloudflare - Exhibit 1035, page 3
`
`
`
`
`
`US 7,027,393 Bl
`
`1
`TCP OPTIMIZED SINGLE RATE POLICER
`
`BACKGROUND OF THE INVENTION
`
`10
`
`1. Field of the Invention
`The present invention relates to digital connnunications
`systems, in particular computer networking, and specifically
`data flow rate control.
`2. Description of the Related Art
`In the field of computer networking, one area of concern
`is maintaining and supplying a pre-negotiated quality of
`service (QoS) and/or a guaranteed packet rate. Further
`discussion of the general quality of service problem can be
`found in James F. Kurose and Keith W. Ross, Computer 15
`Networking: A Top Down Approach Featuring the Internet
`(Addison Wesley 2000), Chapter 6.6, incorporated herein by
`reference in its entirety.
`Many systems attempt to provide a guaranteed bit rate or
`packet rate for designated flows through a switching or 20
`routing system. A "flow" is here defined as a unique data
`connection between a certain designated source address and
`a designated destination address. Generally speaking, a
`"flow" is a defined subset of the packet cell traffic between
`designated endpoints, not merely a transport connection.
`Policers are a critical component in providing quality of
`service in data networks. Policers are used to hold a packet
`flow to a target rate in the presence of burst traffic. Token
`bucket and leaky bucket mechanisms are well known
`approaches to policing packet streams. See, for example,
`Kurose and Ross, cited above. In addition, there are "virtual
`time" based approaches to policing such as that described in
`the ATM Forum Traffic Management Specification, (version
`4.0, af-tm-0056.000, June 1996) as the theoretical arrival
`time (TAT) algorithm. The ATM Forum Traffic Management
`Specification is incorporated herein by reference in its
`entirety. However all of these approaches have the same
`drawbacks seen in packet buffering, namely tail dropping.
`Tail dropping, as that term is understood in the art, refers to
`the complete drop of all packets in a transmission burst after
`the bursting flow exceeds its designated maximum flow rate.
`The problem of tail dropping in packet buffers is
`described in S. Floyd, and V. Jacobson, Random Early
`Detection Gateways for Congestion Avoidance, IEEE/ACM 45
`Transaction on Networking, vol. 1, No. 4, August 1993, p.
`397-413 and in V. Jacobson, K. Nichols, and K. Podhuri,
`RED in a Different Light, Technical Report, April 1999. Both
`of these papers are incorporated herein by reference in their
`entireties.
`Generally speaking, bandwidth management on the links
`between routers and switches is the key element in main(cid:173)
`taining quality of service. As noted in Kurose and Ross,
`there are three aspects of a flaw's packet rate among which
`one could choose to implement a policing scheme. These
`three important policing criteria, which differ from each
`other according to the time scale over which the packet flow
`is policed, are as follows:
`Average Rate. The network may wish to limit the long
`term average rate (i.e., packets per time interval) at
`which a flaw's packets can be sent into the network. A
`crucial issue here is the interval of time over which the
`average rate will be policed. For example, a flow whose
`average rate is limited to 100 packets per second is
`more constrained than a flow that is limited to 6,000
`packets per minute, even though both have the same
`average rate over a long enough interval of time. The
`
`25
`
`30
`
`35
`
`40
`
`2
`latter constraint would allow a flow to send 1000
`packets in a given second-long interval of time (subject
`to the constraint that the rate be less than 6,000 packets
`in a minute), while the former constraint would disal(cid:173)
`low this sending behavior entirely.
`Peak Rate. While the average rate constraint limits the
`amount of traffic that can be sent into the network over
`a relatively long period of time, a peak rate constraint
`limits the maximum number of packets that can be sent
`over a shorter period of time. Using the example above,
`the network may police a flow at an average rate of
`6,000 packets per minute, while limiting the flaw's
`peak rate to 1,500 packets per second.
`Burst Size. The network may also wish to limit the
`maximum number of packets (i.e., the burst packets)
`that can be sent into the network in an extremely short
`interval of time. As this interval length approaches
`zero, the burst size limits the number of packets that
`can be instantaneously sent into the network. While it
`is physically impossible to instantaneously send mul(cid:173)
`tiple packets (after all, every link has a physical trans-
`mission rate that cannot be exceeded), the abstraction
`of a maximum burst size is a useful one.
`One model that can be used to characterize different
`policing schemes is known as the "leaky bucket" mechanism
`(sometimes called the leaky bucket algorithm). A leaky
`bucket consists of a bucket ( a logical container) that can hold
`up to b tokens.
`In the leaky bucket mechanism, tokens are added to the
`bucket as follows: new tokens (which may potentially be
`added) are always generated at a rate of r tokens per second.
`If the bucket is filled with less than b tokens when a token
`is generated, the newly generated token is added to the
`bucket. Otherwise, the newly generated token is ignored and
`the token bucket remains full to its capacity ofb tokens. The
`"leak" arises from the fact that tokens are removed from the
`bucket according to a defined rule representing the act by
`which the parameter policed (here, packet transmission).
`The leaky bucket mechanism can be used to police a
`packet flow in the following manner: suppose that before a
`packet is transmitted into the network it must first remove a
`token from the token bucket. If the token bucket is empty,
`the packet must wait for a token. In this way, packets cannot
`enter the network until a token is available for them. This is
`analogous to requiring a ticket to enter a freeway.
`Alternatively, rather than waiting for a token, a packet that
`arrives at an output queue looking for a token could be
`dropped if there are insufficient tokens to allow it to be
`enqueued. This is an example of a leaky bucket mechanism
`50 employed as an output queue control device.
`The virtual time policing scheme, also well-known in the
`art, can also be used, as virtual time policers are generally
`considered an alternate to leaky bucket algorithms. In the
`virtual time scheme, the process first determines the "next
`55 time" that a flow is allowed to send a packet. When the next
`packet in that flow arrives, its time of arrival is compared to
`the "next time." If the packet has arrived earlier than the
`"next time," it needs to be policed or perhaps dropped. If the
`packet arrived later than the "next time," it is allowed. A
`60 burst parameter is usually associated with each policer to
`indicate how much earlier than the "next time" a packet can
`arrive before it is policed.
`The question now becomes, "How does the network
`behave in response to packet that is either dropped or held
`65 (i.e., buffered)?" Adaptive flows, such as TCP, typically
`respond to a lack of packet transmission, designated by the
`failure to receive a return acknowledgement from the receiv-
`
`Cloudflare - Exhibit 1035, page 5
`
`
`
`US 7,027,393 Bl
`
`4
`with a minimum of extra resource utilization. Using this
`extended microflow policer, an aggressive TCP flow will
`ramp up to exceed the policer rate, setting a burst drop flag.
`Once the flow crosses into the burst area, i.e., the flow rate
`exceeds the burst rate, a single packet will be dropped, in
`one embodiment of the present invention, and the burst drop
`flag will be cleared. On seeing the single packet drop, the
`TCP sender is then expected to back off, i.e., reduce its rate
`before going over the higher hard drop threshold. A TCP
`10 flow will thus oscillate in rate, approaching the hard drop
`rate without exceeding it.
`In an alternate embodiment, a switching device employ(cid:173)
`ing the present extended microflow policer system may also
`include a burst drop enable flag (provided on a per port, per
`switch, or per flow basis) to turn on or off the extended
`dual-rate policing feature. With this flag cleared, the par(cid:173)
`ticular policer effected behaves like a conventional single
`threshold policer.
`The extended microflow policer presently disclosed is
`implemented by providing a single bit extension to the state
`variable representing each microflow. This extension is over
`and above the state variable traditionally used in conven(cid:173)
`tional per-flow policing schemes. The addition of only a
`single bit provides an efficient and cost-effective method of
`providing microflow policing of TCP flows without the cost
`of a full scale, dual-rate policer and without the tail drop
`behavior induced by a conventional single rate policer.
`In a further alternate embodiment, this extended microf-
`30 low policer technique can be employed with flows of any
`well-behaved and adaptive protocol and is thus not limited
`to use with TCP flows. This is so because adaptive flows, by
`definition, reduce their rates in response to a single drop
`without going into a re-transmission timeout period and
`35 without resetting rates all the way to a low and inefficient
`minimum value before ramping up again.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The present disclosure may be better understood and its
`40 numerous features advantages made apparent to those
`skilled in the art by referencing the accompanying drawings.
`FIG. lA is an illustration of the sawtooth rate behavior
`over time in prior art TCP flow policers.
`FIG. 1B illustrates the approximate rate behavior over
`time in one embodiment of the present invention.
`FIG. 2 is a high level schematic diagram of a generic
`switch/router system showing a policer module according to
`one embodiment of the present invention.
`FIG. 3 is a flowchart of the method of one embodiment of
`present invention.
`The use of the same reference symbols in different drawings
`indicates similar or identical items.
`
`3
`ing (destination) system, by reducing their transmit rate. In
`this way, an adaptive flow ( often called a well-behaved flow)
`can slowly reduce its rate in response to unsuccessful
`transmissions.
`In the presence of a packet transmission burst from a
`given flow, a leaky bucket mechanism will be able to pass
`at most b packets simply because the maximum size of the
`leaky bucket is b packets. Furthermore, because the token
`generation rate is r, the maximum number of packets that can
`enter the network in any interval of time length t is rt+b.
`Thus, the token generation rater serves to limit the long term
`average rate at which packets can enter the network by
`causing the well-behaved, adaptive flows to lower their
`average, aggregated transmit (sending) rate to r.
`One problem seen in the art and especially vexatious in 15
`situations requiring fine-grained, per-flow policing (also
`known as microflow policing) is that a TCP flow will ramp
`up to the policer rate and then experience a hard drop. In
`other words, in accordance with standard behavior of TCP
`flows, the sender will continue to increase its transmission 20
`rate until it fails to transmit a packet successfully. At this
`point, again according to the TCP standard, the packet drop
`(as indicated by the receipt of a double acknowledgment
`message at the sender) will cause the TCP sender to re-send
`the first unacknowledged packet and adjust its transmit rate 25
`downwards. If there is just one packet dropped, the flow will
`recover and continue at the reduced rate. However, if several
`packets have been dropped, the TCP connection will receive
`further duplicate acknowledgements. At that point, the
`sender will resort to a retransmission timeout.
`A retransmission timeout, also by definition, causes the
`TCP sender to reset its transmission rate to the lowest
`supported rate on the link. The net result is that the TCP
`transmission rate will drop far below the policing rate on the
`occurrence of the first set of multiple packet drops and will
`remain at a sub-policing rate for a relatively long period of
`time. The situation is illustrated in FIG. lA wherein the
`sawtooth behavior of the transmit rates results from the
`re-transmission timeout response to packet drops.
`Some solutions for this problem, and the resulting loss in
`transmission efficiency, use two levels of policing, one of
`which only causes a mark or an upstream message that
`congestion is occurring. The second level, set at a slightly
`higher rate, causes a hard packet drop. The idea behind this
`approach is that the mark message will cause adaptive flows 45
`to reduce their rate by a small increment rather than starting
`all over at the minimum TCP rate and ramping up. In
`systems using this approach, a burst transmission momen(cid:173)
`tarily supplying a rate in excess of the mark rate results in
`a slight decrease in transmitter rate, rather than a hard drop. 50
`The disadvantage of this scheme is that it is difficult to
`implement in router and switch hardware. Such a dual-level
`or dual-rate policing scheme requires a great deal of addi(cid:173)
`tional memory and computational resources within the
`switch because the packet flow rate must be tested against 55
`two different rates, rather than one.
`What is needed is a system that can provide fine-grained
`policing on a per-flow basis and is relatively immune to
`re-transmission timeout and concomitant loss transmission
`efficiency. Such a system must operate without consuming 60
`too much of the scarce processor and memory resources
`available in modern network devices.
`
`DETAILED DESCRIPTION
`
`In a packet processing system such as that found in a
`modern router or switch, quality of service (QoS) decisions
`need to be made after the access control list (ACL) or policy
`routing decisions are made. Part of the QoS decision is
`insuring the guaranteed rate (i.e., the rate contracted for by
`the sender) is provided on the designated network connec(cid:173)
`tion. In such a system, the element that provides rate
`guarantee is designated a rate-policing module (RPM).
`FIG. 2 shows a high level block diagram of a generalized
`switch/router 200 used in data communications. Rate polic(cid:173)
`ing module 201 is the element that reads a predetermined
`policing parameter (of which more later) and tests the
`
`SUMMARY
`The present invention provides an extension to the con- 65
`ventional single rate microflow policer that provides the
`essential functionality of dual rate policing for TCP flows
`
`Cloudflare - Exhibit 1035, page 6
`
`
`
`US 7,027,393 Bl
`
`5
`corresponding designated flow against that parameter. In
`other words, a packet flow defined by some combination of
`packet parameters including (but not limited to) packet type,
`input and/or output ports, input and/or output classification
`( e.g., type of service), source address, destination address, is
`tested against a particular parameter to determine whether or
`not the rate demanded by that flow meets or exceeds the
`policing limitation.
`Note that the term "policer" is also used to refer to an
`entry in a policer table. This entry comprises a value
`representing one or more aspects of how the data rate is to
`be guaranteed for a particular microflow.
`The policing module works both on the input packet flow
`(i.e., before the forwarding decision) and on the output flow
`(i.e., after the forwarding decision) within the overall packet
`processing device. Thus, a policer can apply quality of
`service rate control rules based on either input information
`(such as the source address) or output information (such as
`the destination address for the next hop). The decision
`between how and where to apply the policing is a function
`of the overall flow itself and is therefore controlled by
`configuration (i.e., programming) of the switch/router.
`The overall function of the rate policing module is to
`prevent a transmitting host from sending more than a certain
`number of bits per second through a link. Thus, the policer
`is (generally speaking) a rate limiter.
`Rate limiting is required on TCP flows because TCP
`begins operation by sending at a slow rate (also known as
`slow start) and ramps up the transmission rate until the
`sender discovers a packet drop. This approach allows the 30
`TCP flow sender to determine how fast data can be sent
`through a particular data transmission path.
`TCP is designed so that when the flow reaches its rate
`limit, packets will be dropped. Packet drop is, as discussed
`previously, signaled by detecting a double acknowledgment
`message (ACK) from the sender. Detection of the double
`acknowledgment has an inherent latency because of the time
`lag between when the source sends the packet and when it
`receives the second ACK. This latency is affected by the
`roundtrip time (RTT) between the source and the destination 40
`points.
`A problem arises in this architecture because a single rate
`policer drops all packets once the rate has been exceeded:
`the single-rate policer effectively reduces the link rate to
`zero immediately upon detecting a burst exceeding the 45
`designated maximum link rate. To the source, this zero rate
`(i.e., 100% packet drop) does not necessarily imply conges(cid:173)
`tion; it can also signal a link failure. Because of this, the TCP
`protocol is designed to slow start transmission all over again
`from a near-zero rate and slowly ramping up. As shown in 50
`FIG. lA, this results in an rate profile that starts near zero,
`ramps up to a maximum rate, and then immediately drops to
`zero once again. In terms of system performance, this
`behavior increases the amount of time in which the system
`is transmitting at less than full rate. As one can see from 55
`inspection of FIG. lA, the system is transmitting at much
`less than its maximum rate for most of the time.
`The problem goes deeper than a lack of efficiency in
`transmission rate. Because policers do not provide (by
`definition) any buffering whatsoever, all packets that are
`dropped because they exceed the designated maximum rate
`are in fact lost and must be re-transmitted. The overall delay
`in transmission increases drastically for packets that are
`dropped, making such a link completely unacceptable for
`time critical packet transmissions. Re-transmission also has
`the undesirable side effects of increasing network loading
`and overall packet latency.
`
`6
`A dual-rate policer addresses both of these problems by
`maintaining two rates at which it takes action. The first rate,
`designated R 1 for the burst threshold, is the rate where a first
`packet is either marked or dropped to signal to the trans(cid:173)
`mitting source that the police rate has been exceeded. This
`signaling results in the source slowing down or reducing its
`transmission rate. Above the second rate, R2 (also known as
`the hard drop rate), the flow will suffer 100% packet drop.
`R2 exceeds R 1 by an amount selected by the system operator
`10 based on system capacity, loading, desired or guaranteed
`flow rate, and/or other factors commonly used in network
`tuning by those of ordinary skill in the art. In one embodi(cid:173)
`ment of the present invention, R2 =2*R1 .
`For example, if the target rate for a flow is 25 Mbps, the
`15 R 1 is set at 25 Mbps and R2 could be set to 50 Mbps. R 1 is
`set to 25 Mbps so that the flow experiences one drop ( or
`some congestion signal) when it hits the desired maximum
`rate. In a conventional TCP implementation, the rate can at
`most double on each round of rate increases, during at most
`20 one round-trip time. Thus, a flow going at 25 Mbps will
`reach at most 50 Mbps before it detects the single packet
`drop and then can react by reducing its rate. Thus, a
`well-behaved TCP implementation will oscillate in rate from
`above 25 Mbps to below 25 Mbps based on these packet
`25 drops, achieving the desired behavior. A misbehaving TCP
`flow might not respond to the packet drop, continuing to
`increase its rate all the way up to 50 Mbps, but at least it is
`strictly limited to at most this data rate.
`In earlier systems that attempted to implement dual-rate
`policing, the hardware implementation costs were exceed(cid:173)
`ingly high. In fact, prototyping and experimentation deter(cid:173)
`mined that dual-rate implementation costs were at least
`double that of the single-rate policer. This follows logically
`because whatever mechanism is necessary to keep track of
`35 system behavior and to mark or drop packets exceeding a
`given rate must be implemented for both R 1 and R2 .
`The cost of implementation is determined at least in part
`by the requirement to maintain state for each flow. That is,
`if the switch is expected to have 128K discrete flows, that
`switch must have storage space for 128K discrete state
`vectors. In a dual rate scheme, a second state vector is
`needed for each flow. Regardless of whether or not the state
`vector space is hashed, twice as many state vectors are
`required per flow in a dual-rate policing scheme.
`The solution to this problem is found by noting that real
`TCP flows occurring in operational networks today never
`present a smoothly changing rate. In practice, the linear
`sawtooth rate ramp of FIG. lAis rarely seen. The rate of any
`real world TCP flow always varies over a given time period
`and usually takes the form of a very rough sawtooth. This is
`so because the destination system (the ultimate receiving
`host in the network) will always have an upper limit in the
`amount of data it can receive. TCP, by design, always
`increases its rate as far as it can in order to try and capture
`as much link bandwidth as possible and therefore provide
`the most efficient link transmission.
`In one embodiment of the present invention, when the
`received flow is a TCP flow and its rate first exceeds first
`60 threshold R 1 (i.e., on receipt of the first packet whose bytes
`per time period exceeds R 1 but does not exceed R2 ), the
`system will drop that first packet and remember that it has
`dropped it. More specifically, upon receipt of the first packet
`at a rate greater than R 1 , the system sets a flag called
`65 tcpBurstDrop to prevent another drop. The presence of a set
`(value equals one) tcpBurstDrop flag indicates that the
`system has already received a burst packet. Because deci-
`
`Cloudflare - Exhibit 1035, page 7
`
`
`
`US 7,027,393 Bl
`
`7
`sions are made based on a measured rate, this embodiment
`necessarily uses a virtual time approach or similar method
`for rate sensing. The virtual time approach, and its many
`variations, are well-known to those of ordinary skill in the
`art.
`
`The following pseudocode description illustrates one
`embodiment of a virtual time policing algorithm operating in
`accordance with the present invention. This sample algo(cid:173)
`rithm is run on each packet arrival. Note that a packet is not 10
`charged to (i.e., acted upon by) a policer if it has already
`been dropped by a (logically) earlier processing step.
`
`8
`
`if( !mflowPolOver && inpQosPermit) {
`I I we can charge tbe packet to tbe mflow policer
`if( newFdtle.nextTime < currentTime ) {
`new F dtle.nextTime = currentTime;
`newFdtle.tcpBurstDrop - O;
`
`}
`newFdtle.nextTime +- pktLengtb * newFdtle.share;
`
`While this form of algorithm description (i.e.,
`pseudocode) is well-known to those of skill in the art, it is
`worth noting that pseudocode is only a high-level represen(cid:173)
`tation of what actual software algorithms or hardware sys(cid:173)
`tems would necessarily implement.
`
`I I currentTime refers to wall clock, i.e. global system time
`if ((policer.nextTime-currentTime)>policer.burstTime) {
`police();
`policer.overRateBytes +- pktLengtb;
`
`}
`else {
`if (policer.nextTime < currentTime)
`policer.nextTime = currentTime;
`policer.nextTime += pktLength * policer.share;
`policer.underRateBytes +- pktLengtb;
`
`In order to police at the microflow level, the above
`pseudocode sample must be adapted to operate on individual
`microflows. The following code example shows on variation
`of such an adaptation. One of ordinary skill in the art will
`readily see that other variations are possible. Accordingly,
`the present disclosure should not be read as limited to a
`single pseudocode example.
`
`II mflow policing.
`pod.mflowPolOver - O;
`mflowPolSingleTcpDrop - O;
`if( ( newFdtle.nextTime - currentTime ) > newFdtle.burstTime
`){
`
`pod.mflowPolOver - 1; II hard drop
`
`}
`else if( mflowPolCtrlReg.tcpBurstEn[pod.maskSel] &&
`f2nr.flowLabelType -- Ipv4 &&
`f2nr.flowLabel.ipProtocol -- TCP) {
`if( ( newFdtle.nextTime - currentTime ) >
`newFdtle.burstTimel2 ) {
`I I Do single packet drop & set flag
`if( !newFdtle.tcpBurstDrop ) {
`newFdtle.tcpBurstDrop - 1;
`mflowPolSingleTcpDrop - 1;
`
`The tcpBurstDrop flag is reset later if the rate is then
`below that called for by the policer, as handled by the next
`pseudocode sample below. Here, mflowPolOver indicates if
`we are over the policer rate, and inQoSPermit indicates
`whether the packet has been permitted ( or not dropped) by
`previous processing. The newFdtle variable designates a
`microflow policer (actually a whole microflow entry).
`
`15
`
`20
`
`By adding the single bit tcpBurstDrop flag to the flow
`state vector, the system effectively provides a second polic(cid:173)
`ing rate level. This is so because the act of dropping a single
`packet at a burst level and then continuing to transmit
`packets at or even above the burst level (though less than
`hard drop level R2 ) enables the link to absorb burst packets.
`Absorption continues at a rate greater than R 1 for a long
`enough period of time for the double ACK from the single
`dropped packet to reach the sender and thereby cause it to
`throttle back (reduce) its rate.
`The net effect of this mechanism is shown in FIG. lB
`which represents a "sawtooth on a sawtooth" rate history
`over time. In region 115, we see the slow start of the standard
`25 TCP packet flow. At threshold R 1 , we see the rate crossing
`into the burst region 120. At threshold R2 , the hard drop
`limit, we see the packet rate dropping back down. However,
`hard drop region 130 does not necessarily begin exactly at
`rate R2 ; in reality, it occurs as soon as the transmitting source
`30 realizes it needs to send at a lower rate. The rate drops
`straight down and then recovers somewhere below rate R 1
`and begins to ramp up again in region 120. It is this sawtooth
`behavior in region 120 that maintains the rate through the
`link at a relatively high efficiency compared to the strong
`35 sawtooth behavior shown in FIG. lA.
`FIG. 2 shows a high-level block diagram of a network
`switching or routing device 200 providing the dual-rate
`policer system according to one embodiment of the present
`invention. A number of input flows 220 are presented to the
`40 unit. These flows each consist of multiple packets of data in
`a variety of sizes and presented at a variety of rates.
`Additionally, flows may be presented in different protocols,
`such as the Transmission Control Protocol/Internet Protocol
`(TCP/IP) and the related User Datagram Protocol (UDP),
`45 File Transfer Protocol (FTP), Terminal Emulation Protocol
`(Telnet), and Hypertext Transfer Protocol (HTTP). Other
`internetworking protocols are found in the literature, such as
`Merilee Ford, et. al., Internetworking Technologies
`Handbook, Cisco Press 1997 (hereinafter Ford), incorpo-
`50 rated herein by reference in its entirety. The packets are
`buffered in a buffer pool 230, which is typically random
`access memory (RAM). Buffering is accomplished accord(cid:173)
`ing to the directives of a controller 260 and a buffer manager
`225. The flows are sent to the proper output port 270 by way
`55 of a set of output queues 240 and a port scheduler 250,
`discussed below. Controller 260 is conventionally imple(cid:173)
`mented as one or more high speed microprocessors with
`associated interface circuitry. Buffer manager 225 and port
`scheduler 250 can be implemented as part of a switch ASIC.
`60 Within port scheduler 250 is rate policing module (RPM)
`201, according to one embodiment of the present invention.
`RPM 201 provides the dual-rate policing functionality.
`The present invention may be readily implemented on
`individual flows at any level of granularity. In other words,
`65 flows may be defined down to the microflow level repre(cid:173)
`senting a single TCP connection in one direction between a
`designated source and a designated destination address.
`
`Cloudflare - Exhibit 1035, page 8
`
`
`
`US 7,027,393 Bl
`
`10
`
`20
`
`25
`
`35
`
`9
`Alternatively, the present invention may be implemented in
`the same way across multiple TCP connections up to and
`including the aggregate flow level representing all TCP
`flows from a single sour