`
`(12) United States Patent
`Olesinski et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,782,793 B2
`Aug. 24, 2010
`
`(54) STATISTICAL TRACE-BASED METHODS
`FOR REAL-TIME TRAFFIC
`CLASSIFICATION
`
`(75)
`
`Inventors: Wladyslaw Olesinski, Kanata (CA);
`Peter Rabinovitch, Kanata (CA)
`
`(73) Assignee: Alcatel Lucent, Paris (FR)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 899 days.
`
`(21) Appl. No.: 11/226,328
`
`(22) Filed:
`
`Sep. 15, 2005
`
`(65)
`
`Prior Publication Data
`
`US 2007/0076606 Al
`
`Apr. 5, 2007
`
`(51)
`
`Int. Cl.
`(2006.01)
`HO4L 12/26
`(2006.01)
`HO4L 12/56
`(2006.01)
`HO4L 12/28
` 370/253; 370/391; 370/395.43
`(52) U.S. Cl.
`(58) Field of Classification Search
` 370/252
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`3/2005 Duffield et al.
`6,873,600 B1 *
`7/2006 Duffield et al.
`7,080,136 B2 *
`7,286,535 B2 * 10/2007 Ishikawa et al.
`7,313,100 B1 * 12/2007 Turner et al.
`7,376,085 B2 *
`5/2008 Yazaki et al.
`7,376,731 B2 *
`5/2008 Khan et al.
`2003/0012197 Al *
`1/2003 Yazaki et al.
`2007/0214504 Al *
`9/2007 Comparetti et al.
`
` 370/252
` 709/223
` 370/392
` 370/253
` 370/235
` 709/224
` 370/392
` 726/23
`
`FOREIGN PATENT DOCUMENTS
`
`OTHER PUBLICATIONS
`
`Sun et al., "Statistical Identification of Encrypted Web Browsing
`Traffic", Microsoft Research for Proc. IEEE Symposium on Security
`and Privacy, IEEE, May 2002.*
`Zhang et al., "Detecting Backdoors", Proceedings of the 9th
`USENIX Security Symposium Denver, Colorado, Aug. 2000, p.
`1-11.*
`M. Roughan, S. Sen, 0. Spatscheck, N. Duffield, "Class-of-service
`mapping for QoS: a statistical signature-based approach to IP traffic
`classification", 2004, pp. 135-148.
`
`(Continued)
`
`Primary Examiner Daniel J. Ryman
`Assistant Examiner Cassandra Decker
`(74) Attorney, Agent, or Firm Kramer & Amado P.C.
`
`(57)
`
`ABSTRACT
`
`Apparatus and methods for real-time traffic classification
`based on off-line determined traffic classification rules are
`provided. Traces of real traffic are obtained and subjected to
`statistical analysis. The statistical analysis identifies the mul-
`tidimensional domain space of characteristic traffic param-
`eters. Classification rules associated with the identified
`domains are derived and provided to traffic classification
`points for real-time traffic classification. Traffic classification
`points, typically edge network nodes, sample packets in
`aggregate streams with a predetermined probability. Statisti-
`cal information regarding the sampled flows is tracked in a
`table, the number of time a flow was sampled providing a
`probabilistic measure of the flow's duration before the flow
`terminates. The table entries, which predominantly track high
`bandwidth flows, are subjected to the classification rules for
`real-time classification of the sampled flows. Optionally,
`rules include an action to be taken in respect of flows having
`characteristics matching thereof. Advantages are derived
`from low overhead on-line real-time classification of high-
`bandwidth flows at low overheads before flow termination.
`
`WO
`
`WO 96/38955
`
`12/1996
`
`32 Claims, 2 Drawing Sheets
`
`----
`
`252
`,... — — -- — Notifications /
`(Trgfi:+l_... __ .- -.- I ---- -Flow Info. Requests-
`/ Flow Information
`tiC°^trr 1 -....- — — — —
`
`254
`
`iTa;17fir:T7
`, ...,, 250
`hawan,
`:Controlleit
`
`t'47%
`
`Outgoing
`Controlled
`Sampled Flow
`
`252
`
`. let
`
`Ate
`(D a t a
`,r_
`Communications
`its
`Network
`fr_
`. . . .ii
`
`/
`/
`1
`
`////
`
`410
`
`
`= 5.40
`,i i „
`
`sification
`Module l
`ides
`Flow #Sampled""CumulatIve
`Packets Amt. Sampled
`ID
`
`Classification Rule
`Distribution
`/
`
`I
`I
`I
`400 !
`
`250
`
`100
`
`300
`
`\
`Last
`Sampling Time
`
`102
`
`02
`
`1117 ...alilk
`Statistical
`Aggregate
`I f melon
`Flow
`g Tracking Modul
`
`
`
`iLS
`
`dpkurlin.t
`
`m:c
`
`maP
`
`Incoming
`Sampled
`Flow
`
`150
`
`Inspection
`Module
`igi
`
`111
`152
`
`154
`
`Cj 2-106
`104
`
`'-
`108
`
`tf,
`110
`
`Cloudflare - Exhibit 1045, page 1
`
`
`
`US 7,782,793 B2
`Page 2
`
`OTHER PUBLICATIONS
`
`Konstantinos Psounis, Arpita Ghosh, and Balaji Prabhakar: "SIFT: A
`Low-complexity Scheduler for Reducing Flow Delays in the
`Internet", 2004, pp. 1-13.
`Duffield N et al.: "Estimating Flow Distributions From Sampled
`Flow Statistics", vol. 33, No. 4, October 20036, pp. 325-336.
`Karagiannis, T., et al., Transport Layer Indentification of P2P Traffic,
`ACM, 2004.
`
`Prabhakar, B., Network Processor Algorithms: Design and Analysis,
`Stochastic Networks Conference, 2004.
`Roughan, M., et al., Class-of-Service Mapping for QoS: A Statistical
`Signature-Based Approach to IP Traffic Classification, ACM, 2004.
`Sen, S., et al., Accurate, Scalable, In-Network Identification of P2P
`Traffice Using Application Signatures, ACM, 2004.
`
`* cited by examiner
`
`Cloudflare - Exhibit 1045, page 2
`
`
`
`U.S. Patent
`
`Aug. 24, 2010
`
`Sheet 1 of 2
`
`US 7,782,793 B2
`
`252
`
`254
`
`Traffic
`,N Control
`,
`Point
`.-.
`•
`Classifierr 250
`hetworkr Rules
`
`— — ------
`
`—
`
`— — —
`
`Notifications I
`-Flow Info. Requests -
`I Flow Information
`
`
`
`/
`
`Outgoing
`Controlled
`Sampled Flow
`
`Classification Rule
`Distribution
`
`252
`
`410
`
`250
`
`I
`
`400 I
`
`100
`
`300
`
`Classification
`less
`Module
`tt Sample
`Last
`Cumulative
`Packets Amt. Sampled Sampling Time
`
`Flow
`ID
`
`Aggregate
`Flow
`
`Statistical
`Informs Ion
`Tracking Modul
`
`Incoming
`Sampled
`Flow
`
`150
`
`FIG. 1
`
`152
`
`154
`
`2_ 106
`
`104
`
`108
`
`Cloudflare - Exhibit 1045, page 3
`
`
`
`lualud °Sil
`
`otoz `rz 'env
`
`Z Jo Z WIN
`
`ZS £6Lt8L`L Sf1
`
`200
`
`400
`
`Flow Table
`Entry
`
`Assign Traffic
`Flow Type
`
`212
`
`vr
`I issue Table
`I Entry Update 4 -
`Notification I
`
`220
`
`FIG. 2
`
`d Select Next
`
`Table Entry
`
`End )
`
`300
`
`FIG. 4
`
`FIG. 3
`
`Cloudflare - Exhibit 1045, page 4
`
`
`
`US 7,782,793 B2
`
`1
`STATISTICAL TRACE-BASED METHODS
`FOR REAL-TIME TRAFFIC
`CLASSIFICATION
`
`FIELD OF THE INVENTION
`
`5
`
`The invention relates to content delivery at the edge of
`communications networks, and in particular methods and
`apparatus providing real-time trace-based traffic classifica-
`tion.
`
`10
`
`BACKGROUND OF THE INVENTION
`
`20
`
`30
`
`Traffic classification is important for many reasons in
`delivering content to customers at the edge of communica-
`15
`tions networks. For example, Quality of Service (QoS)
`requires the traffic to be segregated first in order to assign
`packets to particular Classes of Service (CoS). A network
`operator can provide a different level of service to each class
`as well as a pricing structure.
`Knowledge of traffic characteristics can help optimize the
`usage of the communications network
`infrastructure
`employed, and can help ensure a desired level of performance
`for applications/services important to the customers. The
`intention has always been that application requirements be 25
`considered in offering a level of service. Traditional methods
`of traffic detection and classification rely on monitoring logi-
`cal port specifications typically carried in packet headers as,
`in the past, applications and/or services were, in a sense,
`assigned well known logical ports.
`A large percentage of the traffic conveyed by communica-
`tions networks today consists of peer-to-peer (P2P) traffic.
`Because peer-to-peer traffic is conveyed between pairs of
`customer network nodes, it is not necessary that a well known
`logical port be allocated, reserved, and assigned to traffic 35
`generated by applications generating peer-to-peer traffic and/
`or applications retrieving peer-to-peer content. Therefore
`known approaches to traffic classification are no longer valid
`as logical ports are undefined for peer-to-peer applications
`and/or logical ports may be dynamically allocated as needed 40
`such in the case of the standard File Transfer Protocol (FTP)
`and others.
`Peer-to-peer content exchange techniques are increasingly
`being used to convey without permission content subject to
`intellectual property protection, such as music and movies. 45
`Network operators are under an increasing regulatory pres-
`sure to detect peer-to-peer traffic and to control illicit peer-
`to-peer traffic, while rogue users are seeking ways to defy
`traffic classification to avoid detection.
`Besides peer-to-peer traffic detection, means and methods 50
`are being sought on a continual basis for detecting short
`duration traffic flows to help identify possible intrusions such
`as, but not limited to, Denial of Service (DOS) attacks.
`Statistical billing is another domain in which knowledge of
`traffic characteristics
`is necessary. Network operators 55
`increasingly employ resource utilization measurements as a
`component in determining customer charges.
`Returning to peer-to-peer traffic detection, not all peer-to-
`peer traffic is illicit: in view of the high levels of resource
`utilization demanded by peer-to-peer traffic, network opera- 60
`tors may want to charge customers generating peer-to-peer
`traffic and retrieving peer-to-peer content more for their high
`bandwidth usage. Resource utilization alone is not always an
`adequate traffic characteristic differentiator as in many
`instances content conveyed to, and received from, multiple 65
`customers is aggregated at the managed edge and within the
`managed transport communications network.
`
`2
`Attempts to characterize traffic, to detect traffic types, with
`a view of classifying traffic, include Deep Packet Inspection
`(DPI) techniques. Deep packet inspection techniques are
`described by Sen S., Spatscheck 0. and Wang D. in "Accu-
`rate, Scalable In-Network Identification of P2P Traffic Using
`Application Signatures", Proceedings of the 13th interna-
`tional conference on World Wide Web, NewYork, N.Y., 2004;
`and by Karagiannis T., Broido A., Faloutsos M., Claffy K. in
`"Transport layer identification of P2P traffic", Proceedings of
`the 4th ACM SIGCOMM Conference on Internet Measure-
`ment, Taormina, Sicily, Italy, 2004.
`Proposed deep packet inspection techniques, as the name
`suggests, assume the availability of unlimited resources to
`inspect entire packets to the perform packet characterization.
`Therefore deep packet inspection incurs high processing
`overheads and is subject to high costs. Deep packet inspection
`also suffers from a complexity associated with the require-
`ment of inspecting packet payloads at high line rates. For
`certainty, deep packet inspection is not suited at all for typical
`high throughput communications network nodes deployed in
`current communications networks. Deep packet inspection
`also suffers from a high maintenance overhead as the detec-
`tion techniques rely on signatures, peer-to-peer applications,
`especially, are known for concealing their identities
`a deep
`packet inspection detection signature that provides conclu-
`sive detection now may not work in the future, and another
`conclusive signature would have to be found and coded
`therein.
`Traffic classification means and methods are being actively
`sought by network operators in order to determine the types of
`traffic present in a managed communications network for
`traffic and network engineering purposes, on-line marking of
`packets, quality of service assessment/assurance, billing, etc.
`In view of impending regulatory pressures, efficient detection
`and classification of peer-to-peer traffic is especially desired,
`as peer-to-peer traffic consumes large, disproportional per-
`centages of bandwidth and other communication network
`resources. Network operators have to employ a combination
`of: peer-to-peer traffic control in order to reserve network
`resources for other types of traffic, charge peer-to-peer users
`different rates to curb behavior, and/or even block peer-to-
`peer completely in accordance with regulations imposed on
`network operators. There therefore is a need to solve the
`above mentioned issues to provide traffic classification means
`and methods which avoid the complexities of deep packet
`inspection and the pitfalls of logical port based packet clas-
`sification.
`
`SUMMARY OF THE INVENTION
`
`In accordance with an aspect of the invention, a packet flow
`classification apparatus for on-line real-time traffic flow clas-
`sification at a communications network node is provided.
`Packet sampling means randomly selects packets from an
`aggregate flow with a pre-determined sampling probability.
`Packet inspection means determines the packet size of each
`sampled packet and obtains the flow identification of the
`sampled traffic flow with which the sampled packet is asso-
`ciated. A sampled flow information table has sampled flow
`table entries for storing real-time sampled flow statistical
`information. Flow information tracking means maintain the
`flow information table in real-time. And, a packet classifier
`classifies sampled traffic flows on-line in real-time based on a
`group of classification rules trained off-line on statistical trace
`traffic flow information.
`In accordance with another aspect of the invention, a
`packet flow classification system for on-line real-time traffic
`
`Cloudflare - Exhibit 1045, page 5
`
`
`
`US 7,782,793 B2
`
`4
`3
`page download followed by a reading period, and the infre-
`flow classification of a plurality of traffic flows conveyed
`quent electronic bank transaction. Although the equipment is
`through a network node of a communications network is
`prevalent, video conferencing is relatively rare. Service level
`provided. The packet flow classification system includes: a
`agreements include enough long-term transport bandwidth
`group of classification rules trained off-line on statistical trace
`traffic flow information, at least one traffic flow monitor at the 5 for comparatively higher bandwidth netradio audio stream-
`communications network node, and a packet classifier for
`ing. Customers are assumed to be nice and occasional trans-
`classifying sampled traffic flows on-line in real-time based on
`gressions rarely translate into higher bills at the end of the
`the group of off-line trained classification rules. The traffic
`month. It is assumed that nice customers do not listen to
`flow monitor includes: packet sampling means for randomly
`netradio, nor download MP3's from traceable and reputable
`selecting packets from an aggregate flow with a pre-deter- 10 sources, 24/7. At the same time, in view of the intense com-
`mined sampling probability, packet inspection means for
`petition in communications, the available transport band-
`determining the packet size of each sampled packet and for
`width in the core of the managed communications network is
`obtaining flow identification of the sampled traffic flow with
`oversold.
`which the sampled packet is associated, a sampled flow infor-
`Problems arise when customers engage in illicit/regulated
`mation table having sampled flow table entries for storing 15 activities such as exchanging large amounts of content sub-
`real-time sampled flow statistical information, and flow infor-
`ject to intellectual property protection. Such rogue customers
`mation tracking means for maintaining the flow information
`do not want to pay higher fees for levels of service which
`table in real-time .
`would provide them with increased bandwidth in order not to
`In accordance with yet another aspect of the invention, a
`attract attention to themselves. Sophisticated rogue custom-
`method of classifying traffic flows on-line in real-time based 2o ers are willing to put up with sending and/or receiving content
`at least one classification rule trained off-line on statistical
`at transfer rates well bellow aggregate service level agree-
`traffic flow information is provided. Packets are randomly
`ment limits for long periods of time. Network operators are
`sampled from an aggregate flow with a predetermined sam-
`faced with the conundrum that: rogue customers do not vio-
`pling probability. The packet size of each sampled packet is
`late their service level agreements; the fact that the rogue
`extracted. A flow identifier of the sampled traffic flow with 25 customers have not signed up at a higher level of service the
`which the sampled packet is associated is obtained. Informa-
`resources are overused; and because of the bandwidth is
`tion regarding the sampled flow is tracked in a sampled flow
`oversold, services provisioned to nice customers are being
`information table entry. And, the sampled traffic flow is clas-
`impacted. Therefore given that input traffic from multiple
`sified in real-time by subjecting the information tracked in the
`customers and traffic output to multiple customers is typically
`flow table entry to the at least one classification rule.
`30 aggregated on edge, aggregate traffic metrics only point out
`Advantages are derived from simple, low overhead and
`that network resources are utilized to a very high degree and
`inexpensive on-line real-time classification of high-band-
`that the average and the typical customer is nice.
`width flows at low overheads before flow termination.
`With the prevalence of peer-to-peer activity, it does not
`make business sense for network operators to deny services to
`35 any user engaging in peer-to-peer file sharing, after all some
`peer-to-peer traffic such as a video conference is legitimate.
`The features and advantages of the invention will become
`What is desired is to identify regulated peer-to-peer traffic in
`more apparent from the following detailed description of the
`order to reduce the allocation of network resources thereto
`exemplary embodiment(s) with reference to the attached dia-
`thereby making the network unfriendly only to questionable/
`grams wherein:
`40 undesirable peer-to-peer traffic.
`FIG. 1 is a schematic diagram showing an edge communi-
`Current traffic characterization techniques, in order to be
`cations network element sampling traffic sporadically and
`effective, employ a determination of particular traffic flows
`tracking statistical flow information in a table in accordance
`and their duration to determine the amount of resources
`with the exemplary embodiment of the invention;
`expended. Because of the willingness of rogue users to make
`FIG. 2 is a flow diagram showing process steps in sampling 45 peer-to-peer traffic compliant over the short term with service
`traffic sporadically and tracking statistical flow information
`level agreements subscribed to at the cost of long duration
`in a table in accordance with the exemplary embodiment of
`uploads/downloads, knowing the duration of a traffic flow has
`the invention;
`been found to be very important in characterizing traffic. The
`FIG. 3 is a flow diagram showing process steps of a clean-
`trouble is that using current techniques, the duration of a
`up process ensuring that the statistical information in a flow so traffic flow can only be found after the traffic flow ends, when
`information table is current, in accordance with the exem-
`is too late to effect any control over the traffic flow. For this
`plary embodiment of the invention; and
`very reason, current deep packet inspection techniques,
`FIG. 4 is a flow diagram showing process steps of an
`which are designed to detect the initiation and the termination
`on-line real-time classification process classifying high-
`of a traffic flow, are inadequate as a trigger to real-time peer-
`bandwidth sampled flows in accordance with the exemplary 55 to-peer traffic control as the traffic classification is provided
`embodiment of the invention.
`after the termination of the monitored activity.
`It will be noted that in the attached diagrams like features
`Having identified the problem, real-time traffic identifica-
`bear similar labels.
`tion is needed to provide a measure of how the available
`network resources are utilized by, and partitioned to, rogue
`60 customers in real-time. From a business point of view, any
`solution has to assume that customers are nice. This is nec-
`essary both from the customer relations point of view and
`because assuming the converse, would require a prohibitive
`amount of resources to be devoted to traffic monitoring.
`In "Class-of-Service Mapping for QoS: A statistical signa-
`ture-based approach to IP traffic classification", ACM SIG-
`COMM Internet Measurement Workshop, Taormina, Sicily,
`
`The realities of Internet service provisioning to customers
`are such that service level agreements are described in terms
`of expected aggregate traffic characteristics with the assump- 65
`tion that most of the user traffic is highly bursty and relatively
`low bandwidth such as the occasional email, intermittent web
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`DETAILED DESCRIPTION OF THE
`EMBODIMENTS
`
`Cloudflare - Exhibit 1045, page 6
`
`
`
`US 7,782,793 B2
`
`6
`5
`information and the associated statistics, to create off-line
`Italy, 2004, which is incorporated herein by reference,
`rules based on statistical properties of the gathered informa-
`Roughan M., Subhabrata S., Spatscheck 0. and Duffield N.,
`tion and the application associativity. The outcome of the
`address the fact that although various mechanisms exist for
`characterization step is a relation between traffic statistics
`providing QoS, QoS has yet to be widely deployed. Roughan
`et al. are of the opinion that employing previously known 5
`(packet sizes, flow durations, etc.), application associativity
`and the traffic classes the traffic flows belong to. In creating
`techniques to map traffic to QoS classes is prohibitive due to
`the rules, Roughan et al. propose subjecting the traffic flow
`the high overheads incurred. Falling short of providing real-
`statistical parameters to statistical analysis in order to define
`time traffic classification, Roughan et al. predicate their off-
`domains in the multi-dimensional space of statistical flow
`line traffic classification solution on the fact that it would be
`unrealistic for effective traffic classification to inspect every 10 parameters. Subjecting traffic flow statistical parameters to
`packet, and propose an off-line trace-based statistical method
`statistical analysis is referred herein as training. The domains
`of traffic characterization. Roughan et al. confirm that flow
`correspond to statistically distinct traffic classes. Statistical
`durations are essential in characterizing and distinguishing
`analysis methods include, but are not limited to, Nearest
`between different types of traffic. However, just like deep
`Neighbour (NN), Linear Discriminant Analysis (LDA), etc.
`packet inspection techniques, the trace-based methods pro- 15 Given traffic classes, gathered statistics, trace information,
`posed by Roughan et al. only obtain the flow duration after
`and application associativity, a set of rules are trained for
`each flow terminates and are therefore inadequate for real-
`classifying future data is determined.
`time traffic characterization.
`For example assuming that only two statistics: average
`Other relevant research in the art of packet queuing
`packet sizes and session durations, are all that is needed to
`includes an proposal by Psounis K., Go shA. and Prabhakar B. 20 characterize traffic, a rule can indicate that if an average
`entitled "SIFT: A low-complexity scheduler for reducing flow
`duration of a traffic flow is greater than 40 seconds, and
`delays in the Internet", Technical report, CENG-2004-01,
`average packets of that traffic flow do not exceed 300 bytes,
`USC, which is incorporated herein by reference, and
`then the traffic flow most likely belongs to interactive class
`describes an algorithm for identifying high bandwidth flows
`(e.g., a Telnet session). As another example, peer-to-peer
`in order to queue packets of identified high bandwidth flows 25 flows, based on the number of packets conveyed, statistically
`in a special queue. The SIFT proposal includes sampling
`fall into bulk data transfer flows and streaming flows, how-
`packets with a pre-selected low probability. With the assump-
`ever average packet sizes would characterize peer-to-peer
`tion that most low bandwidth flows consist of few packets,
`flows as bulk data transfer flows regardless of the application
`very few low bandwidth flows would be sampled. Conversely
`or logical port used to convey the content. Beside flow dura-
`high bandwidth flows would be sampled with greater cer- 30 tions, traffic flows can be characterized based on statistical
`tainty. The flow identifier of each sampled flow is provided to
`traffic flow parameters such as: average/median packet size,
`a packet classifier/queue manager which queues every sub-
`packet size variance, root-means-square packet size, largest
`sequent packet bearing one of the provided flow identifiers
`packet sampled so far, shortest packet sampled so far, aver-
`into the special queue. The proposed use of the special queue
`age/median inter-packet arrival delay, inter-packet arrival
`(s) would be prohibitive in terms of the necessary resources 35 delay variance, bytes per flow, packets per flow, etc.
`for high throughput deployments.
`In accordance with an exemplary implementation of the
`In accordance with the exemplary embodiment of the
`exemplary embodiment of the invention, classification rules
`invention, information is gathered about real typical traffic
`pertaining to expected/uninteresting traffic patterns may be
`patterns for statistically relevant periods of time. The out-
`deleted from the off-line determined set of classification
`come of information gathering step is a trace of traffic (packet 40 rules. Reducing the number of classification rules provides
`headers and, possibly, portions of payloads) and perhaps a
`desirable overhead reductions.
`collection of gathered and/or derived statistics. The trace
`Proposed real-time methods include real-time statistics
`traffic information is gathered with the intent of subjecting
`collection and real-time traffic classification.
`thereof to off-line traffic characterization training.
`In accordance with the exemplary embodiment of the
`Because the proposed traffic characterization is performed 45 invention, random packet sampling techniques are employed
`off-line, information about known and/or determinable traffic
`on a monitored aggregated flow irrespective of the individual
`flow types may also be used as inputs to a rule creation
`aggregated flows contained therein for the purpose of esti-
`process referred to as off-line training. Using the trace infor-
`mating flow durations of individual sampled flows. Random
`mation, different applications generating/consuming the traf-
`packet sampling techniques conform to the desired assump-
`fic are identified. Diverse off-line methods can be used for this 50 tion that initially all customers' traffic flows are nice.
`purpose, for example, the methods described in the above
`In accordance with the exemplary embodiment of the
`referenced prior art Sen and Karagiannis and other deep
`invention, network elements shown in FIG. 1 such as, but not
`packet inspection methods to the extent that portions of pay-
`limited to, edge switching equipment and routing equipment,
`loads have been obtained. This step associates traffic flows
`include packet selection means 150, packet inspection means
`55 152, statistical information tracking means 154, and a flow
`with the applications.
`Classes of traffic, or traffic types, are defined. For example,
`information table 100, having sampled flow table entries 102.
`Roughan et al. propose the following exemplary traffic
`Each sampled flow entry 102 exemplary includes fields for
`classes:
`the flow identifier 104, the number of packets sampled 106,
`interactive, including: remote login sessions, interactive
`the cumulative amount of content sampled 108, sampling
`Web content access, etc;
`60 time of the last packet 110. All information necessary to
`bulk data transfer, including: FTP, music downloads, peer-
`populate and update the fields 104, 106, 108 and 110 can
`to-peer traffic, etc.;
`either be extracted by the packet inspection means 152 or
`streaming, including video conferencing, netradio, etc.;
`derived by the statistical information tracking means 154
`and
`from information extracted from sampled packet headers. For
`transactional, including: distributed database access.
`65 certainty, dependent on the implementation, the packet selec-
`Traffic characterization means are employed off-line to
`tion means 150, the packet inspection means 152, the statis-
`characterize the traffic flows based on the gathered trace
`tical information tracking means 154, and the flow informa-
`
`Cloudflare - Exhibit 1045, page 7
`
`
`
`US 7,782,793 B2
`
`8
`7
`tion table 100, without limiting the invention, can be
`information regarding current high bandwidth flows presents
`the network operator with the most plausible flows to con-
`associated with a physical port, a logical port, a group of
`sider in identifying illicit/regulated content transfers regard-
`physical/logical ports, all ports, etc. Also, depending on the
`less of the rogue customers' attempts to foil detection.
`implementation, the proposed traffic monitoring may only be
`In accordance with an exemplary implementation of the
`performed on best-effort traffic and/or available bit rate traffic 5
`exemplary embodiment of the invention, an on-line real-time
`conveyed by the implementing equipment.
`traffic classification entity 250, typically associated with the
`In accordance with the exemplary embodiment of the
`traffic control point 252, is provided with the traffic classifi-
`invention, a real-time traffic monitoring process 200 shown in
`FIG. 2 maintains the flow information table 100. Based on a
`cation rules and performs an on-line real-time classification
`preset sampling probability, the traffic monitoring process 10 process 400 as exemplary shown in FIG. 4. Accordingly,
`200 determines 202 whether to sample the next packet upon
`on-line real-time traffic classification is achieved by subject-
`ing 404 the suspect traffic flows monitored via the flow infor-
`arrival. If the next packet is to be sampled, then the process
`200, waits for the next packet to arrive, and inspects 204 the
`mation table 100 to the off-line trained classification rules.
`Simply put, subjecting flow entry 102 to each rule answers the
`received packet to extract information such as, but not limited
`to, flow identification, and packet size. Typically, the packet 15 question whether the sampled flow, to which the flow entry
`inspection 204 is typically limited to reading at most the
`102 corresponds, has statistical characteristics which would
`packet header, application layer connection information hid-
`locate the traffic flow in the multi-dimensional domain
`den in packet payloads is typically not searched for. Existing
`expressed in the rule. Without limiting the invention, the
`on-line real-time classification process 400, may be triggered
`packet inspection means otherwise used in packet processing
`20 402 via notifications 220 or in accordance with a discipline,
`may be reused.
`If a flow table entry 102 for the identified flow does not
`for example on a schedule or periodically. If there is a hit 404
`exist 206 in the flow information table 100, then a flow table
`in respect of a traffic flow, the classification rule hit specifies
`entry 102 is created 208 and the flow table entry is initialized
`406 the traffic class/type. Accordingly, information about the
`210. Initializing 210 the flow table entry 102 includes, but is
`current state of the monitored communications network is
`not limited to, filing in the flow identifier field 104, setting the 25 provided allow the network operator to use this information to
`number of sampled packets 106 to 1, setting the cumulative
`take further action in respect of specific flows in real-time if
`amount of content sampled 108 to the size of the sampled
`and when necessary.
`packet, and writing the sampling time in field 110.
`In accordance with the exemplary embodiment of the
`If a flow table entry 102 exists for the identified flow, then
`invention, the off-line trained traffic classification rules are
`the fields of the table entry 102 are updated 212 by: incre- 30 also used to identify illicit/regulated content transfers from
`menting the number of sampled packets 106 by 1, adding the
`the suspect flows identified. Depending on implementation,
`and without limiting the invention, rule hits 404 may either be
`size of the sampled packet to the cumulative amount of con-
`tent sampled 108, and overwriting field 110 with the sampling
`(410) logged, alarms may be raised, and/or traffic control may
`be enforced at a traffic control point 252. Depending on
`time.
`Having created 208 or updated 212 the flow table ent