`(10) Patent No.:
`US 7,660,248 B1
`
`Duffield et a].
`(45) Date of Patent:
`Feb. 9, 2010
`
`USOO7660248B1
`
`(54) STATISTICAL, SIGNATURE-BASED
`APPROACH TO IP TRAFFIC
`CLASSIFICATION
`
`(76)
`
`Inventors: Nicholas G. Duffield, 101 W. 12th St,
`Apt. 7 S, New York, NY (US) 10011;
`-
`.
`Matthew Roughan, 15 Locust St.,
`Ig/flifiziolzgéISIeInailsgooIZigfegst Apt H6
`Chatham, NJ (US) 07928; Oliver
`32:30:11???aJIéivgiggg Rd"
`13
`’
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 776 days.
`
`(51)
`
`58
`
`(
`
`)
`
`.
`(21) App1.No.. 10/764’001
`(22) File(1‘
`Jan 23 2004
`'
`'
`’
`Int Cl
`(2006 01)
`H04L 12/26
`3702230 1. 370/229. 370/232.
`(52) U S Cl
`370/2'35’ 370/235; 370/252’
`' """"""""""
`'
`'
`h
`’
`’370/229
`S
`F'
`ld f Cl
`'fi
`_
`1e
`0
`3217532§3tl§§0 1621;; 234235235 1 237’
`370/238 ’241 ’24’2 244 245 25’0 25’2’ 253’
`’
`’
`’
`’
`’ 376/231’ 232’
`1
`fi1 f
`h h'
`’
`.
`1.
`e or comp ete searc
`lstory.
`See app lcatlon
`References Cited
`U.S. PATENT DOCUMENTS
`
`(56)
`
`7,302,682 B2 * 11/2007 Turkoglu .................... 717/174
`................ 718/107
`7,305,676 B1 * 12/2007 Boll et al.
`
`...... 370/230
`4/2008 Klaghofer etal.
`..
`7,359,320 B2 *
`
`7,433,943 B1* 10/2008 Ford
`709/223
`
`......................... 726/13
`7,441,267 B1* 10/2008 Elllott
`OTHER PUBLICATIONS
`ys1s o
`n erne
`a
`ra c,
`ewes, e a,
`c .
`,
`ro-
`t
`1 AnAnal
`_ H t
`tCh tT Hi 0 t 2003 P
`D
`ceedings ofACM SIGCOMM Internet Measurement Conference.
`* Cited by examiner
`Primary ExamineriPankaj Kumar
`Assistant ExamineriMark Mais
`
`(74) Attorney, Agent, or F1rmiHenry Brendzel
`
`ABSTRACT
`(57)
`A signature-based trafiic classification method maps trafiic
`into preselected classes of service (COS). By analyzing a
`known corpus ofdata that clearly belongs to identified ones of
`the preselected classes of service, in a training session the
`method develops statistics about a chosen set of trafiic fea-
`tures. In an analysis session, relative to trafiic of the network
`where QoS treatments are desired (target network),
`the
`method obtains statistical information relative to the same
`chosen set of features for values of one or more predeter-
`mined trafiic attributes that are associated with connections
`that are analyzed in the analysis session, yielding a statistical
`features signature of each of the values of the one or more
`attributes. A classification process then establishes a mapping
`between values of the one or more predetermined traffic
`attributes and the preselected classes of service, leading to the
`establishment of QoS treatment rules.
`
`7,251,218 B2*
`
`7/2007 Jorgensen ................... 370/235
`
`1 Claim, 1 Drawing Sheet
`
`TRAINING SESSION
`(ON TRAINING NETWORK)
`
`ANALYSIS SESSION
`
`(ON TARGET NETWORK)
`
`
`OBTAIN STATISTICAL INFORMATION:
`
` IO
`RELATIVE TO SELECTED FEATURES FOR
`
`
`EACH OF A CHOSEN SET OF CLASSES
`
`
`
` STATISTICAL
`
`"FEATURES—CLASS"
`
`MAPPING
`
`
`OBTAIN STATISTICAL INFORMATION
` 20
`
`
`RELATIVE TO THE SAME SELECTED
`FEATURES, FOR VALUES OF ONE OR
`MORE CONNECTION ATTRIBUTES
`
`STATISTICAL FEATURES SIGNATURE
`
`OF EACH VALUE OF THE ONE ORE
`MORE ATTRIBUTES,
`
`
`ESTABLISH A CLASSIFICATION:
`MAPPING EACH OF THE VALUES OF THE
`ONE OR MORE ATTRIBUTES HAVING A
`INTO A CLASS
`FEATURES SIGNATURE
`
`
`
`ASSIGN PACKETS ARRIVING AT THE
`
`
`TARGET NETWORK TO A CLASS BASED
`ON THE ESTABUSHED CLASSIFICATION
`
`APPLY QoS BASED ON THE ASSIGNED CLASS
`
`SO
`
`40
`
`
`
`Cloudflare - Exhibit 1021, page 1
`
`Cloudflare - Exhibit 1021, page 1
`
`
`
`US. Patent
`
`Feb. 9, 2010
`
`US 7,660,248 B1
`
`FIG.
`
`1
`
`TRAINING SESSION
`(ON TRAINING NETWORK)
`
`OBTAIN STATISTICAL INFORMATION:
`RELATIVE TO SELECTED FEATURES FOR
`EACH OF A CHOSEN SET OF CLASSES
`
`STATISTICAL
`
`IFEATURES-CLASS"
`
`MAPPING
`
`ANALYSIS SESSION
`(ON TARGET NETWORK)
`
`OBTAIN STATISTICAL INFORMATION
`RELATIVE To THE SAME SELECTED
`FEATURES, FOR VALUES OF ONE OR
`MORE CONNECTION ATTRIBUTES
`
`10
`
`20
`
`STATISTICAL FEATURES SIGNATURE
`
`OF EACH VALUE OF THE ONE ORE
`
`MORE ATTRIBUTES.
`
`ESTABLISH A CLASSIFICATION:
`MAPPING EACH OF THE VALUES OF THE
`ONE OR MORE ATTRIBUTES HAVING A
`
`FEATURES SIGNATURE
`
`INTO A CLASS
`
`30
`
`40
`
`
`
`ASSIGN PACKETS ARRIVING AT THE
`
`TARGET NETWORK TO A CLASS BASED
`
`ON THE ESTABLISHED CLASSIFICATION
`
`APPLY OOS BASED ON THE ASSIGNED CLASS
`
`Cloudflare - Exhibit 1021, page 2
`
`Cloudflare - Exhibit 1021, page 2
`
`
`
`1
`STATISTICAL, SIGNATURE-BASED
`APPROACH TO IP TRAFFIC
`CLASSIFICATION
`
`BACKGROUND OF THE INVENTION
`
`This invention relates to traffic classification and, more
`particularly to statistical classification of IP traffic.
`The past few years have witnessed a dramatic increase in
`the number and variety of applications running over the Inter-
`net and over enterprise IP networks. The spectrum includes
`interactive (e. g., telnet, instant messaging, games, etc.), bulk
`data transfer (e.g., ftp, P2P file downloads), corporate; (e.g.,
`Lotus Notes, database transactions), and real-time applica-
`tions (voice, video streaming, etc.), to name just a few.
`Network operators, particularly in enterprise networks,
`desire the ability to support different levels of Quality of
`Service (QoS) for different types of applications. This desire
`is driven by (i) the inherently different QoS requirements of
`different types of applications, e.g., low end-end delay for
`interactive applications, high throughput for file transfer
`applications etc.; (ii) the different relative importance of dif-
`ferent applications to the enterprise%.g., Oracle database
`transactions are considered critical and therefore high prior-
`ity, while traffic associated with browsing external web sites
`is generally less important; and (iii) the desire to optimize the
`usage of their existing network infrastructures under finite
`capacity and cost constraints, while ensuring good perfor-
`mance for important applications.
`Various approaches have been studied, and mechanisms
`developed for providing different Q08 in a network. See, for
`example, S. Blake, et al., RFC 24757an architecture for
`differentiated service, December 1998, http://ww.faqs.org/
`rfcs/rfc2475.html; and C. Gbaguidi, et al., A survey of differ-
`entiated services architectures for the Internet, March 1998,
`http://sscwww.epfl.ch/Pages/publications/p s_files/tr98i
`020.ps; andY. Bemet, et al., A framework for differentiated
`services.
`Internet Draft
`(draft-ietf-diffserv-framework-
`02.txt), February 1999, http://search.ietf.org/internet-drafts/
`draft-ietf-diffserv-framework-02.txt.
`Previous work also has examined the variation of flow
`
`characteristics according to applications. M. Allman, et al.,
`TCP congestion control, IETF Network Working Group RFC
`2581, 1999, investigated the joint distribution of flow dura-
`tion and number ofpackets, and its variation with flow param-
`eters such as inter-packet timeout. Differences were observed
`between the distributions of some application protocols,
`although overlap was clearly also present between some
`applications. Most notably, the distribution of DNS transac-
`tions had almost no overlap with that of other applications
`considered. However, the use of such distributions as a dis-
`criminator between different application types was not con-
`sidered.
`
`There also exists a wealth ofresearch on characterizing and
`modeling workloads for particular applications, with A.
`Krishnamurth, et al., Web Protocols and Practice, Chapter
`10, Web Workload Characterization, Addison-Wesley, 2001;
`and J. E. Pitkow, Summary ofWWW characterizations, W3J,
`223-13, 1999 being but two examples of such research.
`An early work in this space, reported in V. Paxson,
`“Empirically derived analytic models of wide-area TCP con-
`nections,” IEEE/ACM Transactions on Networking, vol. 2,
`no. 4, pp. 316-336, 1994, examines the distributions of flow
`bytes and packets for a number of different applications.
`Interflow and intraflow statistics are another possible
`dimension along which application types may be distin-
`guished and research has been conducted. V. Paxson, et al.,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 7,660,248 B1
`
`2
`
`“Wide-area traffic: The failure of Poisson modeling,” IEEE/
`ACM Transactions on Networking, vol. 3, pp. 226-244, June
`1995, for example, found that user initiated eventsisuch as
`telnet packets within flows or FTP-data connection arrivalsi
`can be described well by a Poisson process, whereas other
`connection arrivals deviate considerably from Poisson.
`Signature-based detection techniques have also been
`explored in the context of network security, attack and
`anomaly detection; e.g. P. Barford et al., Characteristics of
`Network Traffic Flow Anomalies, Proceedings ofACM SIG-
`COMM Internet Measurement Workshop, October 2001; and
`P. Barford, et al., A Signal Analysis of Network Traffic
`Anomalies, Proceedings ofACM SIGCOMM Internet Mea—
`surement Workshop, November 2002, where one typically
`seeks to find a signature for an attack.
`Actually, realization of a service differentiation capability
`requires (i) association of the traffic with the different appli-
`cations, (ii) determination of the QoS to be provided to each,
`and finally, (iii) mechanisms in the underlying network for
`providing the QoS; i.e., for controlling the traffic to achieve a
`particular quality of service.
`While some of the above-mentioned studies assume that
`
`one can identify the application trafiic unambiguously and
`then obtain statistics for that application, none of them have
`considered the dual problem of inferring the application from
`the traffic statistics. This type of approach has been suggested
`in very limited contexts such as identifying chat traffic in C.
`Dewes, et al., An analysis of Internet chat systems, Proceed-
`ings ofACM SIGCOMM Internet Measurement Conference,
`October 2003.
`
`Still, in spite of a clear perceived need, and the prior art
`work reported above, widespread adoption of QoS control of
`traffic has not come to pass. It is believed that the primary
`reason for the slow spread of QoS-use is the absence of
`suitable mapping techniques that can aid operators in classi-
`fying the network traffic mix among the different QoS
`classes. We refer to this as the Class of Service (CoS) mapping
`problem, and perceive that solving this would go a long way
`in making the use of QoS more accessible to operators.
`
`SUMMARY
`
`An advance in the art of providing specified Q08 in an IP
`network is achieved with a signature-based trafiic classifica-
`tion method that maps traffic into preselected classes of ser-
`vice (CoS). By analyzing, in a training session, a known
`corpus of data that clearly belongs to identified ones of the
`preselected classes of service, the method develops statistics
`about a chosen set of traffic features. In an analysis session,
`relative to traffic of the network where QoS treatments are
`desired (target network), obtaining statistical
`information
`relative to the same chosen set of features for values of one or
`
`more predetermined trafiic attributes that are associated with
`connections that are analyzed in the analysis session, yielding
`a statistical features signature of each of the values of the one
`or more attributes. A classification process then establishes a
`mapping between values of the one or more predetermined
`traffic attributes and the preselected classes of service, lead-
`ing to the establishment of rules. Once the rules are estab-
`lished, traffic that is associated with particular values of the
`predetermined trafiic attributes are mapped to classes of ser-
`vice, which leads to a designation of QoS.
`Illustratively, the preselected classes of service may be
`interactive trafiic, bulk data transfer trafiic, streaming traffic
`and transactional traffic. The chosen set of traffic features
`
`may be packet-level features, flow-level features, connection-
`level features, intra-flow/connection features, and multi-flow
`
`Cloudflare - Exhibit 1021, page 3
`
`Cloudflare - Exhibit 1021, page 3
`
`
`
`US 7,660,248 Bl
`
`3
`features. The predetermined traffic attributes may be the
`server port, and the server IP address. An illustrative rule
`might state that "a connection that specifies port x belongs to
`the class of interactive traffic." An administrator of the target
`network may choose to give the highest QoS level to such
`traffic.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`5
`
`4
`All future packets of a session, in either a TCP or UDP
`session, use the same pair of ports to identify the client and
`server side of the session. Therefore, in principle, the TCP or
`UDP server port number can be used to identify the higher
`layer application by simply identifying in an incoming packet
`the server port and mapping this port to an application using
`the IANA (Internet Assigned Numbers Authority) list ofreg(cid:173)
`istered ports (http://www.iana.org/assignments/port-num-
`bers). However, port-based application classification has
`limitations. First, the mapping from ports to applications is
`not always well defined. For instance.
`Many implementations of TCP use client ports in the reg-
`istered port range. This might mistakenly classify the
`connection as belonging to the application associated
`with this port. Similarly, some applications ( e.g., old
`bind versions), use port numbers from the well-known
`ports to identify the client site of a session.
`Ports are not defined with IANA for all applications, e.g.,
`P2P applications such as Napster and Kazaa.
`An application may use ports other than its well-known
`ports to circumvent operating system access control
`restrictions. E.g., non-privileged users often run WWW
`servers on ports other than port 80, which is restricted to
`privileged users on most operating systems.
`There are some ambiguities in the port registrations, e.g.,
`port 888 is used for CDDBP (CD Database Protocol)
`and access-builder.
`In some cases server ports are dynamically allocated as
`needed. For example, FTP allows the dynamic negotia(cid:173)
`tion of the server port used for the data transfer. This
`server port is negotiated on an initial TCP connection,
`which is established using the well-known FTP control
`port.
`The use of traffic control techniques like firewalls to block
`35 unauthorized, and/or unknown applications from using a net(cid:173)
`work has spawned many work-arounds which make port
`based application authentication harder. For example, port 80
`is being used by a variety of non-web applications to circum(cid:173)
`vent firewalls which do not filter port-80 traffic. In fact, avail-
`40 able implementations ofIP over HTTP allow the tunneling of
`all applications through TCP port 80.
`Trojans and other security attacks generate a large volume
`of bogus traffic which should not be associated with the
`applications of the port numbers those attacks use.
`A second limitation of port-number based classification is
`that a port can be used by a single application to transmit
`traffic with different QoS requirements. For example, (i)
`Lotus Notes transmits both email and database transaction
`traffic over the same ports, (ii) sep (secure copy), a file trans-
`50 fer protocol, runs over ssh (secure shell), an interactive appli(cid:173)
`cation using default TCP port 22. This use of the same port for
`traffic requiring different QoS requirements is quite legiti(cid:173)
`mate, and yet a good classification must separate different use
`cases for the same application. A clean QoS implementation
`55 is still possible through augmenting the classification rules to
`include IP address-based disambiguation. Server lists exist in
`some networks but, again, in practice these lists are often
`incomplete, or a single server could be used to support a
`variety of different types of traffic, so we must combine port
`60 and IP address rules.
`A possible alternative to port based classification is to use
`a painstaking process involving installation of packet sniffers
`and parsing packets for application-level information to iden(cid:173)
`tify the application class of each individual TCP connection
`65 or UDP session. However, this approach cannot be used with
`more easily collected flow level data, and its collection is
`computationally expensive, limiting its application to lower
`
`FIG. 1 presents a flow chart of the IP traffic classification 10
`method disclosed herein.
`
`DETAILED DESCRIPTION
`
`In accord with the principles disclosed herein QoS imple- 15
`mentations are based on mapping of traffic into classes of
`service. In principle the division of traffic into CoS could be
`done by end-points of the network, where traffic actually
`originates-for instance by end-user applications. However,
`for reasons of trust and scalability ofadministration and man- 20
`agement, it is typically more practical to perform the CoS
`mapping within the network; for instance, at the router that
`connects the Local Area Network (LAN) to the Wide Area
`Network (WAN). Alternatively, there might be appliances
`connected near the LAN to WAN transition point that can 25
`perform packet marking for QoS.
`CoS mapping inside the network is a non-trivial task. Ide(cid:173)
`ally, a network system administrator would possess precise
`information on the applications running inside the adminis(cid:173)
`trator's network, along with simple and unambiguous map- 30
`pings, which information is based on easily obtained traffic
`measurements ( e.g., by port numbers, or source and destina(cid:173)
`tion IP addresses). This information is vital not just for the
`implementation of CoS, but also in planning the capacity
`required for each class, and balancing tradeoffs between cost
`and performance that might occur in choosing class alloca(cid:173)
`tions. For instance, one might have an application whose
`inclusion in a higher priority class is desirable but not cost
`effective (based on traffic volumes and pricing), and so some
`difficult choices must be made. Good data is required for
`these to be informed choices.
`In general, however, the required information is rarely
`up-to-date, or complete, if it is available at all. The traditional
`ad-hoc growth ofIP networks, the continuing rapid prolifera(cid:173)
`tion of new applications, the merger of companies with dif- 45
`ferent networks, and the relative ease with which almost any
`user can add a new application to the traffic mix with no
`centralized registration are all factors that contribute to this
`"knowledge gap". Furthermore, over recent years it has
`become harder to identify network applications within IP
`traffic. Traditional techniques such as port-based classifica(cid:173)
`tion of applications, for example, have become much less
`accurate.
`One approach that is commonly used for identifying appli(cid:173)
`cations on an IP network is to associate the observed traffic
`(using flow level data, or a packet sniffer) with an application
`based on TCP or UDP port numbers. Alas, this method is
`inadequate.
`The TCP/UDP port numbers are divided into three ranges:
`the Well Known Ports (0-1023), the Registered Ports (1024-
`49,151), and the Dynamic and/or Private ports (49,152-65,
`535). A typical TCP connection starts with a SYN/SYN(cid:173)
`ACK/ACK handshake from a client to a server. The client
`addresses its initial SYN packet to the well-known server port
`of a particular application. The client typically chooses the
`source port number of the packet dynamically. UDP uses
`ports similarly to TCP, though without connection semantics.
`
`Cloudflare - Exhibit 1021, page 4
`
`
`
`US 7,660,248 B1
`
`5
`bandwidth links. Also this approach requires precise prior
`knowledge of applications and their packet formatsisome-
`thing that may not always be possible. Furthermore, the intro-
`duction of payload encryption is increasingly limiting our
`ability to see inside packets for this type of information.
`For the above reasons, a different approach is needed.
`In accord with the principles disclosed herein CoS map-
`ping is achieved using a statistical method. Advantageously,
`the disclosed method performs CoS mapping based on simply
`and easily determined attribute, or attributes of the traffic.
`Specifically, the disclosed method assigns traffic to classes
`based on selected attribute or attributes based on a mapping
`derived from a statistical analysis that forms a signature for
`traffic having particular values for those attributes.
`Thus, in accord with the principles disclosed herein, a
`three-stage process is undertaken, as depicted in FIG. 1; to
`wit,
`1. statistics collectioniblocks 10 and 20,
`2. classification and rule creationiblock 30, and
`3. application of rules to active trafficiblock 40.
`Block 10 obtains statistical information, in a training ses-
`sion, relative to selected features for each of a chosen set of
`classes by using training data that includes collections of
`traffic, where each collection clearly belongs to one of the
`chosen classes, and there is found a collection for each of the
`chosen set of classes. This may be termed statistical “fea-
`tures-class” mapping
`Specifically, first the classes of traffic are selected/identi-
`fied to which administrators of networks may wish to apply
`different QoS treatment, and traffic from a network having a
`well-established set of applications that belong to the identi-
`fied classes (training network) is employed to obtain a set of
`statistics for a chosen set of features. The notion here is that if
`
`it is concluded, from the data of the training network, that
`feature A of class x applications is characterized by a narrow
`range in the neighborhood of value Y, then, at a later time, if
`one encounters traffic in a target network where featureA has
`the value Y one may be able conclude with a high level of
`confidence that the traffic belongs to class x.
`With respect to class definitions, it makes sense to limit the
`set of selected classes to those for which corporate network
`administrators might wish to employ for service differentia-
`tion. It is noted that today’s corporate networks carry four
`broad application classes, which are described below, but it
`should be understood that additional, or other, classes can be
`selected. The four application classes are:
`Interactive: The interactive class contains traffic that is
`
`required by a user to perform multiple real-time interac-
`tions with a remote system. This class includes such
`applications as remote login sessions or an interactive:
`Web interface.
`Bulk data transfer: The bulk data transfer class contains
`
`traffic that is required to transfer large data volumes over
`the network without any real-time constraints. This class
`includes applications such as FTP, software updates, and
`music or video downloads.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`Streaming: The streaming class contains multimedia traffic
`with real-time constraints. This class includes such
`
`applications as streaming and video conferencing.
`Transactional. The transactional class contains traffic that
`
`60
`
`is used in a small number of request response pairs that
`can be combined to represent a transaction. DNS, and
`Oracle transactions belong to this class.
`In order to characterize each application class, it is clear
`that a reference data set is needed for each class. The problem
`is that one needs to identify the class before gathering the
`statistics for the chosen features can be extracted, but the
`
`65
`
`6
`features that ought to be chosen should be ones that charac-
`terize and disambiguate the classes. To break this circular
`dependency, in accord with the principles disclosed herein
`one or more specific “reference” applications are selected for
`each class that, based on their typical use, have a low likeli-
`hood of being contaminated by traffic belonging to another
`class. To select those applications, it makes sense to select
`applications that:
`are clearly within one class (to avoid mixing the statistics
`from two classes);
`are widely used, so as to assure we get a good data-set;
`have server ports in the well-known port range to reduce
`the chance of mis-usage of these ports.
`In a representative embodiment of the disclosed method,
`the reference applications selected for each application class
`are:
`
`Interactive. Telnet,
`Bulk data. FTP-data, Kazaa,
`Streaming: RealMedia streaming,
`Transactional. DNS, HTTPS.
`As indicated above, the statistical information that is gath-
`ered for each class pertains to the chosen set offeatures. As for
`the features that one might consider, it is realized the list of
`possible features is very large, that the actual selection is left
`to the practitioner. However, it is beneficial to note that one
`can broadly classify those features into categories:
`1. Simple packet-level features such as packet size and
`various moments thereof, such as variance, RMS (root mean
`square) size etc., are simple to compute, and can be gleaned
`directly from packet-level information. One advantage of
`such features is that they offer a characterization of the appli-
`cation that is independent of the notion of flows, connections
`or other higher-level aggregations. Another advantage of such
`features is that packet-level sampling is widely used in net-
`work data collection and has little impact on these statistics.
`Another set of statistics that can be derived from simple
`packet data are time series, from which one can derive a
`number of statistics; for instance, statistics relating to corre-
`lations over time (e.g., parameters of long-range dependence
`such as the Hurst parameter). An example of this type of
`classification can be seen in Z. Liu, et al., Profile-based traffic
`characterization of commercial web sites, Proceedings ofthe
`18th International Telelrafic Congress (ITC—lS), volume 5a,
`pages 231-240, Berlin, Germany, 2003, where the authors use
`time-of—day traffic profiles to categorize web sites.
`2. Flow-level statistics are summary statistics at the grain
`of network flows. A flow is defined to be a unidirectional
`
`sequence of packets that have some field values in common,
`typically, the 5-tuple (source IP, destination IP, source port,
`destination port, IP Protocol type). Example flow-level fea-
`tures include flow duration, data volume, number of packets,
`variance of these metrics etc. There are some more complex
`forms of information one can also glean from flows (or packet
`data) statistics; for instance, one may look at the proportion of
`internal versus external trafiic within a category%xtemal
`traffic (traffic to the Internet) may have a lower priority within
`a corporate setting. These statistics can be obtained using
`flow-level data collected at routers using, e.g., Cisco Net-
`Flow, described in White paperinetflow services and appli-
`cations, http://www.cisco.com/warp/public/cc/pd/iosw/ioft/
`neflct/tech/napps_wp.htm. These do not require the more
`resource-intensive process of finer grain packet-level traces.
`A limitation is, that flow-collection may sometimes aggregate
`packets that belong to multiple application-level connections
`into a single flow, which would distort the flow-level features.
`
`Cloudflare - Exhibit 1021, page 5
`
`Cloudflare - Exhibit 1021, page 5
`
`
`
`US 7,660,248 B1
`
`7
`3. Connection-level statistics are required to trace some
`interesting behavior associated with connection oriented
`transport-level connections such as TCP connections. A typi-
`cal TCP connection starts and ends with well-defined hand-
`
`shakes from a client to a server. The collection process needs
`to track the connection state in order to collect connection
`level statistics. In addition to the features mentioned for the
`
`flow-level, other features that are meaningful to compute at
`the TCP connection level are the amount of symmetry of a
`connection, advertised window sizes and throughput distri-
`bution. The connection-level data generally provides better
`quality data than the flow-level information, but requires
`additional overhead, and would also be impacted by sampling
`or asymmetric routing at the collection point.
`4. lntra-flow/connection features are features that are based
`
`on the notion of a flow or TCP connection, but require statis-
`tics about the packets within each flow. A simple example is
`the statistics of the inter-arrival times between packets in
`flows. This requires data collected at a packet level, but then
`grouped into flows. The relative variance ofthese inter-arrival
`times may be used as a measure of the burstiness of a traffic
`stream.
`lntraflow/connection features include loss rates,
`latencies etc.
`
`5. Multi-flow: Sometimes interesting characteristics can be
`captured only by considering statistic, across multiple flows/
`connections. For instance, many peer-to-peer applications
`achieve the download of a large file by bulk downloads of
`smaller chunks from multiple machinesithe individual
`chunk downloads are typically performed close together in
`time. For some multimedia streaming protocols, the high
`volume data connection is accompanied by a concurrent,
`separate connection between the same set of end-systems,
`containing low volume, intermittent control data (e.g., RTSP;
`see H. Schulzrinne, et al., Real time streaming protocol
`(RTSP), request for comments 2326, April 1998, ftp://ft-
`p.isi.edu/in-notes/rfc2326.txt). These multi flow features are
`more complex and computationally more expensive to cap-
`ture than flow or connection data alone.
`
`Turning attention to block 20 of FIG. 1, in accord with the
`principles disclosed herein statistical information is collected
`relative to traffic that is identified by one or more predeter-
`mined attributes. More specifically, block 20 obtains statisti-
`cal information, in an analysis session that employs traffic of
`the target network, relative to the same selected features that
`were analyzed in block 10, for one or more predetermined
`attributes that are associated with connections that are ana-
`
`lyzed in the analysis session. Block 20 yields a statistical
`features-signature of each ofthe analyzed values ofthe one or
`more predetermined attributes. That is, in connection with
`each value of any one of the predetermined attributes, statis-
`tical information is gathered regarding the aggregate traffic
`that is accumulated in the analysis session. For illustrative
`purposes, the traffic attributes that are considered herein are
`the server ports Pl. and the server IP address 11.. The traffic
`aggregates are the collections of traffic relative to a particular
`server port, or relative to a particular IP address.
`Thus, in accord with the principles of this disclosure, a
`vector of statistics SC(i) is formed for each connection i,
`where the elements of the vector are the chosen features, and
`used to update the statistics of each aggregate in which con-
`nection i is involved, for instance statistics SC(p) for port
`aggregates, and SI(ll.) for server aggregates. To illustrate for
`statistics collected on TCP connections, the procedure might
`as in the following pseudocode.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`foreach packet
`if(packet represents a new TCP connection)
`assign the connection index i++
`determine the aggregates for connection i
`server port 1’,- = dst port ofSYN
`server IP address 1,- = dst 1P ofSYN
`
`initialize a set of statistics SC(i)
`elseif(packet belongs to an existing TCP connection i)
`update connection statistics SC(i)
`elseif(packet represents end TCP connection i)
`update connection statistics SC(i)
`update statistics for each aggregate
`by server port: SC(Pl-)
`by server IP address: SI(11-)
`
`endif
`end foreach
`
`The update procedure for connections depends on the sta-
`tistic in question. Ideally, statistics should be chosen that can
`be updated on-line in a streaming fashion, i.e., recursively,
`because that would allow the method to not store data for each
`
`packet but, rather, per connection. For example it is desirable
`to employ an algorithm like
`
`SkC(i)(_/l()(ji(k)>skca¢(i))a
`
`(1)
`
`where X; (k) is the measurements for packet j, relative to
`statistic (feature) k, in connection i, SkC(i) is the kth statistic
`(feature) for connection i, and q)(i) is some (small) set of state
`information (e.g., the packet numberj) for connection i. With
`an update algorithm as specified by equation (1), the memory
`required to store the state depends on the number of connec-
`tions. The following gives a number of specific examples that
`comport with equation (1):
`1. Average:
`
`J' 7
`1
`7
`Xj+l = j+—1Xj+l + ij,
`
`2. Variance:
`
`JI—liz
`,- Wz
`j—1
`1
`vaan+l)=;Xj+l+—j var(Xj)+—j_1Xj——j X141,
`
`(2)
`
`(3)
`
`where X] and var(Xj) are the mean and variance, respectively,
`ofthe first j samples (e.g., packets) of data. However, even for
`more difficult statistics, such as quantiles, there are a number
`of approximation algorithms that can be used to approximate
`the statistic on-line. See A. C. Gilbert, et al., “Fast, small-
`space algorithms from approximate historgram mainte-
`nance.” STOC, 2002. Equations (2) and (3) use “X” without
`the index that designates the feature that is being measured,
`for s