throbber

`
`
`
`
`
`
`
`US 7,660,248 B1
`(10) Patent No.:
`a2) United States Patent
`
`
`
`
`
`
`
`Feb. 9, 2010
`(45) Date of Patent:
`Duffield et al.
`
`
`
`
`US007660248B1
`
`
`
`
`
`(54) STATISTICAL, SIGNATURE-BASED
`
`
`
`APPROACH TO IP TRAFFIC
`
`CLASSIFICATION
`
`
`
`(76)
`
`
`
`
`
`
`
`
`
`
`Inventors: Nicholas G. Duffield, 101 W. 12th St.,
`
`
`
`
`
`
`Apt. 7S, New York, NY (US) 10011;
`Matthew Roughan,15 Locust St.,
`
`
`
`
`.
`.
`enBeeRd., Apt H6
`
`
`
`
`
`
`Chatham, NJ (US) 07928; Oliver
`
`
`Rpatsehecks|cus)07896 Rd.,
`
`Ph
`
`
`
`
`
`
`Subject to any disclaimer, the term ofthis
`
`
`
`
`patent is extended or adjusted under 35
`
`
`
`
`U.S.C. 154(b) by 776 days.
`
`
`(*) Notice:
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`7,302,682 B2* 11/2007 Turkoglu oc TTA
`
`
`
`
`
`
`
`
`
`.........
`7,305,676 B1* 12/2007 Bolletal.
`... 718/107
`
`
`
`
`
`
`
`
`7,359,320 B2*
`4/2008 Klaghofer et al.
`... 370/230
`.
`
`
`
`
`
`
`
`733,943 B1* 10/2008 Ford.......
`... 709/223
`
`
`
`
`
`
`7,441,267 BL* 10/2008 Elliott oe 726/13
`
`
`OTHER PUBLICATIONS
`of Intemet Chat Traffic. Oct. 2003. P
`tal. An Analvsis
`D
`
`
`
`
`
`
`
`
`Oct.
`Pro-
`ysis of
`Traffic,
`Interne
`ewes, et al,
`at
`,
`
`
`
`
`
`
`
`
`ceedings ofACM SIGCOMMInternet Measurement Conference.
`
`
`
`
`
`
`
`
`
`* cited by examiner
`
`
`
`
`Primary Examiner—Pankaj Kumar
`
`
`
`
`Assistant Examiner—Mark Mais
`
`
`
`
`
`(74) Attorney, Agent, or Firm—Henry Brendzel
`
`
`ABSTRACT
`6)
`
`
`
`
`
`
`
`;
`A signature-basedtraffic classification method mapstraffic
`
`
`
`
`
`
`
`
`
`
`
`
`into preselected classes of service (CoS). By analyzing a
`(21) Appl. No.: 10/764,001
`
`
`
`
`
`
`
`
`
`
`
`
`known corpusofdata that clearly belongsto identified ones of
`Filed:
`Jan. 23. 2004
`(22)
`
`
`
`
`
`
`
`
`
`the preselected classes of service, in a training session the
`o_o
`,
`
`
`
`
`
`
`
`
`
`method developsstatistics about a chosenset oftraffic fea-
`Int. Cl
`(51)
`
`
`
`
`
`
`
`
`(2006.01)
`tures. In an analysis session, relative to traffic ofthe network
`HodTE 16
`
`
`
`
`
`
`
`
`
`
`
`
`where QoS treatments are desired (target network),
`370/230 1: 370/229: 370/232:
`(52) US.CI
`the
`method obtains statistical information relative to the same
`es 370123. 370/738- 370/259
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`58) FieldofClassification S h ; ;370/209 chosen set of features for values of one or more predeter-
`
`
`
`
`
`o
`370/pen 533 734.935.2351.237
`minedtraffic attributes that are associated with connections
`Field
`(58)
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`370/238 >A] 349 344 945 30 360, 753,
`that are analyzed in the analysis session, yieldinga statistical
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`?
`?
`?

`? 370173 i 35
`features signature of each of the values of the one or more
`
`
`
`
`
`
`
`
`
`
`
`file f
`bhi
`attributes. A classification process then establishes a mapping
`lication
`1
`?
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`See applicationfile for complete search
`history.
`between values of the one or more predetermined traffic
`
`
`
`
`
`
`
`
`
`References Cited
`attributes andthe preselected classes of service, leading to the
`
`
`
`
`establishment of QoS treatmentrules.
`U.S. PATENT DOCUMENTS
`
`
`
`
`
`
`7/2007 Jorgensen ............ 370/235
`
`
`
`
`
`
`
`(56)
`
`
`7,251,218 B2*
`
`
`
`
`
`
`1 Claim, 1 Drawing Sheet
`
`
`
`TRAINING SESSION
`
`
`
`
`(ON TRAINING NETWORK)
`
`
`
`ANALYSIS. SESSION
`
`
`
`
`(ON TARGET NETWORK)
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`STATISTICAL FEATURES SIGNATURE
`
`OF EACH VALUE OF THE ONE ORE
`
`
`
`
`
`MORE ATTRIBUTES.
`
`
`
`
`
`
`
`OBTAIN STATISTICAL INFORMATION:
`
`
`
`
`
`RELATIVE TO SELECTED FEATURES FOR
`
`
`
`
`
`EACH OF A CHOSEN SET OF CLASSES
`
`
`
`
`
`
`
`STATISTICAL
`
`
`
`
`"FEATURES-CLASS”
`
`MAPPING
`
`
`
`
`
`OBTAIN STATISTICAL INFORMATION
`
`
`
`
`RELATIVE TO THE SAME SELECTED
`
`
`
`
`
`
`FEATURES, FOR VALUES OF ONE OR
`
`
`
`
`
`MORE CONNECTION ATTRIBUTES
`
`
`
`
`
`ESTABLISH A CLASSIFICATION:
`
`
`MAPPING EACH OF THE VALUES OF THE
`
`
`
`
`
`ONE OR MORE ATTRIBUTES HAVING A
`
`
`
`
`
`
`FEATURES SIGNATURE
`INTO A CLASS
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ASSIGN PACKETS ARRIVING AT THE
`
`
`
`TARGET NETWORK 10 A CLASS BASED
`
`
`
`
`
`ON THE ESTABLISHED CLASSIFICATION
`
`
`
`
`APPLY QoS BASED ON THE ASSIGNED CLASS
`
`
`
`
`
`
`
`
`
`
`
`
`
`Splunk Inc.
`
`Exhibit1021
`
`Page 1
`
`Splunk Inc. Exhibit 1021 Page 1
`
`

`

`
`U.S. Patent
`
`
`
`
`Feb. 9, 2010
`
`
`
`
`
`US 7,660,248 B1
`
`
`FIC.
`1
`
`
`
`TRAINING SESSION
`
`
`(ON TRAINING NETWORK)
`
`
`
`
`
`
`OBTAIN STATISTICAL INFORMATION:
`
`
`
`
`
`RELATIVE TO SELECTED FEATURES FOR
`
`
`
`
`
`EACH OF A CHOSEN SET OF CLASSES
`
`
`STATISTICAL
`"FEATURES-CLASS”
`
`MAPPING
`
`
`
`
`
`ANALYSIS SESSION
`
`
`(ON TARGET NETWORK)
`
`
`
`
`
`
`OBTAIN STATISTICAL INFORMATION
`RELATIVE TO THE SAME SELECTED
`
`
`
`
`
`FEATURES, FOR VALUES OF ONE OR
`
`
`
`
`
`
`
`MORE CONNECTION ATTRIBUTES
`
`10
`
`
`
`0
`
`
`
`
`
`
`
`
`
`
`
`OF EACH VALUE OF THE ONE ORE
`
`
`MORE ATTRIBUTES.
`
`
`
`
` STATISTICAL FEATURES SIGNATURE
`
`
`
`ESTABLISH A CLASSIFICATION:
`
`
`
`
`MAPPING EACH OF THE VALUES OF THE
`
`
`
`
`ONE OR MORE ATTRIBUTES HAVING A
`
`
`
`
`FEATURES SIGNATURE
`INTO A CLASS
`
`
`
`
`
`
`
`ASSIGN PACKETS ARRIVING AT THE
`
`
`
`
`TARGET NETWORK TO A CLASS BASED
`
`
`
`
`ON THE ESTABLISHED CLASSIFICATION
`
`
`
`
`
`
`APPLY QoS BASED ON THE ASSIGNED CLASS
`
`
`
`
`
`30
`
`40
`
`
`
`Splunk Inc.
`
`Exhibit1021
`
`Page 2
`
`Splunk Inc. Exhibit 1021 Page 2
`
`

`

`
`
`US 7,660,248 B1
`
`
`1
`
`STATISTICAL, SIGNATURE-BASED
`APPROACH TO IP TRAFFIC
`
`
`
`CLASSIFICATION
`
`
`
`
`BACKGROUND OF THE INVENTION
`
`
`
`
`
`
`2
`
`
`
`
`
`
`
`“Wide-area traffic: The failure of Poisson modeling,” JEEE/
`
`
`
`
`
`
`
`ACM Transactions on Networking,vol. 3, pp. 226-244, June
`
`
`
`
`
`
`
`
`1995, for example, found that user initiated events—such as
`
`
`
`
`
`
`
`telnet packets within flows or FTP-data connection arrivals—
`
`
`
`
`
`
`
`can be described well by a Poisson process, whereas other
`
`
`
`
`
`
`connectionarrivals deviate considerably from Poisson.
`
`
`
`
`
`
`Signature-based detection techniques have also been
`
`
`
`
`
`
`
`
`explored in the context of network security, attack and
`
`
`
`
`
`
`anomaly detection; e.g. P. Barford et al., Characteristics of
`
`
`
`
`
`
`Network Traffic Flow Anomalies, Proceedings ofACM SIG-
`
`
`
`
`
`
`
`COMMInternet Measurement Workshop, October 2001; and
`
`
`
`
`
`
`
`
`
`P. Barford, et al., A Signal Analysis of Network Traffic
`
`
`
`
`
`
`Anomalies, Proceedings ofACM SIGCOMM Internet Mea-
`
`
`
`
`
`
`
`surement Workshop, November 2002, where one typically
`
`
`
`
`
`seeks to find a signature for an attack.
`
`
`
`
`
`Actually, realization of a service differentiation capability
`
`
`
`
`
`
`
`
`requires (1) association ofthe traffic with the different appli-
`
`
`
`
`
`
`cations, (11) determination of the QoSto be provided to each,
`
`
`
`
`
`
`
`
`and finally, (111) mechanisms in the underlying network for
`
`
`
`
`
`
`
`
`
`providing the QoS; 1.e., for controlling the traffic to achieve a
`
`
`
`particular quality of service.
`While some of the above-mentioned studies assume that
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`one can identify the application traffic unambiguously and
`
`
`
`
`
`
`
`
`
`then obtain statistics for that application, none of them have
`
`
`
`
`
`
`
`
`considered the dual problem of inferring the application from
`
`
`
`
`
`
`
`
`
`thetraffic statistics. This type of approach has been suggested
`
`
`
`
`
`
`
`
`in very limited contexts such as identifying chattraffic in C.
`
`
`
`
`
`
`
`Dewes, et al., An analysis of Internet chat systems, Proceed-
`
`
`
`
`
`
`ings ofACM SIGCOMM Internet Measurement Conference,
`October 2003.
`
`
`
`
`
`
`
`
`
`
`
`
`Still, in spite of a clear perceived need, and the prior art
`
`
`
`
`
`
`
`workreported above, widespread adoption of QoScontrol of
`
`
`
`
`
`
`
`
`
`
`traffic has not cometo pass. It is believed that the primary
`
`
`
`
`
`
`
`
`reason for the slow spread of QoS-use is the absence of
`
`
`
`
`
`
`
`
`suitable mapping techniques that can aid operatorsin classi-
`
`
`
`
`
`
`
`
`
`fying the network traffic mix among the different QoS
`
`
`
`
`
`
`
`
`classes. We refer to this as the Class of Service (CoS) mapping
`
`
`
`
`
`
`
`
`
`problem, and perceive that solving this would go a long way
`
`
`
`
`
`
`
`in making the use of QoS more accessible to operators.
`
`
`
`
`
`
`
`
`This invention relates to traffic classification and, more
`
`
`
`
`particularly to statistical classification of IP traffic.
`
`
`
`
`
`
`
`
`The past few years have witnessed a dramatic increase in
`
`
`
`
`
`
`
`
`
`the numberandvariety of applications running overthe Inter-
`
`
`
`
`
`
`
`
`net and over enterprise IP networks. The spectrum includes
`
`
`
`
`
`
`
`
`interactive (e.g., telnet, instant messaging, games, etc.), bulk
`
`
`
`
`
`
`
`
`
`data transfer (e.g., ftp, P2P file downloads), corporate; (e.g.,
`
`
`
`
`
`
`
`Lotus Notes, database transactions), and real-time applica-
`
`
`
`
`
`
`
`
`tions (voice, video streaming,etc.), to namejust a few.
`
`
`
`
`
`
`Network operators, particularly in enterprise networks,
`
`
`
`
`
`
`
`
`
`desire the ability to support different levels of Quality of
`
`
`
`
`
`
`
`Service (QoS)for different types of applications. This desire
`
`
`
`
`
`
`
`is driven by (i) the inherently different QoS requirements of
`
`
`
`
`
`
`
`
`different types of applications, e.g., low end-end delay for
`
`
`
`
`
`
`
`interactive applications, high throughput for file transfer
`
`
`
`
`
`
`
`
`applicationsetc.; (11) the different relative importance of dif-
`
`
`
`
`
`
`ferent applications to the enterprise—e.g., Oracle database
`
`
`
`
`
`
`
`
`transactions are consideredcritical and therefore high prior-
`
`
`
`
`
`
`
`
`
`ity, while traffic associated with browsing external websites
`
`
`
`
`
`
`
`
`
`
`is generally less important; and(iii) the desire to optimize the
`
`
`
`
`
`
`
`usage of their existing network infrastructures under finite
`
`
`
`
`
`
`
`
`capacity and cost constraints, while ensuring good perfor-
`
`
`
`mance for important applications.
`
`
`
`
`
`
`
`Various approaches have been studied, and mechanisms
`
`
`
`
`
`
`
`
`developed for providing different QoS in a network. See, for
`
`
`
`
`
`
`
`
`example, S. Blake, et al., RFC 2475—anarchitecture for
`
`
`
`
`
`differentiated service, December 1998, http://ww.faqs.org/
`
`
`
`
`
`
`rfes/rfc2475 html; and C. Gbhaguidi, et al., A survey of differ-
`
`
`
`
`
`
`
`
`entiated services architectures for the Internet, March 1998,
`
`
`http://ssewww.epfi.ch/Pages/publications/ps_files/tr98__
`
`
`
`
`
`
`020.ps; and Y. Bernet, et al., A framework for differentiated
`
`
`
`
`services.
`Internet Draft
`(draft-ietf-diffserv-framework-
`
`
`
`
`02.txt), February 1999, http://search.ietf.org/internet-drafts/
`draft-ietf-diffserv-framework-02.txt.
`
`Previous work also has examined the variation of flow
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`characteristics according to applications. M. Allman,et al.,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`An advancein the art of providing specified QoS in an IP
`TCP congestion control, IETF Network Working Group RFC
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`network is achieved with a signature-based traffic classifica-
`2581, 1999, investigated the joint distribution of flow dura-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`tion method that maps traffic into preselected classes of ser-
`tion and numberofpackets, and its variation with flow param-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`vice (CoS). By analyzing, in a training session, a known
`eters such as inter-packet timeout. Differences were observed
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`corpus of data that clearly belongs to identified ones of the
`between the distributions of some application protocols,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`preselected classes of service, the method develops statistics
`although overlap was clearly also present between some
`
`
`
`
`
`
`
`
`
`
`
`
`
`about a chosensetoftraffic features. In an analysis session,
`applications. Most notably, the distribution of DNStransac-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`relative to traffic of the network where QoS treatments are
`tions had almost no overlap with that of other applications
`
`
`
`
`
`
`
`
`
`
`
`
`
`desired (target network), obtaining statistical
`information
`considered. However, the use of such distributions as a dis-
`
`
`
`
`
`
`
`
`relative to the same chosenset of features for values of one or
`
`
`
`
`
`
`
`
`criminator between different application types was not con-
`
`
`
`
`
`
`
`
`sidered.
`
`more predeterminedtraffic attributes that are associated with
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`connectionsthat are analyzed in the analysis session, yielding
`There also exists a wealth ofresearch on characterizing and
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`a statistical features signature of each ofthe values of the one
`modeling workloads for particular applications, with A.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`or more attributes. A classification process then establishes a
`Krishnamurth, et al., Web Protocols and Practice, Chapter
`
`
`
`
`
`
`
`
`
`
`
`
`mapping between values of the one or more predetermined
`10, Web Workload Characterization, Addison-Wesley, 2001;
`
`
`
`
`
`
`
`
`
`
`
`
`
`traffic attributes and the preselected classes of service, lead-
`and J. E. Pitkow, Summary ofWWWcharacterizations, W3/,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ing to the establishment of rules. Once the rules are estab-
`2:3-13, 1999 being but two examples of such research.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`lished, traffic that is associated with particular values of the
`An early work in this space, reported in V. Paxson,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`predeterminedtraffic attributes are mappedto classes of ser-
`“Empirically derived analytic models of wide-area TCP con-
`
`
`
`
`
`
`
`
`
`
`
`vice, which leads to a designation of QoS.
`nections,” IEEE/ACM Transactions on Networking, vol. 2,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Illustratively, the preselected classes of service may be
`no. 4, pp. 316-336, 1994, examinesthe distributions of flow
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`interactive traffic, bulk data transfertraffic, streamingtraffic
`bytes and packets for a numberof different applications.
`
`
`
`
`
`
`
`and transactional traffic. The chosen set of traffic features
`
`
`
`
`
`
`
`
`
`Interflow and intraflow statistics are another possible
`
`
`
`
`
`
`
`
`
`
`
`
`
`maybe packet-level features, flow-level features, connection-
`dimension along which application types may be distin-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`level features, intra-flow/connection features, and multi-flow
`guished and research has been conducted. V. Paxson,et al.,
`
`Splunk Inc.—Exhibit 1021 Page 3
`
`SUMMARY
`
`
`
`
`
`
`
`20
`
`25
`
`
`
`30
`
`
`
`35
`
`
`
`40
`
`
`
`45
`
`
`
`50
`
`
`
`55
`
`
`
`60
`
`
`
`65
`
`
`
`
`
`Splunk Inc. Exhibit 1021 Page 3
`
`

`

`
`
`US 7,660,248 B1
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`
`
`
`
`
`
`
`
`
`
`FIG. 1 presents a flow chart of the IP traffic classification
`method disclosed herein.
`
`
`
`
`
`
`DETAILED DESCRIPTION
`
`
`
`
`
`3
`
`
`
`
`
`
`
`features. The predetermined traffic attributes may be the
`
`
`
`
`
`
`
`
`server port, and the server IP address. An illustrative rule
`
`
`
`
`
`
`
`
`mightstate that “a connection that specifies port x belongs to
`
`
`
`
`
`
`the class of interactive traffic.” An administrator of the target
`
`
`
`
`
`
`
`
`
`network may choose to give the highest QoS level to such
`traffic.
`
`
`
`4
`
`
`
`
`
`
`
`
`All future packets of a session, in either a TCP or UDP
`
`
`
`
`
`
`
`
`
`
`session, use the samepair of ports to identify the client and
`
`
`
`
`
`
`
`server side of the session. Therefore, in principle, the TCP or
`
`
`
`
`
`
`
`
`
`UDPserver port number can be usedto identify the higher
`
`
`
`
`
`
`layer application by simply identifying in an incoming packet
`
`
`
`
`
`
`
`
`
`the server port and mappingthis port to an application using
`
`
`
`
`
`
`
`
`the IANA(Internet Assigned Numbers Authority) list of reg-
`
`
`
`istered ports
`(http://www.iana.org/assignments/port-num-
`
`
`
`
`
`
`bers). However, port-based application classification has
`
`
`
`
`
`
`
`limitations. First, the mapping from ports to applications is
`
`
`
`
`
`
`not always well defined. For instance.
`
`
`
`
`
`
`Many implementations of TCP use client ports in the reg-
`
`
`
`
`
`
`
`istered port range. This might mistakenly classify the
`
`
`
`
`connection as belonging to the application associated
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`In accord with the principles disclosed herein QoS imple-
`with this port. Similarly, some applications (e.g., old
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`mentations are based on mappingoftraffic into classes of
`bind versions), use port numbers from the well-known
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`service. In principle the division oftraffic into CoS could be
`ports to identify the client site of a session.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`done by end-points of the network, where traffic actually
`Ports are not defined with IANAfor all applications, e.g.,
`
`
`
`
`
`
`
`
`
`
`
`originates—for instance by end-user applications. However,
`P2P applications such as Napster and Kazaa.
`
`
`
`
`
`
`
`
`20
`
`
`
`
`
`
`
`
`
`for reasonsoftrust and scalability of administration and man-
`An application may use ports other than its well-known
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`agement, it is typically more practical to perform the CoS
`ports to circumvent operating system access control
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`mapping within the network; for instance, at the router that
`restrictions. E.g., non-privileged users often run WWW
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`connects the Local Area Network (LAN) to the Wide Area
`servers on ports other than port 80, whichis restricted to
`
`
`
`
`
`
`
`
`
`
`
`Network (WAN). Alternatively, there might be appliances
`privileged users on most operating systems.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`connected near the LAN to WANtransition point that can
`There are some ambiguities in the port registrations, e.g.,
`
`
`
`
`
`
`
`
`
`
`
`
`
`perform packet marking for QoS.
`port 888 is used for CDDBP (CD Database Protocol)
`
`
`
`
`
`
`
`
`and access-builder.
`
`
`CoS mappinginside the networkis a non-trivial task. Ide-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ally, a network system administrator would possess precise
`In some cases server ports are dynamically allocated as
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`information on the applications running inside the adminis-
`needed. For example, FTP allows the dynamic negotia-
`
`
`
`
`
`
`
`
`30
`
`
`
`
`
`
`
`
`
`trator’s network, along with simple and unambiguous map-
`tion of the server port used for the data transfer. This
`
`
`
`
`
`
`
`
`
`
`
`
`
`pings, which information is based on easily obtainedtraffic
`server port is negotiated on an initial TCP connection,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`measurements(e.g., by port numbers, or source and destina-
`whichis established using the well-known FTP control
`
`
`
`
`
`
`
`
`
`
`tion IP addresses). This information is vital not just for the
`port.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`implementation of CoS, but also in planning the capacity
`Theuse oftraffic control techniqueslike firewalls to block
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`required for each class, and balancing tradeoffs between cost
`unauthorized, and/or unknown applications from using a net-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`and performance that might occur in choosing class alloca-
`work has spawned many work-arounds which make port
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`tions. For instance, one might have an application whose
`based application authentication harder. For example, port 80
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`inclusion in a higher priority class is desirable but not cost
`is being used by a variety of non-web applications to circum-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`effective (based ontraffic volumesand pricing), and so some
`vent firewalls which donotfilter port-80 traffic. In fact, avail-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`difficult choices must be made. Good data is required for
`able implementations of IP over HTTPallow the tunneling of
`
`
`
`
`
`these to be informed choices.
`
`
`
`all applications through TCP port 80.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`In general, however, the required information is rarely
`Trojans and other security attacks generate a large volume
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`up-to-date, or complete,if it is available at all. The traditional
`of bogustraffic which should not be associated with the
`
`
`
`
`
`
`
`
`
`
`
`
`
`ad-hoc growth of IP networks, the continuing rapid prolifera-
`applications of the port numbers those attacks use.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`tion of new applications, the merger of companies with dif-
`A secondlimitation of port-numberbasedclassification is
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ferent networks, and the relative ease with which almost any
`that a port can be used by a single application to transmit
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`user can add a new application to the traffic mix with no
`traffic with different QoS requirements. For example, (i)
`
`
`
`
`
`
`
`Lotus Notes transmits both email and database transaction
`
`
`
`
`
`
`
`
`centralized registration are all factors that contribute to this
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`“knowledge gap”. Furthermore, over recent years it has
`traffic over the sameports, (11) sep (secure copy), a file trans-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`become harder to identify network applications within IP
`fer protocol, runs overssh (secure shell), an interactive appli-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`traffic. Traditional techniques such as port-based classifica-
`cation using default TCPport 22. This use ofthe same port for
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`tion of applications, for example, have become muchless
`traffic requiring different QoS requirements is quite legiti-
`accurate.
`
`
`
`
`
`
`
`
`
`mate, and yet a good classification must separate different use
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`One approachthat is commonly usedfor identifying appli-
`cases for the sameapplication. A clean QoS implementation
`
`
`
`
`
`
`
`cations on an IP network is to associate the observedtraffic
`
`
`
`
`
`
`
`is still possible through augmentingthe classification rules to
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`(using flow level data, or a packet sniffer) with an application
`include IP address-based disambiguation. Serverlists exist in
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`based on TCP or UDP port numbers. Alas, this method is
`some networks but, again, in practice these lists are often
`
`
`
`
`
`
`
`
`inadequate.
`incomplete, or a single server could be used to support a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`variety of different types of traffic, so we must combine port
`The TCP/UDPport numbers are divided into three ranges:
`
`
`
`
`
`
`
`
`
`and IP address rules.
`
`
`
`the Well Known Ports (0-1023), the Registered Ports (1024-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`A possible alternative to port basedclassification is to use
`49,151), and the Dynamic and/or Private ports (49,152-65,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`a painstaking process involving installation ofpacket sniffers
`535). A typical TCP connection starts with a SYN/SYN-
`
`
`
`
`
`
`
`ACK/ACK handshake from a client to a server. The client
`
`
`
`
`
`
`
`and parsing packets for application-level information to iden-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`tify the application class of each individual TCP connection
`addresses its initial SYN packet to the well-knownserver port
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`or UDPsession. However, this approach cannot be used with
`of a particular application. The client typically chooses the
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`more easily collected flow level data, and its collection is
`source port number of the packet dynamically. UDP uses
`
`
`
`
`
`
`
`
`
`
`
`
`
`computationally expensive, limiting its application to lower
`ports similarly to TCP, though without connection semantics.
`
`Splunk Inc.—Exhibit 1021 Page 4
`
`
`
`
`
`
`
`
`
`
`
`
`
`25
`
`
`
`
`
`35
`
`
`
`40
`
`
`
`45
`
`
`
`50
`
`
`
`55
`
`
`
`60
`
`
`
`65
`
`
`
`Splunk Inc. Exhibit 1021 Page 4
`
`

`

`
`
`
`
`
`
`
`
`6
`
`
`
`
`
`
`
`
`features that ought to be chosen should be ones that charac-
`
`
`
`
`
`
`
`
`
`terize and disambiguate the classes. To break this circular
`
`
`
`
`
`
`
`dependency, in accord with the principles disclosed herein
`
`
`
`
`
`
`
`
`one or morespecific “reference” applications are selected for
`
`
`
`
`
`
`
`
`
`
`each class that, based on their typical use, have a low likeli-
`
`
`
`
`
`
`hood of being contaminated by traffic belonging to another
`
`
`
`
`
`
`
`class. To select those applications, it makes sense to select
`
`
`applicationsthat:
`
`
`
`
`
`
`
`
`
`are clearly within one class (to avoid mixingthestatistics
`
`
`
`from twoclasses);
`
`
`
`
`
`
`are widely used, so as to assure we get a good data-set;
`
`
`
`
`
`
`
`have server ports in the well-known port range to reduce
`
`
`
`
`
`the chance of mis-usage ofthese ports.
`
`
`
`
`In a representative embodimentof the disclosed method,
`
`
`
`
`
`
`
`the reference applications selected for each application class
`are:
`
`
`
`5
`
`
`
`
`
`
`
`
`bandwidth links. Also this approach requires precise prior
`
`
`
`
`
`
`knowledge of applications and their packet formats—some-
`
`
`
`
`
`
`
`
`thing that may not alwaysbe possible. Furthermore, the intro-
`
`
`
`
`
`
`duction of payload encryption is increasingly limiting our
`
`
`
`
`
`
`
`
`ability to see inside packets for this type of information.
`
`
`
`
`
`
`
`For the above reasons, a different approach is needed.
`
`
`
`
`
`
`
`
`
`In accord with the principles disclosed herein CoS map-
`
`
`
`
`
`
`ping is achieved usinga statistical method. Advantageously,
`
`
`
`
`
`
`
`
`the disclosed method performs CoS mapping based on simply
`
`
`
`
`
`
`
`
`and easily determined attribute, or attributes of the traffic.
`
`
`
`
`
`
`
`Specifically, the disclosed methodassignstraffic to classes
`
`
`
`
`
`
`based on selected attribute or attributes based on a mapping
`
`
`
`
`
`
`
`
`derived from a statistical analysis that forms a signature for
`
`
`
`
`
`
`
`traffic having particular values for those attributes.
`
`
`
`
`
`
`
`Thus, in accord with the principles disclosed herein, a
`
`
`
`
`
`three-stage process is undertaken, as depicted in FIG. 1; to
`
`wit,
`
`
`
`
`
`
`
`
`Interactive. Telnet,
`1. statistics collection—blocks 10 and 20,
`
`
`
`
`
`
`
`
`
`
`
`
`Bulk data. FTP-data, Kazaa,
`2. classification and rule creation—block 30, and
`
`
`
`
`
`
`
`
`
`3. application of rules to active traffic—block 40.
`Streaming: RealMedia streaming,
`
`
`
`
`
`
`
`
`
`Block 10 obtains statistical information,in a training ses-
`Transactional. DNS, HTTPS.
`
`
`
`
`
`
`
`
`sion, relative to selected features for each of a chosenset of
`
`
`
`
`
`
`
`
`Asindicated above, the statistical information that is gath-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`classes by using training data that includes collections of
`ered for eachclass pertainsto the chosenset offeatures. As for
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`traffic, where each collection clearly belongs to one of the
`the features that one might consider, it is realized the list of
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`chosen classes, and there is found a collection for each of the
`possible features is very large, that the actual selection is left
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`chosen set of classes. This may be termedstatistical “fea-
`to the practitioner. However, it is beneficial to note that one
`
`
`
`
`
`
`
`
`
`tures-class” mapping
`can broadly classify those features into categories:
`
`
`
`
`
`
`
`Specifically, first the classes oftraffic are selected/identi-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`1. Simple packet-level features such as packet size and
`fied to which administrators of networks may wish to apply
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`various moments thereof, such as variance, RMS(root mean
`different QoS treatment, andtraffic from a network having a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`square) size etc., are simple to compute, and can be gleaned
`well-established set of applications that belong to the identi-
`
`
`
`
`
`
`
`
`
`
`
`
`
`directly from packet-level information. One advantage of
`fied classes (training network) is employed to obtain a set of
`
`
`
`
`
`
`
`such features is that they offer a characterization of the appli-
`statistics for a chosen set of features. The notionhereis thatif
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`cation that is independentofthe notion of flows, connections
`
`
`
`
`
`
`
`
`
`
`it is concluded, from the data of the training network, that
`
`
`
`
`
`
`
`
`
`
`
`
`or other higher-level aggregations. Another advantage of such
`feature A of class x applications is characterized by a narrow
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`features is that packet-level sampling is widely used in net-
`range in the neighborhoodof value Y, then, at a later time,if
`
`
`
`
`
`
`
`
`
`workdata collection and has little impact on thesestatistics.
`
`
`
`
`
`
`
`one encounterstraffic in a target network where feature A has
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Anotherset of statistics that can be derived from simple
`the value Y one may be able conclude with a high level of
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`packet data are time series, from which one can derive a
`confidencethat the traffic belongsto class x.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`numberofstatistics; for instance,statistics relating to corre-
`With respectto class definitions, it makes sense to limit the
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`lations over time (e.g., parameters of long-range dependence
`set of selected classes to those for which corporate network
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`such as the Hurst parameter). An example of this type of
`administrators might wish to employ for service differentia-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`classification can be seenin Z. Liu,et al., Profile-basedtraffic
`tion. It is noted that today’s corporate networks carry four
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`characterization of commercial web sites, Proceedings ofthe
`broad application classes, which are described below, butit
`
`
`
`
`
`
`18” International Teletraffic Congress (ITC-18), volume 5a,
`
`
`
`
`
`
`
`should be understoodthat additional, or other, classes can be
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`pages 231-240, Berlin, Germany, 2003, where the authors use
`selected. The four application classes are:
`
`
`
`
`
`
`Interactive: The interactive class contains traffic that is
`
`
`
`
`
`
`
`time-of-daytraffic profiles to categorize websites.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`2. Flow-levelstatistics are summary statistics at the grain
`required by a user to perform multiple real-time interac-
`
`
`
`
`
`
`
`
`of network flows. A flow is defined to be a unidirectional
`
`
`
`
`
`
`tions with a remote system. This class includes such
`
`
`
`
`
`
`
`
`
`
`
`
`
`sequence of packets that have somefield values in common,
`applications as remote login sessions or an interactive:
`
`
`
`
`
`
`
`
`Webinterface.
`
`
`typically, the 5-tuple (source IP, destination IP, source port,
`
`
`
`
`
`
`
`Bulk data transfer: The bulk data transfer class contains
`
`
`
`
`
`
`
`
`
`destination port, IP Protocol type). Example flow-level fea-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`tures include flow duration, data volume, numberofpackets,
`traffic that is required to transfer large data volumes over
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`variance of these metrics etc. There are some more complex
`the network without any real-time constraints. This class
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`forms of information one can also glean from flows (or packet
`includesapplications such as F'TP, software updates, and
`
`
`
`
`
`
`
`
`
`music or video downloads.
`
`
`
`data) statistics; for instance, one maylook atthe proportion of
`
`
`
`
`
`
`
`
`
`
`
`
`
`internal versus external traffic within a category—external
`Streaming: The streaming class contains multimediatraffic
`
`
`
`
`
`
`
`
`
`with real-time constraints. This class includes such
`
`
`
`
`
`
`
`traffic (traffic to the Internet) may have a lowerpriority within
`
`
`
`
`
`
`
`
`
`
`
`
`
`a corporate setting. These statistics can be obtained using
`applications as streaming and video conferencing.
`
`
`
`
`
`
`
`
`Transactional. The transactional class containstraffic that
`
`
`
`
`
`
`
`flow-level data collected at routers using, e.g.,

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket