`US007 664048B 1
`
`c12) United States Patent
`Yung et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,664,048 Bl
`Feb.16,2010
`
`(54) HEURISTIC BEHAVIOR PATTERN
`MATCHING OF DATA FLOWS IN ENHANCED
`NETWORK TRAFFIC CLASSIFICATION
`
`(75)
`
`Inventors: Weng-Chin Yung, Folsom, CA (US);
`Mark Hill, Los Altos, CA (US); Anne
`Cesa Klein, Cupertino, CA (US)
`
`(73) Assignee: Packeteer, Inc., Cupertino, CA (US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 870 days.
`
`(21) Appl. No.: 10/720,329
`
`(22) Filed:
`
`Nov. 24, 2003
`
`(51)
`
`Int. Cl.
`(2006.01)
`H04L 12126
`(52) U.S. Cl. ....................... 370/253; 370/235; 370/252;
`709/224
`(58) Field of Classification Search ................. 370/223,
`370/224, 229,230, 231, 236.1, 238, 235,
`370/253, 252; 709/224, 226, 233, 235, 246
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4/1990
`4,914,650 A
`5,828,846 A
`10/1998
`6,003,077 A
`12/1999
`6,023,456 A
`2/2000
`6,038,216 A *
`3/2000
`6,046,980 A *
`4/2000
`6,122,670 A *
`9/2000
`6,144,636 A * 11/2000
`4/2001
`6,219,050 Bl
`6,285,660 Bl
`9/2001
`6,363,056 Bl
`3/2002
`6,397,359 Bl
`5/2002
`6,584,467 Bl
`6/2003
`6,591,299 B2 *
`7/2003
`6,625,648 Bl
`9/2003
`6,628,938 Bl
`9/2003
`
`Sriram
`Kirby
`Bawden
`Chapman
`Packer ........................ 370/231
`Packer ........................ 370/230
`Bennett et al. .............. 709/236
`Aimoto et al.
`.............. 370/229
`Schaffer
`Ronen
`Beigi
`Chandra
`Haught
`Riddle et al. ................ 709/224
`Schwaller
`Rachabathuni
`
`6,681,232 Bl
`6,690,918 B2
`6,701,359 Bl
`6,738,352 Bl
`6,798,763 Bl
`6,894,972 Bl
`7,010,611 Bl*
`7,120,931 Bl
`7,154,416 Bl
`7,155,502 Bl
`7,193,968 Bl
`7,215,637 Bl
`7,224,679 B2
`
`1/2004 Sistanizadeh
`2/2004 Evans
`3/2004 Calabrez
`5/2004 Yamada
`9/2004 Kimura
`5/2005 Phaal
`3/2006 Wiryaman et al .
`10/2006 Cheriton
`12/2006 Savage
`12/2006 Galloway
`3/2007 Kapoor
`5/2007 Ferguson
`5/2007 Solomon
`
`.......... 709/232
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`Pazos, C.M. et al., "Flow Control and Bandwidth Management in
`Next Generation Internets" IEEE, Jun. 22, 1998, pp. 123-132.*
`
`(Continued)
`
`Primary Examiner-Donald L Mills
`(74) Attorney, Agent, or Firm-Baker Botts L.L.P.
`
`(57)
`
`ABSTRACT
`
`Methods, apparatuses and systems facilitating enhanced clas(cid:173)
`sification of network traffic that extends beyond analysis of
`explicitly presented packet attributes and holistically ana(cid:173)
`lyzes data flows, and in some implementations, related data
`flows against known application behavior patterns to classify
`the data flows. Implementations of the present invention
`facilitate the classification of encrypted or compressed net(cid:173)
`work traffic, or where the higher layer information in the data
`flows are formatted according to a non-public or proprietary
`protocol.
`
`31 Claims, 11 Drawing Sheets
`
`Pattern Match
`Based on Suspected
`Application Type
`
`366
`
`Return No
`Match
`
`'------Return Match
`
`EX1005
`Palo Alto Networks v. Sable Networks
`IPR2020-01712
`
`
`
`US 7,664,048 Bl
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`7,292,531 Bl
`7,296,288 Bl
`7,324,447 Bl
`7,385,924 Bl
`7,554,983 Bl
`2002/0122427 Al
`2002/0143901 Al
`2003/0035385 Al
`2003/0112764 Al
`2003/0185210 Al
`
`11/2007 Hill
`11/2007 Hill
`1/2008 Morford
`6/2008 Riddle
`6/2009 Muppala
`9/2002 Kamentsky
`10/2002 Lupo
`2/2003 Walsh
`6/2003 Gaspard
`10/2003 McCormack
`
`2004/0125815 Al
`2006/0045014 Al
`
`7/2004 Shimazu
`3/2006 Charzinski
`
`OTHER PUBLICATIONS
`
`Ye, Guanhua et al., "Using explicit congestion notification in stream
`control transmisson provided in networks", IEEE, May 19-22, 2003,
`pp. 704-709.*
`Yung, U.S. Appl. No. 10/917,952, entitled: Examination of connec(cid:173)
`tion handshake to enhance classification of encrypted network traffic,
`Aug. 2004.
`
`* cited by examiner
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 1 of 11
`
`US 7,664,048 Bl
`
`50
`
`44
`
`30
`
`Traffic Monitoring
`Device
`
`Traffic Monitoring Module
`
`40
`
`72
`82
`
`Packet
`Processor
`
`Traffic
`Classification
`Engine
`
`86
`
`42
`
`75
`
`76
`
`Fig._1
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 2 of 11
`
`US 7,664,048 Bl
`
`50
`
`IE3
`E!!!!!5I
`li5J
`
`28
`
`25
`
`21
`
`\ 24
`
`(Inside)
`
`~
`Fig._2
`
`24
`
`130
`
`44
`
`40
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 3 of 11
`
`US 7,664,048 Bl
`
`150
`
`Administrator
`Interface
`
`___ ..,.
`
`140
`
`Measurement
`Engine
`
`137
`
`Traffic
`Classification
`Engine
`
`135
`
`134
`
`Flow
`Database
`
`Management
`Information Base
`
`Host
`Database
`
`Traffic Discovery
`t----+---t
`Module
`
`138
`
`139
`
`Data Packet
`In
`
`Packet
`Processor
`
`Flow Control
`t - - - -+ - - t
`Module
`
`Data Packet
`Out
`
`131
`
`132
`Fig._3
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 4 of 11
`
`US 7,664,048 Bl
`
`102
`
`Receive Data
`Packet
`
`106
`
`Construct
`Flow Object
`
`Fetch/Update
`Flow Object
`
`No
`
`Flag Packet Data
`for Traffic
`Discovery
`
`Record Flow
`Measurement
`Variables
`
`Identify
`Traffic Class
`
`114
`
`116
`
`Fig._4
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 5 of 11
`
`US 7,664,048 Bl
`
`305
`
`Return Previous
`>----1.i
`Classification
`
`322
`
`320
`
`Increment
`Related Flow
`Count
`
`Classify as
`Unknown
`
`Pattern Match
`based on Suspected
`Application Type
`
`Classify as
`Suspected
`Application
`
`No
`
`Yes
`
`318
`
`Classify as Unknown;
`End Pattern Match
`For Flow
`
`310
`
`306
`
`Identify
`Suspected
`Application
`
`308
`
`Process Flow for
`Related Flow
`Tracking
`
`Fig._5A
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 6 of 11
`
`US 7,664,048 Bl
`
`Identify
`':!11c-n<>rtorl
`LJLA-U}-'VVL\J'U
`Application
`
`330
`
`Select First Application
`
`332
`
`No
`
`No
`
`Advance to Next
`Application
`
`344
`
`Return "No
`Match"
`
`Compute Packet
`Data Entropy
`Value
`
`Fig._5B
`
`340
`
`Return Suspected
`Application
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 7 of 11
`
`US 7,664,048 Bl
`
`Related Flow
`Tracking
`
`350
`
`Record Arrival Time
`of 1st Packet; Host
`Address and Suspected
`Application
`
`New Host
`Address/Suspected
`Application Pair?
`
`Related Flow Count
`,-..,r---~ =0; Last Flow Time
`= 1st Packet Time
`
`356
`
`Is~
`b/w 1st
`Packet Time and
`Last Flow Time >
`Limit?
`
`Return
`
`Fig._5C
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 8 of 11
`
`US 7,664,048 Bl
`
`Pattern Match
`Based on Suspected
`Application Type
`
`360
`
`No
`
`362
`
`Does Packet
`
`No
`
`366
`
`Return No
`Match
`
`Yes
`
`364
`
`....._ ___ __ ~__.Return Match
`
`Fig._5D
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 9 of 11
`
`US 7,664,048 Bl
`
`71
`
`72
`
`Client Device
`
`42
`
`PtoP App
`
`Tunnel
`Client
`
`74
`
`Tunnel Proxy
`Server
`
`50
`
`75
`
`Network
`Resource
`
`Fig._6
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 10 of 11
`
`US 7,664,048 Bl
`
`202
`
`Receive Data 14------------------,
`Packet
`212
`__,I
`
`Construct
`Control Block
`
`Fetch/Update
`Control Block
`
`Write Traffic
`Class & Policies
`into Control Block
`
`No
`
`Flag Packet Data
`for Traffic
`Discovery
`
`Pass Packet to
`Flow Control
`Module (P)
`
`Record Flow
`Measurement
`Variables
`
`Identify
`Traffic Class
`
`214
`
`222
`
`Fig._7
`
`224
`
`
`
`U.S. Patent
`
`Feb.16,2010
`
`Sheet 11 of 11
`
`US 7,664,048 Bl
`
`402
`
`)
`
`Pass to
`Classification
`Engine
`
`Yes
`
`406
`
`Flag for Traffic
`Discovery
`
`Traffic Class
`Identified By
`Auto-Discovery?
`
`No
`
`Pass to Pattern
`Matching
`Classification
`Mechanism
`
`410
`
`Fig._8
`
`
`
`US 7,664,048 Bl
`
`1
`HEURISTIC BEHAVIOR PATTERN
`MATCHING OF DATA FLOWS IN ENHANCED
`NETWORK TRAFFIC CLASSIFICATION
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS AND PATENTS
`
`This application makes reference to the following com(cid:173)
`monly owned U.S. patent applications and patents, which are
`incorporated herein by reference in their entirety for all pur- 10
`poses:
`U.S. patent application Ser. No. 08/762,828 now U.S. Pat.
`No. 5,802,106 in the name of Robert L. Packer, entitled
`"Method for Rapid Data Rate Detection in a Packet Commu(cid:173)
`nication Environment Without Data Rate Supervision;"
`U.S. patent application Ser. No. 08/970,693 now U.S. Pat.
`No. 6,018,516, in the name of Robert L. Packer, entitled
`"Method for Minimizing Unneeded Retransmission of Pack(cid:173)
`ets in a Packet Communication Environment Supporting a
`Plurality of Data Link Rates;"
`U.S. patent application Ser. No. 08/742,994 now U.S. Pat.
`No. 6,038,216, in the name of Robert L. Packer, entitled
`"Method for Explicit Data Rate Control in a Packet Commu(cid:173)
`nication Environment without Data Rate Supervision;"
`U.S. patent application Ser. No. 09/977,642 now U.S. Pat. 25
`No. 6,046,980, in the name of Robert L. Packer, entitled
`"System for Managing Flow Bandwidth Utilization at Net(cid:173)
`work, Transport andApplication Layers in Store and Forward
`Network;"
`U.S. patent application Ser. No. 09/106,924 now U.S. Pat. 30
`No. 6,115,357, in the name of Robert L. Packer and Brett D.
`Galloway, entitled "Method for Pacing Data Flow in a Packet(cid:173)
`based Network;"
`U.S. patent application Ser. No. 09/046,776 now U.S. Pat.
`No. 6,205,120, in the name of Robert L. Packer and Guy 35
`Riddle, entitled "Method for Transparently Determining and
`Setting an Optimal Minimum Required TCP Window Size;"
`U.S. patent application Ser. No. 09/479,356 now U.S. Pat.
`No. 6,285,658, in the name of Robert L. Packer, entitled
`"System for Managing Flow Bandwidth Utilization at net- 40
`work, Transport andApplication Layers in Store and Forward
`Network;"
`U.S. patent application Ser. No. 09/198,090 now U.S. Pat.
`No. 6,412,000, in the name of Guy Riddle and Robert L.
`Packer, entitled "Method for Automatically Classifying Traf-
`fic in a Packet Communications Network;"
`U.S. patent application Ser. No.09/198,051, in the name of
`Guy Riddle, entitled "Method for Automatically Determining
`a Traffic Policy in a Packet Communications Network;"
`U.S. patent application Ser. No. 09/206, 772, in the name of
`Robert L. Packer, Brett D. Galloway and Ted Thi, entitled
`"Method for Data Rate Control for Heterogeneous or Peer
`Internetworking;"
`U.S. patent application Ser. No. 10/039,992, in the name of 55
`Michael J. Quinn and Mary L. Laier, entitled "Method and
`Apparatus for Fast Lookup of Related Classification Entities
`in a Tree-Ordered Classification Hierarchy;"
`U.S. patent application Ser. No. 10/108,085, in the name of
`Wei-Lung Lai, Jon Eric Okholm, and Michael J. Quinn, 60
`entitled "Output Scheduling Data Structure Facilitating Hier(cid:173)
`archical Network Resource Allocation Scheme;"
`U.S. patent application Ser. No. 10/155,936 now U.S. Pat.
`No. 6,591,299, in the name of Guy Riddle, Robert L. Packer,
`and Mark Hill, entitled "Method For Automatically Classify- 65
`ing Traffic With Enhanced Hierarchy In A Packet Communi(cid:173)
`cations Network;"
`
`2
`U.S. patent application Ser. No. 10/236,149, in the name of
`Brett Galloway and George Powers, entitled "Classification
`Data Structure enabling Multi-Dimensional Network Traffic
`Classification and Control Schemes;"
`U.S. patent application Ser. No. 10/295,391, in the name of
`Mark Hill, Guy Riddle and Robert Purvy, entitled "Methods,
`Apparatuses, and Systems Allowing for Bandwidth Manage(cid:173)
`ment Schemes Responsive to Utilization Characteristics
`Associated with Individual Users;"
`U.S. patent application Ser. No. 10/334,467, in the name of
`Mark Hilt, entitled "Methods, Apparatuses and Systems
`Facilitating Analysis of the Performance of Network Traffic
`Classification Configurations;"
`U.S. patent application Ser. No. 10/453,345, in the name of
`15 Scott Hankins, Michael R. Morford, and Michael J. Quinn,
`entitled "Flow-Based Packet Capture;" and
`U.S. patent application Ser. No. 10/611,573, in the name of
`Roopesh Varier, David Jacobson, and Guy Riddle, entitled
`"Network Traffic Synchronization Mechanism."
`
`20
`
`FIELD OF THE INVENTION
`
`The present invention relates to computer networks and,
`more particularly, to enhanced network traffic classification
`mechanisms that allow for identification of encrypted data
`flows, or data flows where attributes necessary to proper
`classification are otherwise obscured or unknown.
`
`BACKGROUND OF THE INVENTION
`
`45
`
`Efficient allocation of network resources, such as available
`network bandwidth, has become critical as enterprises
`increase reliance on distributed computing environments and
`wide area computer networks to accomplish critical tasks.
`The widely-used Transport Control Protocol (TCP)/Internet
`Protocol (IP) protocol suite, which implements the world(cid:173)
`wide data communications network environment called the
`Internet and is employed in many local area networks, omits
`any explicit supervisory function over the rate of data trans(cid:173)
`port over the various devices that comprise the network.
`While there are certain perceived advantages, this character(cid:173)
`istic has the consequence of juxtaposing very high-speed
`packets and very low-speed packets in potential conflict and
`produces certain inefficiencies. Certain loading conditions
`degrade performance of networked applications and can even
`cause instabilities which could lead to overloads that could
`stop data transfer temporarily.
`In order to understand the context of certain embodiments
`of the invention, the following provides an explanation of
`50 certain technical aspects of a packet based telecommunica(cid:173)
`tions network environment. Internet/Intranet technology is
`based largely on the TCP/IP protocol suite. At the network
`level, IP provides a "datagram" delivery service-that is, IP is
`a protocol allowing for delivery of a datagram or packet
`between two hosts. By contrast, TCP provides a transport
`level service on top of the datagram service allowing for
`guaranteed delivery of a byte stream between two IP hosts. In
`other words, TCP is responsible for ensuring at the transmit(cid:173)
`ting host that message data is divided into packets to be sent,
`and for reassembling, at the receiving host, the packets back
`into the complete message.
`TCP has "flow control" mechanisms operative at the end
`stations only to limit the rate at which a TCP endpoint will
`emit data, but it does not employ explicit data rate control.
`The basic flow control mechanism is a "sliding window", a
`window which by its sliding operation essentially limits the
`amount of unacknowledged transmit data that a transmitter is
`
`
`
`US 7,664,048 Bl
`
`3
`allowed to emit. Another flow control mechanism is a con(cid:173)
`gestion window, which is a refinement of the sliding window
`scheme involving a conservative expansion to make use of the
`full, allowable window.
`The sliding window flow control mechanism works in con(cid:173)
`junction with the Retransmit Timeout Mechanism (RTO),
`which is a timeout to prompt a retransmission of unacknowl(cid:173)
`edged data. The timeout length is based on a running average
`of the Round Trip Time (RTT) for acknowledgment receipt,
`i.e. if an acknowledgment is not received within (typically)
`the smoothed RTT+4*mean deviation, then packet loss is
`inferred and the data pending acknowledgment is re-trans(cid:173)
`mitted. Data rate flow control mechanisms which are opera(cid:173)
`tive end-to-end without explicit data rate control draw a
`strong inference of congestion from packet loss (inferred,
`typically, by RTO). TCP end systems, for example, will
`"back-off," -i.e., inhibit transmission in increasing multiples
`of the base RTT average as a reaction to consecutive packet
`loss.
`A crude form of bandwidth management in TCP/IP net(cid:173)
`works (that is, policies operable to allocate available band(cid:173)
`width from a single logical link to network flows) is accom(cid:173)
`plished by a combination of TCP end systems and routers
`which queue packets and discard packets when some conges(cid:173)
`tion threshold is exceeded. The discarded and therefore unac(cid:173)
`knowledged packet serves as a feedback mechanism to the
`TCP transmitter. Routers support various queuing options to
`provide for some level of bandwidth management. These
`options generally provide a rough ability to partition and
`prioritize separate classes of traffic. However, configuring
`these queuing options with any precision or without side
`effects is in fact very difficult, and in some cases, not possible.
`Seemingly simple things, such as the length of the queue,
`have a profound effect on traffic characteristics. Discarding
`packets as a feedback mechanism to TCP end systems may
`cause large, uneven delays perceptible to interactive users.
`Moreover, while routers can slow down inbound network
`traffic by dropping packets as a feedback mechanism to a TCP
`transmitter, this method often results in retransmission of data
`packets, wasting network traffic and, especially, inbound
`capacity of a Wide Area Network (WAN) link. In addition,
`routers can only explicitly control outbound traffic and cannot
`prevent inbound traffic from over-utilizing a WAN link. A 5%
`load or less on outbound traffic can correspond to a 100% load
`on inbound traffic, due to the typical imbalance between an
`outbound stream of acknowledgments and an inbound stream
`of data.
`In response, certain data flow rate control mechanisms
`have been developed to provide a means to control and opti(cid:173)
`mize efficiency of data transfer as well as allocate available
`bandwidth among a variety of business enterprise function(cid:173)
`alities. For example, U.S. Pat. No. 6,038,216 discloses a
`method for explicit data rate control in a packet-based net(cid:173)
`work environment without data rate supervision. Data rate
`control directly moderates the rate of data transmission from
`a sending host, resulting in just-in-time data transmission to
`control inbound traffic and reduce the inefficiencies associ(cid:173)
`ated with dropped packets. Bandwidth management devices
`allow for explicit data rate control for flows associated with a
`particular traffic classification. For example, U.S. Pat. No.
`6,412,000, above, discloses automatic classification of net(cid:173)
`work traffic for use in connection with bandwidth allocation
`mechanisms. U.S. Pat. No. 6,046,980 discloses systems and
`methods allowing for application layer control of bandwidth
`utilization in packet-based computer networks. For example,
`bandwidth management devices allow network administra(cid:173)
`tors to specify policies operative to control and/or prioritize
`
`5
`
`4
`the bandwidth allocated to individual data lows according to
`traffic classifications. In addition, certain bandwidth manage(cid:173)
`ment devices, as well as certain routers, allow network admin(cid:173)
`istrators to specify aggregate bandwidth utilization controls
`to divide available bandwidth into partitions. With some net(cid:173)
`work devices, these partitions can be configured to ensure a
`minimum bandwidth and/or cap bandwidth as to a particular
`class of traffic. An administrator specifies a traffic class ( such
`as File Transfer Protocol (FTP) data, or data flows involving
`10 a specific user) and the size of the reserved virtual link-i.e.,
`minimum guaranteed bandwidth and/or maximum band(cid:173)
`width. Such partitions can be applied on a per-application
`basis (protecting and/or capping bandwidth for all traffic
`associated with an application) or a per-user basis (control-
`15 ling, prioritizing, protecting and/or capping bandwidth for a
`particular user). In addition, certain bandwidth management
`devices allow administrators to define a partition hierarchy by
`configuring one or more partitions dividing the access link
`and further dividing the parent partitions into one or more
`20 child partitions. While the systems and methods discussed
`above that allow for traffic classification and application of
`bandwidth utilization controls on a per-traffic-classification
`basis operate effectively for their intended purposes, they
`possess certain limitations. As discussed more fully below,
`25 identification of traffic types associated with data flows tra(cid:173)
`versing an access link involves the application of matching
`criteria or rules to explicitly presented or readily discoverable
`attributes of individual packets against an application signa(cid:173)
`ture which may comprise a protocol identifier ( e.g., TCP,
`30 HyperText Transport Protocol (HTTP), User Datagram Pro(cid:173)
`tocol (UDP), Multipurpose Internet Mail Extensions (MIME)
`types, etc.), a port number, and even an application-specific
`string of text in the payload of a packet. After identification of
`a traffic type corresponding to a data flow, a bandwidth man-
`35 agement device associates and subsequently applies band(cid:173)
`width utilization controls (e.g., a policy or partition) to the
`data flow corresponding to the identified traffic classification
`or type. Accordingly, simple changes to an application, such
`as a string of text appearing in the payload or the use of
`40 encryption text may allow the application to evade proper
`classification and corresponding bandwidth utilization con(cid:173)
`trols or admission policies.
`Indeed, a common use of bandwidth management devices
`is to limit the bandwidth being consumed by unruly, band-
`45 width-intensive applications, such as peer-to-peer applica(cid:173)
`tions (e.g., Kazaa, Napster, etc.), and/or other unauthorized
`applications. Indeed, the rich Layer 7 classification function(cid:173)
`ality of Packetshaper® bandwidth management devices
`offered by Packeteer®, Inc. of Cupertino, Calif. is an attrac-
`50 tive feature for network administrator, as it allows for accu(cid:173)
`rate identification of a variety of application types. This traffic
`classification functionality, in many instances, uses a combi(cid:173)
`nation of known protocol types, port numbers and applica(cid:173)
`tion-specific attributes to differentiate between various appli-
`55 cation traffic traversing the network. An increasing number of
`such peer-to-peer applications, however, employ data com(cid:173)
`pression, encryption technology, and/or proprietary protocols
`that obscure or prevent identification of various application(cid:173)
`specific attributes, often leaving well-known port numbers as
`60 the only basis for classification. In fact, as networked appli(cid:173)
`cations get increasingly complicated, data encryption has
`become a touted feature. Indeed, encryption addresses the
`concern of security and privacy issues, but it also makes it
`much more difficult to identify unauthorized applications
`65 using encryption, such as the peer-to-peer applications
`"Earthstation 5" and "Winny." In addition, traffic classifica(cid:173)
`tion based solely on well-known port numbers can be prob-
`
`
`
`US 7,664,048 Bl
`
`6
`public or proprietary protocol. In one embodiment, the
`enhanced classification functionality analyzes the behavioral
`attributes of encrypted data flows against a knowledge base of
`known application behavior patterns to classify the data
`flows. In one embodiment, the enhanced classification
`mechanisms described herein operate seamlessly with other
`Layer 7 traffic classification mechanisms that operate on
`attributes of the packets themselves. Implementations of the
`present invention can be incorporated into a variety of net-
`10 work devices, such as traffic monitoring devices, packet cap(cid:173)
`ture devices, firewalls, and bandwidth management devices.
`
`DESCRIPTION OF THE DRAWINGS
`
`5
`lematic, especially where the application uses dynamic port
`number assignments or an application incorrectly uses a well(cid:173)
`known port number, leading to misclassification of the data
`flows. In addition, classifying such encrypted network traffic
`as "unknown" and applying a particular rate or admission 5
`policy to unknown traffic classes undermines the granular
`control otherwise provided by bandwidth management
`devices and, further, may cause legitimate, encrypted traffic
`to suffer as a result.
`In addition, network savvy users (such as students in a
`campus or university environment) have also become aware
`that bandwidth management devices have been deployed to
`limit or restrict unauthorized peer-to-peer application traffic.
`As a result, users often attempt to bypass or thwart the band(cid:173)
`width management scheme effected by such bandwidth man- 15
`agement devices by creating communications tunnels (proxy
`tunnels) through which unauthorized or restricted network
`traffic is sent. The attributes discernible from the content of
`these tunneled data flows, however, often reveal little infor(cid:173)
`mation about its true nature. For example, commercial HTTP
`tunnel services (such as loopholesoftware.com, TotalRc.net,
`and http-tunnel.com, etc.) allow users to send all network
`traffic in the form of HTTP traffic through a HTTP tunnel
`between a tunnel client and an HTTP proxy server maintained
`by the tunnel services provider. FIG. 6 illustrates the func(cid:173)
`tionality and operation of a typical HTTP proxy tunnel. Client
`device 42 includes a client application (such as a peer-to-peer
`application 71) and a tunnel client 72. The client application
`sends data to the tunnel client 72 which tunnels the data over
`HTTP to a tunnel proxy server 7 4. The tunnel proxy server 7 4
`then forwards the data to the intended destination (here, net(cid:173)
`work resource 75), and vice versa. Such HTTP tunnels typi(cid:173)
`cally feature encryption; accordingly, a bandwidth manage(cid:173)
`ment device 30, encountering the tunneled traffic in this form,
`may not detect the exact nature of the traffic and, in fact, 35
`classify such data flows as legitimate or regular HTTP traffic.
`Accordingly, these tunneling mechanisms and other tech(cid:173)
`niques for evading bandwidth utilization controls imple(cid:173)
`mented by bandwidth management devices present new chal(cid:173)
`lenges to network administrators and bandwidth management
`device manufacturers desiring to effectively control unautho(cid:173)
`rized or restricted network traffic.
`In light of the foregoing, a need in the art exists for meth(cid:173)
`ods, apparatuses and systems that facilitate the classification
`of encrypted or compressed network traffic. A need further 45
`exists for methods, apparatuses and systems that facilitate the
`classification of network traffic associated with a non-public,
`proprietary protocol or application. Embodiments of the
`present invention substantially fulfill these needs.
`
`FIG. 1 is a functional block diagram showing a traffic
`monitoring device according to an embodiment of the present
`invention.
`FIG. 2 is a functional block diagram illustrating a computer
`network environment including a bandwidth management
`20 device according to an embodiment of the present invention.
`FIG. 3 is a functional block diagram setting forth the func(cid:173)
`tionality in a bandwidth management device according to an
`embodiment of the present invention.
`FIG. 4 is a flow chart diagram providing a method, accord-
`25 ing to an embodiment of the present invention, directed to the
`processing of packets in a traffic monitoring device.
`FIGS. SA thru SD are flow chart diagrams illustrating
`methods, according to an embodiment of the present inven(cid:173)
`tion, directed to classifying data flows based on one or more
`30 observed behavioral attributes.
`FIG. 6 is a functional block diagram illustrating a proxy
`tunnel which may be used in attempts to evade appropriate
`classification and circumvent the bandwidth utilization con(cid:173)
`trols implemented by bandwidth management devices.
`FIG. 7 is a flow chart diagram providing a method directed
`to enforcing bandwidth utilization controls on data flows.
`FIG. 8 is a flow chart diagram showing how the application
`behavior pattern matching functionality can be applied in
`combination with other network traffic classification pro-
`40 cesses.
`
`DESCRIPTION OF PREFERRED
`EMBODIMENT(S)
`
`FIG. 1 illustrates a basic network environment in which an
`embodiment of the present invention operates. FIG. 1 shows
`a first network device 40 ( such as a hub, switch, router, and/or
`a variety of combinations of such devices implementing a
`LAN or WAN) interconnecting two end-systems (here, client
`50 computer 42 and server 44). FIG. 1 also provides a second
`network device 22, such as a router, operably connected to
`network cloud 50, which in one implementation could be an
`open, wide-area network. As FIG. 1 shows, traffic monitoring
`device 30 comprises traffic monitoring module 75, and first
`55 and second network interfaces 71, 72, which operably con(cid:173)
`nect traffic monitoring device 30 to the communications path
`between first network device 40 and second network device
`22. Traffic monitoring module 75 generally refers to the func(cid:173)
`tionality implemented by traffic monitoring device 30. In one
`60 embodiment, traffic monitoring module 75 is a combination
`of hardware and software, such as a central processing unit,
`memory, a system bus, an operating system and one or more
`software modules implementing the functionality described
`herein. In one embodiment, traffic monitoring module 75
`65 includes a packet processor 82, and a traffic classification
`engine 86. In one embodiment, the packet processor 82 is
`operative to process data packets, such as storing packets in a
`
`SUMMARY OF THE INVENTION
`
`The present invention provides methods, apparatuses and
`systems facilitating enhanced classification of network traf(cid:173)
`fic. As discussed above, typical mechanisms that classify
`network traffic analyze explicitly presented or readily discov(cid:173)
`erable attributes of individual packets against an application
`signature, such as a combination of protocol identifiers, port
`numbers and text strings. The present invention extends
`beyond analysis of such explicitly presented packet attributes
`and holistically analyzes the data flows, and in some imple(cid:173)
`mentations, the behavior of host or end systems as expressed
`in related data flows against known application behavior pat(cid:173)
`terns to classify the data flows. Implementations of the
`present invention facilitate the classification of encrypted or
`compressed network traffic, or where the higher layer infor(cid:173)
`mation in the data flows are formatted according to a non-
`
`
`
`US 7,664,048 Bl
`
`7
`buffer structure, detecting new data flows, and parsing the
`data packets for various attributes (such as source and desti(cid:173)
`nation addresses, and the Like) and maintaining one or more
`measurement variables or statistics in connection with the
`flows. The traffic classification engine 86, as discussed more 5
`fully below, is operative to classify data flows based on one or
`more attributes associated with the data flows. Traffic classi(cid:173)
`fication engine 86 is also operative to classify data flows
`based on a heuristic comparison of certain observed behav(cid:173)
`ioral attributes of the data flows relative to a set of at least one 10
`known application behavior pattern.
`The functionality of traffic monitoring device 30 can be
`integrated into a variety of network devices that classify net(cid:173)
`work traffic, such as firewalls, gateways, proxies, packet cap(cid:173)
`ture devices (see U.S. application Ser. No. 10/453,345), net(cid:173)
`work traffic monitoring and/or bandwidth management
`devices, that are typically located at strategic points in com(cid:173)
`puter networks. In one embodiment, first and second network
`interfaces 71, 72 are implemented as a combination of hard(cid:173)
`ware and software, such as network interface cards and asso(cid:173)
`ciated software drivers. In addition, the first and second net(cid:173)
`work interfaces 71, 72 can be wired network interfaces, such
`as Ethernet interfaces, and/or wireless network interfaces,
`such as 802.11, Blue Tooth, satellite-based interfaces, and the
`like. As FIG. 1 illustrates, traffic monitoring device 30, in one
`embodiment, includes persistent memory 76, such as a hard
`disk drive or other suitable memory device, such writable CD,
`DVD, or tape drives.
`As FIGS. 1 and 2 show, the traffic monitoring device 30 ( or
`bandwidth management device 130), in one embodiment, is
`disposed on the link between a local area network 40 and
`router 22. In other embodiments, multiple traffic monitoring
`devices can be disposed at strategic points in a given network
`infrastructure to achieve various objectives. In addition, traf(cid:173)
`fic monitoring device 3 0 need not be directly connected to the
`link between two network devices, but may also be connected
`to a mirror port. In addition, the traffic monitoring function(cid:173)
`ality described herein may be deployed in multiple network
`devices and used in r