`US007185368B2
`
`c12) United States Patent
`Copeland, III
`
`(IO) Patent No.:
`(45) Date of Patent:
`
`US 7,185,368 B2
`Feb.27,2007
`
`(54) FLOW-BASED DETECTION OF NETWORK
`INTRUSIONS
`
`(75)
`
`Inventor: John A. Copeland, III, Atlanta, GA
`(US)
`
`(73)
`
`Assignee: Lancope, Inc., Atlanta, GA (US)
`
`( *)
`
`Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 887 days.
`
`(21)
`
`Appl. No.: 10/000,396
`
`(22) Filed:
`
`Nov. 30, 2001
`
`(65)
`
`Prior Publication Data
`
`US 2003/0105976 Al
`
`Jun. 5, 2003
`
`(60)
`
`(51)
`
`(52)
`
`(58)
`
`(56)
`
`Related U.S. Application Data
`
`Provisional application No. 60/265,194, filed on Jan.
`31, 2001, provisional application No. 60/250,261,
`filed on Nov. 30, 2000.
`
`Int. Cl.
`G06F 11130
`(2006.01)
`U.S. Cl. ............................ 726/25; 726/22; 726/23;
`726/26; 713/151; 709/203; 709/224; 709/227;
`705/51
`Field of Classification Search ..................... None
`See application file for complete search history.
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,437,244 A *
`5,557,686 A *
`5,557,742 A
`5,621,889 A
`5,796,942 A *
`5,825,750 A *
`5,970,227 A
`
`8/1995 Van Gilst ..................... 119/73
`9/1996 Brown et al. ............... 382/115
`9/ 1996 Smaha et al.
`4/1997 Lermuzeaux et al.
`8/ 1998 Esbensen .................... 713/201
`10/1998 Thompson .................. 370/244
`10/ 1999 Dayan et al.
`
`FOREIGN PATENT DOCUMENTS
`
`WO
`
`PCT/US99/29080
`
`6/2000
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`Javitz H S et al.: "The SRI IDES Statistical Anomaly Detector",
`Proceedings of the Symposium on Research in Security and Privacy
`US Los Alamitos, IEEE Comp. Soc. Press, v. Symp. 12, pp. 316-326
`XP000220803ISBN; 0-8186-2168-0, p. 316, col. 1, line 1, p. 318,
`col. 1, line 3.*
`
`(Continued)
`
`Primary Examiner-Nasser Moazzami
`Assistant Examiner-Ronald Baum
`(74) Attorney, Agent, or Firm-Morris, Manning & Martin,
`LLP
`
`(57)
`
`ABSTRACT
`
`A flow-based intrusion detection system for detecting intru(cid:173)
`sions in computer communication networks. Data packets
`representing communications between hosts in a computer(cid:173)
`to-computer communication network are processed and
`assigned to various client/server flows. Statistics are col(cid:173)
`lected for each flow. Then, the flow statistics are analyzed to
`determine if the flow appears to be legitimate traffic or
`possible suspicious activity. A concern index value is
`assigned to each flow that appears suspicious. By assigning
`a value to each flow that appears suspicious and adding that
`value to the total concern index of the responsible host, it is
`possible to identify hosts that are engaged in intrusion
`activity. When the concern index value of a host exceeds a
`preset alarm value, an alert is issued and appropriate action
`can be taken.
`
`(Continued)
`
`37 Claims, 9 Drawing Sheets
`
`EX1007
`Palo Alto Networks v. Sable Networks
`IPR2020-01712
`
`
`
`US 7,185,368 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`11/1999 Conklin et al.
`5,991,881 A
`6,119,236 A *
`9/2000 Shipley ...................... 713/201
`6,182,226 Bl
`1/2001 Reid et al.
`6,275,942 Bl
`8/2001 Bernhard et al.
`11/2001 Porras et al.
`6,321,338 Bl
`6,363,489 Bl *
`................ 726/22
`3/2002 Comay et al.
`6,453,345 B2 *
`9/2002 Trcka et al . ................ 709/224
`6,502,131 Bl* 12/2002 Vaid et al. .................. 709/224
`6,628,654 Bl*
`9/2003 Albert et al.
`............... 370/389
`6,853,619 Bl*
`2/2005 Grenot ....................... 370/232
`6,891,839 B2 *
`5/2005 Albert et al.
`............... 370/401
`2002/0104017 Al*
`8/2002 Stefan ........................ 713/201
`2002/0133586 Al*
`9/2002 Shanklin et al ............. 709/224
`2004/0187032 Al*
`9/2004 Gels et al. .................. 713/201
`2004/0237098 Al* 11/2004 Watson et al. ................ 725/25
`
`FOREIGN PATENT DOCUMENTS
`
`WO
`
`PCT/US00/29490
`
`5/2001
`
`OTHER PUBLICATIONS
`
`Lunt T F et al: "Knowledge-based Intrusion Detection", Proceed(cid:173)
`ings of the Annual Artificial Intelligence Systems in Government
`Conf. US, Washington, IEEE Comp. Soc. Press, vol. Conf. 4, pp.
`102-107 XP000040018 p. 102, col. 1, line 1, p. 105, col. 2, line 21.*
`Mahoney, M., "Network Traffic Anomaly Detection Based on
`Packet Bytes", ACM, 2003, Fl. Institute of Technology, entire
`document, http://www.cs.fit.edu/-mmahoney/paper6. pdf. *
`Copeland, John A., et. al., "IP Flow Identification for IP Traffic
`Carried Over Switched Networks," The International Journal of
`Computer Telecommunications Networking Computer Networks 31
`(1999), pp. 493-504.
`Cooper, Mark "An Overview of Intrusion Detection Systems,"
`Zinetica White Paper, (www.xinetica.com) Nov. 19, 2001.
`
`Newman, P., et. al. "RFC 1953: Ipsilon Flow Management Protocol
`Specification for IPv4 Version 1.0" (www.xyweb.com/rfc/rfc1953.
`html) May 19, 1999.
`Paxson, Vern, "Bro: A System for Detecting Network Intruders in
`Real-Time," 7th USENIX Security Symposium, Lawrence
`Berkkeley National Laboratory, San Antonio, TX Jan. 26-29, 1998.
`Mukherjee, Biswanath, et. al., "Network Intrusion Detection," IEEE
`Network, May/Jun. 1994.
`"Network-vs Host-Based Intrusion Detection: A Guide to Intrusion
`Detection," ISS Internet Security Systems, Oct. 2, 1998, Atlanta,
`GA .
`Barford, Paul, et. al. "Characteristics of Network Traffic Flow
`Anomalies," ACM SIGCOMM Internet Measurement Workshop
`2001 (http://www.cs.wisc.edu/pb/ublications.html) Jul. 2001.
`Frincke, Deborah, et. al., "A Framework for Cooperative Intrusion
`Detection" 21st National Information Systems Security Conference,
`Oct. 1998, Crystal City, VA.
`Phrack Magazine, vol. 8, Issue 53, Jul. 8, 1998, Article 11 of 15.
`"LANSleuth Fact Sheet," LANSleuth LAN Analyzer for Ethernet
`and Token Ring Networks, (www.lansleuth.com/features.html),
`Aurora, Illinois.
`"LANSleuth General Features,"
`html), Aurora, Illinois.
`Copeland, John A., et al, "IP Flow Identification for IP Traffic
`Carried Over Switched Networks," The International Journal of
`Computer and Telecommunications Networking Computer Net(cid:173)
`works 31 ( 1999), pp. 493-504.
`Cooper, Mark "An Overview of Instrusion Detection Systems,"
`Xinetica White Paper, (www.xinetica.com) Nov. 19, 2001.
`Newman, P., et al. "RFC 1953: Ipsilon Flow Management Protocol
`Specificaiton for IPv4 Version 1.0" (www.xyweb.com/rfc/rfc1953.
`html) May 19, 1999.
`
`(www.lansleuth.com/features.
`
`* cited by examiner
`
`
`
`e •
`
`00
`•
`
`I
`
`I
`
`..,_
`
`.._'
`
`I
`I
`I
`J //; I
`
`\ ,' /.," FLOW
`FLOW
`F1
`F2
`(E.G.
`(E.G.
`FTP)
`HTTP)
`
`HOST
`
`-+j ~TIME= 330 sec=> FLOW TERMINATION
`PACKET HEADER ,,
`(IP ADDR, PORT)
`',P14 P12 P10I IP9P8P7 P6 PS P4 P3 P2 P1
`..,,-,:~,~ D\;~ ~~A~ D D ~ ~ ~/jlllll(/111!1111!!1 ...
`
`DATA
`
`I
`
`101 _)
`
`.,..
`
`..,.
`
`;
`/
`;~ "
`.,...-
`,; ~ "'..- ~ ; _,,, -
`_. _. ,_.
`FLOW-~ -
`F4
`(E G
`HTTP)
`
`I
`~--,,.
`
`\.
`
`.
`FLOW
`F3
`(E.G.
`SMTP)
`
`FLOW-BASED INTRUSION DETECTION ~
`FIG. 1
`Cl > ALARM THRESHOLD L....--L--+-__,
`(E.G. 3,500) --> ALERT
`
`£:SADMIN
`
`
`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 2 of 9
`
`US 7,185,368 B2
`
`IP HEADER
`220
`
`TCP/IP PACKET
`
`/
`
`210
`
`0
`VERSION
`
`4
`I
`
`8
`I
`TYPE OF SERVICE
`IHL
`IDENTIFICATION
`I
`TIME TO LIVE
`PROTOCOL
`SOURCE IP ADDRESS
`DESTINATION IP ADDRESS
`
`16
`
`19
`
`?.d
`TOTAL LENGTH
`FLAGS I
`FRAGMENT OFFSET
`HEADER CHECKSUM
`
`SOURCE PORT
`
`DESTINATION PORT
`
`I
`SEQUENCE NUMBER
`ACKNOWLEDGMENT NUMBER
`[ulAIP\RIS Fl
`
`OFFSET
`
`{RESERVED)
`I
`CHECK~!IM
`
`DATA BYTE 1
`
`I DATABYTE2
`
`WINDOW
`
`URGENT POINTER
`
`DATA BYTE 3
`
`I DATABYTE4
`
`I
`I
`
`(
`TCP DATA SEGMENT
`235
`
`31
`
`31
`
`31
`
`• • •
`
`TCP/IP DATAGRAM
`
`TCP HEADER
`230
`
`0
`
`, r UDP PACKET
`'Y 240
`
`UDP SOURCE PORT
`UDP MESSAGE LENGTH
`
`UDP DESTINATION PORT
`UDP CHECKSUM
`
`DATA BYTE 1
`
`0
`
`ZERO
`
`/
`
`UDP PSEUDO HEADER
`250
`
`DATA BYTE 2
`
`DATA BYTE3
`• • •
`UDP DATAGRAM
`
`DATABYTE4
`
`UDP DATA SEGMENT
`.
`255
`
`8
`
`16
`SOURCE IP ADDRESS
`DESTINATION ADDRESS
`I
`IP PROTOCOL TYPE
`
`I
`
`UDP LENGTH
`
`PACKET HEADERS
`FIG. 2
`
`
`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 3 of 9
`
`US 7,185,368 B2
`
`TCP/IP SESSION
`300
`
`EVENTS
`AT HOST 1
`
`SEND SYN
`
`RECEIVE SYN-ACK
`SEND ACK
`
`RECEIVE ACK
`SEND FIN-ACK
`
`RECEIVE ACK
`
`~...t.:.---
`
`RECEIVE FIN-ACK
`SEND ACK
`
`•
`•
`
`•
`•
`•
`
`FIG. 3
`
`EVENTS
`ATHOST2
`
`RECEIVE SYN
`SEND SYN-ACK
`
`RECEIVE ACK
`SEND ACK
`
`RECEIVE FIN-ACK
`SEND ACK
`
`SEND FIN-ACK
`
`RECEIVE ACK
`
`
`
`FLOWS
`
`PORTS
`
`,' 65,536
`
`I
`
`I
`
`I
`
`51,132
`
`49948
`
`35,620
`
`I
`
`I
`
`I
`I
`
`PORTS
`
`65,536
`
`42894
`
`\
`
`\
`
`\
`
`\
`
`\
`
`'
`
`\
`
`\
`
`'
`' SERVER
`130
`
`1024
`
`,' ADDRESS0
`
`CLIENT 110
`
`D I
`
`I
`
`ADDRESS
`
`'
`'
`
`11 059
`
`4993
`
`1024
`
`'
`'
`'
`'
`
`\
`
`'
`,,
`'
`'
`
`80
`
`25
`
`21
`20
`
`1
`
`I
`I
`
`I
`I
`
`,'
`
`SMTP 1
`
`~T IME DIFFERENTIAL
`DETERMINES SEPARATE
`FLOWS
`
`I,'
`,1
`
`SMTP2
`
`80
`
`25
`
`21
`20
`
`1
`
`ONE FLOW SERVICE /
`USING TWO PORTS
`
`'-
`
`FIG. 4
`
`e •
`
`00
`•
`~
`~
`~
`
`~ =
`
`~
`
`"f'j
`('D
`?'
`N
`~-....J
`N
`0
`0
`-....J
`
`rJJ
`
`=-('D
`.....
`0 ....
`
`('D
`
`.i;...
`
`1,0
`
`d
`r.,;_
`,.
`-....l
`""'"' QC
`UI
`,.
`tH
`0-,
`QC
`
`= N
`
`
`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 5 of 9
`
`US 7,185,368 B2
`
`FLOW BASED ENGINE
`155
`~
`
`~ ---- -- ........
`
`~160
`'
`' I
`
`/
`
`,,..
`
`510
`PACKET
`CLASSIFIER 1----,.---(cid:173)
`THREAD
`FIG.9A
`
`/
`
`/
`
`,..
`
`/
`
`L
`
`I
`
`'....
`
`---------
`
`, /
`
`520
`FLOW
`COLLECTOR
`THREAD
`(FIG. 9C)
`
`PROGRAM THREADS: SQUARES
`
`DATASTRUCTURES:OVALS
`
`DATA INPUT/OUTPUT: CIRCLES
`
`FIG. 5
`
`
`
`TABLE I
`
`NAME
`
`POTENTIAL INTRUDER
`
`RESPONSE
`
`Cl VALUE
`
`POTENTIAL TCP PROBE
`
`TCP PACKETS
`
`RESET PACKETS
`
`NUMBER OF PACKETS
`
`POTENTIAL UDP PROBE
`
`UDP PACKEST
`
`ICMP PORT
`UNAVAI LABLEPCKETS
`
`NUMBER OF ICMP
`PORT UNAVAILABLE PACKETS
`
`HALF-OPEN ATTACK
`
`HIGH NUMBER AND RATE OF SYNS
`
`SYN-ACKS
`
`TCP STEAL TH PORT SCAN
`
`MULTIPLE PACKETS FROM SAME SOURCE
`PORT TO DIFFERENT DESTINATION PORTS
`
`RESETS
`
`UDP STEAL TH PORT SCAN
`
`MULTIPLE PACKETS FROM SAME SOURCE
`PORT TO DIFFERENT DESTINATION PORTS
`'
`
`NOTHING OR
`ICMP PORT UNAVAILABLE
`
`5000+501
`PER SYN-ACK
`
`8000+1010 PER
`PORTOVER4
`
`8000+1010 PER
`PORTOVER4
`
`FLOW-BASED Cl VALUES
`FIG. 6
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`"f'j
`('D
`?'
`N
`~-....J
`N
`0
`0
`-....J
`
`rJJ
`
`('D
`
`=-('D
`.....
`O'I
`....
`0
`
`1,0
`
`
`
`TABLE II
`
`POTENTIAL INTRUDER
`
`RESPONSE
`
`BAD FLAGS
`
`TCP PACKET WITH UNDEFINED FLAGS
`
`SHORT UDP
`
`UDP PACKET LESS 2 DATA BYTES
`
`Cl VALUE
`
`200
`
`200
`
`ADDRESS SCAN
`
`PACKETS TO MORE THAN 8
`HOSTS ON SAME SUBNET
`
`NOTHING OR RESETS
`
`3000 PER DETECT
`
`PORT SCAN
`
`PACKETS TO MORE THAN 4 PORTS
`
`RESETS
`
`1010 PER PORT
`OVER4
`
`Cl EVENT VALUES
`FIG. 7
`
`e •
`
`00
`•
`~
`~
`~
`
`~ =
`
`~
`
`"f'j
`('D
`?'
`N
`~-....J
`N
`0
`0
`-....J
`
`rJJ
`
`('D
`
`=-('D
`.....
`-....J
`0 ....
`
`1,0
`
`
`
`------------------------,
`I
`. . . - - - - - - - - - - . , - - - -
`- ~ I
`I 510 PACKET CLASSIFIER THREAD ~ - , -~ FLOW DATA 162
`I
`I<
`I >I HOSTDATA I 166 I
`,?- - - - t_16~
`
`520 j FLOW COLLECTOR THREAD
`
`I 530 I ALERT MANAGER THREAD
`
`I
`I
`
`OPERATING SYSTEM
`
`MEMORY 805
`
`NICDRIVERS
`
`OTHER DRIVERS
`
`810
`
`820
`
`L -
`
`-
`
`-
`
`-
`
`e •
`
`00
`•
`
`HARDWARE
`ARCHITECTURE
`
`150
`
`800 i
`
`I
`
`I
`
`I 825 I
`850
`1~<'-----'I>•• PROCESSOR
`I
`
`HARD DRIVE
`
`840
`
`-
`-
`-
`830 -
`I
`I 834 MONITOR NIC
`8381
`~OMINR NIC
`- - - - - - - - - ___ J
`
`-
`
`-
`
`-
`
`7
`
`;----------------- .
`
`I
`I NETWORK DEVICE
`: 135
`I
`I __________ .,..., _______ ,
`
`NETWORK
`
`899
`
`FIG. 8
`
`
`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 9 of 9
`
`US 7,185,368 B2
`
`510
`PACKET CLASSIFIER
`THREAD
`
`START
`
`540
`FLOW COLLECTOR
`THREAD
`
`~
`
`START
`
`947
`
`CLEAR FLOW
`
`570
`ALERT MANAGER
`THREAD
`
`~
`
`976
`ALARM
`SIGNAL
`
`YES
`
`CREATE
`FLOW RECORD
`
`943
`
`INACTMTY
`SEARCH
`
`944
`LOGIC TREE ANALYSIS
`(FLOW CLASSIFICATION)
`
`945
`
`ASSIGN
`CONCERN INDEX
`
`918
`
`UPDATE
`FLOW RECORDS
`
`FIG. 9A
`
`946
`
`WRITE TO
`LOG FILE
`
`FIG. 9B
`
`START
`
`CREATE
`OUTPUT FILES
`
`Cl SEARCH
`
`FIG. 9C
`
`
`
`US 7,185,368 B2
`
`1
`FLOW-BASED DETECTION OF NETWORK
`INTRUSIONS
`
`CROSS REFERENCE To RELATED
`APPLICATIONS
`
`This Patent Application claims priority to the U.S. pro(cid:173)
`visional patent application Ser. No. 60/250,261 entitled
`"System and Method for Monitoring Network Traffic" filed
`Nov. 30, 2000 and U.S. provisional patent application Ser.
`No. 60/265,194 entitled "The Use of Flows to Analyze
`Network Traffic" filed on Jan. 31, 2001, both of which are
`incorporated in their entirety by reference and made a part
`hereof.
`
`REFERENCE TO COMPUTER PROGRAM
`LISTING SUBMITTED ON CD
`
`This application incorporates by reference the computer
`program listing appendix submitted on (1) CD-ROM
`entitled "Flow-Based Engine Computer Program Listing" in
`accordance with 37 C.F.R. §1.52(e). Pursuant to 37 C.F.R.
`§1.77(b)(4), the material on said CD-ROM is incorporated
`by reference herein, said material being identified as fol(cid:173)
`lows:
`
`Sizein
`Bytes
`
`Date of
`Creation
`
`File Name
`
`154,450
`
`Nov. 30, 2001
`
`LANcope Code.txt
`
`A portion of the disclosure of this patent document
`including said computer code contains material that is
`subject to copyright protection. The copyright owner has no
`objection to the facsimile reproduction by anyone of the
`patent document or the patent disclosure, as it appears in the
`Patent and Trademark Office patent file or records, but
`otherwise reserves all copyright rights whatsoever.
`
`TECHNICAL FIELD
`
`The invention relates generally to the field of network
`monitoring and, more particularly, to an intrusion detection
`system that inspects all inbound and outbound network
`activity and identifies suspicious patterns that may indicate
`a network or system attack or intrusion.
`
`BACKGROUND ART
`
`As the world proceeds into the 21st century, the Internet
`continues to grow without bounds. Networks have become
`indispensable for conducting all forms of business and
`personal communications. Networked systems allow one to
`access needed information rapidly, collaborate with part(cid:173)
`ners, and conduct electronic commerce. The benefits offered
`by Internet technologies are too great to ignore. However, as
`with all technology advances, a trade-off ensues. While
`computer networks revolutionize the way one does business,
`the risks introduced can be substantial. Attacks on networks
`can lead to lost money, time, reputation, and confidential
`information.
`One primary danger to avoid is having outside intruders
`gaining control of a host on a network. Once control is
`achieved, private company files can be downloaded, the
`controlled host can be used to attack other computers inside
`
`5
`
`2
`the firewall, or the controlled host can scan or attack
`computers anywhere in the world. Many organizations have
`pursued protecting their borders by the implementation of
`firewalls and intrusion detection systems (IDS).
`Firewalls merely limit access between networks. Fire-
`walls are typically designed to filter network traffic based on
`attributes such as source or destination addresses, port
`numbers, or transport layer protocols. Firewalls are suscep(cid:173)
`tible to maliciously crafted traffic designed bypass the
`10 blocking rules established. Additionally, almost all commer(cid:173)
`cially available IDS are signature based detection systems or
`anomaly based systems.
`Signature based detection systems piece together the
`packets in a connection to collect a stream of bytes being
`15 transmitted. The stream is then analyzed for certain strings
`of characters in the data commonly referred to as "signa(cid:173)
`tures." These signatures are particular strings that have been
`discovered in known exploits. The more signatures that are
`stored in a database, the longer it takes to do on exhaustive
`20 search on each data stream. For larger networks with mas(cid:173)
`sive amounts of data transferred, a string comparison
`approach is unfeasible. Substantial computing resources are
`needed to analyze all of the communication traffic.
`Besides, even if a known exploit signature has been
`25 discovered, the signature is not useful until it is has been
`installed and is available to the network. In addition, signa(cid:173)
`ture analysis only protects a system from known attacks. Yet,
`new attacks are being implemented all the time. Unfortu(cid:173)
`nately, a signature based detection system would not detect
`30 these new attacks and leave the network vulnerable.
`Another approach to intrusion detection includes detec(cid:173)
`tion of unusual deviation from normal data traffic commonly
`referred to as "anomalies." Like signature-based detection
`systems, many current anomaly based intrusion detection
`35 systems only detect known methods of attacks. Some of
`these known anomaly based attacks include TCP/IP stack
`fingerprinting, half-open attacks, and port scanning. How(cid:173)
`ever, systems relying on known attacks are easy to circum(cid:173)
`navigate and leave the system vulnerable. In addition, some
`40 abnormal network traffic happens routinely, often non-ma(cid:173)
`liciously, in normal network traffic. For example, an incor(cid:173)
`rectly entered address could be sent to an unauthorized port
`and be interpreted as an abnormality. Consequently, known
`anomaly based systems tend to generate an undesirable
`45 number of false alarms which creates a tendency to have all
`alarms generated to become ignored.
`Some known intrusion detection systems have tried to
`detect statistical anomalies. The approach is to measure a
`baseline and then trigger an alarm when deviation is
`50 detected. For example, if a system typically has no traffic
`from individual workstations at 2 am, activity during this
`time frame would be considered suspicious. However, base(cid:173)
`line systems have typically been ineffective because the
`small amount of malicious activity is masked by the large
`55 amounts of highly variable normal activity. On the aggre(cid:173)
`gate, it is extremely difficult to detect the potential attacks.
`Other intrusion detection systems compare long term
`profiled data streams to short term profiled data streams. One
`such system is described in U.S. Pat. No. 6,321,338 to Porras
`60 et al. entitled "Network Surveillance." The system described
`in this patent does not necessarily analyze all the network
`traffic, but instead focus on narrow data streams. The system
`filters data packet into various data streams and compares
`short term profiles to profiles collected over a long period.
`65 However, data traffic is typically too varied to meaningfully
`compare short term profiles to long term profiles. For
`example, merely because the average FTP streams may be 3
`
`
`
`US 7,185,368 B2
`
`3
`megabytes over the long term does not indicate that a 20
`megabyte stream is an anomaly. Consequently, these sys(cid:173)
`tems generate a significant amount of false alarms or the
`malicious activity can be masked by not analyzing the
`proper data streams.
`Consequently, a scalable intrusion detection system that
`effectively tracks characterized and tracks network activity
`to differentiate abnormal behavior. Due to the impracticality
`of analyzing all the data flowing through the network, the
`system cannot rely on signature based methods. The detec- 10
`tion system must be able to function even with the data
`traffic of larger networks. In addition, the system needs to
`quickly and efficiently determine if the network has under(cid:173)
`gone an attack without an excessive amount of false alarms.
`
`5
`
`DISCLOSURE OF THE INVENTION
`
`4
`FIG. 2 is a diagram illustrating headers of datagrams.
`FIG. 3 is a functional block diagram illustrating an
`exemplary normal TCP communication.
`FIG. 4 1s a functional block diagram illustrating C/S
`flows.
`FIG. 5 is a functional block illustrating a flow-based
`intrusion detection engine.
`FIG. 6 is a table illustrating concern index value for C/S
`flows.
`FIG. 7 is a table illustrating concern index values for other
`host activities.
`FIG. 8 is a functional block diagram illustrating hardware
`architecture.
`FIG. 9, consisting of FIGS. 9A through 9C, are flow charts
`15 of the program threads in an exemplary embodiment of the
`invention.
`
`The present invention provides a more accurate and
`reliable method for detecting network attacks based in large
`part on "flows" as opposed to signatures or anomalies. This 20
`novel detection system does not require an updated database
`of signatures. Instead,
`the
`intrusion detection system
`inspects all inbound and outbound activity and identifies
`suspicious patterns that denote non-normal flows and may
`indicate an attack. The computational simplicity of the 25
`technique allows for operation at much higher speeds than is
`possible with a signature-based system on comparable hard(cid:173)
`ware.
`According to one aspect of the invention, the detection
`system works by assigning data packets to various client/
`server (C/S) flows. Statistics are collected for each deter(cid:173)
`mined flow. Then, the flow statistics are analyzed to deter(cid:173)
`mine if the flow appears to be legitimate traffic or possible
`suspicious activity. A value, referred to as a "concern index,"
`is assigned to each flow that appears suspicious. By assign(cid:173)
`ing a value to each flow that appears suspicious and adding
`that value to an accumulated concern index associated with
`the responsible host, it is possible to identify hosts that are
`engaged in intruder activity without generation of significant
`unwarranted false alarms. When the concern index value of
`a host exceeds a preset alarm value, an alert is issued and
`appropriate action can be taken.
`Generally speaking, the intrusion detection system ana(cid:173)
`lyzes network communication traffic for potential detrimen(cid:173)
`tal activity. The system collects flow data from packet
`headers between two hosts or Internet Protocol (IP)
`addresses. Collecting flow data from packet headers asso(cid:173)
`ciated with a single service where at least one port remains
`constant allows for more efficient analysis of the flow data.
`The collected flow data is analyzed to assign a concern index 50
`value to the flow based upon a probability that the flow was
`not normal for data communications. A host list is main(cid:173)
`tained containing an accumulated concern index derived
`from the flows associated with the host. Once the accumu-
`lated concern index has exceeded an alarm threshold value, 55
`an alarm signal is generated.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Benefits and further features of the present invention will
`be apparent from a detailed description of preferred embodi(cid:173)
`ment thereof taken in conjunction with the following draw(cid:173)
`ings, wherein like elements are referred to with like refer(cid:173)
`ence numbers, and wherein:
`FIG. 1 is a functional block diagram illustrating a flow(cid:173)
`based intrusion detection system constructed in accordance
`with a preferred embodiment of the present invention.
`
`BEST MODE
`
`The described embodiment discloses a system that pro(cid:173)
`vides an efficient, reliable and scalable method of detecting
`network intrusions by analyzing communication flow sta(cid:173)
`tistics. The network intrusions are detected by a flow-based
`engine that characterizes and tracks network activities to
`differentiate between abnormal activity and normal commu(cid:173)
`nications. Flow-based detection does not rely on analyzing
`the data of packets for signatures of known attacks. Ana(cid:173)
`lyzing character strings for know attacks is extremely
`resource intensive and does not protect against new
`30 unknown attacks. Instead, the present intruder detection is
`accomplished by analyzing communication flows to deter(cid:173)
`mine if the communication has the flow characteristics of
`probes or attacks. Those skilled in the art will readily
`appreciate that numerous communications in addition to
`35 those explicitly described may indicate intrusion activity. By
`analyzing communications for flow abnormal flow charac(cid:173)
`teristics, attacks can be determined without the need for
`resource intensive packet data analysis.
`However, it is useful to discuss the basics of Internet
`40 communications to gain an understanding of the operation of
`the flow-based engine. Consequently, initially an overview
`of a flow-based detection system will be discussed. Follow(cid:173)
`ing the overview, discussions on various aspects of Internet
`communications will follow. A detailed functionality of the
`45 flow-based engine of the present invention is described in
`detail in reference to FIG. 5 through FIG. 9.
`
`Overview
`Turning to the figures, in which like numerals indicate
`like elements throughout the several figures, FIG. 1 provides
`an overview of a flow-based intrusion detection system or
`engine 155 in accordance with an exemplary embodiment of
`the present invention. The flow-based intrusion detection
`system 155 monitors network computer communications.
`The network computer communications are routed via a
`known global computer network commonly known as the
`Internet 199. In accordance with an aspect of the invention,
`the intrusion detection engine 155 is incorporated into a
`monitoring appliance 150, together with a database 160 that
`60 stores information utilized in the intrusion detection meth(cid:173)
`odology.
`The operating environment of the intrusion detection
`system 155 is contemplated to have numerous hosts con(cid:173)
`nected by the Internet 199, e.g. Host #1, Host #2, Host #3
`65 (also referred to as Hl-H3 respectively). Hosts are any
`computers that have full two-way access to other computers
`on the Internet 199 and have their own unique IP address.
`
`
`
`US 7,185,368 B2
`
`5
`For example Host #1 has an exemplary IP address of
`208.60.239.19. The Internet 199 connects clients 110 with a
`host server 130 in known client/server relationship.
`In a typical configuration, some computers are referred to
`as "servers", while others are referred to as "clients." A
`server computer such as Host #2 130 typically provides
`responses to requests from client computers and provides
`services, data, resources, and the like. While a client com(cid:173)
`puter such as Host #1 110 typically requests and utilizes the
`services, data, resources, and the like provided by the server.
`It is known in the art to send communications between
`hosts via the Internet 199. The Internet Protocol (IP) is the
`method by which data is sent from one host computer to
`another on the Internet 199. Each host on the Internet 199
`has an IP address that uniquely identifies it from all other
`computers. When data is transmitted, the message gets
`divided into packets 101. Packets 101 are discussed in more
`detail in reference to FIG. 2.
`Each IP packet 101 includes a header that contains both
`the sender's Internet address and receiver's Internet address. 20
`The packets 101 are forwarded to the computer whose
`address is specified. Illustrated is a legitimate user/client
`110, host #1 (Hl), with an IP address of 208.60.239.19 and
`a server, host #2 (H2), with an IP address of 128.0.0.1.
`As shown, a client 110 communications with a server 130 25
`by sending packets 101 of data. A packet 101 is a unit of data
`that is routed between an origin and destination. As illus(cid:173)
`trated, messages are segmented into numerous packets 101
`and routed via the Internet 199 to the receiving host. The
`receiving host reassembles the stream of packets 101 to 30
`recreate the original message, which is then handled by
`application programs running on the receiving computer
`system.
`However, some of the hosts may be intruders 120, com(cid:173)
`monly referred to as hackers or crackers. Intruders 120 35
`exploit vulnerable computers. As shown, the intruder 120 is
`a host with its own IP address of 110.5.47.224. The intruder
`120 also communicates by sending packets 101 via the
`Internet 199. As previously stated, the packets 101 contain
`the IP address of the originator and destination to ensure
`proper routing. As shown, the stream of packets 101 sent by
`the intruder 120 can be interleaved with the packets 101 sent
`by other hosts. The packets 101 contain header information
`that enables the receiving host to reassemble the interleaved
`stream of packets into the original messages as sent.
`Normal client/server
`(C/S) communication activity
`includes sending e-mails, Web traffic, file transfers, and the
`like. Communications via the Internet 199 need to be sent to
`a specific IP address and to a specific service contact port. A
`"port" is known to those skilled in the art as an arbitrarily
`assigned number to which a particular type of computing
`service is assigned in conventional Internet computer-to(cid:173)
`computer communications, e.g. web traffic is conventionally
`on port 80, FTP traffic on ports 20 and 21, etc. The IP address
`specifies a specific host while the service contact port 55
`number identifies a particular server program or service that
`the host computer may provide. Present day port numbers
`range from Oto 65,535. As shown in FIG. 1, a number of
`frequently-used services or processes have conventionally
`assigned service contact port numbers and are referred to as 60
`well-known port numbers maintained by the Internet
`Assigned Number Authority (IANA). These assigned port
`numbers are well known in the art and are typically the low
`numbered ports between 0 and 1023. Currently, certain
`higher numbered ports have also been assigned.
`A service port chart in FIG. 1 lists some common services
`that present day Internet-based computer systems may pro-
`
`6
`vide. Outgoing email typically utilizes the known Simple
`Mail Transfer Protocol (SMTP) which is implemented over
`the service contact port 25. For the Hypertext Transfer
`Protocol (HTTP) communications, Web browsers open an
`5 ephemeral high port number to initiate Web traffic that is
`sent to the host server port 80. File Transfer Protocol (FTP)
`control communications are sent to the server port 21, while
`FTP data transfer originates from port 20. The FINGER
`service utilizes service contact port 79, the domain name
`10 service (DNS) utilizes service contact port 53, and Telnet
`communications utilize service contact port 23. As illus(cid:173)
`trated, common services are typically associated with spe(cid:173)
`cific predetermined service contact ports.
`Also illustrated in FIG. 1 are four flows, Fl through F4,
`15 between by client host #1 110 and service host #2 130. Flow
`Fl is a file transfer utilizing the File Transfer Protocol (FTP).
`As shown, the file transfer (flow Fl) is delivered by a stream
`of packets 101 (Pl-P3) that will be reassembled by the
`receiving host 110.
`After the file transfer is completed, the client 110 initiates
`an HTTP Web session (flow F2) with server 120. Those
`skilled in the art understand that a Web session typically
`occurs when an Internet browser computer program such as
`MICROSOFT INTERNET EXPLORER or NETSCAPE
`NAVIGATOR requests a web page from a World Wide Web
`(WWW) service on port 80. Packets P4, PS, P6, and P9 are
`associated with the Web traffic of flow F2. These packets
`may contain data such as a JPG format picture to be
`displayed, text, a JAVA program, or other informational
`materials to be displayed or handled by the client's Internet
`browser program.
`Continuing the example of FIG. 1, while the web session
`of flow F2 is still open, the client 110 sent an email
`illustrated by flow F3. As shown, the email packets of flow
`F3 may be interleaved with the previously opened Web
`session of flow F2. As illustrated, packets P7, PS, and P12
`contain the e-mail message.
`Finally, the client 110 requests another web page from the
`server 120, initiating yet another HTTP flow F4. Packets P9,
`40 PlO, Pll, P12, and P14 represent the new Web traffic.
`In accordance with an aspect of the invention, a flow is
`considered terminated after a predetermined period of time
`has elapsed on a particular connection or port. For example,
`if HTTP Web traffic on port 80 ceases for a predetermined
`45 period of time, but other traffic begins to occur on port 80
`after the expiration of that predetermined time period, it is
`considered that a new flow has begun, and the system
`responds accordingly to assign a new flow number and track
`the statistics and characteristics thereof. In the disclosed
`50 embodiment, the predetermined time period is 330 seconds,
`but those skilled in the art will understand that this time is
`arbitrary and may be heuristically adjusted.
`Although the preferred embodiment utilizes the elapse of
`a predetermined period of time to delimit flows, those skilled
`in the art will understand and appreciate that other events,
`indica