throbber
United States Patent [191
`Arnold et al.
`
`US005440723A
`Patent Number:
`[11]
`[45] Date of Patent:
`
`5,440,723
`Aug. s, 1995
`
`[54] AUTOMATIC IMMUNE SYSTEM FOR
`COMPUTERS AND COMPUTER
`NETWORKS
`
`[75] Inventors: William C. Arnold, Mahopac; David
`M. Chess, Mohegan Lake; Jeffrey O.
`Kephart, Yorktown Heights; Steven
`R. White, New York, all of NY.
`[73] Assignee: International Business Machines
`Corporation, Armonk, NY.
`[21] Appl. No.: 4,872
`[22] Filed:
`Jan. 19, 1993
`
`[51] Int. Cl.6 ............................................ .. G06F 11/00
`[52] US. Cl. .................................. .. 395/ 181; 395/700;
`395/183.09; 395/183.14
`[58] Field of Search ............... .. 395/575; 371/ 16.5, 19,
`37l/11.2, 8.2
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`5,062,045 10/1991 Janis et al. ........................ .. 371/l6.5
`5,084,816 l/1992 Boese et a1. .
`395/575
`5,121,345 l/1992 Lentz ..... ..
`.
`364/550
`5,200,958 4/1993 Hamilton et a1.
`37l/16.5
`
`5,218,605 l/l993 Low et a1. . . . . . . . . . . .
`
`. . . . . .. 371/19
`
`37l/l6.5
`5,255,208 10/1993 Thakore et a1. .
`...... .. 380/4
`5,278,901 l/1994 Shieh et a1.
`371/16.5
`5,291,590 3/1994 Ohnishi et a1. ..
`371/19
`5,297,150 3/1994 Clark ......... ..
`. 395/575
`5,319,776 6/1994 Hile et a1.
`...... .. 380/4
`5,359,659 10/1994 Rosenthal
`5,361,359 11/1994 Tajallie et al. .................... .. 395/700
`
`OTHER PUBLICATIONS
`
`Qasem et a1. “AI Trends in Virus Control” 1991 IEEE
`Proc. of Southeaston pp. 99403 vol. 1.
`Crocker et al. “A Proposal for a Veri?cation-Based
`Virus Filler” 1989 IEEE Symposium on Security &
`Privacy pp. 3l9~324.
`Kephort et a1. “Directed Graph Epidemiological Mod
`ule of Computer Viruses” 1991 IEEE Computer Soci
`ety Symposium on Research in Security & Privacy pp.
`343-359.
`Kumor et al. “A Generic Virus Scanner in C+ +” 1992
`
`8th Ann. Computer Security Applications Proceedings
`pp. 210-219.
`Shoutkov et a1. “Computer Viruses: Ways of Reproduc
`tion in MS DOS" 25th Ann. 1991 IEEE International
`Carnahan Conf. on Security Tech. pp. 168-176.
`(List continued on next page.)
`
`Primary Examiner—Robert W. Beausoliel, Jr.
`Assistant Examiner-Joseph E. Palys
`Attorney. Agent, or Firm—-Perman & Green
`[57]
`ABSTRACT
`A method includes the following component steps, or
`some functional subset of these steps: (A) periodic moni~
`toring of a data processing system (10) for anomalous
`behavior that may indicate the presence of an undesir
`able software entity such as a computer virus, worm, or
`Trojan Horse; (B) automatic scanning for occurrences
`of known types of undesirable software entities and
`taking remedial action if they are discovered; (C) de
`ploying decoy programs to capture samples of un
`known types of computer viruses; (D) identifying ma
`chine code portions of the captured samples which are
`unlikely to vary from one instance of the virus to an
`other; (E) extracting an identifying signature from the
`executable code portion and adding the signature to a
`signature database; (F) informing neighboring data pro
`cessing systems on a network of an occurrence of the
`undesirable software entity; and (G) generating a dis
`tress signal, if appropriate, so as to call upon an expert
`to resolve dif?cult cases. A feature of this invention is
`the automatic execution of the foregoing steps in re
`sponse to a detection of an undesired software entity,
`such as a virus or a worm, within a data processing
`system. The automatic extraction of the identifying
`signature, the addition of the signature to a signature
`data base, and the immediate use of the signature by a
`scanner provides protection from subsequent infections
`of the system, and also a network of systems, by the
`same or an altered form of the undesirable software
`entity.
`
`46 Claims, 7 Drawing Sheets
`
`SYEIIFOR
`IMTOR $7
`MOIALQS IMVKIR
`
`1
`
`SYMC 1007
`
`

`

`5,440,723
`Page 2
`
`OTHER PUBLICATIONS
`S. W. Shieh et al. “A Pattern-Oriented Intrusion-De
`tection Model and its Applications”, Proceedings of the
`1991 IEEE Computer Society Symposium on Reserach
`and Privacy, pp. 327-342.
`H. S. Javitz et al. “The SRI IDES Statistical Anomaly
`Detector", Proceedings of the 1991 IEEE Computer
`Symposium on Research in Security and Privacy, pp.
`316-326.
`
`W. Arnold et al. “System for Detecting Undesired Al
`teration of Software”, IBM TDB, vol. 32, No. 11, Apr.
`1990, pp. 48~50.
`S. M. Katz, “Estimation of Probabilities from Sparse
`Data for the Language Model Component of a Speech
`Recognizer”, IEEE Trans. ASSP-35, No. 3, Mar. 1987,
`pp. ‘400-401.
`F. Cohen, A Short Course on Computer Viruses, ASP
`Press, Pittsburg, 1990, pp. 9—l5.
`
`2
`
`SYMC 1007
`
`

`

`US. Patent
`
`Aug. s, 1995 4
`
`Sheet 1 of7
`
`5,440,723
`
`NETWORK
`4————>
`
`NA\
`3!
`:
`/l8
`
`BUS /
`I/m
`
`I2
`
`/I6
`
`RAM
`
`NGT 36
`
`CPU
`
`I
`
`TERMINAL
`CONTROL
`
`\22
`
`DISPLAY MANUAL
`DEVICE
`INPUT
`
`1/24 2a\3
`
`HARD DISK
`SYSTEM
`CONTROL
`
`IA
`
`HARD DISK
`
`FLOPPY DISK
`SYSTEM
`CoNTRoL
`
`I
`
`FLOPPY DISK
`DRIVE
`
`FIG IA
`.
`I0
`
`"P
`'F
`32/ 34/
`
`0
`/
`30o
`
`BUILD NGRAM LIsT
`
`'“A
`
`CALCULATE NGRAM PROBABILITlES
`
`B
`
`FUZZY-MATCH C2 C
`C \EXACT- MATCH
`'
`CALCULATION —*’ CALCULATION /
`
`COMBINE SCORES
`i
`REPORT RESULTS "E
`
`"*D
`
`FIG. 5
`
`3
`
`SYMC 1007
`
`

`

`US. Patent
`
`Aug. s, 1995
`
`Sheet 2 of 7
`
`5,440,723
`
`TO OTHER
`_E_ROCESSORS
`
`élé INFECTED
`
`FIG. IC
`
`4
`
`SYMC 1007
`
`

`

`US. Patent
`
`Aug. s, 1995
`
`Sheet 3 0f 7
`
`5,440,723
`
`MONITOR SYSTEM FOR /A
`ANOMALOUS BEHAVIOR
`
`ALERT usER,
`REMOVE vmus
`
`8
`I
`
`ANOMALOUS sEDHsAvéggzrED
`SCAN FOR KNOWN
`T
`VIRUSES
`,_B
`
`UNKNOWN VIRUS
`
`DEPLOY DECOY PROGRAMSlfC
`
`TO CAPTURE UNKNOWN
`VIRUS
`
`IDENTIFY INVARIANT
`CODE PORTIONS OF
`VIRUS SAMPLEIS)
`
`I
`
`EXTRACT SIGNATURE
`FROM CODE, ADD TO
`SCANNER'S DATABASE
`
`IE
`
`I
`INFORM OTHER NETWORK
`PROCESSORS OF VIRUS
`INFECTION
`
`GENERATE DISTRESS
`SIGNAL, IF REQUIRED
`
`FIG. 2
`
`5
`
`SYMC 1007
`
`

`

`US. Patent
`
`Aug. 8, 1995
`
`Sheet 4 of 7
`
`5,440,723
`
`mwzioflz
`
`
`.34a$88Emacs.“.o
`3:50:h:ozn:
`6:8835.82
`
`mam;u:
`
`0230.”.
`
`>44<DZ_._.ZOO
`
`no.5)23m
`
`
`
`Ime=>Zgozx
`
`4.._.ozn:
`
`__
`
`_
`
`“$02410n:_owwz<IOn:
`
`mmqmfl—bowxm
`
`>I_<_zoz<u:
`
`
`
`maxomro23m
`
`
`
`*444050Ewmmaxowzo23m
`
`>uj<Dz_._.zoo047523m
`
`ZBOzxn:
`
`mam;
`
`
`
`omwz<xoZO2<0mm=>23m
`
`wwqmflraomxw
`
`mwoz<1002n:
`
`OZEOmIsz
`
`
`
`
`
`wi<mwoma>Oomo>041we
`
`
`
`uo<mmm2m3k<kw
`
`mmm:O...
`
`
`
`
`
`
`
`10.:uhqudhum23:52ukmw4<.rwm
`
`65...“.
`
`$43586
`
`._
`
`
`
`$5912...=
`
`385E623zmzofiommz.
`
`
` :93noozn:momzofimoa88{:59SEEwEmwzme2
`
`macawmax;
`
`mmaxo<mmo>mw>
`
`025:n5Z<w40
`
`
`
`O._.vawmsh<zgm004
`
`ww<m<k<o
`
`
`
`
`
`Amymgr—3.205ko<mkxw
`
`m.9“.
`
`
`
`SYMC 1007
`
`6
`
`SYMC 1007
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Aug. 8, 1995
`
`Sheet 5 of 7
`
`5,440,723
`
`FILTER OUT PORTIONS OF VIRUS
`THAT ARE OBSERVED TO VARY
`FROM INSTANCE TO ANOTHER
`
`/"'A
`
`L
`
`FILTER OUT PORTIONS OF VIRUS.
`THAT ARE INTRINSICALLY THE MOST
`LIKELY TO VARY FROM INSTANCE
`/‘ "B
`TO ANOTHER
`
`1st INSTANCE
`I32 6E I3 AF 5? 9B 90 I4 43 60 5c 35 23 I9 85 I
`BYTE I’
`I
`‘BYTE n
`I
`2nd INSTANCE
`I
`76 9A I6 DB AB I9 8A I4 43 60 5c 35 2s 23 A4}
`
`I
`I
`'
`3rd INSTANCE
`‘4| 42 3A I8 7A CA DF I4 43 60 5C 35 25 IO 7D
`I
`VARIANT
`|"INvARIANT“
`lVARIANT I
`I
`PORTION
`| PORTION
`I PORTION I
`FIG. 6A
`
`BYTE I
`BE DE 8C 06 OI 0O 8O C6 8E DE 3I CO
`BI DB 3\ C9 3I D2 CB F9 F8 FF 6F F5
`BYTE 24
`
`FIG. 6B
`
`7
`
`SYMC 1007
`
`

`

`US. Patent
`
`Aug. s, 1995
`
`Sheet 6 of7
`
`I
`
`5,440,723
`
`SEGREGATE CORPUS
`INTO PROBE, TRAINING
`AND TEST SETS
`
`SELECT BYTE-STRINGS
`FROM PROBE SET
`
`USE TRAINING SET TO
`ESTIMATE PROBABILITIES
`
`COUNT FREQUENCY IN
`TEST SET '
`
`PRODUCE LIST OF
`ESTIMATE PROBABILITIES
`VS FREQUENCY
`
`DETERMINE FALSE
`POSITIVE AND FALSE-
`REJECTION PROBABILTY
`
`F
`
`I
`
`I SELECT THRESHOLD
`
`"G
`
`FIG. 7
`
`8
`
`SYMC 1007
`
`

`

`U
`
`1
`
`8
`
`t
`
`5,440,723
`
`
`
`me.=m<moEmoEzfimm
`
`S.xmozfimz
`
`no”om_mE8.
`ammo5:53523ontwe
`«5330?.wmmt33523mmaazgw53.51290as.
`
`
`
`$2.2sz>t£m<moE
`
`ko<xmn_
`
`
`8,
`
`85:25myNo52292“.n..Ea39:.ApzmzoémaaEmmi2:mmzzqom
`
`o.»
`
`mm
`
`255-2mm.203w33.38%
`
`Pmoamm
`
`Emma“g
`
`
`
`5.22:3.mo5a$2628E36458“.
`mmazzmgm«8.05%99mo.._<>
`
`
`7mmE.__m<momn_
`
`
`,m245-2a:2:o350E7no32:Some
`
`.5...$on
`
`mo
`
`$$.mme3
`
`eP8
`
`
`
`m.9...
`
`fimwfimmm80.55.3:m29528.5a:£0sz
`
`.mmoom.®Nw.
`wmabuzgmDj<>
`
`SYMC 1007
`
`9
`
`SYMC 1007
`
`
`
`
`
`

`

`1
`
`AUTOMATIC IlVIlVIUNE SYSTEM FOR
`COMPUTERS AND COMPUTER NETWORKS
`
`CROSS-REFERENCE TO A RELATED PATENT
`APPLICATION
`This patent application is related to commonly as
`signed U.S. patent application Ser. No. 08/004,871, ?led
`Jan. 19, 1993, entitled “Methods and Apparatus for
`Evaluating and Extracting Signatures of Computer
`Viruses and Other Undesirable Software Entities”, by
`Jeffrey O. Kephart
`
`FIELD OF THE INVENTION
`This invention relates generally to digital data pro
`cessors and, in particular, to methods and apparatus for
`providing computational integrity for digital data pro
`cessors and networks thereof.
`
`15
`
`5,440,723
`2
`programmed to detect explicitly. The problem of deal
`ing with new viruses has typically been addressed by
`distributing updates of scanning programs and/ or auxil
`iary ?les containing the necessary information for iden
`tifying the latest viruses. However, the increasing rate
`at which new viruses are being written is widening the
`gap between the number of viruses that exist and the
`number of viruses that can be detected by an apprecia
`ble fraction of computer users. As a result, it is becom
`ing increasingly likely that a new virus will become
`wide~spread before any remedy is generally available.
`At least one mechanism to inform other computers
`connected to a network of a presence of a viral infection
`has been previously advanced. For example, a “net
`hormones” concept has been proposed by David Sto
`dolsky, wherein every computer on a network keeps
`archives of all potential virus-carrying contacts with
`every other computer. When a computer ?nds that it is
`infected with a virus, it sends a message to every com
`puter that it has ever contacted or been contacted by,
`which in turn send messages to all of their contactees or
`contactors, etc. However, it is believed that this ap
`proach will result in most or all computers quickly
`becoming overloaded by infection messages from most
`of the other computers on the network.
`As such, a need has arisen to develop methods for
`automatically recognizing and eradicating previously
`unknown or unanalyzed viruses on individual comput
`ers and computer networks. An ef?cient method is also
`required for informing other computers on the network
`as to the existence of a computer virus within the net
`work.
`It is an object of this invention to provide methods
`and apparatus to automatically detect and extract a
`signature from an undesirable software entity, such as a
`computer virus or worm.
`It is further object of this invention to provide meth
`ods and apparatus for immunizing a computer system,
`and also a network of computer systems, against a sub
`sequent infection by a previously unknown and undesir
`able software entity.
`
`40
`
`BACKGROUND OF THE INVENTION
`A computer virus has been de?ned by Frederick B.
`Cohen as a program that can infect other programs by
`modifying them to include a, possibly evolved, version
`of itself (A Short Course on Computer Viruses, ASP
`Press, Pittsburg, 1990, page 11).
`25
`As employed herein, a computer virus is considered
`to include an executable assemblage of computer in
`structions or code that is capable of attaching itself to a
`computer program. The subsequent execution of the
`viral code may have detrimental effects upon the opera
`tion of the computer that hosts the virus. Some viruses
`have an ability to modify their constituent code, thereby
`complicating the task of identifying and removing the
`virus.
`A worm is a program or collection of programs that
`can cause a possibly evolved version of itself to be exe
`cuted on the same or possibly on a different computer.
`A Trojan Horse is a block of undesired code that is
`intentionally hidden within a block of desirable code.
`Both computer viruses, worms, and Trojan Horses
`are considered to be members of a class of undesirable
`software entities, the presence of which within a data
`processor, or network of data processors, is to be
`avoided so as to maintain computational integrity.
`A widely-used method for the detection of computer
`45
`viruses is known as a virus scanner. A virus scanner
`employs short strings of bytes to identify particular
`viruses in executable ?les, boot records, or memory.
`The byte strings (referred to as signatures) for a particu
`lar virus must be chosen with care such that they always
`discover the virus, if it is present, but seldom give a
`“false alarm”, known as a false positive. That is, the
`signature must be chosen so that the byte string is one
`that is unlikely to be found in programs that are nor
`mally executed on the computer. Typically, a human
`expert makes this choice by converting the binary ma
`chine code of the virus to an assembler version, analyz
`ing the assembler code, selecting sections of code that
`appear to be unusual or virus-like, andidentifying the
`corresponding bytes in the binary machine code so as to
`produce the signature. Wildcard bytes can be included
`within the signature to provide a match with any code
`byte appearing in a virus.
`Currently, a number of commercial computer virus
`scanners are successful in alerting users to the presence
`of viruses that are known apriori. However, conven
`tional virus scanners are typically unable to detect the
`presence of computer viruses which they have not been
`
`55
`
`65
`
`SUMMARY OF THE INVENTION
`The foregoing and other problems are overcome and
`the objects of the invention are realized by a method
`that provides computational integrity for a digital data
`processor and a network of digital data processors. The
`method includes the following component steps, or
`some functional subset of these steps:
`(A) periodic monitoring of a data processing system
`for anomalous behavior that may indicate the presence
`of an undesirable informational state, such as one arising
`from the presence of an undesirable software entity,
`such as a computer virus or worm;
`(B) automatic scanning for occurrences of known
`types of undesirable software entities;
`(C) deploying decoy programs to capture samples of
`unknown types of computer viruses;
`(D) automatically identifying portions of a captured
`virus sample which are likely to remain invariant from
`one instance of the virus to another;
`(E) extracting an identifying signature from the in
`variant portion and adding the signature to a signature
`database;
`(F) informing neighboring data processing systems
`on a network of an occurrence of the undesirable soft
`ware entity; and
`
`10
`
`SYMC 1007
`
`

`

`3
`(G) generating a distress signal, if appropriate, so as
`to call upon an expert to resolve dif?cult cases.
`A feature of this invention is the automatic execution
`of the foregoing steps in response to a detection of an
`undesired software entity, such as a virus, worm, or
`Trojan Horse, within a data processing system. Further
`more, the automatic extraction of the identifying signa
`ture, and the addition of the signature to a signature data
`base, provides protection, or “immunity”, to subsequent
`infections of the system, and also a network of systems,
`by the same or an altered form of the undesirable soft
`ware entity.
`
`H 0
`
`25
`
`35
`
`5,440,723
`4
`floppy diskette 30a. A network adapter (NA) 31 is pro
`vided for coupling the system 10 to a network.
`The components illustrated in FIG. 10 may be em
`bodied within a personal computer, a portable com
`puter, a workstation, a minicomputer, a main frame
`computer, or a supercomputer. As such, the details of
`the physical embodiment of the data processing system
`10, such as the structure of the bus 12 or the number of
`CPUs 14 that are coupled to the bus, is not crucial to the
`operation of the invention, and is not described in fur
`ther detail below.
`As employed herein, the informational state of the
`data processing system 10, at a given moment in time, is
`considered to be de?ned by the contents (bytes) of all
`memory or storage devices to which the system has
`access. These include, but are not limited to, RAM,
`registers, hard disks, floppy disks that are inserted into
`floppy disk drives, and magnetic tape that is inserted
`into a tape drive.
`The informational state history of the system 10 is
`considered to be the history of its informational state at
`all moments in time.
`For the case where the data processing system 10 (P1
`of FIG. 1b) is connected to other data processing sys
`tems (P2-P7) via a network, such as the exemplary
`network depicted in FIG. 117, an additional data proces
`sor (PD) may be added as a dedicated decoy program
`server, as described below.
`The method of this invention includes the following
`component steps, as illustrated in FIG. 2, or some func
`tional subset of these steps.
`Step A: Periodic monitoring of the data processing
`system 10 for anomalous, potentially virus-like behav
`ior.
`Step B: Automatic scanning for occurrences of
`known viruses, and removal of any which are found to
`be present.
`Step C: Deployment of decoy programs to capture
`virus samples.
`Step D: Segregation of a captured virus sample into
`portions that are likely to vary from one instance of the
`computer virus to another instance, and into portions
`that are likely to be invariant from one instance to an
`other.
`Step E: Extraction of a viral signature from the in
`variant portion(s) and the addition of the signature to a
`signature database.
`Step F: Informing neighboring data processing sys
`tems on a network of an occurrence of a viral infection.
`Step G: Generation of a distress signal, if appropriate,
`so as to call upon human experts to resolve dif?cult
`cases.
`A feature of this invention is the automatic execution
`of the foregoing steps in response to a detection of an
`undesired software entity, such as a virus or a worm,
`within the data processing system 10.
`A discussion is now made of each of the above refer
`enced Steps A-G of FIG. 2.
`Step A: Anomaly Detection
`The ?rst step in the process of this invention detects
`anomalous system behavior of a type that may indicate
`the presence of an undesirable informational state re
`sulting from the presence of a virus or some other unde
`sirable software entity, such as a worm or a Trojan
`Horse. The detection of anomalous behavior within a
`computer or computer network can be accomplished
`through the use of known techniques, preferably a tech
`
`40
`
`BRIEF DESCRIPTION OF THE DRAWING
`The above set forth and other features of the inven
`tion are made more apparent in the ensuing Detailed
`Description of the Invention when read in conjunction
`with the attached Drawing, wherein:
`FIG. 1a is a block diagram of a computer system that
`is suitable for use in practicing the invention;
`FIG. 1b illustrates an exemplary computer network
`that includes the computer system of FIG. 10;
`FIG. 1c illustrates the exemplary computer network
`of FIG. 1b and the use of a kill signal that is transmitted
`by infected computers to neighboring computers;
`FIG. 2 is a flow chart depicting the constituent steps
`of a method of the invention;
`FIG. 3 is a more detailed flow chart depicting the
`operation of the method of the invention;
`FIG. 4 is a flow chart depicting a method for use in
`segregating variant virus portions from invariant virus
`portions;
`FIG. 5 is a flow chart that shows the operation of a
`statistically based computer virus signature extraction
`method;
`FIG. 6a illustrates exemplary portions of three in
`stances of a computer virus, the three portions each
`including an invariant portion and variant portions;
`FIG. 6b illustrates an exemplary 24-byte “invariant”
`portion of a computer virus;
`FIG. 7 is a flow chart that depicts a method of select
`ing a probability threshold used for accepting or reject
`ing a candidate signature; and
`FIG. 8 is a block diagram of a system operable for
`executing the method of the invention.
`DETAILED DESCRIPTION OF THE
`INVENTION
`FIG. 1a is a block diagram of a data processing sys
`tem 10 that is suitable for practicing the teaching of the
`invention. A bus 12 is comprised of a plurality of signal
`lines for conveying addresses, data, and controls be
`tween a Central Processing Unit 14 and a number of
`other system bus units. A RAM 16 is coupled to the
`system bus 12 and provides program instruction storage
`and working memory for the CPU 12. A terminal con
`trol subsystem 18 is coupled to the system bus 12 and
`provides outputs to a display device 20, typically a CRT
`monitor, and receives inputs from a manual input device
`22, typically a keyboard. Manual input may also be
`60
`provided from a pointing device, such as a mouse. A
`hard disk control subsystem 24 bidirectionally couples a
`rotating ?xed disk, or hard disk 26, to the system bus 12.
`The control 24 and hard disk 26 provide mass storage
`for CPU instructions and data. A floppy disk control
`subsystem 28 bidirectionally couples one or more
`?oppy disk drives 30 to the system bus 12. The ?oppy
`disk drive 30 works in conjunction with a removable
`
`45
`
`50
`
`65
`
`11
`
`SYMC 1007
`
`

`

`5
`nique that detects anomalies that may indicate a virus.
`One suitable anomaly detection technique, that uses
`patterns of activity undertaken by computational pro
`cesses, is described in an article entitled “A Pattern
`Orientated Intrusion-Detection Model and its Applica
`tions”, S. W. Shieh and V. D. Gligor, Proceedings of
`the 1991 IEEE Computer Society Symposium on Re
`search in Security and Privacy”. Another suitable tech
`nique detects unexpected changes in executable or re
`source ?les. As an example, F. Cohen, A Short Course on
`Computer Viruses, ASP Press, Pittsburg, 1990, discusses
`the use of integrity shells employing modi?cation detec
`tors. In this case, modi?cation of ?les is detected by
`cryptographic or non-cryptographic checksums, and
`the user is noti?ed if a modi?cation is detected.
`The method at Step A may employ any one or a
`combination of. these or similar techniques. The end
`result is the detection by the system 10 of an anomalous
`system behavior that may be indicative of the presence
`of a computer virus, or some other undesirable software
`entity.
`The anomaly detection process or processes may be
`run continually as a background task, or periodically as
`a foreground task, depending on the desire of the user
`and/ or the nature of the underlying operating system.
`
`Step B: Scan for Known Viruses
`If preliminary evidence of virus-like activity is de
`tected, additional computational resources are devoted
`to obtaining more conclusive evidence of viral infec
`tion. If a virus is present, there is a high probability that
`the virus is a well-known virus that can be readily de
`tected by a virus scanner. Thus, Step A scans an infor
`mational state history of the system, that is, all relevant
`media (eg ?les, boot records, memory), for known
`viruses. In one implementation, a pre-existing virus
`scanner is activated to search for a large set of patterns
`or signatures, each of which pertains to a different virus
`or virus family. It is highly desirable (but not essential)
`that the virus scanner be capable of detecting slight
`variations on known viruses. Many conventional scan
`ners tend to possess this capability, to a limited extent,
`simply by virtue of the fact that they search based only
`on short contiguous strings of bytes found in the virus.
`One especially suitable virus scanner, known as VIR
`SCAN, was available from the International Business
`Machines Corporation until recently, when it was in
`corporated into the IBM AntiVirus program. This virus
`scanner is particularly useful for detecting variations in
`known viruses because it permits a certain number of
`mismatches between a string of bytes in a ?le being
`examined and the virus signature string. This capability
`reduces the chance of having to proceed to Step C
`(decoy deployment), and is even more useful in the
`event that the virus is completely unrelated to any
`known virus, as is described below with respect to Step
`E. If the anomaly is found to be due to a known virus or
`some slight alteration of it, the method proceeds to Step
`B1 where the user is alerted, and the virus removed
`(killed) by traditional methods, such as restoration from
`backup (either automatically or manually by the user)
`or disinfection (removal of the virus from all of the
`software it has infected.) In general, disinfection is only
`acceptable if the virus is found to be an exact copy of a
`known virus. This implies that the system verify the
`identi?cation made by the virus scanner. .
`
`6
`Step C: Decoy Deployment
`If no known virus is detected, the anomaly may be
`due to some unknown virus. In this case, the method
`employs one or more decoy programs in an attempt to
`attract and obtain one or more samples of the unknown
`virus. The decoy programs could be commercially
`available computer programs which are installed on
`(and are capable of being executed on) the computer
`system. Preferably, however, they would be special
`purpose programs which exist solely for the purpose of
`capturing viruses.
`For the case of a DOS-based system, a variety of
`decoy programs having *.COM and *.EXE extensions
`(also referred to as executables) are created, executed,
`read, written to, copied, etc. The use of decoy programs
`for this purpose is described by W. C. Arnold and D. M.
`Chess in “System for Detecting Undesired Alteration of
`Software”, IBM Technical Disclosure Bulletin, Vol. 32,
`No. 11, pages 48-50, April 1990.
`In operation, original copies of the decoy programs
`are stored securely, by example in encrypted form or on
`a separate server, along with checksums of the originals.
`After a speci?ed time interval, or after other speci?ed
`conditions are met, checksums of some or all of the
`decoy programs are taken and compared directly to
`their originals. Alternatively, the decoy programs are
`compared directly to their originals. If none of the de
`coys are found to have changed, the user is alerted to
`the fact that an anomaly of some sort was detected, but
`that the system was unable to find conclusive evidence
`of a virus. In this case, it is desirable to maintain the
`system in a moderate state of alert for some time period,
`exercising the decoy programs at a reduced priority or
`less frequent intervals.
`It is desirable to have a large variety of decoy pro
`grams that differ from one another in length, content,
`and name. Furthermore, the decoy programs should be
`placed in and run from various directories, particularly
`those directories in which there are modi?ed executa
`bles.
`Some DOS viruses are known to become resident in
`memory and are active until the computer is rebooted.
`In such a case, the virus is able to infect any executable
`at any time. Often, a resident virus will infect executa
`bles as they are brought into memory to be executed.
`Thus, if several decoy programs are run, there is a rea
`sonable chance that one or more decoy programs will
`be infected by such a virus. In addition to running the
`decoys, the system should perform a variety of other
`operations with the decoy programs, including copying
`them, opening them, creating them, etc. All of these
`operations increase the probability of infection by a
`virus that is resident in memory.
`However, some DOS viruses do not go resident, and
`can infect other programs only when the host program
`in which the virus is imbedded is executed. Such viruses
`typically use static criteria for selecting a program to
`infect, such as the program name or the directory in
`which the program is located. In this case, it is more
`dif?cult for the decoy programs to become infected. To
`increase the chance of infection, decoy programs are
`preferably placed in locations where they are more
`likely to be selected, such as the directory of the possi
`blydnfected executable, or directories containing sys
`tem programs (e.g. the root directory) and commonly
`used utility programs. In order to infect the decoy pro
`
`5,440,723
`
`45
`
`65
`
`12
`
`SYMC 1007
`
`

`

`7
`grams, the potentially-infected executable should be run
`repeatedly.
`If one or more decoy programs is subsequently found
`to have changed from the original, protected version, it
`can be assumed that the changes are due to a virus. A
`comparison of each modi?ed decoy program with its
`corresponding uninfected version enables a copy of the
`virus to be isolated from each decoy.
`
`35
`
`55
`
`20
`
`5,440,723
`8
`grams, which can include representations of numerical
`constants, character strings, screen images, work areas
`for computations, addresses, etc., tend to be more likely
`to vary from one instance of the program to another
`than “code” portions, which represent machine instruc
`tions. It may be that most or all of the “data” bytes in a
`given virus are the same in all instances of that virus, but
`without a careful analysis this is dif?cult to establish
`with certainty. Thus, a conservative approach assumes
`that “data” bytes may change, and therefore it is desir
`able to disqualify them from being chosen as portions of
`candidate signatures. Perfect classi?cation of each byte
`as being “code” or “data” is not necessary. An errone
`ous classi?cation of “code” bytes as “data” bytes is
`more tolerable than an erroneous classi?cation of
`“data” bytes as “code” bytes.
`By example, FIG. 6b illustrates a 24-byte “invariant”
`portion of a virus sample that was generated at Step C.
`The ?rst 4 bytes represent machine instructions, bytes 5
`and 6 represent data, bytes 7-19 represent code, and
`bytes 20-24 represent data. Thus, if Block B of FIG. 4»
`were employed, only bytes 1~4 and 7-19 would be
`passed along to Step E for signature extraction.
`Suitable methods of code-data segregation include,
`but are not limited to, the following techniques.
`1. Conventional static disassembly techniques, such
`as flow analysis.
`2. Simulation of marked viral code in a virtual ma~
`chine, wherein the virtual machine includes a hardware
`and software model of the computer or class of comput
`ers for which the virus signature is being obtained.
`3. Running the marked viral code through an inter
`preter or debugger to determine which bytes are actu
`ally executed. This method requires a “safe” environ
`ment in which to run the virus, such as the dedicated
`server PD referred to above in the discussion of FIG. 1b.
`4. Statistical classi?cation techniques.
`The second method, i.e., simulation of the marked
`viral code in a virtual machine, is the safest and most
`reliable, but is often the most computationally intensive.
`The second and third methods each have the advan
`tage that they are capable of determining whether a
`virus is self-modifying. Viruses that modify themselves
`often do so for the purpose of obscuring their machine
`code portions by making the code appear different from
`one instance of the virus to the next. Such viruses are
`often referred to as self-garblers. Self-garblers typically
`consist of a short decrypting “head” portion followed
`by code which is transformed in an easily-invertible
`manner (e.g. XOR’ecl) using a randomly-chosen de
`cryption key when the virus is replicated. Thus, the
`only portion of the code which remains invariant, from
`one instance of the virus to another, is the head. It is
`helpful to know that a virus is self-modifying (and thus
`potentially self-garbling), so that signatures will only be
`drawn from the decrypting head. This information can
`also be helpful in identifying cases in which what ap
`peared to be several different viruses in Step C are
`actually only different instances of a single, self-garblin g
`virus.
`At the conclusion of Step D, there exists one or more
`sequences of bytes which appear in every instance of
`the one or more viruses that were obtained or generated
`at Step C and which, according to any heuristics em—
`ployed in Block B, are deemed likely to appear in most
`or all possible instances of the virus. The set of candi
`date viral signatures are all possible contiguous blocks
`of S bytes found in these “probably-invariant” or “sub
`
`Step D: Identi?cation of Invariant Code Portions
`In order to maximize the likelihood that a chosen
`signature will appear in all or nearly all progeny of the
`virus, it is preferred to ?lter out all portions of the virus
`which are observed to or are likely to vary from one
`instance of the virus to another. To achieve this, the
`method employs one, or preferably both, of the tech
`niques represented as Blocks A

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket