`Arnold et al.
`
`54 AUTOMATIC IMMUNE SYSTEM FOR
`COMPUTERS AND COMPUTER
`NETWORKS
`75 inventors: William C. Arnold, Mahopac; David
`M. Chess, Mohegan Lake; Jeffrey O.
`Kephart, Yorktown Heights; Steven
`R. White, New York, all of N.Y.
`73) Assignee: International Business Machines
`Corporation, Armonk, N.Y.
`21 Appl. No.: 4,872
`22 Filed:
`Jan. 19, 1993
`511 Int. Cl. .............................................. G06F 11/00
`I52
`U.S. Cl. .................................... 395/181; 395/700;
`395/183.09; 395/183.14
`58 Field of Search ................. 395/575; 371/16.5, 19,
`371/11.2, 8.2
`
`56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`5,062,045 10/1991 Janis et al. .......................... 37/16.5
`5,084,816 1/1992 Boese et al. ......................... 395/575
`5,121,345 l/1992 Lentz ....... -
`... 364/550
`5,200,958 4/1993 Hamilton et al.
`... 371/16.5
`5,218,605 l/1993 Low et al. ............................. 371/19
`5,255,208 10/1993 Thakore et al. .
`371/16.5
`5,278,901 1/1994 Shieh et al. ............................. 38O/4
`5,291,590 3/1994. Ohnishi et al. ..
`371/6.5
`5,297,150 3/1994 Clark ...........
`... 371/19
`5,319,776 6/1994. Hile et al.....
`395/575
`5,359,659 10/1994 Rosenthal ............................... 380/4
`5,361,359 11/1994 Tajallie et al....................... 395/700
`OTHER PUBLICATIONS
`Qasem et al. “AI Trends in Virus Control' 1991 IEEE
`Proc. of Southeaston pp. 99-103 vol. 1.
`Crocker et al. "A Proposal for a Verification-Based
`Virus Filler” 1989 IEEE Symposium on Security &
`Privacy pp. 319-324.
`Kephort et al. “Directed Graph Epidemiological Mod
`ule of Computer Viruses' 1991 IEEE Computer Soci
`ety Symposium on Research in Security & Privacy pp.
`343-359.
`Kumor et al. "A Generic Virus Scanner in C----' 1992
`
`USOO544O723A
`11) Patent Number:
`45) Date of Patent:
`
`5,440,723
`Aug. 8, 1995
`
`8th Ann. Computer Security Applications Proceedings
`pp. 210-219.
`Shoutkov et al. "Computer Viruses: Ways of Reproduc
`tion in MS DOS 25th Ann. 1991 IEEE International
`Carnahan Conf. on Security Tech. pp. 168-176.
`(List continued on next page.)
`
`Primary Examiner-Robert W. Beausoliel, Jr.
`Assistant Examiner-Joseph E. Palys
`Attorney. Agent, or Firm-Perman & Green
`57
`ABSTRACT
`A method includes the following component steps, or
`Some functional subset of these steps: (A) periodic moni
`toring of a data processing system (10) for anomalous
`behavior that may indicate the presence of an undesir
`able software entity such as a computer virus, worm, or
`Trojan Horse; (B) automatic scanning for occurrences
`of known types of undesirable software entities and
`taking remedial action if they are discovered; (C) de
`ploying decoy programs to capture samples of un
`known types of computer viruses; (D) identifying ma
`chine code portions of the captured samples which are
`unlikely to vary from one instance of the virus to an
`other; (E) extracting an identifying signature from the
`executable code portion and adding the signature to a
`signature database, (F) informing neighboring data pro
`cessing systems on a network of an occurrence of the
`undesirable software entity; and (G) generating a dis
`tress signal, if appropriate, so as to call upon an expert
`to resolve difficult cases. A feature of this invention is
`the automatic execution of the foregoing steps in re
`sponse to a detection of an undesired software entity,
`Such as a virus or a worm, within a data processing
`System. The automatic extraction of the identifying
`signature, the addition of the signature to a signature
`data base, and the immediate use of the signature by a
`Scanner provides protection from subsequent infections
`of the system, and also a network of systems, by the
`same or an altered form of the undesirable software
`entity.
`
`46 Claims, 7 Drawing Sheets
`
`
`
`a
`
`ser
`we
`
`orror syste. For
`NAS WR
`
`es
`
`ETFY was
`code. PRTNS OF
`was Sarts
`
`Juniper Ex. 1033-p. 1
`Juniper v Huawei
`
`
`
`5,440,723
`Page 2
`
`OTHER PUBLICATIONS
`S. W. Shieh et al. "A Pattern-Oriented Intrusion-De
`tection Model and its Applications'. Proceedings of the
`1991 IEEE Computer Society Symposium on Reserach
`and Privacy, pp. 327-342.
`H. S. Javitz et al. “The SRI IDES Statistical Anomaly
`Detector, Proceedings of the 1991 IEEE Computer
`Symposium on Research in Security and Privacy, pp.
`316-326.
`
`W. Arnold et al. "System for Detecting Undesired Al
`teration of Software", IBM TDB, vol. 32, No. 11, Apr.
`1990, pp. 48–50.
`S. M. Katz, "Estimation of Probabilities from Sparse
`Data for the Language Model Component of a Speech
`Recognizer”, IEEE Trans. ASSP-35, No. 3, Mar. 1987,
`pp. 400-401.
`F. Cohen, A Short Course on Computer Viruses, ASP
`Press, Pittsburg, 1990, pp. 9-15.
`
`Juniper Ex. 1033-p. 2
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 1 of 7
`
`5,440,723
`
`2
`
`BUS
`
`4.
`
`6
`
`CPU
`
`RAM
`
`36
`
`NETWORK
`
`NA
`
`3.
`
`TERMINAL
`CONROL
`
`8
`
`24, 28
`HARD DSK
`FLOPPY DISK
`SYSTEM
`SYSTEM
`CONTROL
`CONTROL
`
`26 3O
`HARD DISK
`FLOPPY DISK
`DRIVE
`
`IF
`
`cp
`
`3Oo
`
`22
`2O
`DISPLAY MANUAL
`DEVCE
`INPUT
`
`FG. A
`
`O
`
`
`
`EXACT- MATCH
`CA CULATON
`
`
`
`FUZZY- MATCH
`CALCUATION
`
`F.G. 5
`
`Juniper Ex. 1033-p. 3
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 2 of 7
`
`5,440,723
`
`
`
`
`
`TO OTHER
`PROCESSORS
`
`FIG. B
`
`kNFECTED
`
`Juniper Ex. 1033-p. 4
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 3 of 7
`
`5,440,723
`
`
`
`
`
`ALERT USER,
`REMOVE VIRUS
`
`MONITOR SYSTEM FOR
`ANOMALOUS BEHAVOR
`
`A
`
`ANOMAOUS BEHAVOR
`SCAN FOR KNOWN
`DETECTED
`VRUSES
`B
`
`VIRUS
`
`UNKNOWN WRUS
`DEPLOY DECOY PROGRAMS C
`TO CAPTURE UNKNOWN
`VRUS
`
`DENTFY INVARANT
`CODE PORTIONS OF
`VIRUS SAMPLE (S)
`
`EXTRACT SIGNATURE
`FROM CODE, ADD TO
`SCANNER'S DATABASE
`
`NFORM OTHER NETWORK
`PROCESSORS OF VIRUS
`NFECTION
`
`GENERATE DISTRESS
`SIGNAL, F REQUIRED
`
`FIG. 2
`
`E
`
`F
`
`G
`
`Juniper Ex. 1033-p. 5
`Juniper v Huawei
`
`
`
`Juniper Ex. 1033-p. 6
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 5 of 7
`
`5,440,723
`
`FILTER OUT PORTIONS OF VRUS
`THAT ARE OBSERVED TO WARY
`FROM INSTANCE TO ANOTHER
`
`FILTER OUT PORTONS OF VIRUS
`THAT ARE INTRINSICALLY THE MOST
`LKELY TO WARY FROM NSTANCE
`TO ANOTHER
`
`FG. 4
`
`st INSTANCE
`
`
`
`BYTE
`2nd NSTANCE
`
`
`
`BYTE in
`
`326E3 AF579Beca 436D5c35239
`reisalie DBIABIs IBA4 4s isolscisses 23 A4
`4423A187AcADF4436D5C3525 to 7D
`VARIANT
`"INVARIANT"
`VARIANT
`PORTION
`|PORTION
`PORTION
`FG, 6A
`
`3rd NSTANCE
`
`BYE
`N 8E DE 8C O6 O OO 8C C6 8E DE 3 CO
`3 DB 3 C9 3 D2 CB F9 FB FF 6F F5-N
`BYTE 24
`
`F.G. 6B
`
`Juniper Ex. 1033-p. 7
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 6 of 7
`
`5,440,723
`
`SEGREGATE CORPUS
`INTO PROBE, TRAINING
`AND TEST SETS
`
`A
`
`SELECT BYTE-STRINGS
`FROM PROBE SET
`
`B
`
`USE TRAINING SET TO
`ESTMATE PROBABILITIES
`
`C
`
`COUNT FREQUENCY IN
`TEST SET
`
`PRODUCE LIST OF
`ESTMATE PROBABILITES
`VS FREQUENCY
`
`
`
`DETERMINE FALSE
`POSITIVE AND FALSE-
`REJECTION PROBABLTY
`
`D
`
`E
`
`F
`
`SELECT THRESHOLD
`
`G
`
`FIG. 7
`
`Juniper Ex. 1033-p. 8
`Juniper v Huawei
`
`
`
`Juniper Ex. 1033-p. 9
`Juniper v Huawei
`
`
`
`1
`
`AUTOMATIC IMMUNE SYSTEM FOR
`COMPUTERS AND COMPUTER NETWORKS
`
`CROSS-REFERENCE TO A RELATED PATENT
`APPLICATION
`This patent application is related to commonly as
`signed U.S. patent application Ser. No. 08/004,871, filed
`Jan. 19, 1993, entitled “Methods and Apparatus for
`Evaluating and Extracting Signatures of Computer
`10
`Viruses and Other Undesirable Software Entities', by
`Jeffrey O. Kephart
`FIELD OF THE INVENTION
`15
`This invention relates generally to digital data pro
`cessors and, in particular, to methods and apparatus for
`providing computational integrity for digital data pro
`cessors and networks thereof.
`BACKGROUND OF THE INVENTION
`A computer virus has been defined by Frederick B.
`Cohen as a program that can infect other programs by
`modifying them to include a, possibly evolved, version
`of itself (A Short Course on Computer Viruses, ASP
`Press, Pittsburg, 1990, page 11).
`25
`As employed herein, a computer virus is considered
`to include an executable assemblage of computer in
`structions or code that is capable of attaching itself to a
`computer program. The subsequent execution of the
`viral code may have detrimental effects upon the opera
`30
`tion of the computer that hosts the virus. Some viruses
`have an ability to modify their constituent code, thereby
`complicating the task of identifying and removing the
`virus.
`A worm is a program or collection of programs that
`35
`can cause a possibly evolved version of itself to be exe
`cuted on the same or possibly on a different computer.
`A Trojan Horse is a block of undesired code that is
`intentionally hidden within a block of desirable code.
`Both computer viruses, worms, and Trojan Horses
`40
`are considered to be members of a class of undesirable
`software entities, the presence of which within a data
`processor, or network of data processors, is to be
`avoided so as to maintain computational integrity.
`A widely-used method for the detection of computer
`45
`viruses is known as a virus scanner. A virus scanner
`employs short strings of bytes to identify particular
`viruses in executable files, boot records, or memory.
`The byte strings (referred to assignatures) for a particu
`lar virus must be chosen with care such that they always
`discover the virus, if it is present, but seldom give a
`"false alarm’, known as a false positive. That is, the
`signature must be chosen so that the byte string is one
`that is unlikely to be found in programs that are nor
`mally executed on the computer. Typically, a human
`55
`expert makes this choice by converting the binary ma
`chine code of the virus to an assembler version, analyz
`ing the assembler code, selecting sections of code that
`appear to be unusual or virus-like, and identifying the
`corresponding bytes in the binary machine code so as to
`produce the signature. Wildcard bytes can be included
`within the signature to provide a match with any code
`byte appearing in a virus.
`Currently, a number of commercial computer virus
`scanners are successful in alerting users to the presence
`65
`of viruses that are known apriori. However, conven
`tional virus scanners are typically unable to detect the
`presence of computer viruses which they have not been
`
`5,440,723
`2
`programmed to detect explicitly. The problem of deal
`ing with new viruses has typically been addressed by
`distributing updates of scanning programs and/or auxil
`iary files containing the necessary information for iden
`tifying the latest viruses. However, the increasing rate
`at which new viruses are being written is widening the
`gap between the number of viruses that exist and the
`number of viruses that can be detected by an apprecia
`ble fraction of computer users. As a result, it is becom
`ing increasingly likely that a new virus will become
`wide-spread before any remedy is generally available.
`At least one mechanism to inform other computers
`connected to a network of a presence of a viral infection
`has been previously advanced. For example, a "net
`hormones' concept has been proposed by David Sto
`dolsky, wherein every computer on a network keeps
`archives of all potential virus-carrying contacts with
`every other computer. When a computer finds that it is
`infected with a virus, it sends a message to every com
`puter that it has ever contacted or been contacted by,
`which in turn send messages to all of their contactees or
`contactors, etc. However, it is believed that this ap
`proach will result in most or all computers quickly
`becoming overloaded by infection messages from most
`of the other computers on the network.
`As such, a need has arisen to develop methods for
`automatically recognizing and eradicating previously
`unknown or unanalyzed viruses on individual comput
`ers and computer networks. An efficient method is also
`required for informing other computers on the network
`as to the existence of a computer virus within the net
`work.
`It is an object of this invention to provide methods
`and apparatus to automatically detect and extract a
`signature from an undesirable software entity, such as a
`computer virus or worm.
`It is further object of this invention to provide meth
`ods and apparatus for immunizing a computer system,
`and also a network of computer systems, against a sub
`sequent infection by a previously unknown and undesir
`able software entity.
`SUMMARY OF THE INVENTION
`The foregoing and other problems are overcome and
`the objects of the invention are realized by a method
`that provides computational integrity for a digital data
`processor and a network of digital data processors. The
`method includes the following component steps, or
`some functional subset of these steps:
`(A) periodic monitoring of a data processing system
`for anomalous behavior that may indicate the presence
`of an undesirable informational state, such as one arising
`from the presence of an undesirable software entity,
`such as a computer virus or worm;
`(B) automatic scanning for occurrences of known
`types of undesirable software entities;
`(C) deploying decoy programs to capture samples of
`unknown types of computer viruses;
`(D) automatically identifying portions of a captured
`virus sample which are likely to remain invariant from
`one instance of the virus to another;
`(E) extracting an identifying signature from the in
`variant portion and adding the signature to a signature
`database;
`(F) informing neighboring data processing systems
`on a network of an occurrence of the undesirable soft
`ware entity; and
`
`50
`
`60
`
`Juniper Ex. 1033-p. 10
`Juniper v Huawei
`
`
`
`10
`
`15
`
`5,440,723
`4
`3
`floppy diskette 30a. A network adapter (NA) 31 is pro
`(G) generating a distress signal, if appropriate, so as
`vided for coupling the system 10 to a network.
`to call upon an expert to resolve difficult cases.
`The components illustrated in FIG. 1a may be em
`A feature of this invention is the automatic execution
`bodied within a personal computer, a portable com
`of the foregoing steps in response to a detection of an
`puter, a workstation, a minicomputer, a main frame
`undesired software entity, such as a virus, worm, or
`computer, or a supercomputer. As such, the details of
`Trojan Horse, within a data processing system. Further
`the physical embodiment of the data processing system
`more, the automatic extraction of the identifying signa
`ture, and the addition of the signature to a signature data
`10, such as the structure of the bus 12 or the number of
`base, provides protection, or “immunity', to subsequent
`CPUs 14 that are coupled to the bus, is not crucial to the
`operation of the invention, and is not described in fur
`infections of the system, and also a network of systems,
`by the same or an altered form of the undesirable soft
`ther detail below.
`As employed herein, the informational state of the
`ware entity.
`data processing system 10, at a given moment in time, is
`BRIEF DESCRIPTION OF THE DRAWING
`considered to be defined by the contents (bytes) of all
`memory or storage devices to which the system has
`The above set forth and other features of the inven
`access. These include, but are not limited to, RAM,
`tion are made more apparent in the ensuing Detailed
`registers, hard disks, floppy disks that are inserted into
`Description of the Invention when read in conjunction
`floppy disk drives, and magnetic tape that is inserted
`with the attached Drawing, wherein:
`FIG. 1a is a block diagram of a computer system that
`into a tape drive.
`The informational state history of the system 10 is
`is suitable for use in practicing the invention;
`considered to be the history of its informational state at
`FIG. 1b illustrates an exemplary computer network
`that includes the computer system of FIG. 1a,
`all moments in time.
`For the case where the data processing system 10 (P1
`FIG. 1c illustrates the exemplary computer network
`of FIG. 1b) is connected to other data processing sys
`of FIG. 1b and the use of a kill signal that is transmitted
`by infected computers to neighboring computers;
`tems (P2-P7) via a network, such as the exemplary
`network depicted in FIG. 1b, an additional data proces
`FIG. 2 is a flow chart depicting the constituent steps
`sor (PD) may be added as a dedicated decoy program
`of a method of the invention;
`FIG. 3 is a more detailed flow chart depicting the
`server, as described below.
`The method of this invention includes the following
`operation of the method of the invention;
`component steps, as illustrated in FIG. 2, or some func
`FIG. 4 is a flow chart depicting a method for use in
`tional subset of these steps.
`segregating variant virus portions from invariant virus
`Step A: Periodic monitoring of the data processing
`portions;
`system 10 for anomalous, potentially virus-like behav
`FIG. 5 is a flow chart that shows the operation of a
`statistically based computer virus signature extraction
`O.
`Step B: Automatic scanning for occurrences of
`method;
`35
`known viruses, and removal of any which are found to
`FIG. 6a illustrates exemplary portions of three in
`be present.
`stances of a computer virus, the three portions each
`Step C: Deployment of decoy programs to capture
`including an invariant portion and variant portions;
`virus samples.
`FIG. 6b illustrates an exemplary 24-byte “invariant'
`Step D: Segregation of a captured virus sample into
`portion of a computer virus;
`portions that are likely to vary from one instance of the
`FIG. 7 is a flow chart that depicts a method of select
`computer virus to another instance, and into portions
`ing a probability threshold used for accepting or reject
`that are likely to be invariant from one instance to an
`ing a candidate signature; and
`FIG. 8 is a block diagram of a system operable for
`other.
`Step E: Extraction of a viral signature from the in
`executing the method of the invention.
`variant portion(s) and the addition of the signature to a
`DETAILED DESCRIPTION OF THE
`signature database.
`INVENTION
`Step F: Informing neighboring data processing sys
`FIG. 1a is a block diagram of a data processing sys
`tems on a network of an occurrence of a viral infection.
`Step G: Generation of a distress signal, if appropriate,
`tem 10 that is suitable for practicing the teaching of the
`SO
`so as to call upon human experts to resolve difficult
`invention. A bus 12 is comprised of a plurality of signal
`lines for conveying addresses, data, and controls be
`CaSCS.
`A feature of this invention is the automatic execution
`tween a Central Processing Unit 14 and a number of
`of the foregoing steps in response to a detection of an
`other system bus units. A RAM 16 is coupled to the
`system bus 12 and provides program instruction storage
`undesired software entity, such as a virus or a worm,
`55
`within the data processing system 10.
`and working memory for the CPU 12. A terminal con
`trol subsystem 18 is coupled to the system bus 12 and
`A discussion is now made of each of the above refer
`provides outputs to a display device 20, typically a CRT
`enced Steps A-G of FIG. 2.
`monitor, and receives inputs from a manual input device
`Step A: Anomaly Detection
`22, typically a keyboard. Manual input may also be
`60
`The first step in the process of this invention detects
`provided from a pointing device, such as a mouse. A
`anomalous system behavior of a type that may indicate
`hard disk control subsystem 24bidirectionally couples a
`the presence of an undesirable informational state re
`rotating fixed disk, or hard disk 26, to the system bus 12.
`sulting from the presence of a virus or some other unde
`The control 24 and hard disk 26 provide mass storage
`sirable software entity, such as a worm or a Trojan
`for CPU instructions and data. A floppy disk control
`65
`subsystem 28 bidirectionally couples one or more
`Horse. The detection of anomalous behavior within a
`computer or computer network can be accomplished
`floppy disk drives 30 to the system bus 12. The floppy
`through the use of known techniques, preferably a tech
`disk drive 30 works in conjunction with a removable
`
`40
`
`25
`
`30
`
`45
`
`Juniper Ex. 1033-p. 11
`Juniper v Huawei
`
`
`
`10
`
`15
`
`5,440,723
`5
`nique that detects anomalies that may indicate a virus.
`One suitable anomaly detection technique, that uses
`patterns of activity undertaken by computational pro
`cesses, is described in an article entitled "A Pattern
`Orientated Intrusion-Detection Model and its Applica
`tions', S. W. Shieh and V. D. Gligor, Proceedings of
`the 1991 IEEE Computer Society Symposium on Re
`search in Security and Privacy'. Another suitable tech
`nique detects unexpected changes in executable or re
`source files. As an example, F. Cohen, A Short Course on
`Computer Viruses, ASP Press, Pittsburg, 1990, discusses
`the use of integrity shells employing modification detec
`tors. In this case, modification of files is detected by
`cryptographic or non-cryptographic checksums, and
`the user is notified if a modification is detected.
`The method at Step A may employ any one or a
`combination of these or similar techniques. The end
`result is the detection by the system 10 of an anomalous
`system behavior that may be indicative of the presence
`of a computer virus, or some other undesirable software
`entity.
`The anomaly detection process or processes may be
`run continually as a background task, or periodically as
`a foreground task, depending on the desire of the user
`25
`and/or the nature of the underlying operating system.
`Step B: Scan for Known Viruses
`If preliminary evidence of virus-like activity is de
`tected, additional computational resources are devoted
`to obtaining more conclusive evidence of viral infec
`tion. If a virus is present, there is a high probability that
`the virus is a well-known virus that can be readily de
`tected by a virus scanner. Thus, Step A scans an infor
`mational state history of the system, that is, all relevant
`35
`media (e.g. files, boot records, memory), for known
`viruses. In one implementation, a pre-existing virus
`scanner is activated to search for a large set of patterns
`or signatures, each of which pertains to a different virus
`or virus family. It is highly desirable (but not essential)
`that the virus scanner be capable of detecting slight
`variations on known viruses. Many conventional scan
`ners tend to possess this capability, to a limited extent,
`simply by virtue of the fact that they search based only
`45
`on short contiguous strings of bytes found in the virus.
`One especially suitable virus scanner, known as VIR
`SCAN, was available from the International Business
`Machines Corporation until recently, when it was in
`corporated into the IBM AntiVirus program. This virus
`50
`scanner is particularly useful for detecting variations in
`known viruses because it permits a certain number of
`mismatches between a string of bytes in a file being
`examined and the virus signature string. This capability
`reduces the chance of having to proceed to Step C
`55
`(decoy deployment), and is even more useful in the
`event that the virus is completely unrelated to any
`known virus, as is described below with respect to Step
`E. If the anomaly is found to be due to a known virus or
`some slight alteration of it, the method proceeds to Step
`B1 where the user is alerted, and the virus removed
`(killed) by traditional methods, such as restoration from
`backup (either automatically or manually by the user)
`or disinfection (removal of the virus from all of the
`software it has infected.) In general, disinfection is only
`65
`acceptable if the virus is found to be an exact copy of a
`known virus. This implies that the system verify the
`identification made by the virus scanner.
`
`6
`Step C: Decoy Deployment
`If no known virus is detected, the anomaly may be
`due to some unknown virus. In this case, the method
`employs one or more decoy programs in an attempt to
`attract and obtain one or more samples of the unknown
`virus. The decoy programs could be commercially
`available computer programs which are installed on
`(and are capable of being executed on) the computer
`system. Preferably, however, they would be special
`purpose programs which exist solely for the purpose of
`capturing viruses.
`For the case of a DOS-based system, a variety of
`decoy programs having *.COM and *.EXE extensions
`(also referred to as executables) are created, executed,
`read, written to, copied, etc. The use of decoy programs
`for this purpose is described by W. C. Arnold and D. M.
`Chess in "System for Detecting Undesired Alteration of
`Software', IBM Technical Disclosure Bulletin, Vol. 32,
`No. 11, pages 48-50, April 1990.
`In operation, original copies of the decoy programs
`are stored securely, by example in encrypted form or on
`a separate server, along with checksums of the originals.
`After a specified time interval, or after other specified
`conditions are met, checksums of some or all of the
`decoy programs are taken and compared directly to
`their originals. Alternatively, the decoy programs are
`compared directly to their originals. If none of the de
`coys are found to have changed, the user is alerted to
`the fact that an anomaly of some sort was detected, but
`that the system was unable to find conclusive evidence
`of a virus. In this case, it is desirable to maintain the
`system in a moderate state of alert for some time period,
`exercising the decoy programs at a reduced priority or
`less frequent intervals.
`It is desirable to have a large variety of decoy pro
`grams that differ from one another in length, content,
`and name. Furthermore, the decoy programs should be
`placed in and run from various directories, particularly
`those directories in which there are modified executa
`bles.
`Some DOS viruses are known to become resident in
`memory and are active until the computer is rebooted.
`In such a case, the virus is able to infect any executable
`at any time. Often, a resident virus will infect executa
`bles as they are brought into memory to be executed.
`Thus, if several decoy programs are run, there is a rea
`sonable chance that one or more decoy programs will
`be infected by such a virus. In addition to running the
`decoys, the system should perform a variety of other
`operations with the decoy programs, including copying
`them, opening them, creating them, etc. All of these
`operations increase the probability of infection by a
`virus that is resident in memory.
`However, some DOS viruses do not go resident, and
`can infect other programs only when the host program
`in which the virus is imbedded is executed. Such viruses
`typically use static criteria for selecting a program to
`infect, such as the program name or the directory in
`which the program is located. In this case, it is more
`difficult for the decoy programs to become infected. To
`increase the chance of infection, decoy programs are
`preferably placed in locations where they are more
`likely to be selected, such as the directory of the possi
`bly-infected executable, or directories containing sys
`tem programs (e.g. the root directory) and commonly
`used utility programs. In order to infect the decoy pro
`
`40
`
`Juniper Ex. 1033-p. 12
`Juniper v Huawei
`
`
`
`5,440,723
`8
`7
`grams, which can include representations of numerical
`grams, the potentially-infected executable should be run
`repeatedly.
`constants, character strings, screen images, work areas
`for computations, addresses, etc., tend to be more likely
`If one or more decoy programs is subsequently found
`to vary from one instance of the program to another
`to have changed from the original, protected version, it
`than "code” portions, which represent machine instruc
`can be assumed that the changes are due to a virus. A
`comparison of each modified decoy program with its
`tions. It may be that most or all of the "data' bytes in a
`given virus are the same in all instances of that virus, but
`corresponding uninfected version enables a copy of the
`without a careful analysis this is difficult to establish
`virus to be isolated from each decoy.
`with certainty. Thus, a conservative approach assumes
`Step D: Identification of Invariant Code Portions
`that “data'' bytes may change, and therefore it is desir
`able to disqualify them from being chosen as portions of
`In order to maximize the likelihood that a chosen
`candidate signatures. Perfect classification of each byte
`signature will appear in all or nearly all progeny of the
`as being "code' or "data' is not necessary. An errone
`virus, it is preferred to filter out all portions of the virus
`ous classification of "code' bytes as "data'' bytes is
`which are observed to or are likely to vary from one
`more tolerable than an erroneous classification of
`instance of the virus to another. To achieve this, the
`15
`method employs one, or preferably both, of the tech
`“data'' bytes as "code' bytes.
`By example, FIG. 6b illustrates a 24-byte “invariant'
`niques represented as Blocks A and B in FIG. 4. The
`portion of a virus sample that was generated at Step C.
`order in which the two techniques are used can be re
`The first 4 bytes represent machine instructions, bytes 5
`versed, and the following description is correspond
`and 6 represent data, bytes 7-19 represent code, and
`ingly modified to accommodate the opposite ordering
`bytes 20–24 represent data. Thus, if Block B of FIG. 4
`of the two techniques.
`were employed, only bytes 1-4 and 7-19 would be
`If several decoy programs are found to be infected, a
`passed along to Step E for signature extraction.
`most desirable case, it is important to compare the vari
`Suitable methods of code-data segregation include,
`ous versions of the isolated virus programs with one
`but are not limited to, the following techniques.
`another (Block A). If all versions are identical, or if 25
`1. Conventional static disassembly techniques, such
`there is only one version, the entire virus is considered
`as flow analysis.
`to be "invariant', and can be passed along to block B
`Otherwise, if the various isolated computer virus pro
`2. Simulation of marked viral code in a virtual Ina
`chine, wherein the virtual machine includes a hardware
`grams differ, but have substantial areas in common, it
`and software model of the computer or class of comput
`can be assumed that the virus alters itself when it repli
`ers for which the virus signature is being obtained.
`cates. As a result, only the common areas are marked
`3. Running the marked viral code through an inter
`for further examination by the heuristics employed in
`preter or debugger to determine which bytes are actu
`Block B of FIG. 4. This procedure is illustrated in
`ally executed. This method requires a "safe' environ
`FIG. 6a, in which the variant and "invariant' por
`tions of a virus are identified by comparing three in
`ment in which to run the virus, such as the dedicated
`35
`stances of it. In FIG. 6a, each byte is shown expressed
`server PDreferred to above in the discussion of FIG. b.
`4. Statistical classification techniques.
`in a hexadecimal representation, and an invariant por
`tion is identified. The length of the invariant sequence
`The second method, i.e., simulation of the marked
`of bytes is exemplary. In practice, invariant sequences
`viral code in a virtual machine, is the safest and most
`of bytes are often considerably longer than five bytes,
`reliable, but is often the most computationally intensive.
`40
`and more than one such portion typically appears
`The second and third methods each have the advan
`tage that they are capable of determining whether a
`within the virus. If there are no substantially similar
`virus is self-modifying. Viruses that modify themselves
`areas, this may indicate the presence of a relatively
`sophisticated self-altering virus, or the presence of more
`often do so for the purpose of obscuring their machine
`code portions by making the code appear different from
`than one virus. In this case, each of what appear to be
`45
`different viruses are provided separately to the extrac
`one instance of the virus to the next. Such viruses are
`often referred to as self-garblers. Self-garblers typically
`tion method, which determines one or more signatures
`consist of a short decrypting "head' portion followed
`for each. The system is also placed in a moderate state
`by code which is transformed in an easily-invertible
`of alert, and the user is informed.
`manner (e.g. XOR'ed) using a randomly-chosen de
`It is noted that a situation may arise in which the
`cryption key when the virus is replicated. Thus, the
`computer is attacked by a sophisticated, unk