throbber
000001
`
`Symantec 1022
`IPR of U.S. Pat. No. 7,757,298
`
`

`
`Page 2
`
`5,440,723
`
`OTHER PUBLICATIONS
`S. W. Shieh et al. “A Pattern—Oriented Intrusion-De-
`
`tection Model and its Applications”, Proceedings of the
`1991 IEEE Computer Society Symposium on Reserach
`and Privacy, pp. 327-342.
`H. S. Javitz et al. “The SR1 IDES Statistical Anomaly
`Detector", Proceedings of the 1991 IEEE Computer
`Symposium on Research in Security and Privacy, pp.
`316-326.
`
`W. Arnold et al. “System for Detecting Undesired Al-
`teration of Software”, IBM TDB, vol. 32, No. 11, Apr.
`1990, pp. 48—50.
`S. M. Katz, “Estimation of Probabilities from Sparse
`Data for the Language Model Component of a Speech
`Recognizer”, IEEE Trans. ASSP—35, No. 3, Mar. 1987,
`pp. 400-401.
`F. Cohen, A Short Course on Computer Viruses, ASP
`Press, Pittsburg, 1990, pp. 9-15.
`
`OOOOO2
`
`000002
`
`

`
`U.S. Patent
`
`Aug. 8, 1995 4
`
`Sheet 1 of7
`
`5,440,723
`
`l2
`
`BUS
`
`NETWORK
`
`M 3!
`
`l4
`
`l6
`
`I8
`
`24
`
`28
`
`CPU W TERMINAL
`36
`
`CONTROL
`
`220
`
`22
`
`DISPLAY
`DEVICE
`
`MANUAL
`INPUT
`
`HARD DISK
`SYSTEM
`CONTROL
`as
`
`HARD DISK
`
`FLOPPY DISK
`SYSTEM
`CONTROL
`30
`
`FLOPPY DISK
`DRIVE
`
`I0
`
`300
`
`
`
`EXACT- MATCH I FUZZY-MATCH
`
`CALCULATION CALCULATION
`
`
`
`REPORT RESULTS
`
`
`
`
`
`FIG. 5
`
`000003
`
`000003
`
`

`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 2 of 7
`
`_EBOCESSORS
`
`TO OTHER
`
`FIG.
`
`IB
`
` fie INFECTED
`
`FIG. IC
`
`000004
`
`000004
`
`

`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 3 of 7
`
`5,440,723
`
`MONITOR SYSTEM FOR
`ANOMALOUS BEHAVIOR
`
`A
`
`,
`I
`
`
`
`ALERT USER.
`REMOVE vmus
`
`ANOMALOUS BEHAVIOR
`SCAN FOR KNOWN
`DETECTED
`vmuses
`B
`
`vuaus
`
`ummown vmus
`
`DEPLOY oscov PROGRAMS
`TO CAPTURE UNKNOWN
`vmus
`
`C
`
`IDENTIFY INVARIANT
`CODE PORTIONS OF
`VIRUS SAMPLEISI
`
`EXTRACT SIGNATURE
`
`FROM CODE, ADD TO
`SCANNER.'S DATABASE
`
`IN FORM OTHER NETWORK
`PROCESSORS OF vnaus
`INFECTION
`
`GENERATE DISTRESS
`
`SIGNAL, IF REQUIRED
`
`E
`
`F
`
`6)
`
`FIG. 2
`
`000005
`
`000005
`
`

`
`>u3<DZ_.._.200
`
`n_o._m>23¢
`
`
`
`4.._.oz...=
`
` U.S.Patent
`
`__
`
`>._._<o_ooEu._msxomzo23¢
`
`
`>..._<:z_»zoo93>23¢
`
`_mm._m<Souxm_362410n:
`
`mmozaion:
`
`aaxomzo23¢
`
`>._<s_oz<u:
`
`_
`
`zsozxn:
`
`m_..m.>
`
`omoz<.._oZ0z<omm_>23¢
`
`mu._mE.:omxm
`
`mmozaio02u:
`
`
`
`I0__.._&.mm..<Em
`
`
`
`22.82"E34Em
`
`
`_|OH._<zo_mozmm
`
`Im:m_>zaozx
`
`
`
`
`
`
`
`ozaomxomzmzqmooma>ooma>o._nma
`
`Aug.8,1995
`
`
`
`zo_Eon_..<m_>m:jom_._._<Qoowoamigos.nozoz<omm_>22.
`
`
`
`
`
`
`
`
`
`muzioqzamigos.n:
`
`aroomoV.
`
`wza:n5z<m._oo
`
`mm._m<Somxm
`
`55...“.
`
`Sheet4of7
`
`
`Dw._.Own._2_n:wm|_n=2<mM000mDm_>mommz
`I.
`
`
`
`mzmfiwwwfimdsqmw:m_>:93no
`
`2
`
`
`
`mzo_Eon_mooo>.._:zuo_
`
`
`
`EonmmEqmmzmo
`
`cgfifiomaxmzaz
`
`
`
`$EE:<zo_m._.o<Exm
`
`m.9“.
`
`mn5xo<mmo>mm>
`
`5,440,723
`
`
`
`O._..mEm3<zo_m84
`
`um<m<_.<a
`
`02n:
`
`amigos.
`
`m>oowo
`
`
`
`uo<mmms_mD._.<._.w
`
`mum:O...
`
`000006
`
`000006
`
`
`
`
`
`
`
`
`
`
`
`
`

`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 5 of 7
`
`5,440,723
`
`
`
`FILTER OUT PORTIONS OF VIRUS
`THAT ARE OBSERVED TO VARY
`FROM INSTANCE TO ANOTHER
`
`
`
`
`
`
`
`FILTER
`T PORTIONS OF VIRUS.
`THAT A
`INTRINSICALLY THE MOST
`LIKELY TO VARY FROM INSTANCE
`TO ANOTHER
`
`
`
`Ist
`
`INSTANCE
`
`
`
`BYTE n
`
`II
`
`BYTE I
`I
`2nd INSTANCE
`I
`flfifiafiflfililfiifiml
`I
`I
`3rd INSTANCE
`I
`,
`HIEEEBIEIEEEEEIEI
`I
`VARIANT
`I"|NVARlANT"
`IVARIANTI
`I
`PORTION
`[PORTION
`IPORTIONI
`
`II
`
`FIG. 6A
`
`BYTE I
`\BE DE 8C 06 0| OO 8C C6 8E DE 3| CO
`
`BI DB 3| C9 3I D2 CB F9 F8 FF SF F5\
`BYTE 24
`
`FIG. 6B
`
`OOOOO7
`
`000007
`
`

`
`U.S. Patent
`
`Aug. 8, 1995
`
`Sheet 6 of7
`
`.
`
`5,440,723
`
`SEGREGATE CORPUS
`INTo PROBE, TRAINING
`AND TEST SETS
`
`SELECT BYTE-STRINGS
`FROM PROBE SET
`
`USE TRAINING SET TO
`ESTIMATE PROBABILITIES
`
`COUNT FREQUENCY IN
`TEST SET '
`
`PRODUCE LIST OF
`ESTIMATE PROBABILITIES
`VS FREQUENCY
`
`A
`
`B
`
`C
`
`D
`
`E
`
`DETERMINE FALSE-
`POSITIVE AND FALSE-
`
`F REJECTION PROBABILTY
`
`SELECT THRESHOLD
`
`G
`
`FIG. 7
`
`000008
`
`000008
`
`

`
`U
`
`1
`
`e
`
`327.,O
`
`4.,mmoom.om.o_5.mo2_>_:.wmmmm0_n_
`
`4um:2ze_mo3<>
`
`o:.m_m:m:.m>z..%__mmm_mmmE:Eozmz
`
`o7
`
`mmE.__m<moE
`
`.m245-2M.2<mo»mw_m
`
`.5...uon%m
`
`
`
`e2%%3...mmE:ms
`
`255.2mm._2oz<w>.:.._m<momn_
`
`
`
`>t.__m<moEmo.Es_:mm.mo05wmmz_ms_oo
`
`ms_<mmo.EcmWEoammnomsmmoo0mmnzzmamo_.ow._mmM.9mo.._<>
`
`mmE.._m<moEmo»<s__.6m
`
`
`
`mm:.Ezo_mZ._.__m<moE
`
`
`
` mommmoomacmmtwEo_oz<ommB<ze.ws_<mo:zSoas.m.._o5:m.Eo_oz<o2.tmoDamom_mE8.Sxmozfimz
`
`
`
`mEEo-co.»
`
`
`8,
`
`§m.Es_<mmyNo_._o._.<s_m=2n_n..c._.cmu_m<._.A»zms_o<mnEn.s_<me-zElmwzzqow
`
`
`
`5<xmn_mm
`
`000009
`
`
`

`
`1
`
`i44Q723
`
`AUTOMATIC IMMUNE SYSTEM FOR
`COMPUTERS AND COMPUTER NETWORKS
`
`CROSS-REFERENCE TO A RELATED PATENT
`APPLICATION
`
`This patent application is related to commonly as-
`signed U.S. patent application Ser. No. 08/004,871, filed
`Jan. 19, 1993, entitled “Methods and Apparatus for
`Evaluating and Extracting Signatures of Computer
`Viruses and Other Undesirable Software Entities”, by
`Jeffrey O. Kephart
`FIELD OF THE INVENTION
`
`This invention relates generally to digital data pro-
`cessors and, in particular, to methods and apparatus for
`providing computational integrity for digital data pro-
`cessors and networks thereof.
`
`BACKGROUND OF THE INVENTION
`
`A computer virus has been defined by Frederick B.
`Cohen as a program that can infect other programs by
`modifying them to include a, possibly evolved, version
`of itself (A Short Course on Computer Viruses, ASP
`Press, Pittsburg, 1990, page 11).
`As employed herein, a computer virus is considered
`to include an executable assemblage of computer in-
`structions or code that is capable of attaching itself to a
`computer program. The subsequent execution of the
`viral code may have detrimental effects upon the opera-
`tion of the computer that hosts the virus. Some viruses
`have an ability to modify their constituent code, thereby
`complicating the task of identifying and removing the
`virus.
`
`A worm is a program or collection of programs that
`can cause a possibly evolved version of itself to be exe-
`cuted on the same or possibly on a different computer.
`A Trojan Horse is a block of undesired code that is
`intentionally hidden within a block of desirable code.
`Both computer viruses, worms, and Trojan Horses
`are considered to be members of a class of undesirable
`software entities, the presence of which within a data
`processor, or network of data processors,
`is to be
`avoided so as to maintain computational integrity.
`A widely-used method for the detection of computer
`viruses is known as a virus scanner. A virus scanner
`employs short strings of bytes to identify particular
`viruses in executable files, boot records, or memory.
`The byte strings (referred to as signatures) for a particu-
`lar virus must be chosen with care such that they always
`discover the virus, if it is present, but seldom give a
`“false alarm”, known as a false positive. That is, the
`signature must be chosen so that the byte string is one
`that is unlikely to be found in programs that are nor-
`mally executed on the computer. Typically, a human
`expert makes this choice by converting the binary ma-
`chine code of the virus to an assembler version, analyz-
`ing the assembler code, selecting sections of code that
`appear to be unusual or virus-like, andidentifying the
`corresponding bytes in the binary machine code so as to
`produce the signature. Wildcard bytes can be included
`within the signature to provide a match with any code
`byte appearing in a virus.
`Currently, a number of commercial computer virus
`scanners are successful in alerting users to the presence
`of viruses that are known apriori. However, conven-
`tional virus scanners are typically unable to detect the
`presence of computer viruses which they have not been
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`programmed to detect explicitly. The problem of deal-
`ing with new viruses has typically been addressed by
`distributing updates of scanning programs and/or auxil-
`iary files containing the necessary information for iden-
`tifying the latest viruses. However, the increasing rate
`at which new viruses are being written is widening the
`gap between the number of viruses that exist and the
`number of viruses that can be detected by an apprecia-
`ble fraction of computer users. As a result, it is becom-
`ing increasingly likely that a new virus will become
`wide-spread before any remedy is generally available.
`At least one mechanism to inform other computers
`connected to a network of a presence of a viral infection
`has been previously advanced. For example, a “net
`hormones” concept has been proposed by David Sto-
`dolsky, wherein every computer on a network keeps
`archives of all potential virus-carrying contacts with
`every other computer. When a computer finds that it is
`infected with a virus, it sends a message to every com-
`puter that it has ever contacted or been contacted by,
`which in turn send messages to all of their contactees or
`contactors, etc. However, it is believed that this ap-
`proach will result in most or all computers quickly
`becoming overloaded by infection messages from most
`of the other computers on the network.
`As such, a need has arisen to develop methods for
`automatically recognizing and eradicating previously
`unknown or unanalyzed viruses on individual comput-
`ers and computer networks. An efficient method is also
`required for informing other computers on the network
`as to the existence of a computer virus within the net-
`work.
`
`It is an object of this invention to provide methods
`and apparatus to automatically detect and extract a
`signature from an undesirable software entity, such as a
`computer virus or worm.
`It is further object of this invention to provide meth-
`ods and apparatus for immunizing a computer system,
`and also a network of computer systems, against a sub-
`sequent infection by a previously unknown and undesir-
`able software entity.
`
`SUMMARY OF THE INVENTION
`
`The foregoing and other problems are overcome and
`the objects of the invention are realized by a method
`that provides computational integrity for a digital data
`processor and a network of digital data processors. The
`method includes the following component steps, or
`some functional subset of these steps:
`(A) periodic monitoring of a data processing system
`for anomalous behavior that may indicate the presence
`of an undesirable informational state, such as one arising
`from the presence of an undesirable software entity,
`such as a computer virus or worm;
`(B) automatic scanning for occurrences of known
`types of undesirable software entities;
`(C) deploying decoy programs to capture samples of
`unknown types of computer viruses;
`(D) automatically identifying portions of a captured
`virus sample which are likely to remain invariant from
`one instance of the virus to another;
`(E) extracting an identifying signature from the in-
`variant portion and adding the signature to a signature
`database;
`(F) informing neighboring data processing systems
`on a network of an occurrence of the undesirable soft-
`ware entity; and
`
`000010
`
`000010
`
`

`
`i44Q723
`
`4
`
`3
`(G) generating a distress signal, if appropriate, so as
`to call upon an expert to resolve difficult cases.
`A feature of this invention is the automatic execution
`of the foregoing steps in response to a detection of an
`undesired software entity, such as a virus, worm, or
`Trojan Horse, within a data processing system. Further-
`more, the automatic extraction of the identifying signa-
`ture, and the addition of the signature to a signature data
`base, provides protection, or “immunity”, to subsequent
`infections of the system, and also a network of systems,
`by the same or an altered form of the undesirable soft-
`ware entity.
`BRIEF DESCRIPTION OF THE DRAWING
`
`The above set forth and other features of the inven-
`tion are made more apparent in the ensuing Detailed
`Description of the Invention when read in conjunction
`with the attached Drawing, wherein:
`FIG. 1a is a block diagram of a computer system that
`is suitable for use in practicing the invention;
`FIG. 1b illustrates an exemplary computer network
`that includes the computer system of FIG. 1a;
`FIG. 1c illustrates the exemplary computer network
`of FIG. 1b and the use of a kill signal that is transmitted
`by infected computers to neighboring computers;
`FIG. 2 is a flow chart depicting the constituent steps
`of a method of the invention;
`FIG. 3 is a more detailed flow chart depicting the
`operation of the method of the invention;
`FIG. 4 is a flow chart depicting a method for use in
`segregating variant virus portions from invariant virus
`portions;
`FIG. 5 is a flow chart that shows the operation of a
`statistically based computer virus signature extraction
`method;
`FIG. 6a illustrates exemplary portions of three in-
`stances of a computer virus, the three portions each
`including an invariant portion and variant portions;
`FIG. 6b illustrates an exemplary 24-byte “invariant”
`portion of a computer virus;
`FIG. 7 is a flow chart that depicts a method of select-
`ing a probability threshold used for accepting or reject-
`ing a candidate signature; and
`FIG. 8 is a block diagram of a system operable for
`executing the method of the invention.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`FIG. 1a is a block diagram of a data processing sys-
`tem 10 that is suitable for practicing the teaching of the
`invention. A bus 12 is comprised of a plurality of signal
`lines for conveying addresses, data, and controls be-
`tween a Central Processing Unit 14 and a number of
`other system bus units. A RAM 16 is coupled to the
`system bus 12 and provides program instruction storage
`and working memory for the CPU 12. A terminal con-
`trol subsystem 18 is coupled to the system bus 12 and
`provides outputs to a display device 20, typically a CRT
`monitor, and receives inputs from a manual input device
`22,
`typically a keyboard. Manual input may also be
`provided from a pointing device, such as a mouse. A
`hard disk control subsystem 24 bidirectionally couples a
`rotating fixed disk, or hard disk 26, to the system bus 12.
`The control 24 and hard disk 26 provide mass storage
`for CPU instructions and data. A floppy disk control
`subsystem 28 bidirectionally couples one or more
`floppy disk drives 30 to the system bus 12. The floppy
`disk drive 30 works in conjunction with a removable
`
`floppy diskette 30a. A network adapter (NA) 31 is pro-
`vided for coupling the system 10 to a network.
`The components illustrated in FIG. 1a may be em-
`bodied within a personal computer, a portable com-
`puter, a workstation, a minicomputer, a main frame
`computer, or a supercomputer. As such, the details of
`the physical embodiment of the data processing system
`10, such as the structure of the bus 12 or the number of
`CPUs 14 that are coupled to the bus, is not crucial to the
`operation of the invention, and is not described in fur-
`ther detail below.
`As employed herein, the informational state of the
`data processing system 10, at a given moment in time, is
`considered to be defined by the contents (bytes) of all
`memory or storage devices to which the system has
`access. These include, but are not limited to, RAM,
`registers, hard disks, floppy disks that are inserted into
`floppy disk drives, and magnetic tape that is inserted
`into a tape drive.
`The informational state history of the system 10 is
`considered to be the history of its informational state at
`all moments in time.
`For the case where the data processing system 10 (P1
`of FIG. 1b) is connected to other data processing sys-
`tems (P2—P7) via a network, such as the exemplary
`network depicted in FIG. 1b, an additional data proces-
`sor (PD) may be added as a dedicated decoy program
`server, as described below.
`The method of this invention includes the following
`component steps, as illustrated in FIG. 2, or some func-
`tional subset of these steps.
`Step A: Periodic monitoring of the data processing
`system 10 for anomalous, potentially virus-like behav-
`ior.
`
`Step B: Automatic scanning for occurrences of
`known viruses, and removal of any which are found to
`be present.
`Step C: Deployment of decoy programs to capture
`virus samples.
`Step D: Segregation of a captured virus sample into
`portions that are likely to vary from one instance of the
`computer virus to another instance, and into portions
`that are likely to be invariant from one instance to an-
`other.
`
`Step E: Extraction of a viral signature from the in-
`variant portion(s) and the addition of the signature to a
`signature database.
`Step F: Informing neighboring data processing sys-
`tems on a network of an occurrence of a viral infection.
`Step G: Generation of a distress signal, if appropriate,
`so as to call upon human experts to resolve difficult
`cases.
`A feature of this invention is the automatic execution
`of the foregoing steps in response to a detection of an
`undesired software entity, such as a virus or a worm,
`within the data processing system 10.
`A discussion is now made of each of the above refer-
`enced Steps A—G of FIG. 2.
`
`Step A: Anomaly Detection
`
`The first step in the process of this invention detects
`anomalous system behavior of a type that may indicate
`the presence of an undesirable informational state re-
`sulting from the presence of a virus or some other unde-
`sirable software entity, such as a worm or a Trojan
`Horse. The detection of anomalous behavior within a
`computer or computer network can be accomplished
`through the use of known techniques, preferably a tech-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`000011
`
`000011
`
`

`
`i440fl23
`
`6
`
`Step C: Decoy Deployment
`
`5
`nique that detects anomalies that may indicate a virus.
`One suitable anomaly detection technique,
`that uses
`patterns of activity undertaken by computational pro-
`cesses, is described in an article entitled “A Pattem-
`Orientated Intrusion-Detection Model and its Applica-
`tions”, S. W. Shieh and V. D. Gligor, Proceedings of
`the 1991 IEEE Computer Society Symposium on Re-
`search in Security and Privacy”. Another suitable tech-
`nique detects unexpected changes in executable or re-
`source files. As an example, F. Cohen, A Short Course on
`Computer Viruses, ASP Press, Pittsburg, 1990, discusses
`the use of integrity shells employing modification detec-
`tors. In this case, modification of files is detected by
`cryptographic or non—cryptographic checksums, and
`the user is notified if a modification is detected.
`The method at Step A may employ any one or a
`combination of_ these or similar techniques. The end
`result is the detection by the system 10 of an anomalous
`system behavior that may be indicative of the presence
`of a computer virus, or some other undesirable software
`entity.
`The anomaly detection process or processes may be
`run continually as a background task, or periodically as
`a foreground task, depending on the desire of the user
`and/or the nature of the underlying operating system.
`
`Step B: Scan for Known Viruses
`
`If preliminary evidence of virus-like activity is de-
`tected, additional computational resources are devoted
`to obtaining more conclusive evidence of viral infec-
`tion. If a virus is present, there is a high probability that
`the virus is a well-known virus that can be readily de-
`tected by a virus scanner. Thus, Step A scans an infor-
`mational state history of the system, that is, all relevant
`media (e.g. files, boot records, memory), for known
`viruses.
`In one implementation, a pre-existing virus
`scanner is activated to search for a large set of patterns
`or signatures, each of which pertains to a different virus
`or virus family. It is highly desirable (but not essential)
`that the virus scanner be capable of detecting slight
`variations on known viruses. Many conventional scan-
`ners tend to possess this capability, to a limited extent,
`simply by virtue of the fact that they search based only
`on short contiguous strings of bytes found in the virus.
`One especially suitable virus scanner, known as VIR-
`SCAN, was available from the International Business
`Machines Corporation until recently, when it was in-
`corporated into the IBM AntiVirus program. This virus
`scanner is particularly useful for detecting variations in
`known viruses because it permits a certain number of
`mismatches between a string of bytes in a file being
`examined and the virus signature string. This capability
`reduces the chance of having to proceed to Step C
`(decoy deployment), and is even more useful
`in the
`event that the virus is completely unrelated to any
`known virus, as is described below with respect to Step
`E. If the anomaly is found to be due to a known virus or
`some slight alteration of it, the method proceeds to Step
`B1 where the user is alerted, and the virus removed
`(killed) by traditional methods, such as restoration from
`backup (either automatically or manually by the user)
`or disinfection (removal of the virus from all of the
`software it has infected.) In general, disinfection is only
`acceptable if the virus is found to be an exact copy of a
`known virus. This implies that the system verify the
`identification made by the virus scanner. .
`
`If no known virus is detected, the anomaly may be
`due to some unknown virus. In this case, the method
`employs one or more decoy programs in an attempt to
`attract and obtain one or more samples of the unknown
`virus. The decoy programs could be commercially-
`available computer programs which are installed on
`(and are capable of being executed on) the computer
`system. Preferably, however, they would be special-
`purpose programs which exist solely for the purpose of
`capturing viruses.
`For the case of a DOS-based system, a variety of
`decoy programs having *.COM and *.EXE extensions
`(also referred to as executables) are created, executed,
`read, written to, copied, etc. The use of decoy programs
`for this purpose is described by W. C. Arnold and D. M.
`Chess in “System for Detecting Undesired Alteration of
`Software”, IBM Technical Disclosure Bulletin, Vol. 32,
`No. 11, pages 48-50, April 1990.
`In operation, original copies of the decoy programs
`are stored securely, by example in encrypted form or on
`a separate server, along with checksums of the originals.
`After a specified time interval, or after other specified
`conditions are met, checksums of some or all of the
`decoy programs are taken and compared directly to
`their originals. Alternatively, the decoy programs are
`compared directly to their originals. If none of the de-
`coys are found to have changed, the user is alerted to
`the fact that an anomaly of some sort was detected, but
`that the system was unable to find conclusive evidence
`of a virus. In this case, it is desirable to maintain the
`system in a moderate state of alert for some time period,
`exercising the decoy programs at a reduced priority or
`less frequent intervals.
`It is desirable to have a large variety of decoy pro-
`grams that differ from one another in length, content,
`and name. Furthermore, the decoy programs should be
`placed in and run from various directories, particularly
`those directories in which there are modified executa-
`bles.
`Some DOS viruses are known to become resident in
`memory and are active until the computer is rebooted.
`In such a case, the virus is able to infect any executable
`at any time. Often, a resident virus will infect executa-
`bles as they are brought into memory to be executed.
`Thus, if several decoy programs are run, there is a rea-
`sonable chance that one or more decoy programs will
`be infected by such a virus. In addition to running the
`decoys, the system should perform a variety of other
`operations with the decoy programs, including copying
`them, opening them, creating them, etc. All of these
`operations increase the probability of infection by a
`virus that is resident in memory.
`However, some DOS viruses do not go resident, and
`can infect other programs only when the host program
`in which the virus is imbedded is executed. Such viruses
`typically use static criteria for selecting a program to
`infect, such as the program name or the directory in
`which the program is located. In this case, it is more
`difficult for the decoy programs to become infected. To
`increase the chance of infection, decoy programs are
`preferably placed in locations where they are more
`likely to be selected, such as the directory of the possi-
`bly-infected executable, or directories containing sys-
`tem programs (e.g. the root directory) and commonly-
`used utility programs. In order to infect the decoy pro-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`000012
`
`000012
`
`

`
`5A4Q723
`
`7
`grams, the potentially-infected executable should be run
`repeatedly.
`If one or more decoy programs is subsequently found
`to have changed from the original, protected version, it
`can be assumed that the changes are due to a virus. A
`comparison of each modified decoy program with its
`corresponding uninfected version enables a copy of the
`virus to be isolated from each decoy.
`
`8
`grams, which can include representations of numerical
`constants, character strings, screen images, work areas
`for computations, addresses, etc., tend to be more likely
`to vary from one instance of the program to another
`than “code” portions, which represent machine instruc-
`tions. It may be that most or all of the “data” bytes in a
`given virus are the same in all instances of that virus, but
`without a careful analysis this is difficult to establish
`with certainty. Thus, a conservative approach assumes
`that “data” bytes may change, and therefore it is desir-
`able to disqualify them from being chosen as portions of
`candidate signatures. Perfect classification of each byte
`as being “code” or “data” is not necessary. An errone-
`ous classification of “code” bytes as “data” bytes is
`more tolerable than an erroneous classification of
`“data” bytes as “code” bytes.
`By example, FIG. 6b illustrates a 24-byte “invariant”
`portion of a virus sample that was generated at Step C.
`The first 4 bytes represent machine instructions, bytes 5
`and 6 represent data, bytes 7-19 represent code, and
`bytes 20-24 represent data. Thus, if Block B of FIG. 4-
`were employed, only bytes 1-4 and 7-19 would be
`passed along to Step E for signature extraction.
`Suitable methods of code-data segregation include,
`but are not limited to, the following techniques.
`1. Conventional static disassembly techniques, such
`as flow analysis.
`2. Simulation of marked viral code in a virtual ma-
`chine, wherein the virtual machine includes a hardware
`and software model of the computer or class of comput-
`ers for which the virus signature is being obtained.
`3. Running the marked viral code through an inter-
`preter or debugger to determine which bytes are actu-
`ally executed. This method requires a “safe” environ-
`ment in which to run the virus, such as the dedicated
`server PD referred to above in the discussion of FIG. lb.
`4. Statistical classification techniques.
`The second method,
`i.e., simulation of the marked
`viral code in a virtual machine, is the safest and most
`reliable, but is often the most computationally intensive.
`The second and third methods each have the advan-
`tage that they are capable of determining whether a
`virus is self-modifying. Viruses that modify themselves
`often do so for the purpose of obscuring their machine
`code portions by making the code appear different from
`one instance of the virus to the next. Such viruses are
`often referred to as self-garblers. Self-garblers typically
`consist of a short decrypting “head” portion followed
`by code which is transformed in an easily-invertible
`manner (e.g. XOR’ed) using a randomly-chosen de-
`cryption key when the virus is replicated. Thus, the
`only portion of the code which remains invariant, from
`one instance of the virus to another, is the head. It is
`helpful to know that a virus is self-modifying (and thus
`potentially self-garbling), so that signatures will only be
`drawn from the decrypting head. This information can
`also be helpful in identifying cases in which what ap-
`peared to be several different viruses in Step C are
`actually only different instances of a single, self-garbling
`virus.
`At the conclusion of Step D, there exists one or more
`sequences of bytes which appear in every instance of
`the one or more viruses that were obtained or generated
`at Step C and which, according to any heuristics em-
`ployed in Block B, are deemed likely to appear in most
`or all possible instances of the virus. The set of candi-
`date viral signatures are all possible contiguous blocks
`of S bytes found in these “probably-invariant” or “sub-
`
`l0
`
`15
`
`20
`
`Step D: Identification of Invariant Code Portions
`In order to maximize the likelihood that a chosen
`signature will appear in all or nearly all progeny of the
`virus, it is preferred to filter out all portions of the virus
`which are observed to or are likely to vary from one
`instance of the virus to another. To achieve this, the
`method employs one, or preferably both, of the tech-
`niques represented as Blocks A and B in FIG. 4. The
`order in which the two techniques are used can be re-
`versed, and the following description is correspond-
`ingly modified to accommodate the opposite ordering
`of the two techniques.
`If several decoy programs are found to be infected, a
`most desirable case, it is important to compare the vari-
`ous versions of the isolated virus programs with one
`another (Block A). If all versions are identical, or if 25
`there is only one version, the entire virus is considered
`to be “invariant”, and can be passed along to block B
`Otherwise, if the various isolated computer virus pro-
`grams differ, but have substantial areas in common, it
`can be assumed that the virus alters itself when it repli-
`cates. As a result, only the common areas are marked
`for further examination by the heuristics employed in
`Block B of FIG. 4. This procedure is illustrated in
`FIG. 6a, in which the variant and “invariant” por-
`tions of a virus are identified by comparing three in-
`stances of it. In FIG. 6a, each byte is shown expressed
`in a hexadecimal representation, and an invariant por-
`tion is identified. The length of the invariant sequence
`of bytes is exemplary. In practice, invariant sequences
`of bytes are often considerably longer than five bytes,
`and more than one such portion typically appears
`within the virus. If there are no substantially similar
`areas,
`this may indicate the presence of a relatively
`sophisticated self-altering virus, or the presence of more
`than one virus. In this case, each of what appear to be
`different viruses are provided separately to the extrac-
`tion method, which determines one or more signatures
`for each. The system is also placed in a moderate state
`of alert, and the user is informed.
`It is noted that a situation may arise in which the
`computer is attacked by a sophisticated, unknown virus,
`every instance of which looks completely different. In
`this case, at the completion of Block A in FIG. 4 the
`virus detector may be unable able to determine whether
`there are several viruses, or one virus that appears dif-
`ferent every time it is replicated. The use of a distress
`signal, described below in relation to Step G, may be
`appropriate for this situation.
`At this point, there are one or more sections of a virus
`program (or a set of different viruses) that are marked as
`being substantially “invariant”. However, even if a
`given section of the virus program is common to all
`copies which have been obtained in Step C, this does
`not guarantee that the section will appear in its entirely
`in every instance of the virus. The purpose of Block B
`is to filter out portions of the “invariant” sections which
`by their nature are more likely to be variable. As a
`general rule, non-executable “data” portions -of pro-
`
`65
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`000013
`
`000013
`
`

`
`9
`stantially invariant” byte sequences, where S is a signa-
`ture length specified by the user or determined by the
`system. Typically, S has a value in the range of approxi-
`mately 12 bytes to 36 bytes. The “probably-invariant”
`byte sequence(s) may be provided an an input file (IF
`32, FIG. 1a) for subsequent processing by the signature
`extraction technique described below.
`There may be several viruses (or virus signatures) of
`interest. As such, a record is maintained of which candi-
`date signatures belong to each virus.
`
`Step E: Signature Extraction
`
`In general, the signature extraction method searches
`through the “probably-invariant” sections of virus code
`for the n best possible signatures (where n is typically
`small, and perhaps 1). A “good” viral signature is one
`which is statistically unlikely to generate false positives
`when run on uninfected, legitimate programs, but will
`always identify the virus if it is present. Depending on
`the nature of the virus scanner which will employ the
`resulting signature, other criteria may be combined
`with this criterion to evaluate the desirability of a given
`candidate signature. Another desirable property is that
`the signature be flexible enough to allow for the detec-
`tion of slight alterations in the viral code. This is espe-
`cially useful because the decoys may not have captured
`the full range of variation of a particular virus. This is
`the reason that the virus scanner referred to in Step B
`should be capable of detecting slightly-altered

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket