`
`Symantec 1003
`IPR of U.S. Pat. No. 7,757,298
`
`
`
`U.S. Patent
`
`Jun. 10, 2003
`
`Sheet 1 of 2
`
`US 6,577,920 B1
`
`17
`
`18
`
`Figure 1
`
`000002
`
`000002
`
`
`
`U.S. Patent
`
`Jun. 10, 2003
`
`Sheet 2 of 2
`
`US 6,577,920 B1
`
`enerate I C SYS em even
`
`
`
`
`
`
`¢ I'lVCI'
`1 e system
`
`
`ntercept I1 e system event at
`
` Pass 1 e system event to macro virus contro er
`
`
`
`
`
`Does
`No
`file contain .
`
`Iiui'\JA'O?
`
`
`
`C €l'fI'lln€
`
`signa ['6 Ofl CHI] 16 macro
`
`Yes
`
`
`
`Does
`first database
`
`
`
`contain a matching
`signature?
`
`
`Does
`
`second database
`contain a matching "
`signature?
`
`
`
`nterupt event an not1
`"VIRUS FOUND"
`
`
`user
`
`
`
`
`
`Does
`third database
`
`contain a matching
`signature?
`
`
`
`
`ontmue processmg 1 e system event
`
`Fi ure2
`
`000003
`
`000003
`
`
`
`US 6,577,920 B1
`
`1
`COMPUTER VIRUS SCREENING
`
`FIELD OF THE INVENTION
`
`The present invention relates to the screening of computer
`data for viruses and more particularly to the screening of
`computer data for macro viruses.
`
`BACKGROUND OF THE INVENTION
`
`Computer data viruses represent a potentially serious
`liability to all computer users and especially to those who
`regularly transfer data between computers. Computer
`viruses were first identified in the 1980’s, and up until the
`mid-1990s consisted of a piece of executable code which
`attached itself to a bona fide computer program. At that time,
`a virus typically inserted a JUMP instruction into the start of
`the program which, when the program was executed, caused
`a jump to occur to the “active” part of the virus. In many
`cases, the viruses were inert and activation of a virus merely
`resulted in its being spread to other bona fide programs. In
`other cases however, activation of a virus could cause
`malfunctioning of the computer running the program
`including, in extreme cases, the crashing of the computer
`and the loss of data.
`
`Computer software intended to detect (and in some cases
`disinfect) infected programs has in general relied as a first
`step upon identifying those data files which contain execut-
`able code, e.g.
`.exe, .com, .bat. Once identified, these files
`are searched (or parsed) for certain signatures which are
`associated with known viruses. The producers of anti-virus
`software maintain up to date records of such signatures
`which may be, for example, checksums.
`WO95/12162 describes a virus protection system in
`which executable data files about to be executed are passed
`from user computers of a computer network to a central
`server for virus checking. Checking involves parsing the
`files for signatures of known viruses as well as for signatures
`of files known to be clean (or uninfected).
`In 1995, a new virus strain was identified which infected,
`in particular, files of the Microsoft Office” system. Given
`the dominant position of Microsoft Office TM in the computer
`market, the discovery of these viruses has caused much
`consternation.
`Microsoft OfficeTM makes considerable use of so-called
`
`“macros” which are generally small executable programs
`written in a simple high level language. Macros may be
`created, for example, to provide customised menu bars or
`“intelligent” document templates or may be embedded in
`some other file format. For example, macros may be embed-
`ded in template files (.dot) or even in Microsoft Word” files
`(.doc).
`As the new strains of virus discovered in 1995 infect
`
`they are generally referred to as “macro
`macro files,
`viruses”. It will be appreciated that the possibility for macro
`viruses to be spread is great given the frequency with which
`Microsoft OfficeTM files are copied between two computers
`either by way of floppy disk or via some other form of
`electronic data transfer, e.g.
`the Internet. Indeed, viruses
`such as “WM/Concept” are known to have spread widely
`and rapidly at a global level.
`Producers of anti-virus software have approached the
`macro virus problem by maintaining and continuously
`updating records of macro viruses known to exist in the
`“wild”. As with more conventional viruses, a signature
`(commonly a checksum) is determined for each macro virus
`
`2
`and these signatures are disseminated to end users of anti-
`virus software. The software generally scans data being
`written to or read from a computer’s hard disk drive for the
`presence of macros having a checksum corresponding to one
`of the identified viruses.
`
`There are a number of problems with these more or less
`conventional approaches. Firstly,
`the number of macro
`viruses is exploding with around 3000 identified by mid
`1998. There is inevitably a time lag between a virus being
`released and its being identified, by which time many
`computers may have been infected. Secondly, end users may
`be slow in updating their systems with the latest virus
`signatures. Again, this leaves a window of opportunity for
`systems to be infected.
`W0 98/14872 describes an anti-virus system which uses
`a database of known virus signatures as described above, but
`which additionally seeks to detect unknown viruses based
`upon expected virus properties. However, given the inge-
`nuity of virus producers, such a system is unlikely to be
`completely effective against unusual and exotic viruses.
`SUMMARY OF THE PRESENT INVENTION
`
`10
`
`15
`
`20
`
`It is an object of the present invention to overcome or at
`least mitigate the above noted disadvantages of existing
`anti-virus software.
`
`25
`
`This and other objects are met by screening computer data
`to identify macros which do not correspond to known
`certified and acceptable macros.
`According to a first aspect of the present invention there
`is provided a method of screening a software file for viral
`infection, the method comprising;
`defining a database of signatures indicative of macros
`previously certified as being virus free;
`scanning said file to determine whether or not the file
`contains a macro; and
`if the file contains a macro, determining whether or not
`the macro has a signature corresponding to one of the
`signatures contained in said database.
`It will be appreciated that embodiments of the present
`invention have the advantage that they may be used to
`effectively block the transfer and/or processing of files
`which contain a previously unidentified (either to the local
`user or to the software producer) macro virus. It is therefore
`less critical (or even unnecessary) for the software to be
`updated to take account of newly detected viruses).
`Preferably, said step of defining a database of signatures
`indicative of macros previously certified as being virus free
`comprises scanning a set of end user applications which are
`known to be virus free to identify macros therein, determin-
`ing a signature for each of the identified macros, and
`compiling the determined signatures into the database. More
`preferably, the step of defining the database comprises the
`further steps of updating the database with additional macro
`signatures. This updating may be done via an electronic link
`between a computer hosting the database (where the scan-
`ning of the file is performed) and a remote central computer.
`Alternatively, the database may be updated by way of data
`stored on an electronic storage medium such as a floppy
`disk. The database may also include signatures correspond-
`ing to widely used proprietary macros, e.g. those used by
`large organisations.
`Preferably, the method comprises defining a second data-
`base comprising signatures indicative of macro viruses, and
`scanning said file to determine whether or not
`the file
`contains a signature corresponding to one of signatures
`contained in the second database. This second database may
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`000004
`
`000004
`
`
`
`US 6,577,920 B1
`
`3
`be created at a central site and disseminated to end users by
`floppy disk or direct electronic data transfer.
`Preferably, the method comprises creating a set of signa-
`tures corresponding to a set of user specific macros, certified
`by the user as being virus free. These signatures may be
`added to the first mentioned database, or may be included in
`a separate database. In either case, the method comprises
`scanning a macro identified in a file to determine whether or
`not the macro has a signature corresponding to a signature
`of a user certified macro. The user in this case may be an end
`user, but preferably is a network manager. In the latter case,
`database updates made by the network manager are com-
`municated to the network end user computers where the
`virus screening is performed.
`According to a second aspect of the present invention
`there is provided a method of screening a software file for
`viral infection, the method comprising:
`defining a first database of known macro virus signatures,
`a second database of known and certified commercial
`
`macro signatures, and a third database of known and
`certified local macro signatures;
`scanning said file to determine whether or not the file
`contains a macro; and, if the file contains a macro
`determining a signature for the macro and screening that
`signature against the signatures contained in said data-
`bases; and
`alerting a user in the event that the macro has a signature
`corresponding to a signature contained in said first
`database and/or in the event
`that
`the macro has a
`signature which does not correspond to a signature
`contained in either of the second and third databases.
`
`According to a third aspect of the present invention there
`is provided apparatus for screening a software file for viral
`infection, the apparatus comprising;
`a memory storing a set of signatures indicative of macros
`previously certified as being virus free; and
`a data processor arranged to scan said file to determine
`whether or not the file contains a macro and, if the file
`does contain a macro, to determine whether or not the
`macro has a signature corresponding to one of the
`signatures contained in said database.
`According to a third aspect of the present invention there
`is provided a computer memory encoded with executable
`instructions representing a computer program for causing a
`computer system to:
`maintain a database of signatures indicative of macros
`previously certified as being virus free;
`scan data files to determine whether or not
`contains a macro; and
`if a file contains a macro, determine whether or not the
`macro has a signature corresponding to one of the
`signatures contained in said database.
`Preferably, the computer program provides for the updat-
`ing of said database with additional macro signatures.
`Preferably, the computer program causes a second data-
`base to be maintained which comprises signatures indicative
`of macro viruses, and further causes the files to be scanned
`to determine whether or not they contain a signature corre-
`sponding to one of signatures contained in the second
`database. More preferably, the computer program causes a
`third database to be maintained which comprises signatures
`indicative of macros defined locally, e.g. at the level of a
`local network to which the programmed computer is con-
`nected. The computer program causes this third database to
`be scanned for a match between signatures of a file macro
`not already matched in the first and second databases, and
`signatures contained in the third database.
`
`the files
`
`4
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a functional block diagram of a computer system
`in which is installed macro virus screening software; and
`FIG. 2 is a flow chart illustrating the method of operation
`of the system of FIG. 1.
`
`DETAILED DESCRIPTION OF CERTAIN
`EMBODIMENTS
`
`For the purpose of illustration, the following example is
`described with reference to the Microsoft Windows” series
`
`of operating systems, although it will be appreciated that the
`invention is also applicable to other operating systems such
`as Macintosh system and OS/2. With reference to FIG. 1, an
`end user computer 1 has a display 2 and a keyboard 3. The
`computer 1 additionally has a processing unit and a memory
`which provide (in functional terms) a graphical user inter-
`face layer 4 which provides data to the display 2 and
`receives data from the keyboard 3. The graphical user
`interface layer 4 is able to communicate with other com-
`puters via a network interface 5 and a network 6. The
`network is controlled by a network manager 7.
`Beneath the graphical user interface layer 4, a number of
`user applications are run by the processing unit. In FIG. 1,
`only a single application 8 is illustrated and may be, for
`example, Microsoft Word”. The application 8 communi-
`cates with a file system 9 which forms part of the Microsoft
`Windows” operating system and which is arranged to
`handle file access requests generated by the application 8.
`These access requests include file open requests, file save
`requests, file copy requests, etc. The lowermost layer of the
`operating system is the disk controller driver 10 which
`communicates with and controls the computer’s hard disk
`drive 11. The disk controller driver 10 also forms part of the
`Microsoft Windows” operating system.
`Located between the file system 9 and the disk controller
`driver 10 is a file system driver 12 which intercepts file
`system events generated by the file system 9. The role of the
`file system driver 12 is to co-ordinate virus screening
`operations for data being written to, or read from, the hard
`disk drive 11. Asuitable file system driver 12 is, for example,
`the GATEKEEPERTM driver which forms part of the
`F-SECURE ANTI-VIRUSTM system available from Data
`Fellows Oy (Helsinki, Finland). In dependence upon certain
`screening operations to be described below, the file system
`driver 12 enables file system events to proceed normally or
`prevents file system events and issues appropriate alert
`messages to the file system 9.
`The file system driver 12 is functionally connected to a
`macro virus controller 13, such that file system events
`received by the file system driver 12 are relayed to the macro
`virus controller 13. The macro virus controller is associated
`with three databases 14 to 16 which each contain a set of
`
`“signatures” previously determined for respective macros.
`For the purposes of this example, the signature used is a
`checksum derived using a suitable checksum calculation
`algorithm, such as the US Department of Defence Secure
`Hash Algorithm (SHA) or the older CRC 32 algorithm.
`The first database 14 contains a set of signatures derived
`for known macro viruses. The signatures in this database 11
`are determined by the provider of the file driver system 12
`and the macro virus controller 13 and are regularly updated
`to take into account newly discovered viruses. Updates may
`be provided by way of floppy disks or directly by down-
`loading them from a remote server 17 connected to the
`Internet 18.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`000005
`
`000005
`
`
`
`US 6,577,920 B1
`
`5
`The second database 15 contains a set of signatures
`derived for commercially available macros. These macros
`include those supplied with the Microsoft Office TM operating
`system and with user applications such as Microsoft
`Word”. Again,
`these signatures are determined by the
`provider of the file driver system 12 and the macro virus
`controller 13 and are regularly updated to take into account
`newly available products.
`The third database 16 contains a set of signatures which
`are derived for macros created and used at the local network
`
`level, for example letter templates and the like (of course
`this database may be empty if no local macros are defined).
`Once a new local macro is created, typically at the network
`manager 7, the macro is processed by the network manager
`7 to derive the corresponding (checksum) signature. This is
`then relayed via the local network 6 to the end user computer
`1 where it is added to the third database 16. It is usually the
`case that only the network manager has the authority to
`modify this database 16, whilst the first and second data-
`bases 14,15 can be updated only by the network manager 7
`using signatures specified by the anti-virus software pro-
`vider.
`
`the macro virus
`Upon receipt of a file system event,
`controller 13 first analyses the file associated with the event
`(and which is intended to be written to the hard disk drive
`11, read, copied, etc) to determine if the file contains a
`macro. This may include examining the file name extension
`(e.g. to identify dot, .doc files) and/or scanning the file for
`embedded macros. If one or more macros is identified in the
`
`file, a checksum signature is determined for the/or each
`identified macro.
`
`Assuming that a single macro is identified in the file, the
`macro virus controller 13 scans the first database 14 to
`
`determine whether or not the corresponding signature is
`present in that database 14. If the signature is found there,
`the macro virus controller 13 reports this to the file system
`driver 12. The file system driver 12 in turn causes the system
`event to be suspended and causes an alert to be displayed to
`the user that a known virus is present in the file. The file
`system driver 12 may also cause a report to be sent to the
`network manager 7 via the local network 6.
`If this first scan does not locate a known virus, the macro
`virus controller 13 proceeds to search the second database
`15 to determine whether or not the signature for the iden-
`tified macro is present in that database 15. If the signature is
`found, then an appropriate report is sent to the file system
`driver 12, which in turn allows the file event to proceed
`normally. However,
`if the signature is not found in the
`second database 15, this indicates that the identified macro
`is unknown to the system and may be a new and unknown
`virus.
`
`Before a warning is issued to the user, the macro virus
`controller 13 searches the third database 16 to determine
`
`whether the as yet unidentified macro corresponds to a
`locally defined macro. If the answer is yes, then the macro
`virus controller 13 reports accordingly to the file system
`driver 12 and the event is allowed to proceed. On the other
`hand, if the identified macro signature is not found in the
`third database 16, then the macro virus controller 13 reports
`this to the file system driver 12 and the event is suspended.
`Again, a report is sent to the network manager 7, and also
`possibly to the remote server 17 of the software provider.
`This report may be accompanied by a copy of the “guilty”
`macro.
`
`The file scanning system described above is further illus-
`trated by reference to the flow chart of FIG. 2.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`It will be appreciated by the person of skill in the art that
`various modifications may be made to the embodiment
`described above without departing from the scope of the
`present invention. For example, the file system driver 12
`may make use of further virus controllers including control-
`lers arranged to screen files for viruses other than macro
`viruses. The file system driver 12 may also employ disin-
`fection systems and data encryption systems.
`It will also be appreciated that the file system driver 12
`typically receives all file access traffic, and not only that
`relating to hard disk access. All access requests may be
`passed to the macro virus controller 13 which may select
`only hard disk access requests for further processing or may
`also process other requests relating to, but not limited to,
`floppy disk data transfers, network data transfers, and
`CDROM data transfers.
`We claim:
`
`1. Amethod of screening a software file for viral infection,
`the method comprising:
`defining a first database of known macro virus signatures,
`a second database of known and certified commercial
`
`macro signatures, and a third database of known and
`certified local macro signatures;
`scanning said file to determine whether or not the file
`contains a macro; and, if the file contains a macro
`determining a signature for the macro and screening that
`signature against the signatures contained in said data-
`bases; and
`alerting a user in the event that the macro has a signature
`corresponding to a signature contained in said first
`database and/or in the event
`that
`the macro has a
`signature which does not correspond to a signature
`contained in either of the second and third databases.
`
`2. A method according to claim 1, wherein said step of
`defining a second database of known and certifiable com-
`mercial macro signatures comprises scanning a set of end
`user applications which are known to be virus free to
`identify macros therein, determining a signature for each of
`the identified macros, and compiling the determined signa-
`tures into the second database.
`
`3. A method according to claim 1, wherein the step of
`defining the third database comprises the further steps of
`updating the third database with additional macro signa-
`tures.
`
`4. A method according to claim 3, wherein said updating
`steps are done via an electronic link between a computer
`hosting the database, where the scanning of the file is
`performed, and a remote central computer.
`5. A method according to claim 1, wherein thee user is a
`network manager and database updates made by the network
`manager are communicated to network end user computers
`where virus screening is performed.
`6. A method according to claim 1, wherein said step of
`determining a signature for the macro and screening that
`signature comprises deriving a signature of the macro and
`comparing the derived signature with signatures in the
`databases.
`
`7. A method of screening a software file to determine
`whether any macro contained therein does or does not
`contain a virus, the method comprising:
`defining a first database of known macro virus signatures,
`a second database of known and certified commercial
`
`macro signatures, and a third database of known and
`certified local macro signatures;
`scanning said file to determine whether or not the file
`contains a macro; and
`
`000006
`
`000006
`
`
`
`US 6,577,920 B1
`
`7
`if the file contains a macro, determining whether or not
`the macro has a signature corresponding to one of the
`signatures contained in said databases.
`8. Apparatus for screening a software file for viral
`infection, the apparatus comprising:
`a memory storing a first database of known macro virus
`signatures, a second database of known and certified
`commercial macro signatures, and a third database of
`known and certified local macro signatures; and
`a data processor arranged to scan said file to determine
`whether or not the file contains a macro and, if the file
`does contain a macro, to determine whether or not the
`macro has a signature corresponding to one of the
`signatures contained in said databases.
`9. The apparatus according to claim 8, wherein, in order
`to determine whether or not
`the macro has a signature
`corresponding to one of the signatures contained in said
`databases, said data processor is arranged to derive a sig-
`nature of the macro and to compare the derived signature
`with signatures in the databases.
`10. Acomputer memory encoded with executable instruc-
`tions representing a computer program for causing computer
`system to:
`maintain a first database of known macro virus signatures,
`a second database of known and certified commercial
`
`macro signatures, and a third database of known and
`certified local macro signatures;
`
`8
`scan data files to determine whether or not
`
`the files
`
`contains a macro; and
`
`if a file contains a macro, determine whether or not the
`macro has a signature corresponding to one of the
`signatures contained in said second database.
`11. A computer memory according to claim 10, wherein
`the computer program provides for the updating of said third
`database with additional macro signatures.
`12. A computer memory according to claim 10, wherein
`the computer program causes the files to be scanned to
`determine whether or not they contain a signature corre-
`sponding to one of signatures contained in the first database.
`13. A computer memory according to claim 12, wherein
`the computer program causes the third database to be
`scanned for a match between signatures of a file macro not
`already matched in the first and second databases, and
`signatures contained in the third database.
`14. The computer memory according to claim 10, wherein
`in order to determine whether or not
`the macro has a
`
`signature corresponding to one of the signatures contained in
`said databases, said computer program causes the computer
`system to derive a signature of the macro and to compare the
`derived signature with signatures in the databases.
`
`10
`
`15
`
`20
`
`25
`
`OOOOO7
`
`000007
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`
`CERTIFICATE OF CORRECTION
`
`PATENT NO.
`APPLICATION NO.
`
`: 6,577,920 B1
`: 09/165279
`
`DATED
`INVENTOR(S)
`
`: June 10, 2003
`: Mikko Hypponen et a1.
`
`It is certified that error appears in the above—identified patent and that said Letters Patent is
`hereby corrected as shown below:
`
`Title Page, Item (75): Please add --Alexey Kirichenko, Espoo (FI)-- (As third inventor)
`
`Signed and Sealed this
`
`Fourth Day of August, 2009
`
`t~2»@«€’1
`
`JOHN DOLL
`Acting Director ofthe United States Patent and Trademark Oflice
`
`OOOOO8
`
`000008