`Boston Park Plaza Hotel and Towers
`20-22 September

`Proceedings of
`The Fifth Intprnational
`Virus Bulletin Conference
`21 The Quadrant, Abingdon,
`Copyright © 1995
`Virus Bulletin Ltd
`2i The Quadrant, Abingdon, OX14 3YS, England
`All rights reserved. No part ofthispublication maybe reproduced, storedin aretrieval system, or
`transmitted in any form or by any means, electronic, mechanical, photocopying, recording orotherwise,
`withoutpriorpermissionofthe publishers.
`No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a
`matter ofproducts liability, negligence or otherwise, or from any use ofoperation ofany methods, products,
`instructions ofideas contained in the material herein.
`DAY 1
`Corporate stream
`The anti—virus strategy system
`Sarah Gordon
`Blessings in disguise: building outofdisaster
`Humandimension ofcomputerviruses
`Fullyautomatedresponseforinthewildviruses (FAR -ITW)
`Technical stream 1
`The PCbootsequence,itsrisks and opportunities
`Modern methods ofdeteetingand eradicatingknownand unknownvimses
`Evaluating distributedvirusprotectionproducts
`Technical stream 2
`Dynamicdetectionandclassification ofcomputervirusesusinggeneral
`Mbrton Swimmer
`Flash BIOS - a newsecurity loophole
`Automatic virusanalysersystem
`Theproblemsin creatinggoatfiles
`Automatic testing ofmemoryresidentanti—virus sofiwarc
`Late additions
`Why do we need heuristics?
`Frans Veldman
`DAY 2
`Corporate stream
`A testing time
`Paul Robinson
`Fending offviruses in the university community: acase study ofthe Macintosh
`Juafy Edwards
`Recentviruses, vims writers and routes ofvirus spread in Hong Kong and China
`Allan Dyer
`Case study ofvirus control in a large organisation
`Luciian Caric & Philip Kruss
`Computer viruses: perspective
`Steve White, Jefliey Kephart & David Chess
`Robin Kinney
`Harmless and usefulvirusescan hardly exist
`The effect ofcomputer viruses on OS/2 and Warp
`John Morar& David Chess
`Technical stream 2
`Heuristic scanners: artificial intelligence?
`Righard Zwienenberg
`Virus detection- ‘the brainyway’
`Glenn Coates &DavidLeigh
`Datasecurityproblems associated withhigh capacity IDE hard disks
`Scanners ofthe year 2000: heuristics
`Dmitry Gryaznov
`Computerviruses and artificial intelligence
`Late additions
`The evolution ofpolymorphicviruses
`Fridrik Slculason
`Macroviruses - the sum ofall Ph3 3rs?
`Chris Baxter
`Morton Swimmer
`Virus Test Center, University ofHamburg, Odenwaldstr. 9, 20255 Hamburg, Germany
`Tel +49 404 910041 - Fax +49 405 471 5226 - Email
`Baudouin Ie Charlier andAbdelaziz Mounji
`F.U.N.D.P., Institut d’Infom1atique, University ofNamur, Belgium
`Email /
`We numberoffiles which needprocessingby virus labs is growing exponentially. Even though only a small
`proportion ofthesefiles will contain a new virus, eachfile requires examination. The normal methodjbr
`dealing withfiles is still bruteforce manual analysis. A virus expert runs several tests on a givenfile and
`delivers a verdict on whether it is virulent or not. Ifit is a new virus, it will be necessary to detect it. Some
`tools havebeen developedto speed up thisprocess, rangingfromprograms which identtfitpreviousbn
`classzjfiedfiles toprograms that generate detection data. Some anti-virusproducis have built-in mechanisms
`based on heuristics, which enable them to detectunlcnown viruses. Unfortunately all these tools have
`In thispaper, we will demonstrate how an emulator is used to monitor the system activity ofa virtual PC,
`and how the expertsystem ASAXis used to analyse the stream ofdata whicg the emulatorproduces. We use
`general rules to detect real viruses generically and reliably, andspecific rules to extract details oftheir
`behaviour. The resulting system is called VIDES: it is a prototypefor an automatic analysis systemfor
`computer viruses andpossibly a prototype anti-virusproduct for the emerging 32 bt'tPCoperating
`Virus researchers must cope with many thousands ofsuspectedfiles each month, but the problem is not so
`much the number ofnew viruses (which number perhaps a few hundred and grows at a nearly exponential
`rate) as the number offiles the researcher receives and must analyse - the glut. Out ofperhaps one hundred
`fiIeS one may actually contain anewvirus. Unfortunately, there are no shortcuts. Everyfile hastobc
`The standard method ofsorting out such files is still brute force manual analysis, requiring specialists.
`Some tools have been developed to help cope with the problem, ranging fromprograms which identify and
`remove previously——classified files andviruses to utilitieswhich extract strings from infected files that aid in
`identifying the viruses. However, none ofthe solutions are satisfactory. Clearly, more advanced tools are
`in this paper, the concept of dynamic analysis as applied to viruses is discussed. This is based on an idea
`called VIDES ( VirusIntrusion. Detection ExpertSystem), coined at the Virus Test Center [BFHS9 1 ]. The
`system will comprise ofa PC emulation and an IDES-like expert system It should be capable ofdetecting
`viral behaviour using a set ofapriori rules, as shown in the preliminary work done with Dr. Fischer~
`Hiibner. Furthermore, advanced rules will help in classifying the detected virus.
`The present version ofVIDES is only of interest to virus researchers; it is not designed to be a practical
`system for the end-user - its demands on processing power and hardware platform are too high. However, it
`can be used to identify unknown viruses rapidly and provide detection and classification information to the
`researcher. It also serves as a prototype for the fixture application ofintrusion detection technology in
`detecting malicious software under future operating systems, such as OS/2, MS-Windows NT and 95,
`Linux, Solaris, etc.
`The rest ofthe paper is organized as follows: Section 2 presents the current state ofthe art in anti-virus
`technology; Section 3 describesa generic virus detection rule; Section 4 discusses the architecture ofthe PC
`auditing system; Section 5 shows how the expert system ASAX is used to analyse the activity data collected
`by the PC emulator; and finally, Section 6 contains some concluding remarks.
`For the purpose ofdiscussion it will be necessary to define the term computer virus.
`There is still no universally-agreed definition for a computer virus. What is missing is a description which
`is still general enough to account for all possible implementations ofcomputer viruses. An attempt was
`made in [Swi95], which is the result of many years ofexperience with viruses in the Virus Test Center. The
`following definition for a computer virus is the result ofdiscussion in comp.virus (Virus-L) derived from
`Def 1
`A Computer Virus is a routine or aprogram thatcan ‘infect ‘otherprograms bymo.<1zj)5/ingthern
`or their environment such that a call to an infectedprogram implies a call to apossibly evolved.
`fimctionally similar. copy ofthe virus.
`A more formal, but less useful, definition ofacomputer virus can be found in [Coh85]. Using the formal
`definition, it was possible to prove the virus property undecrdable.
`We talk ofthe infected file as the hostprogram. System viruses infect system programs. Such 33 1h€_ b00t
`or Master Boot Sector, whereas file viruses infect executable files such as BXE or COM files. For an in-
`depth discussion ofthe properties ofviruses, please refer to literature such as: ['Hru92], [SK94], [Coh94] or
`Today, anti-virus technology can be divided into two approaches: the virus specific and the generic
`approach. In principle, the former requires knowledge ofthe viruses before they can be detected. Due to
`advances in technology, this prerequisite is no longer entirely valid in many ofthe modern anti-‘virus
`products. This type oftechnology is known to us as a scanner. The latter attempts to detect a virus by _
`observing attributes characteristic ofall viruses. For instance, integrity checkers detect viruses by checking
`for modifications in executable files; a characteristic ofmany (although not all) viruses.
`Virus specific detection is by far the most popular type ofvirus protection used on PCS. Information
`from the virus analysis is used in the so-called scannerto detect it. Usually, a scanneruses a database of
`virus identification infomiation which enable it to detect all viruses previously analysed.
`The termscanner has become increasingly incorrect terminology. The term comes from lexical scanner, i.e.
`a pattern matching tool. Traditionally scanners have beenjust that. The information extracted from viruses
`were strings which were representative ofthat particular virus. This means that the string has to:
`a differ significantly from all other viruses, and
`a differ significantly from strings found in bonafide anti-virus programs.
`Finding such strings was the entire art ofanti-virus program writing until polymorphic viruses appeared on
`the scene.
`Encrypted viruses were the first minor challenge to string searching methods. The body of the virus was
`encrypted in the host file, and could notbe sought, due to its variable nature. However, thebody was
`C prependcd by a decryptor-loaderwhich mustbe in plain text(unencrypted code); otherwise itwould notbe
`executable. This decryptor can still be detected using strings, even if it becomes difficultto differentiate
`between viruses.
`Polymorphic viruses are the obvious next step in avoiding detection. Here, the decryptor is implemented
`in a variable manner, so thatpattern matching becomes impossible or very difficult. Early polymorphic
`viruses were identified using a set ofpatterns (strings with variable elements). Moreover, simple virus
`detection techniques are made unreliable by the appearance ofthe so-called Mutation Engines such as
`MtE and TPE (Trident Polymorphic Engine). These are object library modules generating variable
`implementations ofthe virus decryptor. They can easilybe linked with vimses to produce highly
`polymorphic infectors. Scanning techniques are further complicated by the fact that the resulting viruses
`do not have any scan strings in common even iftheir structure remains constant. When polymorphic
`technology improved, statistical analysis ofthe opcodes was used.
`Recently, the best ofthe scanners have shifted course from merely detecting viruses to attempting to
`identify the virus. This is often done with added strings, perhaps position dependent, or checkairns, over the
`invariantpart ofthe virus. To support this, many anti-vinrs products have implemented machine-code
`emulators so that the virus‘ own decryptor canbe used to decrypt the virus. Using these enhancements, the
`positive identification ofeven polymorphic viruses poses no problem.
`The next shifi many scanners are presently experiencing is away from known virus only detection to
`detection ofunknown vimses. The method ofchoice is heunsncs. Heuristics are built into an anti-virus
`product in an attempt to deduce whether a file is infected or not
`is most often done by looking for a
`pattern ofcertain code fiagments that occur most often in viruses and hopefully not in bonafide programs.
`Heuristics analysis suffers from a moderate to high false-positive rate. Ofcourse, a manufacturer ofa
`heuristic scannerwill improve the heuristicsboth to avoid falsepositives and still find all new viruses, but
`both cannot be achieved completely. Usually, a heuristic scannerwill contain a ‘traditional’ pattern-matching
`component, so that viruses can be identified by name.
`Computer viruses must replicate to be viruses. This means that a virus must be observable by its mechanism
`of replication.
`Unfortunately, it is not as easy to observe the replication as it may seem. DOS, in it vaiious flavours,
`provides no process isolation, or even protection ofthe operating system from programs. This means that
`any monitoring program can be circumvented by avirus which has been programmed to do so. There used to
`be many anti-virus programs which would try to monitorsystem activity for viruses, but were not proof
`against all viruses. This problem led to the demise of many such programs. Later in the paper, we shall
`discuss how we avoided the problem when implementing VIDES.
`A more common approach is to detect symptoms of the infection such as file modifications. This type of
`program is usually called an integrity checker or checlrsummer.
`When programs are installed on the PC, checksums are calculated over the entire file, or over portions of the
`file. These checksums are then used to verify that the programs have not been modified. The shortcoming of
`this method is that the integiitychecker can detect a modification in the file, but carmot deteimine whether
`the modification is due to a virus or not. A legitimate modification to, for instance, the data area of a
`program will cause the same alarm as avirus infection.
`Another problem is virus technology aimed specifically against anti-virus products. Advances in stealth and
`tunnelling technology have made updates necessary. There have also been direct attacks against
`particular integrity checkers, rendering them useless. Again, the lack ofsupport from the operating
`system makes the prevention of such attacks very difficult. As a consequence, the acceptance of such
`products is low.
`The non-specific nature of the detection has little appeal for many of the users. Even generic repair
`facilities in the anti-virus products do not help, despite these methods effectively rendering identification
`unnecessary. The problem is partly understandable. The user is concerned with his data. Merely
`disinfecting the programs is not enough if data has been manipulated. Only if the virus has been
`identified and analyzed can the user detemiine if his data was threatened.
`Generic virus detection technology should not be dismissed. It isjust as valid as virus-specific technology.
`The problems so far have stemmed from the permissiveness ofthe underlying operating system, DOS, and
`from the limits in the programs. Both problems can be addressed.
`Before we can attempt to detect a virus using ASAX, we need to model the virus attack strategy. This is
`then translated into RUSSEL, the mle-based language which ASAX uses to identify the virus attack.
`State transition diagrams are eminently suitable for representing virus infection scenarios. In this model of
`representation, we distinguish two basic components: a node in a state transition‘diagrai_n represents some
`aspects ofthe computing system state. Arcs represents actions performed by a program in execution.
`Given a (current) state si, the action a takes the system from the state s, to the state sfas shown in Figure
`1. The infection process played by a virus can be viewed as a sequence ofactions which drives the system
`from an initial clean state to a final infectious state, where some files are infected. In order to get a complete
`description ofthe actual scenario, a state is adorned by a set ofassertions, characterizing the objects as
`affected by actions.
`Figure 1: State transition diagram
`C In practice,we only representthoseactionsrelevanttotheinfection scenario. Asa result,manypossible
`actions may occur between adjacent states, but are not recorded because they do not entail a modification in
`the current state. In terms ofauditing, irrelevant audit records may be present in the sequence ofaudit
`records representing the infection signature.
`For the sake ofsimplicity, discussion of the generic detection rules are based on the state transition
`diagrams described above.
`VTDES uses three types ofdetection rules: generic detection rules, virus specific rules, other rules. As its
`name implies, generic rules are used to detectall viruses which use a known attack pattern. For this, models
`ofvirus behaviour are needed for the target system (in our case MS-DOS). Virus-specific mles use
`information from a previous analysis to detect that specific virus, or direct variants. These rules are similar
`to virus-specific detection programs, except for the fact that they analyze the dynamic behaviour ofthe virus
`instead ofits code. Finally, there are the ‘other rules’ for gleaning other infomiation item the virus which
`can be used in its classification.
`C We will notgo into thevirus-specific rules orthe ‘other’ rules, concentrating instead on the generic rules.
`In developing a generic rule for detecting viruses, we need to have a model for the virus attack. No one
`model will do, because MS-DOS viruses can use choose from many effective strategies. This is
`compounded by the diversity of executable file types for MS-DOS. Fortunately forus, the majority of
`viruses have chosen one particular strategy, and infect only two types ofexecutable files. This means that
`we can detect most viruses with very few rules. On the other hand, a virus which uses an unknown attack
`strategy will not be detected. For this reason, the prototype analysis system contains an auxiliary static
`analysis component to detect such problems.
`In the following, we will develop a generic rule which detects file infectors that modify the file directly to
`gain control over that file. We will concentrate on COM file infectors. EXE file infectors are detected in an
`analogous way.
`We must make two assumptions about the behaviour ofDOS viruses to help us buildthe rule.
`Assumption 1:
`Afile-infecting virus modifies the hostfile in such a way that itgains control over the
`hastjile when the hostfile is run.
`This is a specific version ofthe virus definition (Def 1). However, it doesn’t specify when the virus gains
`control over the host file.
`Hie virus in an injéctedfile receives control overthefile before the original host
`Assumption 2:
`That is, when the infected file is run, the virus is run before the host program.
`Discussion: If the virus never gains control over the host file, it would not fulfil the definition ofa virus.
`This observation leads to Assumption 1. However, there is no reason (in the definition) why the virus must
`gain control before the host does.
`We make an additional assumption that the virus does gain control before the hostprogram does. The reason
`we do this is to avoid very blatant false positives. However, it should be noted that Assumption 2 does not
`I'Cs:l.(lilt fromthe virus definition, and will cause some viruses to be missed. For these cases, otherrulesare
`With respectto assumptions 1 and 2, we are looking for twopossible infection strategies:
` other
`read or writes
`read or writes
`read or writes 5
`Figure 2: Generic rulefor identryjzing COMfile infectors
`1 The virus is overwriting. Therefore, we are looking for a write to the beginning of the file (BOF),
`withouta previous read to the same location. Other reads and
`are permitted.
`2 The virus is nan-overwriting. We expect to see a read to BOF, then a write to BOF. Before, in
`between, and afierthese two events, other reads and writes are permitted.
`The assumption in both cases is thatthe write to BOF causes the virus to gain control on execution.
`In the case of a non-overwriting virus, we assume that the virus first reads the original code at BOF and
`then replaces it with its own code, usually ajump to the virus body. In most cases, the number ofbytes read
`will be the same as the number ofbytes written, but we cannot assume this. In the case ofan overwriting
`virus, the code is not read (and saved somewhere), but overwritten.
`Other reads and writes are not actually relevant to the detection of the virus. They can be logged and used in
`generating virus- specific rules.
`The rule is initiated by the opening ofa file (in this case a COM file). The rule is terminated by a close of
`the file, where this does not have to be done by the virus itself. In between these two events, we expect the
`actual infection to occur. We look for the read BOF followed by the write BOF or the write BOFwithout
`the read. Other administrative operations, like tracking the file position, are also done by the rule. This is
`shown in the state transition diagram ofFigure 2.
`Some viruses cause problems for the rule by closing the file after a first set of operations. This is handled
`by a reopen mechanism which waits for apossible open event on the same file from the virus. In order that
`this rule does not stay active indefinitely and clog up the rule memory, there are a number ofterminating
`events. In fig. 2, reopen is abstracted as a transition element, whereas its implementation is as a separate
`MS-DOS provides two methods ofaccessing files. The most common method uses file handles. Access
`usingfile control blocks (FCB) was provided for compatibility to CP/M, and is rarely used, even by
`viruses. However, because it is used, we need a separate rule to handle this method. The basic rule stays the
`same, but internal handling of the data is different.
`We could avoid this problem by abstracting the audit data to give us a generic view of the system events.
`This way, we could reduce the number ofaudit records to only relevant higher—level records by using a
`filter. After that, processing becomes simpler as the problems of reopens and handle/FCB use disappear.
`This method also allows us to apply the rules on non-MS-DOS systems which provide similar file handling.
`As a matter of fact, ASAX itself is the logical choice to act as the filter. The first ASAX system reads the
`raw audit trail, converts it into genetic data, and pipes its output as a NADF file for further processing (see
`Section 5). Using ASAX as a filter allows us to reduce the complexity ofmaintaining such a system while
`not sacrificing any power.
`The prerequisite for using an Intrusion Detection (ID) system like ASAX is an audit system which securely
`collects system activity data. In addition, integrity ofthe ID system itselfmust not be compromised: this
`means that the audit data retrieval, analysis and archiving must be secured against corruption by viruses.
`Moreover, the ID system must not be prevented firom reporting (raising alarms, updating virus information
`databases) the results ofsuch analysis. DOS neither provides such a service, nor makes the implementation
`of such a service easy. Its total lack ofsecurity mechanisms means that the collection of data can be
`subverted. Even if the collection can be secured, the data is open to manipulation ifstored on the same
`For the prototype ofVIDES, we were not bound to a real world implementation, so we explored various
`alternative possibilities. The experience gained by the use of such a system will not benefit DOS users, but
`should be applicable to users ofvarious emerging 32-bit operating systems which offer DOS support.
`We have made several attempts to build a satisfactory audit system: these are described hereafier.
`All DOS services areprovided to application programs via interrupts, which can be described as indexed
`inter-segment calls. Primarily, interrupt 0x21 is used. The requested service is entered into the AH
`register and its parameters are entered into the other registers. When the service is finished, it returns
`control to the calling program and provides its results in registers or in buffers.
`The very first implementation ofan auditing system was a filter which was placed before DOS Services and
`registered all calls to DOS functions. This was done very early on, together with Dr. Fischer-Hiibner, to
`prove the feasibility ofthe VIDES concept. It also demonstrated the limits which DOS imposes on the
`implementation ofsuch an auditing system: it did not run reliably, and could be subverted by tunnelling
`This implementation was soon scrapped, but it did prove that the premise was correct: viruses could be
`found using ID technology. This was perhaps the first such a trial that had been done [BFHS9I].
`The Intel iAPX 386 introduced the so-called virtual 8086 machine mode. A protected_mode operating system
`can create many virtual 8086 machines in which tasks can run completely isolated from each other and from
`the operating system. Each task ‘sees’ only its own environment. Operating systems such OS/2 use these
`constructs to provide a full DOS environment for DOS programs. All callsto the machine (via the BIOS
`interface or direct port access) and DOS are redirected to the host operating system (OS/2 in this case) for
`This mechanism can also be used to monitor the activity in DOS session. Because all interrupts are being
`redirected to the native operating system, the native operating system can record the activity securely and
`Care has to be taken in the implementation ofthe virtual 8086 machine. The DOS windows in OS/2 have
`been shown in tests at the VTC to be too permissive. In the course ofa comprehensive test including the
`entire collection offile viruses, many ofthe viruses running under a DOS window managed to harm vital
`parts ofthe system. One problem was that OS/2 files could be manipulated directly from within the DOS
`session. However, this did not explain the corruption ofthe running operating system.
`Even though using a virtual 8086 machine was the original method ofchoice, such experiments showed that
`the complexity of building a safe implementation would be difficult. A more secure method was sought for
`the prototype.
`Hardware debugging systems, such as the Periscope IV, may be used to monitor system events closely in
`real time. This is achieved by a card fitted between the CPU and the motherboard and which can set break
`points on various types ofevents on the PC’s bus. The card is connected to a receiving card in a second PC
`which is used to control the debugging session.
`Monitoring system behaviour on a DOS machine can be accomplished by capturing the Intemipt 0x21
`directly, or by setting a break point in the resident DOS kernel. Special memory areas can be monitored by
`setting a break condition on access to those areas.
`The monitoring is completely unobtrusive, i. e. the program will not notice a difference between running
`with or without the debugger. When an event is triggered, the PC is stopped while the controlling PC is
`processing the data. Ifthe controlling PC is fast enough, the tune delay should be nearly negligible.
`A hardware solution usingthe PeriscopeIVis complicated by the problem ofautomating the processes
`necessary to test large numbers ofviruses on different operating systems. When such a solution is
`implemented, itwill offer the possibility oftesting viruses on other PC operating systems which require full
`iAPX 3 86 compatibility.
`emulation is a
`The solution which was fuially chosen was the software emulation ofthe 808.6 processor.
`program which accepts the entire instruction set ofaprocessor as input, and Interprets the binary Code 35315
`original processor would. All other elements ofthe machine must be implemented or emulated, eg. the
`various ports To simplify and quicken the emulation, the BIOS Code (Basic Input Qutput System - the
`interface between the operating system and the hardware) can be replaced with special emulation books, so
`that the complicated machine access can be skipped as long as all access to those servicesare routed via the
`BIOS. In the case of a graphics adapter, the entire hardware must be emulated, whereas disk access can be

