throbber
A Testbed for Malicious Code Detection: A
`Synthesis of Static and Dynamic Analysis
`Techniques*
`
`R. Crawford, R. Lo, J. Crossley, G. Fink, P. Kerchen, W. Ho,
`K. Levitt, R. Olsson, and M. Archer
`
`Division of Computer Science
`University of California, Davis
`Davis, CA 95616
`
`Abstract
`
`1
`
`Introduction
`
`In the past five years, there has been an explosion in the number ofTro-
`jan horses, time bombs, and viruses that have been found on computers,
`Furthermore, the ease with which one may write a virus or trapdooris
`certainly cause for concern: In his Turing Award lecture, Ken Thompson
`demonstrated an elegant yet simple trapdoor program that was quite ef-
`fective in subverting the security of a UNIX system [5]. The situation is
`re
`even less encouraging in the personal computer arena — literally hundreds
`“SPONSORS:National Comput-r Security Center, Lawrence Livermore National Labora-
`tory, Deloitte Touche
`
`translating it into our internal form laneuage. Becanse not all assembler
`
`This paper proposes an environment for detecting many types of ma-
`licious code,
`including computer viruses, Trojan horses, and time/logic
`bombs. This Malicious Code Testbed (MCT)is based upon both static and
`dynamic analysis tools developed at the University of California, Davis,
`that have been shown to be effective against certain types of malicious code.
`The testbed enhances the power of these tools by using them in a com-
`plementary fashion to detect more general cases of malicious code. The
`MCTallows administrators and security analysts to check a program be-
`fore installation, thereby avoiding any damage a malicious program might
`inflict.
`Keywords: Detection of Malicious Code, Static Analysis, Dynamic
`Analysis.
`
`Departmentof Energy Computer Security Group
`14th Annual Conference Proceedings
`
`(fo
`
`CS-1011
`CS-1011
`Cisco Systems, Inc. v. Finjan, Inc.
`Cisco Systems, Inc. v. Finjan, Inc.
`17-1
`
`4.2.1 Disassembly — Translation into Internal Form
`To statically analyze the behavior of an executable machine code program,
`we must first “disassemble” it, that is, translate its code into our internal
`form. We have designed a set of procedures that, given a 2-tuple (Mem-
`ory-Address, Memory.Contents), will disassemble its Memory.Contents,
`
`

`

`time bombs, and Trojan horses exist for all of the
`of computer viruses,
`major personal computers in use today.
`Malicious code detection — in the most general case — is known to be
`an undecidable problem. However, a number of techniques already exist
`for coping with certain restricted forms of malicious code. Although no
`algorithm thatidentifies malicious code in all its guises can exist, various
`heuristics may be applied to detect, e.g., viruses that are not very cleverly
`hidden in legitimate code. Such an approach to managing the problem
`is valid toward all formsof malicious code: stopping a large percentage
`of destructive programs is a considerable improvement over not stopping
`any of them.
`This idea forms the basis for a Malicious Code Testbed (MCT) ca-
`pable of detecting a large majority of current and future malicious pro-
`grams. Given the absence of a decision procedure for malicious code, such
`a testbed would allow one to examine a program to ascertain whether or
`notit is suspicious. The MCT will employ heuristics to analyze and sim-
`plify a program. Although these techniques may not always succeed, they
`should cover all “tricks” thought to be employed by malicious code.
`We first discuss some of the known methods of coping with malicious
`code, and then summarize previous work at UC Davis aimed at provid-
`ing defenses against malicious code. Finally, we explore in greater detail
`the idea of the Malicious Code Testbed that builds on our previous work,
`and melds several different heuristic techniques into a more effective, in-
`tegrated system.
`
`that only the (Memory_Address, Memory.Contents) fields are needed.
`
`stored in a table in the MCT.
`
`4.2.3 Memory Model for the Data Segment
`Cells in a suspicious Pprogram’s data memory can be represented by the
`same structures as are used for its code, although at first it might appear
`
`2 A Sample of Current Methods for Cop-
`ing with Malicious Code
`Presently,
`the majority of malicious code defenses are concerned with
`computerviruses. However, someare morebroadly applicable to malicious
`code in general. Table 1 shows the applicabiity of some of these methods.
`These methods can be divided into two distinct classes depending on when
`they are applied: as a pre-ezecution check or at run lime. Pre-execution
`techniques are applied to a suspicious program before it can be executed
`by a user.
`In contrast, run time methods are actually applied to the
`program as it executes, in hopes of stopping the program before it can
`cause damage orallow a virus to propagate.
`Many of the more sophisticated pre-execution methods rely on the
`prior existence of a copy of the program tha;is assumed to be “clean”,
`perhaps because it was originally written by a trusted programmer and
`then translated into an executable file by a trusted compiler on a secure
`system. One such method computes cryptographic checksums that are
`characteristic of that trusted executable file, and embeds them inthatfile.
`Thefile is then copied to an insecure environment, whose operating system
`will not allow a user to execute any program until it has recomputed
`what those checksums should be and compared those values with the ones
`
`14
`
`Department of Energy Computer Security Group
`14th Arnual Conference Proceedings
`
`

`

`actually embedded in the program. In this way, most alterations made to
`a trusted executable file after it leaves the secure system can be detected
`before the program is executed in the insecure environment.
`It is importantto note that this technique shares one important char-
`acteristic in common with most other sophisticated pre-execution methods
`— ultimately, they depend on the prior application ofdetection (or formal
`verification) techniques in order to certify an executable file as “trusted”
`in the first place.
`
`Table 1. Applicability of Defenses.
`
`treated as data (the anestion of whether » civen Trine machine will aver
`
`2.1 Simple Scanners and Monitors
`Simple scanners such as McAfee’s Scanv or Norstad’s Disinfectant are by
`and large the most common Pre-execution method in use today. Typically,
`the user will invoke a scanner to search a binary program for patterns (bit-
`strings) that match those of known malicious programs.
`If none of those
`bitstrings are found, the user then proceedsto execute the program. Thus
`these scanners boast a very good record in defending against known ma-
`licious programs but they cannot be applied in gencral to finding new or
`mutated malicious code. Another popular approach uses simple monitors
`to observe program execution and detect malicious behavior at run time
`Such monitors usually watch all disk accesses to ensure that no unautho-
`tized writes are made. Unfortunately such techniques incur a substantial
`speed penalty during execution.
`In addition,
`to be effective these pro-
`grams must err on the conservative side, resulting in many false alarms
`which require user interaction.
`
`Departmentof Energy
`Computer Security Group
`14th Annual Conference Proceedings
`
`4.3.1 Distinguishing “Code” from “Data”
`In
`Contrary to popular belief, disassembly can be a formidable task.
`general,it is formally undecidable to determine whether a region of “data”
`could be executed as code (the so-called “state-entry problem” for Turing
`machines), or to determine whether a region of “code” could ever be
`
`

`

`2.2 Program Access Control Lists
`The next approach, program access control lists (PACL’s) [6], consists of
`associating with each file in a system an access control list that specifies
`what programs can modify the file. This preventive approach has the
`effect of limiting damage that can be done by many malicious programs,
`especially Trojan Horses unknowingly executed by the user. In contrast,
`conventional ACL’s would be ineffective in such situations because they
`only limit which users can modify the file. Although the PACL approach
`can help to ensure integrity, it is ineffective against attacks exploiting
`covert channels that only violate information security, not integrity.
`
`2.3 Encryption & Watchdog Processors
`Encryptionis another method of coping with the threat of malicious code.
`Lapid, Ahituv, and Neumann (2) use encryption to defend against Tro-
`jan horses and trapdoors. When correctly implemented, encryption tech-
`niques are quite effective against many types of malicious code, but the
`cost of such a system is high due to the required hardware. Similarly,
`watchdog processors(3] also require additional hardware. Such processors
`are capable of detecting invalid reads/writes from/to memory, but they
`require additional support to effectively combat viruses. Also, both of
`these methods are dependent on the prior existence of a “clean” version
`of every program that is to be executed. As mentioned, to certify such
`copies as “clean” in the first place requires either formal verification or
`a malicious code detection capability, which is the subject of the present
`paper.
`
`and record the resulting internal form syntax in the appropriate
`
`3 Ongoing Work at UC Davis
`We are pursuing two families of analytical
`techniques that, unlike most
`current virus prevention and detection methods, attemptto dissect a pro-
`gram to determine whatit does and how. By examining the code, static
`analysis techniques can determine certain properties for some types of
`programs. Dynamic analysis techniques attempt to learn about a pro-
`gram’s behavior by actually running it in a “cleanroom” environment or
`by simulating its execution.
`At UC Davis,
`three analysis tools have been developed to help in
`determining whether a program has any potentially malicious codein it:
`VF1, Snitch, and Dalek. VF1 uses data flow algorithms during static
`analysis to determine the names ofall files that a program can access.
`Snitch statically examines a program for duplication of operating system
`services. Dalek is a debugger that forms the basis for a dynamic analyzer.
`
`3.1 Static Analysis
`From Table 1, one can see that static analysis [1] can be applied to a broad
`class of security problems. By closely examining the binary or source code
`
`14 3
`
`Department of Energy Computer Security Group
`14th Annual Conference Proceedings
`
`data. As with many programming languages, this concept of “fail-
`ure” may need to propagate outward (to other Memory_Addresses),
`perhaps even requiring certain aspects of the static analysis to be
`re-computed from scratch the next time the interpreter needs to rely
`onafield computed by static analysis.
`. Invoke the translator routines to disassemble the Memory.Contents
`
`

`

`of a program,static analysis attempts to detect the presence of malicious
`sections in that program. However,
`in the most general case such de-
`tection is known to be an undecidable problem, resulting in a need for
`more selective analysis techniques aimed at limited subclasses of problem
`instances. A generic virus designed to infect an arbitrary program will be
`unlikely to evade detection.
`In contrast, malicious code intended for one
`specific program can be more smoothly integrated with other sections of
`that program, therebyeffectively camouflaging its presence. In such cases,
`it is more cost-effective for the detection to focus on the services provided
`by the operating system that might be exploited by the malicious code,
`and on the strategic vulnerabilities of that operating system and underly-
`ing architecture. Thus our approach avoids the prohibitive cost of formal
`program verification in favor of slicing [1] and other static and dynamic
`analysis methods that reduce the problem space to a more tractable size.
`3.1.1 VFI
`
`3.1.2 Snitch
`
`Snitch is a prototype that detects duplication of operating system access
`routines in a program.
`Its strategy relies on the fact that most UNIX
`programs contain at most one instance of any operating system service
`(¢.g-, open, write, close). Since a simple virus cannotrely onall programs
`possessing the services it needs, it usually carries each of these services
`with it, inserting them into every program it infects. This will most likely
`result in 3 duplication of some operating system services. When Snitch
`is used to analyze an infected program, it will report this duplication
`
`VF1is a prototype system that was implemented to determine the vi-
`ability of applying static analysis to the detection of malicious code; it
`uses a technique called slicing. Slicing involves isolating the portions of a
`program related to a particular property in which one is interested. The
`sliced program (which is greatly reduced in size compared to the original)
`can then be analyzed to give information about that particular property.
`VF1's target property is filename generation — in particular, whichfiles
`can be opened and written to by a given program. By knowing whatfiles
`& program can write to, one can determineif there is a possibility of the
`program being a virus. For example,
`if a program that does not need
`to write to files (e.g., Is, the UNIX directory listing program) possesses
`code to open and write anyfile, then one might suspect that the program
`contains a virus.
`VFi translates a program written in the C programming language to
`@ program expressed in a Lisp-like internal form thatis easier to analyze
`This resultant program can then be sliced with respect to any given line
`of its code. That is, one can select a line of the resultant program that
`performs an interesting action (such as openingafile for writing), and have
`VF1 determine which statements of the resultant program have bearing
`on that selected line.
`
`ory.Address that was the target of a WRITE by the suspicious program
`
`Department of Energy Computer Security Group
`14th Annual Conference Proceedings
`
`| 44
`
`behavior would be grounds for more extensive dynamic analysis. How
`could dynamic analysis detect self-modifying code? Primarily for expos-
`itory purposes, we first introduce a special case of self-modifying code
`that builds on the two examples of the last section. Later, we describe
`self-modifying code in a more general context.
`Suppose thatin the second example of the previous section, the Mem-
`
`

`

`as being suspicious. The Snitch prototype is specific to Sun-3’s running
`SunOS, but many ofthe concepts underlying the prototype can be applied
`to other architectures and operating systems.
`Snitch consists of two major modules. Thefirst module, the disassem-
`bler, takes an executable program as input and produces the equivalent
`Motorola 68020 assembly language representation as output. The second
`module, the analyzer, examines the output from the disassembler for du-
`plication of operating system services, reporting any such duplications as
`well as the number of occurrences ofall system calls.
`
`3.2.1 A Fully Programmable Debugger
`Dalek is a prototypical debugger originally designed to demonstrate that
`significant further progress in debugging sequential programs was possible.
`For brevity in our discussion, we will refer to the suspicious program being
`executed under Dalek’s control as “the inferior”. Although Dalek has a
`number of unusual features, one of the most importantis thatits debugger
`language is fully programmable; that is, it includes mechanisms typically
`found only in general-purpose programming languages. Dalek's debugger
`language supports mechanisms such as conditional and looping constructs,
`
`3.2 Dynamic Analysis
`Dynamic analysis offers a reasonable potential for detecting a large class of
`malicious code. By closely observing a program at run timein a controlled
`environment, one may determine exactly what it is doing and how it
`accomplishes its goals.
`It is important to bear in mind one essential difference between static
`and dynamic analysis: In those cases where static analysis is able to assert
`something about a program, that assertion will be valid for every possible
`execution of the program.
`In contrast, in order for dynamic analysis to
`detect behavior that has eluded static analysis,
`the suspicious program
`may need to be run on a particular set of input data that causes that
`behavior to manifest. Thus the testing strategy (i.c., the selection of input
`data for the suspicious program)is as vital for the detection problem as
`it is for the general problem of debugging.
`Like static analysis techniques, dynamic analysis is best used “off-line”
`to avoid burdening the user with an execution-time penalty. Asa result,
`clever programs may be able to elude the analyzer by only executing
`malicious sections of code when they “know” that they are not being
`watched.
`One obvious approach to dynamic analysis is to base the analysis on a
`debugger. Over the last two years, an event-based debugger called Dalek
`has been developed at UC Davis [4] and is now being used to detect
`malicious code.
`
`a different location, thus successfully Propagating itself to a section of
`
`\45
`
`Department ofEnergy Computer Security Group
`14th Annual Conference Proceedings
`
`guage, Lisp itself provides ample evidence that there are often good rea-
`sons to (i.c., benign programs that) treat code as data. At the level of
`a machine language, an in-memory virus is a shining (albeit malicious)
`example of an application that may first execute a region of “code”, then
`READ that sameregion as “data” in order to WRITE that “code” to
`
`

`

`the ability to dynamically declare new global convenience variables’, and
`procedures and functions with local convenience variables.
`In Dalek, these general-purpose language constructs have been fully in-
`tegrated with the traditional application-specific debugging features such
`as breakpoints and single-stepping. Thus, for example, Dalek permits a
`single-step command inside a WHILE loop. Such a construct might be
`employed to programmatically observe the sequence of changes made to
`a particular variable.
`In any monitoring activity,
`there is concern for whether the act of
`observation could disturb the normal execution of the suspicious program.
`It is important to note that Dalek code (e.g., for debugger procedures)
`and convenience variables do not reside in the address space ofthe inferior
`(i.e., the suspicious program). Thus thelikelihood that Dalek’s monitoring
`activities might be noticed by the inferior is minimal.
`Dalek also provides support for detecting hierarchical events — occur-
`rences of interesting activities during the execution of the inferior — as
`a way to form higher-level abstractions explaining the inferior’s behavior.
`Dalek employs a novel, coarse-grained dataflow approach for combining
`events in which the dataflow graph’s nodes contain fragments of code
`written in the Dalek language.
`Thus Dalek code can be executed in a variety of different situations:
`interactively from the debugger’s commandlevel, when read in fromafile,
`and automatically when theinferior hits a breakpoint or when an event
`triggers the dataflow graphinto activity.
`
`mhnthne 2 Mam anes Addncce mee DDAT La 2 en antatanw. ------ = tea
`
`3.2.2 Primitive Events in Dalek
`Under Dalek, one can define an event to capture details of the occurrence
`of any interesting activity during the execution of the inferior. To define
`an event, one must declare names for it and for its at¢ributes and write a
`block of Dalek code. How does Dalek know when to execute the code for
`an event? The simplest way to effect this is to explicitly issue an event
`raise command (giving the name of that event) once Dalek has suspenued
`the inferior program’s execution. Events whose code is activated in this
`mannerare called primitive events, and include those activated by an event
`raise command executed automatically at a breakpoint. When activated,
`the code associated with an eventwill try to recognize a valid occurrence of
`that event in the inferior. The attributes associated with an event should
`contain information sufficient to characterize a particular occurrence of
`that event, allowingit to be distinguished from other instances of the same
`event. The Dalek code one writes for an event’s definition can cause it,
`uponactivation, to assign values to these attributes from inferior variables,
`debugger convenience variables, or computation based on those variables.
`1A “convenience variable” is a variable that the user defines and manipulates within the
`debugger.
`
`DepartmentofEnergy Computer Security Group
`14th Annual Conference Proceedings
`
`\4%
`
`point or eventfield. Depending on its value, this field would cause
`the suspicious program’s execution to be suspended whenever it
`attempted to READ, WRITE,and/or execute the corresponding
`Memory.Address. The MCT interpreter would then automatically
`execute the Lisp code defined for this event by the user.
`. The defined field would be used to detect access anomalies, ¢.g.,
`
`

`

`3.2.3. Hierarchical Events in Dalek
`Dalek also supports high-level events. When defining a high-level event,
`one must specify the names of other events on whichit depends. A high-
`level event is not explicitly raised; instead, Dalek can automatically exe-
`cute a high-level event’s code whenever an occurrence of a primitive event
`on which it dependsis successfully recognized. Dalek provides a coarse-
`grained dataflow view of events. In most models of dataflow, tokens flow
`between the nodes of a directed graph. In Dalek, the leaves of this graph
`Tepresent primitive events (independent sources of tokens) and the inte-
`tior nodes represent higher-level events whose activation and subsequent
`behavior depend on instances of those primitive events. Thus the struc-
`ture of the dataflow graphis implicit in the events that are defined by
`the user. It is derived automatically by Dalek from the dependencies that
`were explicitly specified between primitive and higher-level events.
`
`3.2.4 A Dataflow Model of Event Recognition
`Activity within the inferior — such as reaching a breakpoint — can raise
`an instanceof a primitive event, which if recognized will generate a token.
`This token carries copies of the attributes characterizing that instance
`of the primitive event, allowing that particular occurrence to be distin-
`guished from other occurrences of the same Primitive event. The token(s)
`generated by recognizing a valid instance of a primitive event flows to
`any high-level event(s) that depends on the primitive event. The arrival
`of such a token at the high-level event can trigger that high-level event,
`causing it to execute its own Dalek code so it can decide whetherto rec-
`ognize a valid instance of that high-level event. In making that decision,
`the Dalek code for a high-level event node has access to the tokens, and
`their attribute values, from all occurrences of the lower-level constituent
`events on which the high-level event depends.
`An event history is also maintained to record all recognized event
`occurrences and their attributes.
`It may be browsed selectively by the
`user in interactive mode or accessed programmatically via pre-defined
`functions in the Dalek language.
`
`detected by the MCT.
`
`3.2.5 Effectivencss of a Debugger Against Viruses
`Many programmers findit difficult to debug even non-malicious software
`they have written themselves.
`In comparison,
`the task of detecting a
`malicious section of code in a program written by another appears truly
`daunting. Yet with the support Dalek provides for detecting primitive
`events and for recognizing patterns of lower-level events as higher-level
`abstractions, much of the burdenis offloaded from the user.
`Figure 1 illustrates how high-level events were used to correlate at-
`tributes captured by lower-level events to provide a characterization of a
`Suspicious program’s behavior, represented in termsof the semantic mod-
`els the user had determined to be most relevant, For that demonstration,
`Dalek controlled a stripped executable under the UNIX operating system.
`
`\49 DepartmentofEnergy ComputerSecurityGroup
`14th Annual Conference Proceedings
`
`instructions. It could proceed to execute for an arbitrarily long pe-
`tiod without ever “re-synchronizing” with the framing assumptions
`made by the disassembler. Without the sequence field, such a faulty
`framing assumption might allow highly malicious code to escape un-
`
`

`

` Mes", *irwstt*"}
`
`
`
`surites
`
`(5, *Virwstt")
`
`
`Figure 1. Interaction of Events in Dalek.
`
`
`
`Figure 1 indicates that the inferior made a series of system calls, in-
`
`
`voking operating system services to access thefilesystem. For example,
`
`
`the attributes of the first ‘open event ? record the fact that the inferior
`
`
`requested Write access to “file!”, and that the operating system permit-
`
`
`ted that access by returningafile-descriptor of 5 for the inferior to use.
`Subsequently,
`the inferior used thatfile-descriptor to write “string!” to
`
`
`“filet”. The high-level event ‘write-to-file was able to deduce this abstract
`
`behavioral description by correlating the file-descriptor, returned by the
`
`
`initial
`‘open event with that later used by the ‘write event.
`It was nec-
`
`
`
`essary to make this deduction because under UNIX, one cannot write to
`24 leading backquote character allows names of events and their attributes to be distin-
`
`
`guished from objects in the inferior program (e.g., variables and procedures) having otherwise
`identical names.
`
`
`17-
`
` DepartmentofEnergy Computer Security Group
`i49
`17-9
`
`14th Annual Conference Proceedings [4] R. Olsson, R. Crawford, and W. Ho. “A Dataflow Approach to Event-
`
`
`
`
`based Debugging”, SOFTWARE—Practice and Experience, Vol. 21,
`No.2, Feb. 1991, pp. 209-229.
`(5) K. Thompson. “Reflections on Trusting Trust”, Comm. ACM,Vol. 27,
`No. 8, 1984, pp. 761-763.
`[6] D. Wichers, D. Cook, R. Olsson, J. Crossley, P. Kerchen, K. Levitt,
`and R. Lo. “PACL’s: An Access Control List Approach to Anti-Viral
`Security", Proc. of NIST/NCSC 13th Nat'l Computer Security Conf..
`
`
`
`
`
`
`
`

`

`a file by name, but only througha file-descriptor previously returned by
`the operating system via an open system call. As shown in Figure 1,
`once theinferior closes “file”, UNIX recycles file-descriptor 5 so that af-
`ter the inferior’s next open system call, file-descriptor 5 denotes “files”
`instead. Fortunately, the expressiveness of Dalek’s general-purpose lan-
`guage features allow one easily to write debugger code for the ‘write-to-file
`high-level event so thatit is not confused by the reuse of a file-descriptor.
`From the example, one can imagine other ways in which Dalek’s ca-
`pabilities could be applied to the detection and understanding of viruses
`or other malicious code. Yet it might seem that in real-world situations,
`such event-based methods would be ineffective against hostile or secretive
`Programs.
`In the first place, one would expect that the malicious code
`would have been stripped ofall (correct) symbolic information. Thus the
`debugger would not know the names,sizes, or locations of procedures or
`data structures. However, most operating systems offer some assistance in
`this regard, allowing a relatively complete behavioral trace of all system-
`related activity initiated by a suspicious program to be obtained. Indeed,
`although the previous example involved a stripped executable file, we were
`nevertheless able to monitor its behavior in terms of certain extremely rel-
`evant, high-level abstractions, as shown in thefigure.
`Secondly, a malicious program may alter its own code, making analysis
`difficult. Under Dalek, however, one may define events to recognize such
`self-modifying behavior. Therefore, self-modification does not present in-
`surmountable difficulties for the debugger but it does increase the com-
`plexity of the events one must define.
`If Dalek were to be used as an isolated tool for dynamic analysis, it
`should be equipped with a library of predefined events to capture sus-
`picious and malicious behavior, similar in concept to the events shown
`in Figure 1. For example, attempting to open (or change/inspect
`the
`permissions on)all files in the current directory might be considered sus-
`picious. Writing the same block of “data” to several different executable
`files would appear even more suspicious.
`
`14th Annual Conference Proceedings
`
`3.3 Test Data Generation for Dynamic Analysis
`Successful dynamic analysis is dependent on a nearly complete test set of
`input data. Without a high level of confidence that most or all situations
`that may be suspicious are reached, the sample runs are unlikely to be
`useful. By using and adapting classical test data generation algorithms
`(7)[8], we can gain sufficient coverage to activate all suspicious code.
`Generation of test data typically involves extracting a set of possible
`paths through a program, and for each path selecting a set of inputs
`that activates that path. Path generation algorithms fall into three main
`categories [9]:
`¢ Control flow - Ensuring each statement of the program is executed.
`@ Data flow ~ Ensuring all paths containing certain relationships be-
`tween variables are traversed.
`
`{99
`
`Department ofEnergy Computer Security Group
`
`

`

` e Expression coverage - Ensuring that each variable is bound to a
`
`
`variety of nontrivial values during executions.
`
`Of these, data flow methods are most commonly used. For programs of
`
`
`significant size, the set of paths is far too large for an actual testing scheme
`[9]. Some work has been done to identify easily computable subsets of
`
`
`possible paths that cover in a reasonably complete way the activities of a
`
`
`program [7}[8], but these subsets vsually comprise a significeut percentage
`
`
`of the full set of paths (8), and so offer little improvement.
`
`
`For our purposes, it is possible that by focusing our test data gen-
`eration algorithm on specific suspicious events or “critical points” in the
`
`
`Program,(¢.g., requests for operating system services), we will be able to
`
`
`generate dramatically smaller test sets of input data. By thus reducing
`
`
`the size of the test set, such an approach would necessarily reduce the ab-
`
`
`solute percentage of path Coverage, yet it would retain the property that
`
`
`the malicious code lies on one of the paths that are covered. To explore
`this conjecture, the general algorithm in [8] for “critical points” appears
`promising.
`
`
`
`
`Testing techniques generally assume an “oracle” for determining the
`validity of results over some set of input data. Our oracle need not be as
`
`comprehensive as ai: oracle for debugging purposes, as a consequence of
`our specific interest in security properties.
`
`
` 4 A Testbedto Integrate Static and Dy-
`namic Analysis
`
`
`Our current efforts are directed towards devgloping a Malicious Code
`Testbed (MCT) that will allow us to explore andrefine our ideas for inte-
`
`
`
`grating static and dynamic analysis. The goal of the MCT Project is to
`
`
`provide a set of tools — operating in a seamless, integrated environment
`
`
`— that will assist a user in detecting various forms of malicious code,
`
`
`and in identifying programs that exploit security flaws within developed
`
`
`software. The MCT will incorporate features from the tools mentioned
`
`
`above (Dalek, VF1, and Snitch), whose capabilities were demonstrated in
`
`
`our previous work. It is expected that the power of the resulting toolset
`
`
`to detect and understand malicious code will exceed the sum of that of
`its componentparts when operated in relative isolation.
`
`
`
`
`Both static and dynamic analysis techniques are necessary because
`
`
`they can be applied in different sitnations, and their strengths complement
`one another. Compared with static analysis, dy’ amic analysis can follow
`
`
`any execution sequence — even if the program modifies its code on the fly.
`
`
`However, dynamic analysis can follow only a finite number of execution
`sequences inafinite time period. Thus, in the absence of a test set of input
`
`
`data known to provide complete coverage, dynamic analysis might be
`
`able to certify the presence of certain activities (ie., violations of security
`policy), but it could not unconditionally guarantee their absence.
`
`
`
`
`
` Department of Energy Computer Security Group14th Annual Conference Proceedings
`
`
`
`
`
`
`
`
`An Architecture for a Distributed
`Intrusion Detection System
`
`James Brentano, Steven R. Snapp,
`Gihan V. Dias, Terrance L. Goan,
`L. Todd Heberlein, Che-Lin Ho,
`Karl N. Levitt, Biswanath Mukherjee,
`
`
`
`
`
`
`

`

`
`
`4.1 Architecture of the MCT
`Onerequirement for the MCT is that it be as universal as possible. That
`is, the testbed should in Principle be capable of analyzing executable code
`from different processors and different operating systems. We propose to
`meet this requirement by developing various fr

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket