throbber
USENIX Association
`
`5 r • \l •";'
`
`(
`v
`
`~-~~~-----~ -
`
`~
`
`Proceedings of the
`12th USENIX
`Security Symposium
`
`I ON LOAN FROM BLDS BOSTON SPA.
`
`FOR USE AT THE BRITISH LIBRARY
`ST. PANCRAS READING ROOM ONLY
`
`.
`
`L 2 L
`
`D55·2a I 01/11
`
`August 4-8, 2003
`Washington, D.C., USA
`
`Blue Coat Systems - Exhibit 1010 Page 1
`
`

`
`For additional copies of the se proceedings contact:
`
`USENIX Assoc iation
`2560 Ninth Street. Suite 215
`Berkeley, CA 9-+710 USA
`Phone: 510 52 8 8649
`FAX: 510 548 5738
`Ema il: office @usenix.org
`URL: http://www.u senix.org.
`
`Security XI
`Security X
`Security IX
`Security VIII
`Security VII
`Security VI
`Security V
`Security IV
`Security III
`Security II
`Sec-urity
`
`Past USENIX Security Proceedings
`Washington, D.C. , USA
`August 2002
`Washington, D.C. , USA
`August 2001
`Denver, Colorado, USA
`August 2000
`Washington , D.C., USA
`August 1999
`San Antonio , Texas, USA
`January 1998
`San Jose, California, USA
`July 1996
`Salt Lake City, Utah, USA
`June 1995
`Santa Clara, California, USA
`October 1993
`September 1992
`Baltimore, Maryland, USA
`Portland, Oregon, USA
`August 1990·
`Portland, Oregon, USA
`August 1988
`
`$30/38
`$30/38
`$27/35
`$27/35
`$27/35
`$27/35
`$27/35
`· $15/20
`$30/39
`$13116
`$7/7
`
`© 2003 by The USENIX Association
`All Rights Reserved
`
`This volume is published as a collective work. Rights to individual papers
`remain with the author or the author's employer. Permission is granted for the
`noncommercial reproduction of the complete work for educational or research
`purposes. USENIX acknowledges all trademarks herein.
`
`ISBN 1-931971-13-7
`
`Printed in the United States of America on 50% recycled paper, 10-15% post-consumer waste.
`
`Blue Coat Systems - Exhibit 1010 Page 2
`
`

`
`Static Analysis of Executables to Detect Malicious Patterns*
`
`Mihai Christodorescu
`mihai@cs . wisc .edu
`
`Somesh Jha·
`jha@cs.wisc.edu
`
`Computer Sciences Department
`University of Wisconsin, Madison
`
`Abstract
`Malicious code detection is a crucial component of any defense mechanism. In this paper, we present a unique view(cid:173)
`point on malicious code detection. We regard malicious code detection as an obfuscation-deobfuscation game between
`malicious code writers and researchers working on malicious code detection. Malicious code writers attempt to obfus(cid:173)
`cate the malicious code to subvert the malicious code detectors, such as anti-virus software . We tested the resilience of
`three commercial virus scanners against code-obfu scation attacks. The results were surprising: the three commercial
`virus scanqers could be subverted by very simple obfu scation transformations ! We present an architecture for detect(cid:173)
`ing malicious patterns in executables that is resilient to common obfuscation transformations . Experimental results
`demonstrate the effi cacy of our prototype tool , SAFE (a ~tatic Qnalyzer fo r yecutables) .
`
`1 Introduction
`
`In the interconnected world of computers, malicious
`code has become an omnipresent and dangerous threat.
`Malicious code can infiltrate hosts using a variety of
`methods such as attacks against known software fl aws,
`hidden functionality in regular programs, and social en(cid:173)
`gineering. Given the devastating effect malicious code
`has on our cyber infrastructure, identifying malicious
`programs is an important goal. Detecting the presence
`of malicious code on a given host is a crucial component
`of any defense mechanism.
`Malicious code is usually classified [30] according to
`its propagation method and goal into the following cate(cid:173)
`gories:
`• viruses are programs that self-replicate within a host
`by attaching themselves to programs and/or documents
`that become carriers of the malicious code;
`• worms self-replicate across a network;
`• trojan horses masquerade as useful programs, but con(cid:173)
`tain malicious code to attack the system or leak data;
`• back doors open the system to external entities by sub-
`
`*This work was supported in part by the Office of Naval Research
`under contracts N00014 -0 1-1-0796 and N000 14-0l-1·0708. The .U.S.
`Government is authorized to reproduce and di stribute reprints for Gov.
`ernmental purposes, notwithstanding any copyright notices affi xed
`thereon.
`The views and conclusions contained herei n are th ose of the authors.
`and sho uld not be interpreted as necessarily represe nting the official
`policies or endorse ments, either expressed or implied, of the above
`government agencies or the U.S. Government.
`
`verting the local security policies to allow remote access
`and controf over a network;
`• spyware is a useful software package that also trans(cid:173)
`mits private user data to an external entity.
`Combining two or more of these malicious code cate(cid:173)
`gories can lead to powerful attack tool s. For example, a
`worm car. contain a payload that installs a back door to
`allow remote access. When the worm replicates to a new
`system (via email or other means), the back door is in(cid:173)
`stalled on that system, thus providing an attacker with a
`quick and easy way to gain access to a large set of hosts.
`Staniford et al. have demonstrated that worms can prop(cid:173)
`agate extremely quickly through a network, and thus
`potentially cripple the entire cyber infrastructure [43] .
`In a recent outbreak, the Sapphire/SQL Slammer worm
`reached the peak infection rate in about 10 minutes since
`launch, doubling every 8.5 seconds [31]. Once the back(cid:173)
`door tool gains a large installed base, the attacker can
`use the compromised hosts to launch a coordinated at(cid:173)
`tack, such as a distributed denial-of-service (DDoS) at(cid:173)
`tack [5] .
`In this paper, we develop a methodology for detecting
`malicious patterns in executables. Although our meth od
`is general, we have initially focused our attention on
`viruses. A computer virus replicates itself by inserting
`a copy of its code (the viral code) into a host program.
`When a user executes the infected program, the virus
`copy runs, infects more programs, and then the original
`program continues to execute. To the casual user, there
`is no perceived difference between the clean and the in-
`
`USENIX Association
`
`12th USENIX Security Symposium
`
`169
`
`Blue Coat Systems - Exhibit 1010 Page 3
`
`

`
`fected copies of a program until the virus activates its
`malicious payload.
`The classic virus-detection techniques look for the
`presence of a virus-specific sequence of instmctions
`(called a virus signature) inside the program: if the sig(cid:173)
`nature is found, it is highly probable that the program is
`infected. For example, the Chernobyl/CIH virus is de(cid:173)
`tected by checking for the hexadecimal sequence [47]:
`
`ESOO
`OFOl
`
`0000
`4C24
`
`005B
`FE5B
`
`8D4B
`83C3
`
`4251
`lCFA
`
`5050
`8B2B
`
`This corresponds to the following IA-32 instruction se(cid:173)
`quence, which constitutes part of the virus body:
`
`ES 00000000
`call Oh
`pop ebx
`5B
`lea ecx, [ebx + 42h]
`SD 4B 42
`push ecx
`51
`push eax
`50
`push eax
`50
`OFOl 4C 24 FE sidt [esp - 02h]
`5B
`pop ebx
`83 C3
`add ebx,
`FA
`eli
`SB 2B
`mov ebp,
`
`lCh
`
`[ebx]
`
`lC
`
`oround on some common obfuscation techniques used
`by virus writers is given in Section 3. We also have de(cid:173)
`veloped an obfuscator for executables. Surprisingly, the
`three commercial virus scanners we considered could be
`easily thwarted by simple obfuscation transformations
`(Section 4). For example, in some cases the Norton an(cid:173)
`tivirus scanner could not even detect insertions of nop
`instructions.
`• A general architecture for detecting malicious pat(cid:173)
`terns in executables
`We introduce a general architecture for detecting mali(cid:173)
`cious patterns in executables. An overview of the archi(cid:173)
`tecture and its novel features is given in Section 5. Ex(cid:173)
`ternal predicates and uninterpreted symbols are two im(cid:173)
`portant elements in our architecture. External predicates
`are used to summarize results of various static analyses,
`such as points-to and live-range analysis. We allow these
`external predicates to be referred in the abstraction pat(cid:173)
`terns that describe the malicious code. Moreover, we
`allow uninterpreted symbols in patterns, which makes
`the method resistant to renaming, a common obfusca(cid:173)
`tion transformation. Two key compon~nts of our archi(cid:173)
`tecture, the program annotator and the malicious code
`detector, are described in Sections 6 and 7 respectively.
`• Prototype for x86 executables
`We have implemented a prototype for detecting mali(cid:173)
`cious patterns in x86 executables. The tool is called a
`~tatic Q_nalyzer for gxecutables or SAFE. We have suc(cid:173)
`cessfully tried S-AFE on multiple viruses; for brevity we
`report on our experience with four specific viruses. Ex(cid:173)
`perimental results (Section 8) demonstrate the efficacy
`of SAFE. There are several interesting directions we in(cid:173)
`tend to pursue as future work, which are summarized in
`Section 9.
`• Extensibility of analysis
`SAFE depends heavily on static analysis techniques. As
`a result, the precision of the tool directly depends on the
`static analysis techniques that are integrated into it. In
`other words, SAFE is as good as the static analysis-tech(cid:173)
`niques it is built upon. For example, if SAFE uses the
`result of points-to analysis, it will be able to track values
`across memory references. In the absence of a points(cid:173)
`to analyzer, SAFE makes the conservative assumption
`that a memory reference can access any memory loca(cid:173)
`tion (i.e. , everything points to everything). We have de(cid:173)
`signed SAFE so that various static analysis techniques
`can be readily integrated into it. Several simple static
`analysis techniques are already implementeGL.in SAFE.
`
`2 Related Work
`2.1 Theoretical Discussion
`The theoretical limits of malicious code detection
`(specifically of virus detection) have been the focus of
`many researchers. Cohen [ l 0] and Chess-White [9]
`
`This classic detection approach is effective when the
`virus code does not change significantly over time. De(cid:173)
`tection is also easier when viruses originate from the
`same source code, with only minor modifications and
`updates. Thus, a virus signature can be common to sev(cid:173)
`eral virus variants. For example, Chernobyl/CIH ver(cid:173)
`sions 1.2, 1.3, and 1.4 differ mainly in the trigger date
`on which the malicious code becomes actiVe and can be
`effectively detected by scanning for a single signature,
`namely the one shown above.
`The virus writers and the antivirus software develop(cid:173)
`ers are engaged in an obfuscation-deobfuscation game.
`Virus writers try to obfuscate the "vanilla" virus so that
`signatures used by the antivirus software cannot detect
`these "morphed" viruses. Therefore, to detect an obfus(cid:173)
`cated virus, the virus scanners first must undo the ob(cid:173)
`fuscation transformations used by the virus writers. In
`this game, virus writers are obfuscators and researchers
`working on malicious code detection are deobfuscators.
`A method to detect malicious code should be resistant
`to common obfuscation transformations. This paper in(cid:173)
`troduces such a method. The main contributions of this
`paper include:
`• The obfuscation-deobfuscation game and attacks
`on commercial virus scanners
`We view malicious code detection as an obfuscation(cid:173)
`deobfuscation game between the virus writers and the
`researchers working to detect malicious code. Back-
`
`170
`
`12th USENIX Security Symposium
`
`USENIX Association
`
`Blue Coat Systems - Exhibit 1010 Page 4
`
`

`
`showed that in general th e problem of virus detec(cid:173)
`tion is undecidable. Similarly, several important static
`analysis problems are undecidable or computationally
`hard [28, 35].
`However, the problem considered in this paper is
`slightly different than the one considered by Cohen [I 0]
`and Chess-White [9]. Assume that we are given a vanilla
`virus V which contains a malicious sequence of instruc(cid:173)
`tions 17. Next we are given an obfuscated version O(V )
`of the virus. The problem is to find whether there ex(cid:173)
`1 in O(V) which is "se(cid:173)
`ists a sequence of instructions 17
`mantically equivalent" to 17 . A recent result by Vadhan
`eta!. [3] proves that in general program obfuscation is
`impossible. This leads us to believe that a computation(cid:173)
`ally bounded adversary will not be able to obfuscate a
`virus to completely hide its malicious behavior. We will
`further explore these theoretical issues in the future.
`
`2.2 Other Detection Techniques
`Our work is closely related to previous results on
`static analysis techniques for verifying security proper(cid:173)
`ties of software [1, 4, 8, 7, 25, 29] . . In a larger con(cid:173)
`text, our work is similar to existing research on soft(cid:173)
`ware verification [2, 13]. However, there are several im(cid:173)
`portant differences. First, viewing malicious code de(cid:173)
`tection as. an obfuscation-deobfuscation game is unique.
`The obfuscation-deobfuscation viewpoint lead us to ex(cid:173)
`plore obfuscation attacks upon commercial virus scan(cid:173)
`ners. Second, to our knowledge,- all existing work on
`static analysis techniques fot verifying security proper(cid:173)
`ties analyze source code. On the other hand, our analy(cid:173)
`sis technique works on executables. In certain contexts,
`such as virus detection, source code is not available. Fi(cid:173)
`nally, we believe that using uninterpreted variables in the
`specification of the malicious code is unique (Section
`6.2).
`looked at the problem of automatically
`Currie et al.
`checking the equivalence of DSP routines in the context
`of verifying the correctness of optimizing transforma(cid:173)
`tions [ 15]. Their approach is similar to ours, but they
`impose a set of simplifying assumptions for their simula(cid:173)
`tion tool to execute with reasonable performance. Feng
`and Hu took this approach one step further by using a
`theorem prover to determine when to unroll loops [19].
`In both cases the scope of the problem is limited to
`VLIW or DSP code and there is limited support for user(cid:173)
`specified analyses. Our work is applied to x86 (IA-32)
`assembly and can take full advantage of static analyses
`available to the user of our SAFE tool. Necula adopts
`a similar approach based on comparing a transformed
`code sequence against the original code sequence in the
`setting of verifying the correctness of the GNU C com(cid:173)
`piler [38]. Using knowledge of the transformations per(cid:173)
`formed by the compiler, equivalence between the com-
`
`piler inpu t and the compil er output is proven using a
`simulation relation. In our work, we require no a pri(cid:173)
`ori knowledge of the obfuscation transformations per(cid:173)
`formed, as it would be unreali stic to expect such infor(cid:173)
`mation in the presence of malicious code.
`We plan to enhance our framework by using the ideas
`from existing work on type systems for assembly code.
`We are currently investigating Monisett et al. 's Typed
`Assembly Language [32, 33]. We apply a simple type
`system (Section 6) to the binaries we analyze by manu(cid:173)
`ally inserting the type annotations. We are unaware of
`a compiler that can produce Typed Assembly Language,
`and thus we plan to support external type annotations to
`enhance the power of our static analysis.
`Dynamic monitoring can also be used for malicious
`code detection. Cohen [ 1 0] and Chess-White [9] pro(cid:173)
`pose a virus detection model that executes code in a
`sandbox. Another approach rewrites the binary to in(cid:173)
`troduce checks driven by an enforceable security pol(cid:173)
`icy [ 17] (known as the inline reference monitor or the
`IRM approach). We believe static analysis can be used to
`improve the efficiency of dynamic analysis techniques,
`e.g. , static analysis can remove redundant checks in the
`IRM framework. We construct our models for exe(cid:173)
`cutables similar to the work done in specification-based
`monitoring [21, 46], and apply our detection algorithm
`in a context-insensitive fashion. Other research used
`context-sensitive analysis by employing push-down sys(cid:173)
`tems (PDSs). Analyses described in [7, 25] use the
`model checking algorithms for pushdown systems [I 8)
`to verify security properties of programs. The data struc(cid:173)
`tures used in interprocedural slicing [23], interprocedu(cid:173)
`ral DFA [40], and Boolean programs [2] are hierarchi(cid:173)
`cally structured graphs and can be translated to push(cid:173)
`down systems.
`
`2.3 Other Obfuscators
`While deciding on the initial obfuscation techniques
`to focus on, we were influenced by several existing tools.
`Mistfall (by zOmbie) is a library for binary obfuscation,
`specifically written to blend malicious code into a host
`program [49]. It can encrypt, morph, and blend the virus
`code into the host program. Our binary obfuscator is
`very similar to Mistfall. Unfortunately, we could not
`successfully morph binaries using Mistfall , so we could
`not perform .a direct comparison between our obfusca(cid:173)
`tor and Mistfall. Burneye (by TESO) is a Linux binary
`encapsulation tool. Burneye encrypts a binary (possibly
`multiple times) , and packages it into a new binary with
`an extraction tool [45]. In this paper, we have not con(cid:173)
`sidered encryption based obfuscation techniques. In the
`future, we will incorporate encryption based obfuscation
`techniques into our tool , by incorporating or extending
`existing libraries.
`
`USENIX Association
`
`12th USENIX Security Symposium
`
`171
`
`Blue Coat Systems - Exhibit 1010 Page 5
`
`

`
`3 Background on Obfuscating Viruses
`To detect obfuscated viruses, antivirus software have
`become more complex. This section discusses some
`common obfuscation transformations used by virus writ(cid:173)
`ers and how antivirus software have historically dealt
`with obfuscated viruses.
`A polymorphic virus uses multiple techniques to pre(cid:173)
`vent signature matching. First, the virus code is en(cid:173)
`crypted, and only a small in-clear routine is designed
`to decrypt the code before running the virus. \\'hen
`the polymorphic virus replicates itself by infecting an(cid:173)
`other program, it encrypts the virus body with a newly(cid:173)
`generated key, and it changes the decryption routine by
`generating new code for it. To obfuscate the decryption
`routine, several transformations are applied to it. These
`include: nap-insertion, code transposition (changing
`the order of instructions and placing jump instructions to
`maintain the original semantics), and register reassign(cid:173)
`ment (permuting the register allocation). These trans(cid:173)
`formations effectively change the virus signature (Fig(cid:173)
`ure I), inhibiting effective signature scanning by an an(cid:173)
`tivirus tool.
`The obfuscated code in Figure I will behave in the
`same manner as before since the nop imtruction has
`no effect other than incrementing the program counter1.
`However the signature has changed . Analysis can de(cid:173)
`tect simple obfuscations, like nop-insertion, by using
`regular expressions instead of fixed signatures. To catch
`nop insertions, the signaturt; should allow for any num(cid:173)
`ber of nops at instruction boundaries (Figure 2). In fact,
`most modern antivirus software use regular expressions
`as virus signatures.
`Antivirus software deals with polymorphic viruses
`by performing heuristic analyses of the code (such as
`checking only certain program locations for virus code,
`as most polymorphic viruses attach themselves only at
`the beginning or end of the executable binary [37]), and
`even emulating the program in a sandbox to catch the
`virus in action [36] . The emulation technique is effec(cid:173)
`tive because at some point during the execution of the
`infected program, the virus body appears decrypted in
`main memory, ready for execution; the detection comes
`down to frequently scanning the in-memory image of
`the program for virus signatures while the program exe(cid:173)
`cutes.
`Metamorphic viruses attempt to evade heuristic de(cid:173)
`tection techniques by using more complex obfuscations.
`When they replicate, these viruses change thei'r code in
`a variety of ways, such as code transposition, substi(cid:173)
`tution of equivalent instruction sequences, and register
`reassignment [44, 51]. Furthermore, they can "weave"
`the virus code into the host program, making detec(cid:173)
`tion by traditional heuristics almost impossible since the
`virus code is mixed with program code and the virus en-
`
`try point is no longer at the beginning of the program
`(these are designated as ~ntry point Qbscuring (EPO)
`viruses [26]).
`-
`As virus writers employ more complex obfusca(cid:173)
`tion techniques, heuristic virus-detection techniques are
`bound to fail. Therefore, there is need to pe1jorm a
`deeper analysis of malicious code based upon more so(cid:173)
`phisticated static-analysis techniques. In other words,
`inspection of the code to detect malicious patterns
`should use structures that are closer to the semantics of
`the code, as purely syntactic techniques, such as regular
`expression matching, are no longer adequate.
`
`3.1 The Suite of Viruses
`We have analy zed multiple viruses using our tool, and
`discuss four of them in this paper. Descriptions of these
`viruses are given below.
`
`3.1.1 Detailed Description of the Viruses
`Chernobyl (CIH)
`According to the Symantec Antivirus Reseach Cen(cid:173)
`ter (SARC), Chernobyl!CIH is a virus that infects 32-
`bit Windows 95/98/NT executable files [41]. When
`a user executes an infected program under Windows
`95/98/ME, the virus becomes resident in memory. Once
`the virus is resident, CIH infects other files when they
`are accessed. Infected files may have the same size as
`the original files because of CIH's unique mode of in(cid:173)
`fection: the virus searches for empty, unused spaces in
`the file 2 . Next it breaks itself up into smaller pieces and
`inserts its code into these unused spaces. Chernobyl has
`two different payloads : the first one overwrites the hard
`disk with random data, starting at the beginning of the
`disk (sector 0) using an infinite loop. The second pay(cid:173)
`load tries to cause permanent damage to the computer
`by conupting the Flash BIOS.
`zombie-6.b
`The z0mbie-6.b virus includes an interesting feature -
`the polymorphic engine hides every piece of the virus,
`and the virus code is added to the infected file as a chain
`of differently-sized routines, making standard signature
`detection techniques almost useless.
`fOsfOrO
`The jOsjOrO virus uses a polymorphic engine combined
`with an EPO technique to hide its entry point. According
`to Kaspersky Labs [27], when an infected file is run and
`the virus code gains control, it searches for portable ex(cid:173)
`ecutable files in the system directories and infects them .
`While infecting, the virus encrypts itself with a polymor(cid:173)
`phic loop and writes a result to the end of the file. To gain
`control when the infected file is run , the virus does not
`modify the program's start address, but instead writes a
`"jmp (virus _en t ry)" instruction into the middle of
`the file .
`
`172
`
`12th USENIX Security Symposium
`
`USENIX Associatior
`
`Blue Coat Systems - Exhibit 1010 Page 6
`
`

`
`Original code
`E8 00000000
`SB
`8D 4B 42
`51
`50
`50
`OF01 4C 24 FE
`5B
`83 C3 1C
`
`-
`
`8B 2B
`
`[ ebx +
`
`42h ]
`
`call Oh
`pop ebx
`l ea ecx,
`push ecx
`push eax
`push eax
`sidt [esp - 02h ]
`pop ebx
`add ebx, 1Ch
`eli
`rnov ebp,
`
`[ebx]
`
`Obfuscated code
`E8 00000000
`5B
`8D 4B 42
`90
`5 1
`50
`50
`90
`OF0 1 4C 24 FE
`5B
`83 C3 1C
`90
`FA
`8B 2B
`
`[ebx + 45h]
`
`ca ll Oh
`p op ebx
`lea ecx,
`nop
`push ecx
`push eax
`push eax
`nop
`sidt [esp - 02h]
`pop ebx
`add ebx, 1Ch
`nop
`eli
`rnov ebp,
`
`[ebx]
`
`Signature
`E8 00 0000 005 B 8D4B 4251 5050
`OF0 1 4C24 FE5B 83C3 1CFA 8B2B
`
`New signature
`E800 0000 005B 8D4B 4290 5l50
`50 9fr OF01 4C24 FESB 83C3 1C90
`FABB 2B
`------ - --
`-
`-·------- -----
`Figure I: Original code and obfuscated code from Chemobyl/CIH, and their corresponding signatures. Newly added
`instructions are highlighted.
`
`EBOO
`8D4B
`50(90}*
`5B(90)*
`8B2B
`
`0000
`42(90)*
`OF 01
`83C3
`
`5B(90)*
`00 (90) *
`51(90)* 50(90)*
`4C24
`FE ( 90)*
`1C(90}* FA(90)*
`
`cation techniques [11 , 12], those described here seem
`to be preferred by malicious code writers, possibly be(cid:173)
`cause implementing them is easy and they add little to
`the memory footprint.
`
`Figure 2: Extended signature to catch n op-insertion.
`
`4.1 Common Obfuscation Transformations
`
`Hare
`Finally, the Hare virus infects the bootloader sectors
`of floppy di sks and hard drives, as well as executable
`programs. When the payload is triggered, the virus
`overwrites random sectors on the hard disk, making the
`data inaccessible. The virus spreads by polymorphically
`changing its decryption routine and encrypting its main
`body.
`The H are and ChernobyVCIH viruses are well known
`in the antivirus community, with their presence in the
`wild peaking in 1996 and 1998, respectively. In spite
`of this, we discovered that current commercial virus
`scanners could Hot detect slightly obfuscated versions
`of these viruses.
`
`4 Obfuscation Attacks on Commercial
`Virus Scanners
`
`We tested three commercial virus scanners against
`several common obfuscation transformations . To test the
`resilience of commercial virus scanners to common ob(cid:173)
`fuscation transformation s, we have developed an obfus(cid:173)
`cator for binaries. Our obfuscator supports four com(cid:173)
`mon obfuscation transformations: dead-code insertion,
`code transposition, register reass ignment, and instruc(cid:173)
`tion substitution. While there are other generic obfus-
`
`4.1.1 Dead-Code Insertion
`
`Also known as trash insertion, dead-code insertion
`adds code to a program wi thout modifying its behav~
`ior. Inserting a sequence of n op instructions is the sim(cid:173)
`plest example. More interesting obfuscations involve
`constructing challenging code sequences that modify the
`program state , only to restore it immediately.
`Some code sequences are designed to fool antivirus
`software that solely rely on signature matching as their
`detection mechanism. Other code sequences are com(cid:173)
`plicated enough to make automatic analysis very time(cid:173)
`consuming, if not impossible. For example, passing val(cid:173)
`ues through memory rather than registers or the stack
`requires accurate pointer analysis to recover values. The
`example shown in Figure 3 should clarify thi s. The code
`marked by (*) can be easily eliminated by automated
`analysis. On the other hand, the second and third inser(cid:173)
`tions, marked by (**), do cancel out but the analysis is
`more complex. Our obfuscator supports dead-code in(cid:173)
`sertion.
`Not all dead-code sequence can be detected and elim(cid:173)
`inated, as this problem reduces to program equivalence
`(i.e., is this code sequence equiFalent to an empty pro(cid:173)
`gram ?), which is undecidable. We believe that many
`common dead-code sequences can be detected and elim(cid:173)
`inated with acceptable performance. To quote the docu-
`
`USENIX Association
`
`12th USENIX Security Symposium
`
`173
`
`Blue Coat Systems - Exhibit 1010 Page 7
`
`

`
`mentation of the RPME virus permutation engine [50],
`
`[T]rash [does not make the] program more
`complex [ ... ]. If [the] detecting algorithm will
`be written such as I think, then there is no
`difference between NOP and more complex
`trash.
`
`Our detection tool, SAFE, identifies several kinds of
`such dead-code segments.
`
`4.1.2 Code Transposition
`Code transposition shuffles the instructions so that the
`order in the binary image is different from the execu(cid:173)
`tion order, or from the order of instructions assumed in
`the signature used by the antivirus software. To achieve
`the first variation, we randomly reorder the instructions
`and insert unconditional branches or j umps to restore the
`original control-flow. The second variation swaps in(cid:173)
`structions if they are not interdependent, similar to com(cid:173)
`piler code generation, but with the different goal of ran(cid:173)
`domizing the instruction stream.
`The two versions of thi s obfuscation technique differ
`in their complexity. The code transposition technique
`based upon unconditional branches is relatively easy to
`implement. The second technique that interchanges in(cid:173)
`dependent instructions is more complicated because the
`independence of instructions must be ascertained. On
`the analysis side, code transposition can complicate mat(cid:173)
`ters only for a human. Most automatic analysi·s tools (in(cid:173)
`cluding ours) use an intermediate representation, such
`as the control flow graph (CFG) or the program depen(cid:173)
`dence graph (PDG) [23], that is not sensitive to super(cid:173)
`fluous changes in control flow. Note th at an optimizer
`acts as a deobfuscator in thi s case by finding the unnec(cid:173)
`essary unconditional branches and removing them from
`the program code. Currently, our obfuscator supports
`only code transposition based upon inserting uncondi(cid:173)
`tional branches.
`
`4.1.3 Register Reassignment
`The register reassignment transformation replaces us(cid:173)
`age of one register with another in a specific live range.
`This technique exchanges register names and has no
`other effect on program behavior. For example, if reg(cid:173)
`ister ebx is dead throughout a given live range of the
`register eax, it can replace eax in that live range. In
`certain cases, regi ster reassignment requires insertion of
`prologue and epilogue code around the live range to re(cid:173)
`store the state of various regi sters. Our binary obfuscator
`supports this code transformation.
`The purpose of thi s transformation is to subvert the
`antivirus software analyses that rely upon sig nature(cid:173)
`matching. There is no real obfuscatory value gained in
`this process. Conceptually, the deobfuscation challenge
`
`is equally complex before or after the regis ter reassign(cid:173)
`ment.
`
`Instruction Substitution
`4.1.4
`This obfuscation technique uses a dicti onary of equiv(cid:173)
`alent instruction sequences to replace one instruction
`sequence with another. Since this transfonnation re(cid:173)
`lies upon human knowledge of equivalent instructions, it
`poses the toughest challenge for automatic detection of
`malicious code. The IA-32 instruction set is especially
`rich, and provides several ways of performing the same
`operation. Coupled with several architecturally ambiva(cid:173)
`lent features (e.g., a memory-based stack that can be ac(cid:173)
`cessed both as a stack using dedicated instructions and
`as a memory area using standard memory operations),
`the IA-32 assembly language provides ample opportu(cid:173)
`nity for instruction substitution.
`To handle obfuscation based upon instruction substi(cid:173)
`tution , an analysis tool must maintain a dictionary of .
`equivalent instruction sequences, similar to the dictio(cid:173)
`nary used to generate them . This is not a comprehen(cid:173)
`sive solution, but it can cope with the common cases. In
`the case of IA-32, the problem can be slightly simplified
`by using a simple intermediate language that "unwinds"
`the complex operations corresponding to each IA-32 in(cid:173)
`struction . In some cases, a theorem prover such as Sim(cid:173)
`plify [ 16] or PVS [39] can also be used to prove that two
`sequences of instructions are equivalent.
`
`4.2 Testing Commercial Antivirus Tools
`We tested three commercial virus scanners using ob(cid:173)
`fuscated versions of the four viruses described earlier.
`The results were quite surprising: a combination of
`nap-insertion and code transposition was enough to
`create obfuscated versions of the viruses that the COI;I(cid:173)
`mercial virus scanners could not detect. Moreover, the
`Norton antivirus software could not detect an obfus(cid:173)
`cated version of the Chernobyl virus using just nap(cid:173)
`insertions. SAFE was resistant to the two obfuscation
`transformations. The results are summarized in Table l .
`A ./ indicates that the antivirus software detected the
`vi rus. A X means that the software did not detect the
`virus. Note that unobfuscated versions of all four viruses
`were detected by all the tools .
`
`5 Architecture
`This section gives an overview of the architecture of
`SAFE (Figure 4). Subsequent sections provide detailed
`descriptions of the major components of SAFE.
`To detect malicious patterns in executables, we build
`an abstract representation of the malicio

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket