`
`United States Patent
`Bertman et a].
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 7,287,279 B2
`Oct. 23, 2007
`
`US007287279B2
`
`(54) SYSTEM AND METHOD FOR LOCATING
`MALWARE
`
`(75) Inventors: Justin R. Bertman, Erie, CO (US);
`Bryan M- Llstoll, Longmom, CO (Us);
`Matthew L- B0116)’, Longmom, CO
`(Us)
`(73) Assignee: Webroot Software, Inc., Boulder, CO
`(Us)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U-S-C- 154(1)) by 0 days-
`
`'
`1
`(21) APP'N°"10/956’274
`
`'
`.
`(22) F11ed
`
`.
`
`Oct 1 2004
`
`,
`
`.
`
`3/2003 Celi, Jr.
`6,535,931 B1
`8/2003 De Armas et a1.
`6,611,878 B2
`10/2003 Moran et a1.
`6,633,835 B1
`12/2003 Wynn et a1,
`6,667,751 B1
`3/2004 Balasubramaniam et a1. . 726/25
`6,701,441 B1 *
`8/2004 Bates et a1. ............... .. 709/232
`6,785,732 B1 *
`10/2004 Touboul
`6,804,780 B1
`110004 Dlmenstem
`6’813’7n B1
`6,829,654 B1* 12/2004 Jungck ..................... .. 709/246
`6,965,968 B1
`11/2005 Touboul
`7,058,822 B2
`6/2006 Edery et a1.
`2003/0217287 A1 11/2003 Kruglenko
`2004/0030914 A1
`2/2004 Kelley et a1.
`2004/0034794 A1
`2/2004 Mayer et a1.
`2004/0064736 A1
`4/2004 Obrecht et a1.
`
`2004/0080529 A1
`
`4/2004 W0'cik
`J
`
`(65)
`
`Prior Publication Data
`
`Us 2006/0075500 A1
`
`Apr. 6, 2006
`
`(
`)
`51
`
`Int. Cl.
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`(200601)
`G08B 23/00
`(52) US. Cl. ........................ .. 726/23; 726/24; 382/103;
`715/736
`(58) Field of Classi?cation Search .............. .. 713/165;
`709/224, 232, 238; 715/736, 749, 760, 238,
`715/700; 726/22L24; 382/100’ 103
`See application ?le for complete search history.
`
`US A l N 10/956 818 ?l d0 1 2004 M h B
`'
`' PP‘ °'
`’
`’
`e
`a‘ ’
`’
`3“ 6W Oney'
`(Continued)
`
`_
`_
`Prlmary ExammeriT- B- Truong
`(74) Attorney, Agent, or Firm4Cooley GodWard Kronish
`LLP
`
`(56)
`
`References Cited
`
`(57)
`
`ABSTRACT
`
`U.S. PATENT DOCUMENTS
`
`4/1997 11 et a1.
`5,623,600 A
`7/1999 Brandt et a1. ............. .. 709/218
`5,920,696 A *
`5/2000 Farry et a1.
`6,069,628 A
`6/2000 Rosenberg et a1.
`6,073,241 A
`7/2000 Touboul
`6,092,194 A
`6,154,844 A 11/2000 Touboul
`6,167,520 A 12/2000 Touboul
`6,310,630 B1
`10/2001 Kulkarni et a1.
`6,397,264 B1
`5/2002 Stasnick et a1.
`6,460,060 B1
`10/2002 MaddaloZZo, Jr. et 31.
`6,480,962 B1
`11/2002 Touboul
`
`A system and method for managing malWare is described.
`One embodiment is designed to receive an initial URL
`associated With a Web site; download content from that Web
`site; identify any obfuscation techniques used to hide mal
`Ware or pointers to malWare; interpret those obfuscation
`techniques; identify a neW URL as a result of interpreting the
`obfuscation techniques; and add the neW URL to a URL
`database.
`
`12 Claims, 5 Drawing Sheets
`
`N145
`
`155
`
`150
`
`165
`
`Add URL targets to URL lisl
`
`175
`
`Apply URL frequency rules
`
`Juniper Ex. 1012-p.1
`Juniper v Huawei
`
`
`
`US 7,287,279 B2
`Page 2
`
`US. PATENT DOCUMENTS
`
`2004/0143763 A1
`2004/0187023 A1
`2004/0225877 A1
`2005/0038697 A1*
`2005/0138433 A1
`
`7/2004 Radatti
`9/2004 Alagna et al.
`1 1/ 2004 Huang
`2/2005 Aaron ....................... .. 705/14
`6/2005 Linetsky
`
`OTHER PUBLICATIONS
`
`U.S. Appl. No. 10/956,575, ?led Oct. 1, 2004, Matthew Boney.
`U.S. Appl. No. 11/079,417, ?led Mar. 14, 2005, Justin Ryan
`Bertman.
`U.S. Appl. No. 11/171,924, ?led Jul. 1, 2005, Paul Piccard.
`U.S. Appl. No. 11/199,468, ?led Aug. 8, 2005, Philip Maddaloni.
`U.S. Appl. No. 11/180,161, ?led Jul. 13, 2005, Paul L. Piccard et al.
`US. Appl. No. 11/408,215, ?led Apr. 20, 2006, Matthew Boney.
`U.S. Appl. No. 11/408,146, ?led Apr. 20, 2006, Matthew Boney.
`U.S. Appl. No. 11/408,145, ?led Apr. 20, 2006, Matthew Boney.
`U.S. Appl. No. 11/462,781, ?led Aug. 7, 2006, Harry M. McCloy et
`al.
`
`Codeguru, Three Ways to Inject Your Code Into Another Process, by
`Robert Kuster, Aug. 4, 2003, 22 pgs.
`Codeguru, Managing Low-Level Keyboard Hooks With The Win
`dows API for VB .Net, by Paul Kimmel, Apr. 18, 2004, 10 pgs.
`Codeguru, Hooking The Keyboard, by Anoop Thomas, Dec. 13,
`2001, 6 pgs.
`Illusive Security, Wolves In Sheep’s Clothing: malicious DLLs
`Injected Into trusted Host Applications, Author Unknown, http://
`home.arcor.de/scheinsicherheit/dll.htrn 13 pgs.
`DevX.com, Intercepting Systems API Calls, by Seung-Woo Kim,
`May 13, 2004, pgs.
`Microsoft.com, How To Subclass AWindow in Windows 95, Article
`ID 125680, Jul. 11, 2005, 2005, 2 pgs.
`MSDN, Win32 Hooks by Kyle Marsh, Jul 29, 1993, 15 pgs.
`PCT Search Report, PCT/US05/34874, Jul. 5, 2006, 7 Pages.
`
`* cited by examiner
`
`Juniper Ex. 1012-p.2
`Juniper v Huawei
`
`
`
`Juniper Ex. 1012-p.3
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`0a. 23, 2007
`
`Sheet 2 0f 5
`
`US 7,287,279 B2
`
`N145
`
`150 --
`
`Retrieve URL
`
`155 — Download HTML
`
`160 —- Search HTML for targets
`
`165 — Add HTML to HTML list
`
`l
`i
`i
`i
`i
`
`170
`
`Add URL targets to URL list
`
`175 -— Apply URL frequency rules
`
`FIGURE 2
`
`Juniper Ex. 1012-p.4
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 3 0f 5
`
`US 7,287,279 B2
`
`185 ——»
`
`Retrieve URL
`
`N180
`
`190 —
`
`195 —~
`
`Download HTML
`
`Search HTML for targets
`
`l
`l
`l
`l
`l
`l
`l
`l
`
`200 —— Determine if targets include potential malware
`
`205 —- Upload potential malware to clean system
`
`210 — Monitor system changes caused by malware
`
`215 —
`
`220 —
`
`Identify as malware
`
`Generate de?nition
`
`225 — Provide de?nition to protected systems
`
`FIGURE 3
`
`Juniper Ex. 1012-p.5
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 4 0f 5
`
`US 7,287,279 B2
`
`/\/ 230
`
`235 ——
`
`Record con?guration information for a clean system
`
`240 ——
`
`245 —
`
`250 ——
`
`Record installation information for approved programs
`
`Download and run potential malware
`
`Record new con?guration information
`
`l
`l
`l
`l
`l
`l
`l
`
`255 — Determine differences between the original and new con?gurations
`
`260 — Determine whether the differences indicate malware was installed
`
`265 —
`
`Create a de?nition for the malware
`
`270 -——
`
`Provide de?nition to users
`
`FIGURE 4
`
`Juniper Ex. 1012-p.6
`Juniper v Huawei
`
`
`
`U.S. Patent
`
`Oct. 23, 2007
`
`Sheet 5 0f 5
`
`US 7,287,279 B2
`
`280
`
`Parse JavaScript
`
`285 — Identify obfuscation techniques for URLs
`
`290 —
`
`Parse Forms
`
`l
`i
`l
`i
`
`295 — Populate Form Fields and submit
`
`300 -—
`
`Generate malware definition
`
`FIGURE 5
`
`Juniper Ex. 1012-p.7
`Juniper v Huawei
`
`
`
`US 7,287,279 B2
`
`1
`SYSTEM AND METHOD FOR LOCATING
`MALWARE
`
`RELATED APPLICATIONS
`
`The present application is related to commonly owned
`and assigned application Ser. No. 10/956,818, entitled Sys
`tem and Method for Actively Operating Malware to Gener
`ate a De?nition, which is incorporated herein by reference.
`The present application is related to commonly owned
`and assigned application Ser. No. 10/956,575, entitled Sys
`tem and Method for Locating Malware to Generate a De?
`nition, which is incorporated herein by reference.
`
`COPYRIGHT
`
`A portion of the disclosure of this patent document
`contains material that is subject to copyright protection. The
`copyright owner has no objection to the facsimile reproduc
`tion by anyone of the patent disclosure, as it appears in the
`Patent and Trademark O?ice patent ?les or records, but
`otherwise reserves all copyright rights whatsoever.
`
`FIELD OF THE INVENTION
`
`The present invention relates to systems and methods for
`locating and identifying malware. In particular, but not by
`way of limitation, the present invention relates to systems
`and methods for searching out malware and generating
`corresponding malware de?nitions.
`
`BACKGROUND OF THE INVENTION
`
`Personal computers and business computers are continu
`ally attacked by trojans, spyware, and adware, collectively
`referred to as “malware” or “spyware.” These types of
`programs generally act to gather information about a person
`or organiZation4often without the person or organiZation’s
`knowledge. Some malware is highly malicious. Other mal
`ware is non-malicious but may cause issues with privacy or
`system performance. And yet other malware is actually
`bene?cial or wanted by the user. Unless speci?ed otherwise,
`“malware” as used herein refers to any of these programs
`that collects information about a person or an organiZation.
`Software is presently available to detect and remove
`malware. But as it evolves, the software to detect and
`remove it must also evolve. Accordingly, current techniques
`and software for removing malware are not always satisfac
`tory and will most certainly not be satisfactory in the future.
`Additionally, because some malware is actually valuable to
`a user, malware-detection software should, in some cases, be
`able to handle differences between wanted and unwanted
`malware.
`Current malware removal software uses de?nitions of
`known malware to search for and remove ?les on a protected
`system. These de?nitions are often slow and cumbersome to
`create. Additionally, it is often di?icult to initially locate the
`malware in order to create the de?nitions. Accordingly, a
`system and method are needed to address the shortfalls of
`present technology and to provide other new and innovative
`features.
`
`SUMMARY OF THE INVENTION
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`Exemplary embodiments of the present invention that are
`shown in the drawings are summarized below. These and
`other embodiments are more fully described in the Detailed
`
`65
`
`2
`Description section. It is to be understood, however, that
`there is no intention to limit the invention to the forms
`described in this Summary of the Invention or in the
`Detailed Description. One skilled in the art can recogniZe
`that there are numerous modi?cations, equivalents and alter
`native constructions that fall within the spirit and scope of
`the invention as expressed in the claims.
`The present system provides a system and method for
`managing malware. One embodiment is designed to receive
`an initial URL associated with a Web site; download content
`from that Web site; identify any obfuscation techniques used
`to hide malware or pointers to malware; interpret those
`obfuscation techniques; identify a new URL as a result of
`interpreting the obfuscation techniques; and add the new
`URL to a URL database.
`As previously stated, the above-described embodiments
`and implementations are for illustration purposes only.
`Numerous other embodiments, implementations, and details
`of the invention are easily recogniZed by those of skill in the
`art from the following descriptions and claims.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Various objects and advantages and a more complete
`understanding of the present invention are apparent and
`more readily appreciated by reference to the following
`Detailed Description and to the appended claims when taken
`in conjunction with the accompanying Drawings wherein:
`FIG. 1 is a block diagram of one embodiment of the
`present invention;
`FIG. 2 is a ?owchart of one method for identifying URLs
`that may be associated with malware;
`FIG. 3 is a ?owchart of one method for generating
`malware de?nitions;
`FIG. 4 is a ?owchart of one method for actively browsing
`a Web site to identify targets; and
`FIG. 5 is a ?owchart of one method for searching for
`malware targets in JavaScript and forms.
`
`DETAILED DESCRIPTION
`
`Referring now to the drawings, where like or similar
`elements are designated with identical reference numerals
`throughout the several views, and referring in particular to
`FIG. 1, it is a block diagram of one embodiment 100 of the
`present invention. This embodiment includes a database
`105, a downloader 110, a parser 115, an active browser 120,
`and a de?nition module 125. These components, which are
`described below, are connected through a network 130 to
`Web servers 135 and protected computers 140. These com
`ponents are described brie?y with regard to FIG. 1, and their
`operation is further described in the description accompa
`nying FIGS. 2 through 5.
`The database system 105 of FIG. 1 can be built on an
`ORACLE platform or any other database platform and can
`include several tables or be divided into separate database
`systems. But assuming that the database 105 is a single
`database with multiple tables, the tables can be generally
`categorized as URLs to search, downloaded HTML, down
`loaded targets, and de?nitions. (As used herein, “targets”
`refers to any program, program trace, ?le, object, exploits,
`malware activity, or URL that corresponds to malware.)
`The URL tables store a list of URLs that should be
`searched for malware. The URL tables can be populated by
`crawling the Internet and storing any found links. When
`searching for URLs linked to malware, the techniques used
`to identify those URLs sometimes differ from those used by
`
`Juniper Ex. 1012-p.8
`Juniper v Huawei
`
`
`
`US 7,287,279 B2
`
`3
`popular search engines such as GOOGLE. For example,
`malWare distributors often try to hide their URLs rather than
`have them pushed out to the public. GOOGLE’s crawling
`techniques and similar techniques look for these high-traf?c
`links and often miss deliberately-hidden URLs. Embodi
`ments of the present invention, hoWever, speci?cally seek
`out hidden URLs, and these techniques are described in
`more detail beloW.
`In one embodiment, the URLs stored in the URL tables
`can be stored in association With corresponding data such as
`a time stamp identifying the last time the URL Was accessed,
`a priority level indicating When to access the URL again, etc.
`For example, the priority level corresponding to CNN.COM
`Would likely be loW because the likelihood of ?nding
`malWare on a trusted cite like CNN.COM is loW. On the
`other hand, the likelihood of ?nding malWare on a pornog
`raphy-related site is much higher, so the priority level for the
`related URL could be set to a high level.
`Another table in the database can store HTML code or
`pointers to the HTML code doWnloaded from a URL in the
`URL table. This doWnloaded HTML code can be used for
`statistical purposes and for analysis purposes. For example,
`a hash value can be calculated and stored in association With
`the HTML code corresponding to a particular URL. When
`the same URL is accessed again, the HTML code can be
`doWnloaded again and the neW hash value calculated. If the
`hash value for both doWnloads is the same, then the content
`at that URL has not changed and further processing is not
`necessarily required.
`TWo other tables in the database relate to identi?ed
`malWare or targets. (Collectively referred to as a “target.”)
`One table can store the code and/ or URL associated With any
`identi?ed target. And the other table can store the de?nitions
`related to a target. These de?nitions, Which are discussed in
`more detail beloW, can include a list of the activity caused
`by the target, a hash function of actual malWare code, the
`actual malWare code, etc.
`Referring noW to the doWnloader 110 in FIG. 1, it
`retrieves the code associated With a particular URL. For
`example, the doWnloader 110 selects a URL from the
`database and identi?es the IP address corresponding to the
`URL. The doWnloader 110 then forms and sends a requests
`to the IP address for the URL. For speed reasons, the
`doWnloader 110 may focus its efforts on the HTML, Java
`Script, applets, and objects corresponding to the URL.
`Although this document often discusses HTML, JavaScript,
`and Java applets, those of skill in the art can understand that
`embodiments of the present invention can operate on any
`object Within a Web page, including other types of markup
`languages, other types of script languages, any applet pro
`grams such as ACTIVEX from MICROSOFT, and any other
`doWnloaded objects. When these speci?c terms are used,
`they should be understood to also include generic versions
`and other vendor versions.
`Once the requested information from the URL is received
`by the doWnloader 110, the doWnloader 110 can send it to the
`database for storage. In certain embodiments, the doWn
`loader 110 can open multiple sockets to handle multiple data
`paths for faster doWnloading.
`Referring noW to the parser 115 shoWn in FIG. 1, it is
`responsible for searching doWnloaded material for malWare
`and possible pointers to other malWare. And When the parser
`115 discovers potential malWare, the relevant information is
`provided to the active broWser 120 for veri?cation of
`Whether or not it is actually malWare.
`This embodiment of the parser 115 includes three indi
`vidual parsers: an HTML parser, a JavaScript parser, and a
`
`50
`
`55
`
`60
`
`65
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`4
`form parser. The HTML parser is responsible for craWling
`HTML code corresponding to a URL and locating embedded
`URLs. The JavaScript parser parses JavaScript, or any script
`language, embedded in doWnloaded Web pages to identify
`embedded URLs and other potential malWare. And the form
`parser identi?es forms and ?elds in doWnloaded material
`that require user input for further navigation.
`Referring ?rst to the URL parser, it can operate much as
`a typical Web craWler and traverse links in a Web page. It is
`generally handed a top level link and instructed to craWl
`starting at that top level link. Any discovered URLs can be
`added to the URL table in the database 105.
`The parser 115 can also store a priority indication With
`any URL. The priority indication can indicate the likelihood
`that the URL Will point to content or other URLs that include
`malWare. For example, the priority indication could be based
`on Whether malWare Was previously found using this URL.
`In other embodiments, the priority indication is based on
`Whether a URL included links to other malWare sites. And in
`yet other embodiments, the priority indication can indicate
`hoW often the URL should be searched. Trusted cites such as
`CNN.COM, for example, do not need to be searched regu
`larly for malWare.
`As for the JavaScript parser, it parses (decodes) JavaS
`cript, or other scripts, embedded in doWnloaded Web pages
`so that embedded URLs and other potential malWare can be
`more easily identi?ed. For example, the JavaScript parser
`can decode obfuscation techniques used by malWare pro
`grammers to hide their malWare from identi?cation.
`In one embodiment, the JavaScript parser uses a JavaS
`cript interpreter such as the MoZilla broWser to identify
`embedded URLs or hidden malWare. For example, the
`JavaScript interpreter could decode URL addresses that are
`obfuscated in the JavaScript through the use of ASCII
`characters or hexadecimal encoding. Similarly, the JavaS
`cript interpreter could decode actual JavaScript programs
`that have been obfuscated. In essence, the JavaScript inter
`preter is undoing the tricks used by malWare programmers to
`hide their malWare. And once the tricks have been removed,
`the interpreted code can be searched for text strings and
`URLs related to malWare.
`Obfuscation techniques, such as using hexadecimal or
`ASCII codes to represent text strings, generally indicate the
`presence of malWare. Accordingly, obfuscated URLs can be
`added to the URL database and indicated as a high priority
`URL for subsequent craWling. These URLs could also be
`passed to the active broWser immediately so that a malWare
`de?nition can be generated if necessary. Similarly, other
`obfuscated JavaScript can be passed to the active broWser
`120 as potential malWare or otherWise ?agged.
`The form parser identi?es forms and ?elds in doWnloaded
`material that require user input for further navigation. For
`some forms and ?elds, the form parser can folloW the
`branches embedded in the JavaScript. For other forms and
`?elds, the parser passes the URL associated With the forms
`or ?eld to the active broWser 120 for complete navigation.
`The form parser’s main goal is to identify anything that
`could be or could contain malWare. This includes, but is not
`limited to, ?nding submit forms, button click events, and
`evaluation statements that could lead to malWare being
`installed on the host machine. Anything that is not able to be
`veri?ed by the form parser can be sent to the active broWser
`120 for further inspection. For example, button click events
`that run a function rather than submitting information could
`be sent to the active broWser 120. Similarly, if a ?eld is
`checked by server side JavaScript and requires formatted
`
`Juniper Ex. 1012-p.9
`Juniper v Huawei
`
`
`
`US 7,287,279 B2
`
`5
`input, like a phone number that requires parenthesis around
`the area code, then this type of form could be sent to the
`active broWser 120.
`Referring noW to the active broWser 120 shoWn in FIG. 1,
`it is designed to automatically surf a Web site associated
`With a URL retrieved from the URL database or passed from
`the parser 115. In essence, the active broWser 120 surfs a
`Web site as a person Would. The active broWser 120 gen
`erally folloWs each possible path on the Web site and if
`necessary, populates any forms, ?elds, or check boxes to
`fully navigate the site.
`The active broWser 120 generally operates on a clean
`computer system With a knoWn con?guration. For example,
`the active broWser 120 could operate on a WINDOWS
`based system that operates INTERNET EXPLORER. It
`could also operate on a Linux-based system operating a
`MoZilla broWser.
`As the active broWser 120 navigates a Web site, any
`changes to the con?guration of the active broWser’s com
`puter system are recorded. “Changes” refers to any type of
`change to the computer system including, changes to a
`operating system ?le, addition or removal of ?les, changing
`?le names, changing the broWser con?guration, opening
`communication ports, etc. For example, a con?guration
`change could include a change to the WINDOWS registry
`?le or any similar ?le for other operating systems. For
`clarity, the term “registry ?le” refers to the WINDOWS
`registry ?le and any similar type of ?le, Whether for earlier
`WINDOWS versions or other operating systems, including
`Linux.
`And ?nally, the de?nition module 125 shoWn in FIG. 1 is
`responsible for generating malWare de?nitions that are
`stored in the database and eventually pushed to the protected
`computers 140. The de?nition module 125 can determine
`Which of these changes are associated With malWare and
`Which are associated With acceptable activities. For
`example, the malWare de?nition module 125 could use a
`series of shields to detect suspicious activities on the active
`broWser 120. The potential malWare associated With accept
`able activities can be discarded.
`Referring noW to FIG. 2, it is a ?owchart of one method
`145 for identifying URLs that may be associated With
`malWare. Although this method is not necessarily tied to the
`architecture shoWn in claim 1, for convenience and clarity,
`that architecture is sometimes referred to When describing
`the method.
`For the method of FIG. 2, the doWnloader initially
`retrieves or otherWise obtains a URL from the database.
`(Block 150) Typically, the doWnloader retrieves a high
`priority URL or a batch of high-priority URLs. The doWn
`loader then retrieves the material, usually a Web page or
`HTML, associated With the URL. (Block 155) Before fur
`ther processing the doWnloaded material, the doWnloader
`can compare the material against previously doWnloaded
`material from the same URL. For example, the doWnloader
`could calculate a cyclic redundancy code (CRC), or some
`other hash function value, for the doWnloaded material and
`compare it against the CRC for the previously doWnloaded
`material. If the CRCs match, then the neWly doWnloaded
`material can be discarded Without further processing. But if
`the tWo CRCs do not match, then the neWly doWnloaded
`material is different and should be passed on for further
`processing.
`Assuming that the doWnloaded page requires further
`processing, the doWnloaded material, usually HTML and
`JavaScript, can be stored in the database 105. (Block 165) It
`can also be searched for targets such as embedded URLs,
`
`50
`
`55
`
`60
`
`65
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`6
`JavaScript, potential targets, etc. (Block 160) When it dis
`covers neW URLs, they can be stored and a priority indicator
`can also be calculated for those URLs. (Blocks 170 and 175)
`For example, URLs mined from trusted Web sites could be
`given a loW priority. Similarly, URLs that Were obfuscated
`in doWnloaded material or found at a pornography Web site
`could be given a high priority. The identi?ed URLs and the
`corresponding priority data can be stored in the URL table
`in the database 105. These URLs can subsequently be
`doWnloaded and searched.
`Referring noW to FIG. 3, it is a ?owchart of one method
`180 for generating malWare de?nitions. This method is
`similar to the one described With respect to FIG. 2. For
`example, this method begins by retrieving a URL or batch of
`URLs and the associated material. (Blocks 185 and 190) The
`retrieved material is then searched for potential targets.
`(Block 195) For example, the material can be searched for
`JavaScript and/or obfuscation techniques. (Block 200)
`Any potential targets are uploaded and executed on the
`active broWser. (Block 205) If the potential malWare makes
`changes to the active broWser, then those changes are
`recorded and used to determine Whether the potential mal
`Ware is actually malWare. (Blocks 210 and 215) For
`example, the changes could be compared against approved
`changes from approved softWare applications. (Discussed in
`detail With relation to FIG. 4.) In a second method, any
`changes to the active broWser could be scanned by a series
`of shields that monitor for basic behavior indicative of
`malWare. For example, shields can Watch for the installation
`of programs, alteration of the registry ?le, attempts to access
`email programs, etc. Typical shields include:
`Favorites ShieldiThe favorites shield monitors for any
`changes to a broWser’s list of favorite Web sites.
`Browser-Hijack ShieldiThe broWser-hijack shield
`monitors the WINDOWS registry ?le for changes to any
`default Web pages or other user preferences. For example,
`the broWser-hijack shield could Watch for changes to the
`default search page stored in the registry ?le.
`Host-File ShieldiThe host-?le shield monitors the host
`?le for changes to DNS addresses. For example, some
`malWare Will alter the address in the host ?le for yahoo.com
`to point to an ad site. Thus, When a user types in yahoo.com,
`the user Will be redirected to the ad site instead of yahoo’s
`home page.
`Cookie ShieldiThe cookie shield monitors for third
`party cookies being placed on the protected computer. These
`third-party cookies are generally the type of cookie that
`relay information about Web-sur?ng habits to an ad site.
`Homepage ShieldiThe homepage shield monitors the
`identi?cation of a user’s homepage and detects any attempt
`to change it.
`Plug-in ShieldiThis shield monitors for the installation
`of plug-ins. For example, the plug-in shield looks for
`processes that attach to broWsers and then communicate
`through the broWser. Plug-in shields can monitor for the
`installation of any plug-in or can compare a plug-in to a
`malWare de?nition. For example, this shield could monitor
`for the installation of INTERNET EXPLORER BroWser
`Help Objects
`Zombie shieldiThe Zombie shield monitors for malWare
`activity that indicates a protected computer is being used
`unknoWingly to send out spam or email attacks. The Zombie
`shield generally monitors for the sending of a threshold
`number of emails in a set period of time. For example, if ten
`emails are sent out in a minute, then the user could be
`noti?ed and user approval required for further emails to go
`out. Similarly, if the user’s address book is accesses a
`
`Juniper Ex. 1012-p.10
`Juniper v Huawei
`
`
`
`US 7,287,279 B2
`
`7
`threshold number of times in a set period, then the user could
`be noti?ed and any outgoing email blocked until the user
`gives approval. And in another implementation, the Zombie
`shield can monitor for data communications When the sys
`tem should otherWise be idle.
`Startup shieldiThe startup shield monitors the run folder
`in the WINDOWS registry for the addition of any program.
`It can also monitor similar folders, including Run Once, Run
`OnceEX, and Run Services in WINDOWS-based systems.
`And those of skill in the art can recogniZe that this shield can
`monitor similar folders in Unix, Linux, and other types of
`systems.
`WINDOWS-messenger shieldiThe WINDOWS-mes
`senger shield Watches for any attempts to turn on WIN
`DOWS messenger.
`Installation shieldiThe installation shield intercepts the
`CreateProcess operating system call that is used to start up
`any neW process. This shield compares the process that is
`attempting to run against the de?nitions for knoWn malWare.
`Memory shieldiThe memory shield is similar to the
`installation shield. The memory-shield scans through run
`ning processes matching each against the knoWn de?nitions
`and noti?es the user if there is a spy running.
`Communication shieldiThe communication shield scans
`for and blocks traf?c to and from IP addresses associated
`With a knoWn malWare site. The IP addresses for these sites
`can be stored on a URL/IP blacklist. This shield can also
`scan packets for embedded IP addresses and determine
`Whether those addresses are included on a blacklist or White
`list. In another implementation, the communication shield
`checks for certain types of communications being transmit
`ted to an outside IP address. For example, the shield may
`monitor for information that has been tagged as private.
`The communication shield could also inspect packets that
`are coming in from an outside source to determine if they
`contain any malWare traces. For example, this shield could
`collect packets as they are coming in and Will compare them
`to knoWn de?nitions before letting them through. The shield
`Would then block any that are associated With knoWn
`malWare.
`Key-logger shieldiThe key-logger shield monitors for
`malWare that captures are reports out key strokes by com
`paring programs against de?nitions of knoWn key-logger
`programs. The key-logger shield, in some implementations,
`can also monitor for applications that are logging key
`strokesiindependent of any malWare de?nitions. In these
`types of systems, the shield stores a list of knoWn good
`programs that can legitimately log keystrokes. And if any
`application not on this list is discovered logging keystrokes,
`it is targeted for shut doWn and removal. Similarly, any
`key-logging application that is discovered through the de?
`nition process is targeted for shut doWn and removal. The
`key-logger shield could be incorporated into other shields
`and does not need to be a stand-alone shield.
`Still referring to FIG. 3, once potential malWare has been
`identi?ed as actual malWare, a malWare de?nition can be
`generated. (Block 220) The de?nition can be based on the
`changes that the malWare caused at the active broWser 120.
`For example, if the malWare made certain changes to the
`registry ?le, then those changes can be added to the de?ni
`tion for that exploit. Protected computers can then be told to
`look for this type of registry change. Text strings associated
`With offending JavaScript can also be stored in the de?ni
`tion. Similarly, applets, executable ?les, objects, and similar
`?les can be added to the de?nitions.
`Once a de?nition is generated for certain malWare, that
`de?nition can be stored in the database and then pushed to
`the protected computer systems. (Block 225) This process of
`generating de?nitions is described With regard to FIG. 4.
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`Referring noW to FIG. 4, it is a ?owchart of one method
`230 for actively broWsing a Web site to identify potential
`malWare. In this method, the active broWser 120, or another
`clean computer system, is initially scanned and the con?gu
`ration information recorded. (Block 235) For example, the
`initial scan could record the registry ?le data, installed ?les,
`programs in memory, broWser setup, operating system (OS)
`setup, etc. Next, changes to the con?guration information
`caused by installing approved programs can be identi?ed
`and stored as part of the active-broWser baseline. (Block
`240) For example, the con?guration changes caused by
`installing ADOBE ACROBAT could be identi?ed and
`stored. And When the change information is aggregated
`together for each of the approved programs, the baseline for
`an approved system is generated.
`The baseline for the clean system can be compared
`against changes caused by malWare programs. For example,
`When the parser passes a URL to the active broWser, the
`active broWser 120 broWses the associated Web site as a
`p