`
`Case 3:17-cv-05659-WHA Document1-8 Filed 09/29/17 Page 1 of 23
`
`EXHIBIT 8
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 2 of 23
`eee.
`
`US008225408B2
`
`US8,225,408 B2
`(10) Patent No.:
`a2) United States Patent
`Rubinetal.
`(45) Date of Patent:
`Jul. 17, 2012
`
`
`(54) METHOD AND SYSTEM FOR ADAPTIVE
`RULE-BASED CONTENT SCANNERS
`
`5/1995 Hershey et al... 726/22
`5,414,833 A *
`1/1996 Guptaet al.
`5,485,409 A
`1/1996 Chesset al.
`5,485,575 A
`11/1996 Judson
`5,572,643 A
`:
`11/1996 Furtney etal.
`5,579,509 A
`Inventors: Moshe Rubin, Jerusalem (IL); Moshe
`2/1997 Shwed
`5,606,668 A
`Matitya, Jerusalem (IL); Artem
`4/1997 Jietal.
`5,623,600 A
`Melnick, Beit Shemesh (IL); Shlomo
`6/1997 Rubin
`5,638,446 A
`Touboul, Kefar-Haim (IL); Alexander
`
`Yermakov, Beit Shemesh(IL); Amit 5,675,711 A*10/1997 Kephart etal... 706/12
`Shaked,Tel Aviv (IL)
`(Continued)
`
`(75)
`
`(73) Assignee: Finjan, Inc., San Jose, CA (US)
`
`(*) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1298 days.
`
`EP
`
`FOREIGN PATENT DOCUMENTS
`1091276 Al
`4/2001
`(Continued)
`
`OTHER PUBLICATIONS
`
`(21) Appl. No.: 10/930,884
`
`(22)
`
`(65)
`
`Filed:
`
`Aug. 30, 2004
`
`Prior Publication Data
`
`US 2005/0108554 Al
`
`May19, 2005
`
`Related U.S. Application Data
`(63) Continuation-in-part of application No. 09/539,667,
`filed on Mar. 30, 2000, now Pat. No. 6,804,780, which
`is a continuation of application No. 08/964,388, filed
`on Nov.6, 1997, now Pat. No. 6,092,194.
`Int.Cl
`(51)
`(2006.01)
`HOAL oy)06
`796/25: 713/153: 726/22
`50) US.CI
`ee
`i eeepire cscs
`(52)
`?
`?
`(58) Field of Classification Search o0...0000..settee None
`See applicationfile for complete search history.
`Ref
`Cited
`eferences
`Cite
`U.S. PATENT DOCUMENTS
`
`(56)
`
`5,077,677 A
`5,359,659 A
`5,361,359 A
`
`12/1991 Murphyetal.
`10/1994 Rosenthal
`11/1994 Tayalli et al.
`
`D Grune, C Jacobs, K Langendoen, H Bal—Parsing Techniques: A
`Practical Guide, 2000—John Wiley & Sons, Inc. New York, NY,
`USA,p. 1-326.*
`
`(Continued)
`
`a
`Primary Examiner — Eleni Shiferaw
`Assistant Examiner — Jeffery Williams
`aenome. Agent, or Firm —~ Dawn-Marie Bey; King &
`parame
`
`ABSTRACT
`(57)
`A methodfor scanning content, including identifying tokens
`within an incoming byte stream, the tokens being lexical
`constructs for a specific language, identifying patterns of
`tokens, generating a parse tree from the identified patterns of
`tokens, and identifying the presence of potential exploits
`within the parse tree, wherein said identifying tokens, iden-
`tifying patterns of tokens, and identifying the presence of
`potential exploits are based upona setof rules for the specific
`language. A system and a computer readable storage medium
`are also described and claimed.
`
`35 Claims, 7 Drawing Sheets
`
`NETWORK GATEWAY
`180
`
`
`PRE-SCANNER,
`
`
`
`
`
`130:
`CONTENT SCANNER
`
`
`
`
`
`
`
`
`
`CORPORATE INTRANET
`
`CONTENT CACHE
`
`CLIENT
`
`
`
`CLIENT
`
`
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 3 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 3 of 23
`
`US 8,225,408 B2
`
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`
`
`11/1997 McManis
`5,692,047 A
`11/1997 Holdenetal.
`5,692,124 A
`2/1998 Deo
`5,720,033 A
`3/1998 Changetal.
`5,724,425 A
`4/1998 Fiereset al.
`5,740,248 A
`4/1998 Yellinet al. oo. 717/134
`5,740,441 A *
`6/1998 van Hoffet al.
`5,761,421 A
`6/1998 Breslauet al.
`5,765,205 A
`7/1998 Devarakondaet al.
`5,784,459 A
`8/1998 Daviset al.
`5,796,952 A
`9/1998 Cohenetal.
`5,805,829 A
`11/1998 Chenetal.
`5,832,208 A
`11/1998 Cutler et al.
`5,832,274 A
`12/1998 Angelo et al.
`5,850,559 A
`1/1999 Hayman etal.
`5,859,966 A
`1/1999 Boebert etal.
`5,864,683 A
`3/1999 Yamamoto oo... 726/24
`5,881,151 A *
`3/1999 Duvalletal. w.. 709/206
`5,884,033 A *
`4/1999 Atkinsonet al.
`5,892,904 A
`9/1999 Chen et al.
`5,951,698 A
`9/1999 Walsh et al.
`5,956,481 A
`10/1999 Williams... 717/143
`5,963,742 A
`10/1999 Golan
`5,974,549 A
`11/1999 Appersonet al.
`5,978,484 A
`cccetcteeeeeeenes 726/13
`5,983,348 A * 11/1999 Ti
`. 726/4
`5,987,611 A
`11/1999 Freund..
`. 7226/1
`6,088,801 A *
`7/2000 Grecsek ...
`6,088,803 A *
`7/2000 Tsoet al. vce ee 726/22
`6,092,194 A
`7/2000 Touboul
`6,154,844 A
`11/2000 Touboul
`6,167,520 A
`12/2000 Touboul
`6,339,829 Bl
`1/2002 Beadle etal.
`6,425,058 Bl
`7/2002 Arimilli et al. oc. 711/134
`.
`. 711/128
`6,434,668 Bl
`8/2002 Arimilliet al.
`
`6,434,669 Bl
`8/2002 Arimilli et al.
`711/128
`6,480,962 Bl
`11/2002 Touboul
`11/2002 Shanklinetal. ou. 726/23
`6,487,666 Bl
`2/2003 Devireddyetal.
`711/114
`6,519,679 B2
`
`......
`.. 706/46
`6,598,033 B2
`7/2003 Rossetal.
`5/2004 Brownetal. ..... 709/229
`6,732,179 Bl
`6,804,780 Bl
`10/2004 Touboul
`6,917,953 B2
`7/2005 Simonet al. wc 707/204
`
`7,058,822 B2
`6/2006 Edery etal.
`..
`.. 726/22
`
`7,143,444 B2* 11/2006 Porrasetal. ....
`.. 726/30
`7,210,041 B1*
`4/2007 Gryaznovetal.
`we 713/188
`...
`. 715/234
`7,308,648 B1* 12/2007 Buchthal et al.
`
`7,343,604 B2*
`3/2008 Grabarniket al.
`719/313
`8/2008 Touboul..........
`7,418,731 B2
`.. 726/22
`
`2003/0014662 Al*
`1/2003 Guptaetal.
`.
`713/200
`2003/0074190 Al*
`4/2003 Allison ........
`.. 704/10
`2003/0101358 Al*
`5/2003 Porrasetal
`713/201
`2004/0073811 Al*
`4/2004 Sanin .......0....
`we 713/201
`
`5/2004 Rubinstein et al
`.. 709/230
`2004/0088425 Al*
`
`2005/0050338 Al*
`3/2005 Liang etal.
`..
`713/188
`
`8/2005 Sanduetal. ....
`2005/0172338 Al*
`.. 726/22
`2/2006 Bjarnestam etal.
`2006/0031207 Al*
`we 707/3
`2006/0048224 Al*
`3/2006 Duncan etal. ..
`726/22
`2008/0066160 Al*
`3/2008 Becker etal. ...
`we 7226/4
`
`8/2010 Wasson etal. 0... 382/176
`2010/0195909 Al
`
`FOREIGN PATENT DOCUMENTS
`1132796 Al
`9/2001
`WO 2004/063948
`7/2004
`
`EP
`WO
`
`OTHER PUBLICATIONS
`
`Power, James, “Notes on Formal Language Theory and Parsing”,
`1999, National University of Ireland, p. 1-40.*
`Scott et al., “Abstracting Application-Level Web Security”, 2002,
`ACM,p. 396-407.*
`U.S. Appl. No. 10/838,889, filed Oct. 26, 1999, Golan , G.
`http://www.codeguru.com/Cpp/Cpp/cpp_mfc/parsing/article.php/
`c4093/.
`
`http://www.cs.may.ie/~jpower/Courses/compilers/notes/lexical pdf.
`http://www.mail-archive.com/kragen-tol@canonical.org/
`msg00097 html.
`
`http://www.owlnet.rice.edu/~comp4 | 2/Lectures/L06LexWrapup4.
`pdf.
`http://www.cs.odu.edu/~—toida/nerzic/390teched/regular/fa/min-fa.
`html.
`http://rw4.cs.uni-sb.de/~ganimal/GANIFA/page16_e.htm.
`http://www.cs.msstate.edu/~hansen/classes/38 13 fall0 1/slides/
`06Minimize.pdf.
`http://www.win.tue.nl/~watson/2R870/downloads/madfa_algs.pdf.
`http://www.cs.nyu.edu/web/Research/Theses/chang_chia-hsiang.
`pdf.
`“Products” Article published. on the Internet, “Revolutionary Secu-
`tity for a New Computing Paradigm” regarding SurfinGate™ 7
`pages.
`“Release Notes for the Microsoft ActiveX Development Kit’, Aug.
`13, 1996, activex.adsp.or.jp/inetsdk/readme.txt, pp. 1-10.
`Doyle et al., “Microsoft Press Computer Dictionary” 1993, Microsoft
`Press, 2”” Edition, pp. 137-138.
`Finjan Software Ltd., “Powerful PC Security for the New World of
`Java™ and Downloadables, Surfin Shield™” Article published on
`the Internet by Finjan Software Ltd., 1996, 2 pages.
`Finjan Software Ltd., “Finjan Announces a Personal Java™Firewall
`for Web Browsers—the SurfinShield™ 1.6 (formerly known as
`SurfinBoard)”, Press Release of Finjan Releases SurfinShield 1.6,
`Oct. 21, 1996, 2 pages.
`Finjan Software Ltd., “Finjan Announces Major Power Boost and.
`NewFeatures for SurfinShield™2.0” Las Vegas Convention Center/
`Pavilion 5 P5551, Nov. 18, 1996, 3 pages.
`Finjan Software Ltd., “Finjan Software Releases SurfinBoard, Indus-
`try’s First JAVA Security Product for the World Wide Web”, Article
`published on the Internet by Finjan Software Ltd., Jul. 29, 1996, 1
`page.
`Finjan Software Ltd., “Java Security: Issues & Solutions” Article
`published on the Internet by Finjan Software Ltd., 1996, 8 pages.
`Finjan Software Ltd., Company Profile “Finjan—Safe Surfing, The
`Java Security Solutions Provider” Article published on the Internet
`by Oct. 31, 1996, 3 pages.
`IBM Antivirus User’s Guide Version 2.4, International Business
`Machines Corporation, Nov. 15, 1995, p. 6-7.
`Khare, R. “Microsoft Authenticod Analyzed” Jul. 22, 1996, xent.
`com/FoRK-archive/smmer96/0338. html, p. 1-2.
`LaDue, M., “Online Business Consultant: Java Security: Whose
`BusinessIs It?” Article published on the Internet, Home Page Press,
`Inc. 1996, 4 pages.
`Leach, Norvin et al., “IE 3.0 Applets Will Earn Certification”, PC
`Week,vol. 13, No. 29, Jul. 22, 1996, 2 pages.
`Moritz, R., “Why We Shouldn’t Fear Java” Java Report, Feb. 1997,
`pp. 51-56.
`Microsoft—“Microsoft ActiveX Software Development Kit” Aug.
`12, 1996, activex.adsp.or.jp/inetsdk/help/overview.htm,pp. 1-6.
`Microsoft Corporation, Web Page Article “Frequently Asked Ques-
`tions About Authenticode”, last updated Feb. 17, 1997, Printed Dec.
`23,
`1998. URL:
` http://www.microsoft.com/workshop/security/
`authcode/signfaq.asp#9, pp. 1-13.
`Microsoft® Authenticode Technology, “Ensuring Accountability
`and Authenticity for Software Components on the Internet”,
`Microsoft Corporation, Oct. 1996,
`including Abstract, Contents,
`Introduction and pp. 1-10.
`Okamoto, E. et al., “ID-Based Authentication System for Computer
`Virus Detection”, IEEE/TEE Electronic Library online, Electronics
`Letters, vol. 26, Issue 15, ISSN 0013-5194, Jul. 19, 1990, Abstract
`and pp. 1169-1170. URL: http://iel.ihs.com:80/cgi-bin/iel__cgi?se...
`2ehts%26ViewTemplate%3ddocview%5 fb%2ehts.
`Omura, J. K., “Novel Applications of Cryptography in Digital Com-
`munications”, IEEE Communications Magazine, May 1990; pp.
`21-29,
`Schmitt, D.A., “.EXEfiles, OS-2 style” PC Tech Journal, v6, n11, p.
`76 (13).
`Zhang, X.N., “Secure Code Distribution”, IEEE/IEE Electronic
`Library online, Computer, vol. 30, Issue 6, Jun. 1997, pp. 76-79.
`International Search Report for Application No. PCT/IL05/00915, 4
`pp., dated Mar. 3, 2006.
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 4 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 4 of 23
`
`US 8,225,408 B2
`Page 3
`
`Zhong,et al., “Security in the Large: is Java’s Sandbox Scalable,”
`Seventh IEEE Symposium on Reliable Distributed Systems, pp. 1-6,
`Oct. 1998.
`Rubin,et al., “Mobile Code Security,” JEEE Internet, pp. 30-34, Dec.
`1998.
`Schmid,et al. “Protecting Data From Malicious Software,” Proceed-
`ing ofthe 18 Annual Computer Security Applications Conference,
`pp. 1-10, 2002.
`Corradi, et al., “A Flexible Access Control Service for Java Mobile
`Code,” IEEE, pp. 356-365, 2000.
`
`International Search Report for Application No. PCT/IB97/01626, 3
`pp., May 14, 1998 (mailing date).
`Written Opinion for Application No. PCT/IL05/00915, 5 pp., dated
`Mar. 3, 2006 (mailing date).
`International Search Report for Application No. PCT/IB01/01138, 4
`pp., Sep. 20, 2002 (mailing date).
`International Preliminary Examination Report for Application No.
`PCT/IB01/01138, 2 pp., dated Dec. 19, 2002.
`
`* cited by examiner
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 5 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 5 of 23
`
`U.S. Patent
`
`Jul. 17, 2012
`
`Sheet 1 of 7
`
`US 8,225,408 B2
`
`
`
`CONTENTCACHE
`
` CORPORATE
`INTRANET
`
`
`CONTENTSCANNER
`
`NETWORKGATEWAY
` PRE-SCANNER
`
`FIG.1
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 6 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 6 of 23
`
`U.S. Patent
`
`Jul. 17, 2012
`
`Sheet 2 of 7
`
`US 8,225,408 B2
`
`age c‘SIs
`
`
`Sa1NYYAZATVNYSAINYYaSeVd
`YaZATWNV02z
`
`092
`
`
`
`ANION]ONIHOLVWNYSLLVd
`
`3313Sevd
`
`YINNVOS-ENS
`
`022
`
`Y3aZINAAOL
`
`Ol2
`
`YAZITVWYON
`
`Y¥300030
`
`
`
`SOYNOSSLA
`
`
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 7 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 7 of 23
`
`U.S. Patent
`
`Jul. 17, 2012
`
`Sheet 3 of 7
`
`US8,225,408 B2
`
`(4)punctuation>
`
`punctuation FIG.3
`
`
`[punctuation]
`[*a],
`
` [Apunctuation]
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 8 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 8 of 23
`
`U.S. Patent
`
`Jul. 17, 2012
`
`Sheet 4 of 7
`
`US 8,225,408 B2
`
`NUMBER
`
`EQUALS
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 9 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 9 of 23
`
`U.S. Patent
`
`Jul. 17, 2012
`
`Sheet 5 of 7
`
`US 8,225,408 B2
`
`
`
`
`CALL TOKENIZER TO RETRIEVE NEXT
`TOKEN
`
`500
`
`510
`
`ADD TOKEN TO PARSE TREE
`
`
`
`
`
`
`IS THERE A PATTERN
`MATCH WITH A
`PARSER RULE?
`
`
`
`DOES THE RULE
`HAVE A NONODE
`
`
`ATTRIBUTE?
`
` .
`
`NO
`PERFORM ACTION ASSOCIATED WITH
`MATCHED PARSER RULE:
`CREATE A NEW NODE, CALLED [RULE-
`NAME] AND PLACE THE MATCHING
`NODES UNDER THE NEW NODE
`
`
`
`
`
`
`
`DOES THE RULE
`HAVE A NOANALYZE
`ATTRIBUTE?
`
`
`CALL ANALYZER TO DETERMINE IF A
`
`POTENTIAL EXPLOIT IS PRESENT
`
`560
`
`570
`
`
`
`DOES ANALYZER FIND
`AN ANALYZER RULE
`
`MATCH?
`
`
`
`PERFORM ACTION ASSOCIATED WITH
`MATCHED ANALYZER RULE:
`RECORD ANALYZER RULE AT CURRENT
`NODE, AS LEVEL 0
`
`
`
`
`
`
`PROPAGATE ANALYZER RULE UPWARD
`THROUGH NODE PARENTS, AS
`SUCCESSIVELY INCREASING LEVEL
`
`FIG. 5
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 10 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 10 of 23
`
`U.S. Patent
`
`Jul. 17, 2012
`
`Sheet 6 of 7
`
`US 8,225,408 B2
`
`gazitidas
`
`V1vG3108
`
`Q3Z1vi3as
`
`V1lvG31nd
`
`Yyaqiine
`
`
`
`YANNVOS8Uv
`
`AYOLOVA
`
`YANNVOS
`
`AYOLISOdSY
`
`YANNVOSSuv
`
`ANOLOVS
`
`YANNVOS
`
`AYOLISOd34Y
`
`(Jaoue}sul
`
`9‘SIA
`
`GVaYHL|-8Nsdav
`LOSPrso|AYOLOVSA
`WX-OLS1NY
`YALYSANOD
`
`
`
`
`
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 11 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 11 of 23
`
`U.S. Patent
`
`Jul. 17, 2012
`
`Sheet 7 of 7
`
`US 8,225,408 B2
`
`BUILDER
`
`ARB SCANNER FACTORY
`
`SCANNER REPOSITORY
`
`URI
`
`ARB SCANNER
`HTML
`
`ARB SCANNER
`JAVASCRIPT
`
`ARB SCANNER
`
`TOKENIZER
`
`TOKENIZER
`
`TOKENIZER
`
`PARSER
`
`PARSER
`
`PARSER
`
`ANALYZER
`
`ANALYZER
`
`ANALYZER
`
`FIG. 7
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 12 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 12 of 23
`
`US 8,225,408 B2
`
`1
`METHOD AND SYSTEM FOR ADAPTIVE
`RULE-BASED CONTENT SCANNERS
`
`CROSS REFERENCES TO RELATED
`APPLICATIONS
`
`This application is a continuation-in-part of assignee’s
`application U.S. Ser. No. 09/539,667, filed on Mar. 30, 2000,
`now USS. Pat. No. 6,804,780, entitled “System and Method
`for Protecting a Computer and a Network from Hostile
`Downloadables,” which is a continuation of assignee’s patent
`application U.S. Ser. No. 08/964,388, filed on 6 Nov. 1997,
`now USS. Pat. No. 6,092,194, also entitled “System and
`Methodfor Protecting a Computer and a Network from Hos-
`tile Downloadables.”
`
`15
`
`FIELD OF THE INVENTION
`
`The present invention relates to network security, and in
`particular to scanning of mobile content for exploits.
`
`20
`
`BACKGROUND OF THE INVENTION
`
`2
`content, such as inter alia JavaScript, VBScript, URI, URL
`and HTML. ARBscannersdiffer from prior art scanners that
`are hard-codedfor one particular type of content. In distinc-
`tion, ARB scanners are data-driven, and can be enabled to
`scan any specific type of content by providing appropriate
`rule files, without the need to modify source code. Rule files
`are textfiles that describe lexical characteristics of a particu-
`lar language. Rule files for a language describe character
`encodings, sequences of characters that form lexical con-
`structs of the language, referred to as tokens, patterns of
`tokens that form syntactical constructs of program code,
`referred to as parsing rules, and patterns of tokens that corre-
`spond to potential exploits, referred to as analyzer rules.
`Rules files thus serve as adaptors, to adapt an ARB content
`scamnerto a specific type of content.
`The present invention alsoutilizes a novel description lan-
`guage for efficiently describing exploits. This description
`language enables an engineerto describe exploits as logical
`combinationsofpatterns of tokens.
`Thusit may be appreciatedthat the present invention is able
`to diagnose incoming content. As such, the present invention
`achieves very accurate blocking of content, with minimal
`over-blocking as compared with prior art scanning technolo-
`gies.
`Conventionalanti-virus software scans a computerfile sys-
`There is thus provided in accordance with a preferred
`tem by searching for byte patterns, referred to as signatures
`embodimentofthe present invention a method for scanning
`that are present within knownviruses. Ifa virus signature is
`content,
`including identifying tokens within an incoming
`discovered within a file, the file is designated as infected.
`byte stream, the tokens being lexical constructs for a specific
`Content that enters a computer from the Internet poses
`language, identifying patterns of tokens, generating a parse
`additional security threats, as such content executes upon
`tree from the identified patterns oftokens, and identifying the
`entry into a client computer, without being saved into the
`presence of potential exploits within the parse tree, wherein
`computer’s file system. Content such as JavaScript and
`said identifying tokens, identifying patters of tokens, and
`VBScript is executed by an Internet browser, as soon as the
`identifying the presence of potential exploits are based upon
`contentis received within a web page.
`a set of rules for the specific language.
`Conventional network security software also scans such
`35
`There is moreover provided in accordance withapreferred
`mobile content by searching for heuristic virus signatures.
`embodimentof the present invention a system for scanning
`However, in order to be as protective as possible, virus sig-
`content, including a tokenizerfor identifying tokens within an
`natures for mobile content tend to be over-conservative,
`incoming byte stream, the tokens being lexical constructs for
`which results in significant over-blocking of content. Over-
`a specific language, a parser operatively coupledto the token-
`blocking refers to false positives; i.e., in addition to blocking
`izer for identifying patterns of tokens, and generating a parse
`of malicious content,prior art technologies also block a sig-
`nificant amount of content that is not malicious.
`tree therefrom, and an analyzer operatively coupled to the
`parser for analyzing the parse tree and identifying the pres-
`ence of potential exploits therewithin, wherein the tokenizer,
`the parser and the analyzerusea set of rules for the specific
`language to identify tokens, patterns and potential exploits,
`respectively.
`There is further provided in accordance with a preferred
`embodiment of the present invention a computer-readable
`storage medium storing program code for causing a computer
`to perform the steps of identifying tokens within an incoming
`byte stream, the tokens being lexical constructs for a specific
`language, identifying patterns of tokens, generating a parse
`tree from the identified patterns oftokens, and identifying the
`presence of potential exploits within the parse tree, wherein
`said identifying tokens, identifying patters of tokens, and
`identifying the presence of potential exploits are based upon
`a set of rules for the specific language.
`There is yet further provided in accordance with a preferred
`embodimentofthe present invention a method for scanning
`content, including expressing an exploit in terms of patterns
`of tokens and rules, where tokens are lexical constructs of a
`specific programming language, and rules are sequences of
`tokens that form programmatical constructs, and parsing an
`incoming byte source to determine if an exploit is present
`therewithin, based on said expressing.
`There is additionally provided in accordance with a pre-
`ferred embodiment of the present invention a system for
`
`25
`
`40
`
`45
`
`Another drawback with priorart network security software
`is that it is unable to recognize combinedattacks, in which an
`exploit is split among different content streams. Yet another
`drawbackis that prior art network security software is unable
`to scan content containers, such as URI within JavaScript.
`All of the above drawbacks with conventional network
`
`security software are due to an inability to diagnose mobile
`code. Diagnosisis a dauntingtask,since it entails understand-
`ing incoming byte source code. The same malicious exploit
`can be encoded in an endless variety of ways, so it is not
`sufficient to look for specific signatures.
`Nevertheless, in order to accurately block malicious code
`with minimal over-blocking,
`a thorough diagnosis
`is
`required.
`
`55
`
`SUMMARY OF THE DESCRIPTION
`
`The present invention provides a method and system for
`scanning content that includes mobile code, to produce a
`diagnostic analysis of potential exploits within the content.
`The present invention is preferably used within a network
`gateway or proxy, to protect an intranet against viruses and
`other malicious mobile code.
`
`The content scanners of the present invention are referred
`to as adaptive rule-based (ARB) scanners. An ARB scanneris
`able to adapt itself dynamically to scan a specific type of
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 13 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 13 of 23
`
`US 8,225,408 B2
`
`3
`4
`from malicious mobile code originating from the Internet.
`scanning content, including a parser for parsing an incoming
`Mobile code is program code that executes on a client com-
`byte source to determine if an exploit is present therewithin,
`puter. Mobile code can take many diverse forms, including
`based on a formal description of the exploit expressed in
`inter alia JavaScript, Visual Basic script, HTML pages, as
`termsofpatterns oftokens andrules, where tokensare lexical
`well as a Uniform Resource Identifier (URI).
`constructs of a specific programming language,and rules are
`Mobile code can be detrimental to a client computer.
`sequences of tokens that form programmatical constructs.
`Mobile code can access a client computer’s operating system
`There is moreover provided in accordance with a preferred
`and file system, can open sockets for transmitting data to and
`embodiment of the present invention a computer-readable
`from a client computer, and can tie up a client computer’s
`storage medium storing program code for causing a computer
`processing and memory resources. Such malicious mobile
`to perform the steps of expressing an exploit in terms of
`code cannot be detected using conventional anti-virus scan-
`patterns of tokens and rules, where tokens are lexical con-
`ners, which scan a computer’s file system, since mobile code
`structs of a specific programming language, and rules are
`is able to execute as soon asit enters a client computer from
`sequences of tokens that form programmatical constructs,
`the Internet, before being savedtoafile.
`and parsing an incoming byte source to determine if an
`15
`Manyexamples ofmalicious mobile code are knowntoday.
`exploit is present therewithin, based on said expressing.
`Portions of code that are maliciousare referred to as exploits.
`For example, one such exploit uses JavaScript to create a
`windowthatfills an entire screen. The user is then unable to
`access any windowslying underneath thefiller window. The
`following sample code showssuch an exploit.
`
`10
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`<!DOCTYPE HTML PUBLIC *-//W3C//DTD HTML4.0 Transitional//
`EN”>
`<HTML>
`<HEAD>
`<TITLE>BID-3469</TITLE>
`<SCRIPT>
`op=window.createPopup( );
`s=‘<body>foobar</body>’;
`op.document.body.innerHTML=s;
`function oppop( )
`
`if (top.isOpen)
`
`w = screen.width;
`h = screen.height;
`op.show(0,0,w,h,document.body);
`
`function doit ()
`
`oppop( );
`setInterval(“window.focus( ); {oppop( );}”,10);
`</SCRIPT>
`</HEAD>
`<BODY>
`<H1>BID-3469</H1>
`<FORM method=POSTaction=“">
`<INPUTtype=“button” name=“btnDolt” value=“DoIt”onclick=“doit( )’>
`</FORM>
`</BODY>
`</HTML>
`
`Thus it may be appreciated that the security function ofnet-
`work gateway 110 is critical to a corporate intranet.
`In accordance with a preferred embodimentofthe present
`invention, network gateway 110 includes a content scanner
`130, whose purpose is to scan mobile code andidentify poten-
`tial exploits. Content scanner 130 receives as input content
`containing mobile code in the form of byte source, and gen-
`erates a security profile for the content. The security profile
`indicates whetheror not potential exploits have been discov-
`ered within the content, and, if so, provides a diagnostic list of
`one or more potential exploits and their respective locations
`within the content.
`Preferably, the corporate intranet uses a security policy to
`decide whetheror not to block incoming content based on the
`content’s security profile. For example, a security policy may
`block content that may be severely malicious, say, content
`that accesses an operating system ora file system, and may
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The present invention will be more fully understood and
`appreciated from the following detailed description, taken in
`conjunction with the drawings in which:
`FIG.1 is a simplified block diagram of an overall gateway
`security system that uses an adaptive rule-based (ARB) con-
`tent scanner, in accordance with a preferred embodimentof
`the present invention;
`FIG.2 is a simplified block diagram of an adaptive rule-
`based content scanner system, in accordancewith a preferred
`embodimentof the present invention;
`FIG.3 is an illustration of a simplefinite state machine for
`detecting tokens “a” and “ab”, used in accordance with a
`preferred embodimentofthe present invention;
`FIG.4 is an illustration of a simplefinite state machine for
`a pattern, used in accordance with a preferred embodiment of
`the present invention;
`FIG. 5 is a simplified flowchart of operation of a parser for
`a specific content language within an ARB content scanner, in
`accordance with a preferred embodiment of the present
`invention;
`FIG.6 is a simplified block diagram of a system forseri-
`alizing binary instances of ARB content scanners, transmit-
`ting them to a client site, and regenerating them back into
`binary instances at the client site, in accordance with a pre-
`ferred embodimentofthe present invention; and
`FIG.7 illustrates a representative hierarchy of objects cre-
`ated by a builder module, in accordance with a preferred
`embodimentof the present invention.
`
`LIST OF APPENDICES
`
`Appendix A is a source listing of an ARBrule file for the
`JavaScript language, in accordance with a preferred embodi-
`mentof the present invention.
`
`DETAILED DESCRIPTION
`
`The present invention concerns scanning of content that
`contains mobile code, to protect an enterprise against viruses
`and other malicious code.
`Reference is now made to FIG. 1, which is a simplified
`block diagram of an overall gateway security system that uses
`an adaptive rule-based (ARB) content scanner, in accordance
`with a preferred embodimentofthe present invention. Shown
`in FIG. 1 is a network gateway 110 that acts as a conduit for
`content from the Internet entering into a corporate intranet,
`and for content from the corporate intranet exiting to the
`Internet. One of the functions of network gateway 110 is to
`protect client computers 120 within the corporate intranet
`
`
`
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 14 of 23
`Case 3:17-cv-05659-WHA Document 1-8 Filed 09/29/17 Page 14 of 23
`
`US 8,225,408 B2
`
`5
`permit contentthat is less malicious, such as contentthat can
`consume a user’s computer screen as in the example above.
`The diagnostics within a content security profile are com-
`pared with the intranet security policy, and a decision is made
`to allow or block the content. When contentis blocked, one or
`morealternative actions can be taken, such as replacing sus-
`picious portions of the content with innocuous code and
`allowing the modified content, and sending a notification to
`an intranet administrator.
`
`Scanned content and their corresponding security profiles
`are preferably stored within a content cache 140. Preferably,
`network gateway 110 checks if incoming content is already
`resident in cache 140, and, if so, bypasses content scanner
`130. Use of cache 140 saves content scanner 130 the task of
`
`re-scanning the same content.
`Alternatively, a hash value of scanned content, such as an
`MDShashvalue, can be cachedinstead of caching the content
`itself. When contentarrives at scanner 130, preferably its hash
`value is computed and checked against cached hash values. If
`a match is found with a cached hash value, then the content
`does not haveto be re-scannedandits security profile can be
`obtained directly from cache.
`Consider, for example, a complicated JavaScriptfile that is
`scanned and determined to contain a known exploit there-
`within. An MDShashvalueofthe entire JavaScriptfile can be
`stored in cache, together with a security profile indicating that
`the JavaScript file contains the known exploit. If the same
`JavaScript file arrives again, its hash value is computed and
`foundto already reside in cache. Thus, it can immediately be
`determined that
`the JavaScript
`file contains the known
`exploit, without re-scanningthefile.
`Tt may be appreciated by those skilled in the art that cache
`140 mayreside at network gateway 110. However,it is often
`advantageous to place cache 140 as close as possible to the
`corporate intranet, in order to transmit content to the intranet
`as quickly as possible. However, in order for the security
`profiles within cache 140 to be up to date, it is important that
`network gateway 110 notify cache 140 whenever content
`scanner 130 is updated. Updates to content scanner 130 can
`occurinter alia when content scanner 130 is expanded(1) to
`cover additional content languages; (ii) to cover additional
`exploits; or (111) to correct for bugs.
`Preferably, when cache 140 is notified that content scanner
`130 has been updated, cache 140 clears its cache, so that
`content that was in cache 140 is re-scanned uponarrival at
`network gateway 110.
`Also, shown in FIG.1 is a pre-scanner 150 that uses con-
`ventional signature technologyto scan content. As mentioned
`hereinabove, pre-scanner 150 can quickly determine if con-
`tent is innocuous, but over-blocks on the safe side. Thus
`pre-scanner 150 is useful for recognizing content that poses
`no security threat. Preferably, pre-scanner 150 is a simple
`signature matching scanner, and processes incoming content
`at a rate of approximately 100 mega-bits per second. ARB
`scanner 130 performs much moreintensive processing than
`pre-scanner 150, and processes incoming contentat a rate of
`approximately 1 mega-bit per second.
`In order to accelerate the scanning process, pre-scanner
`150 acts as a first-pass filter, to filter content that can be
`quickly recognized as innocuous. Contentthat is screened by
`pre-scanner 150 as being potentially malicious is passed
`along to ARB scanner 130 for further diagnosis. Content that
`is screened by pre-scanner 150 as being innocuous bypasses
`ARBscanner 130. It is expected that pre-scanner 150 filters
`90% of incoming content, and that only 10% of the content
`requires extensive scanning by ARB scanner 130. As such,
`
`6
`the combinedeffect ofARB scanner 130 and pre-scanner 150
`provides an average scanning throughoutof approximately 9
`mega-bits per second.
`Use of security profiles, security policies and caching is
`described in applicant’s U.S. Pat. No. 6,092,194 entitled
`SYSTEM AND METHOD FOR PROTECTING A COM-
`PUTER AND A NETWORK FROM HOSTILE DOWN-
`
`in applicant’s U.S. Pat. No. 6,804,780
`LOADABLES,
`entitled SYSTEM AND METHOD FOR PROTECTING A
`COMPUTER AND A NETWORK FROM HOSTILE
`
`10
`
`DOWNLOADABLES, and in applicant’s U.S. Pat. No.
`7,418,731 entitled METHOD AND SYSTEM FOR CACH-
`ING AT SECURE GATEWAYS.
`
`Reference is now made to FIG. 2, which is a simplified
`block diagram of an adaptive rule-based content scannersys-
`tem 200, in accordance with a preferred embodimentof the
`present invention. An ARB scanner system is preferably
`designed as a generic architecture that is language-indepen-
`dent, and is customizedfor a specific language through use of
`a set of language-specific rules. Thus, a scanner system is
`customized for JavaScript by means of a set of JavaScript
`rules, and is customized for HTML by meansofa set of
`HTMLrules. In this way, each set of rules acts as an adaptor,
`to adapt the scanner system to a specific language. A sample
`rule file for JavaScript is provided in Appendix A, and is
`described hereinbelow.
`
`Moreover, in accordance with a preferred embodiment of
`the present
`invention, security violations, referred to as
`exploits, are described using a generic syntax, which is also
`language-independent.It is noted that the same generic syn-
`tax used to describe exploits is also used to describe lan-
`guages. Thus, referring to Appendix A, the same syntax is
`used to describe the JavaScript parser rules and the analyzer
`exploit rules.
`It may thus be appreciated that the present invention pro-
`vides a flexible content scanning method and system, which
`can be adapted to any language syntax by meansofa set of
`rules that serve to train the content scanner how to interp