throbber
(12) United States Patent
`Thomas
`
`US006401118B1
`(10) Patent N0.:
`US 6,401,118 B1
`(45) Date of Patent:
`Jun. 4, 2002
`
`(54) METHOD AND COMPUTER PROGRAM
`PRODUCT FOR AN ONLINE MONITORING
`SEARCH ENGINE
`
`-
`,
`(75) Inventor‘ Jason B‘ Thomas’ Arhngton’ VA(US)
`
`.
`.
`.
`.
`.
`(73) Asslgnee: Onhne Momtonng semces’
`Alexandna> VA(US)
`
`(*) Notice?
`
`subjectto any diSC1aiII1@r>_th@ term Ofthis
`patent 1s extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/133,374
`(22) Filed,
`Aug 13 1998
`’
`Related US Application Data
`(60) Provisional application No. 60/091,164, ?led on Jun. 30,
`1998'
`(51) Int. c1.7 ......................... .. G06F 11/30; G06F 17/00
`(52) us. Cl. ..................... .. 709/224- 709/203- 709/217-
`709/219. 70’9/226. 70’7/5. 707/4;
`’
`’709/22’3 222
`(58) Field of Search
`709/225 226 203 217 219 707/5 4
`’
`’
`’
`’
`’
`’
`’
`3, 2, 1
`
`(56)
`
`References Cited
`
`US. PATENT DOCUMENTS
`
`*
`
`.
`
`*
`
`"""""""""" "
`' 7075
`5,864,845 A * 1/1999 Voorhees et al.
`____ __ 707/5
`5,864,846 A * 1/1999 Voorhees et a1_
`707/501
`5,873,107 A * 2/1999 Borovoy et a1.
`. . . . .. 707/3
`5,913,208 A * 6/1999 Brown et al. . . . . . . . .
`-- 707/10
`5,913,215 A * 6/1999 Rubinstein et a1-
`707/5
`59207859 A * 7/1999 Li ~~~~~~~~ ~~~~~~~~ ~~
`707/5
`5,924,090 A * 7/1999 Krellenstem """"" "
`5’933’822 A * 8/1999 Braden-Harder et a1‘ "" " 707/5
`5,983,216 A * 11/1999 Kirsch et al. ................ .. 707/2
`5,987,446 A * 11/1999 Corey et al.
`707/3
`5,991,751 A * 11/1999 Rivette et al. ............... .. 707/1
`
`5,995,961 A * 11/1999 Levy et a1. .................. .. 707/4
`6,006,217 A * 12/1999 Lumsden ..................... .. 707/2
`6,009,422 A * 12/1999 Ciccarelli
`707/4
`6,009,459 A * 12/1999 Bel?ore et al. ..
`. 709/203
`6,018,733 A * 1/2000 Kirsch et al.
`...... .. 707/3
`6,092,182 A * 2/2000 Nehab et al. ............. .. 707/523
`
`6,041,326 A * 3/2000 Am t
`l. ................ .. 707/10
`6,078,914 A * 6/2000 Redrferil ............... .. 707/3
`6,078,917 A * 6/2000 Paulsen, Jr. et al.
`707/6
`6,094,649 A * 7/2000 B
`t
`l.
`.... ..
`707/3
`6,098,065 A * 8/2000
`Z121. ....... ..
`707/3
`6,102,969 A * 8/2000 Christianson et al. ..
`717/8
`6,112,202 A * 8/2000 Kleinberg .................... .. 707/5
`
`.
`
`*
`
`.
`
`‘med by exammer
`Primary Examiner—Glenton B. Burgess
`Assistant Examiner—Abdullah E. Salad
`(74) Attorney, Agent, or Firm—Piper Marbury Rudnick &
`Wolfe’ LLP; Steven B‘ Kelber
`(57)
`ABSTRACT
`_
`_
`_
`_
`_
`_
`_
`An Onhne momtonng Search engme- The memo“ 1S a
`system, method and computer program product that allows
`an organization, company, or the like to monitor the Internet
`(or any computer network) for violations of their intellectual
`property (e.g., patent, trademark or copyright infringement),
`.
`.
`.
`.
`or momtor how persons on the Internet view their busmess,
`products and/or services. The system includes a Web server
`for receiving search requests and criteria from users on a
`Web client and a server for searching the Internet for URL’s
`that contain contents matching the search criteria, thereby
`compiling a list of offending URL’s. The system also
`includes a ?le system for storing contents from each of the
`offending URL’s and a relational database for allowing the
`server to perform queries of the content in order to produce
`a report. The method involves receiving search criteria from
`a user, searching the Internet, downloading offending
`contents, and then archiving and scoring the contents. The
`method also obtains contact information for each registrant
`of the offending URL’s and produces a report for the user.
`
`18 Claims, 12 Drawing Sheets
`
`306
`
`CONSTRUCT 2
`NEW SEARCH
`TERMS W'TH
`UNUSED RELATED
`TOPIC KEYWORDS
`
`CCT>§L§BITQHS
`PRELIMINARY
`LIST OF URLS
`
`520
`
`524
`
`QUERY WITH
`2 NEW TERMS
`
`Plaid Technologies Inc.
`Exhibit 1008
`
`Ex 1008-1
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 1 0f 12
`
`US 6,401,118 B1
`
`
`
`_ m9; Smog
`
`{NIJIIIL
`
`_ _
`
`_ _ _
`
`_ \ \ \ "N: @2 NS
`“ w_n__ m5
`
`55 3: _ >5 m2 2: 2:
`
`
`
`w: + | ~m>~mw mm; 55% 5E
`
`_ \ f
`
`mow?
`
`F .GE
`
`S: _ wmwIom/mw _
`TWEMEI _
`
`_
`
`_ S:
`
`Ex 1008-2
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 2 0f 12
`
`US 6,401,118 B1
`
`200
`,g
`
`100
`/
`
`IPIS
`(BACK END)
`C++
`
`102
`
`ODBC
`
`RELATIONAL
`DATABASE
`PRODUCT
`
`NATIVE WINDOWS
`
`FILE SYSTEM
`
`104
`\
`
`PHYSICAL FILE
`SYSTEM DISK
`
`NATIVE WINDOWS
`
`HTTP
`
`108
`/
`
`202
`/
`
`WEB CLIENTS
`‘(?x-Egg;
`-ACT|VE SERVER PAGES HTTP ‘JAVA SCR'PT
`_JAVA SCRIPT
`-DYNAM|C HTML
`-VB SCRIPT
`‘JAVA
`
`FIG. 2
`
`Ex 1008-3
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 3 0f 12
`
`US 6,401,118 B1
`
`300
`j
`
`( START P302
`
`V
`
`DEFINE SEARCH CRITERIA ~304
`
`V
`
`~306
`GET PROBABLE URL’S SEARCH 1
`
`DOWNLOAD PAGES
`
`~308
`
`V
`
`~310
`SCORE PAGES % r312
`
`I
`
`FULL ARCHIVE
`
`~ 314
`GROUP PAGES <—I
`
`IT ------ T T Tl ------- T T 1/ \ ' 316
`
`'
`
`PRIORITIZE SITES
`
`GET CONTACT INFORMATION N318
`
`V
`
`GENERATE REPORT
`
`~320
`
`IT TTTTTT T T Tl TTTTTTT T T 1
`
`:
`
`CLIENT ACTION
`
`r \ -322
`
`Ex 1008-4
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 4 0f 12
`
`US 6,401,118 B1
`
`IPIS
`
`/106
`
`A QUEUE THREAD (Q 402
`
`URL THREAD
`
`»-~/ 404
`
`DATABASE THREAD ~/ 406
`
`ARCHIVE THREAD ~/ 408
`
`CONTACT THREAD ~/ 410
`
`FIG. 4
`
`Ex 1008-5
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 5 0f 12
`
`US 6,401,118 B1
`
`SELECT SEARCH ENGINE(S) ~504
`
`/
`
`I
`T RANSLATE SEARCH CRITERIA N506
`INTO KEYWORDS
`I
`IDENTIFY MAIN TOPIC
`KEYWORD
`I
`IDENTIFY SET OF RELATED N510
`TOPIC KEYWORDS
`I
`QUERY FOR {TOPIC}
`
`~512
`
`N508
`
`514
`
`YES
`
`HITS
`s ENGINE
`LIMIT?
`
`NUSED RELATED
`TOPIC KEYWORDS
`2
`
`NO
`
`518
`/_/
`?gvvsggggglf
`TERMS W'TH
`UNUSED RELATED
`TOPIC KEYWORDS
`
`522
`COLLECT HITS
`PECELIOIEIIISY
`LIST OF URLs
`
`520 1
`
`v
`QUERY WITH
`2 NEW TERMS
`
`524
`
`V
`
`END
`
`FIG. 5
`
`Ex 1008-6
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 6 0f 12
`
`US 6,401,118 B1
`
`A com
`
`£8).
`
`£8).
`
`88)‘
`
`5%).
`
`NM 2“ 8W Q“ X“
`
`
`wmmwig .__<sm_ zo?gmomwo M65 “6 SE #6
`
`m .QE
`
`Ex 1008-7
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 7 0f 12
`
`US 6,401,118 B1
`
`CE
`
`NNONZ‘
`
`£2).
`
`
`
`ONE ‘1.
`
`CNON).
`
`a‘ m2 @E #2
`
`
`
`926a Eow M05 “6 mi; in
`
`k .QE
`
`Ex 1008-8
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 8 0f 12
`
`US 6,401,118 B1
`
`m .QE
`
`VWWVVWW‘
`
`Ex 1008-9
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`21f09LI.66hS
`
`US 6,401,118 B1
`
`
`
`
`
`mxz:O0O:_._.Xm_._.
`
`
`
`
`
`
`
`mzozmo_In_<moom_n__>o_o:<m_wOn_m3n_mzozm_mo_>_mhwmfizoooz_w_Em_>o<
`
`
`
`
`
`
`
`
`
`
`
`._.zw._.ZO0._.Zm:._O
`
`m_o<n_
`
`EN5H200m.@E
`
`Ex 1008-10
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 10 of 12
`
`US 6,401,118 B1
`
`
`
`E3222:12:20gaze825928beEH2805%E2882>EH28O52
`
`11.NI.
`
`AJ0N.l.
`
`«M:3H3..NI.
`
`IAUV91...S
`
`ud0Ian-
`
`1...
`
`NnW00
`
`Ex 1008-11
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`Sheet 11 0f 12
`
`US 6,401,118 B1
`
`coo?
`
`woo? \“
`
`N028‘
`
`Ex 1008-12
`
`

`
`U.S. Patent
`
`Jun. 4, 2002
`
`21f021LI.66hS
`
`US 6,401,118 B1
`
`E..03
`
`8:
`
`
`
`_>_m:m>mmm5n__>_oo
`
`m_._m<>o_>_m_m
`
`
`
`._._ZDmo<mopm
`
`m_._m_<>o_>_m_m
`
`.:ZDmo<m_o5
`
`mm:
`
`
`
`I._.<n_mzo:<o_z:_>=>_oo
`
`8:
`
`mo:
`
`mowmmooma
`
`
`
`>mo_>_m_>_Z_<_>_
`
`IImo:
`
`
`
`>mo_>_m_>_>m<ozoomw
`
`
`
`v_m_an_m<_._
`
`m_>_mo
`
`m_._m<>o_>_m_m_
`
`
`
`m>_momo<mo5
`
`mzo:<o_z:_>__>_oo
`
`m_o<nEmfiz_
`
`Ex 1008-13
`
`

`
`US 6,401,118 B1
`
`1
`METHOD AND COMPUTER PROGRAM
`PRODUCT FOR AN ONLINE MONITORING
`SEARCH ENGINE
`
`CROSS-REFERENCE TO RELATED
`APPLICATION
`
`This application claims priority to US. Provisional Patent
`Application No. 60/091,164, ?led Jun. 30, 1998, Which is
`incorporated herein by reference in its entirety.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention relates generally to computer net
`Work search engines, and more particularly to search engines
`for performing online monitoring activities.
`2. Related Art
`Over the past several years, there has been an explosion
`of computers, and thus people, connected to the global
`Internet and the World-Wide Web
`This increase of
`connectivity has alloWed computer users to access various
`types of information, disseminate information, and be
`eXposed to electronic commerce activities, all With a great
`degree of freedom. Electronic commerce includes large
`corporations, small businesses, individual entrepreneurs,
`organiZations, and the like Who offer their information,
`products, and/or services to people all over the World via the
`Internet.
`The rise in the usage of the Internet, hoWever, has also had
`a negative side. Given the Internet’s vastness and freedom,
`many unscrupulous individuals have taken the opportunity
`to pro?t by violating the intellectual property of others. For
`eXample, it has been estimated that billions of dollars in
`pro?ts are lost each year due to piracy of copyrighted
`materials over the Internet. These lost pro?ts result from
`unscrupulous individuals making available through the
`Internet, either free or for their oWn pro?t, copyrighted
`materials such as music, movies, magaZines, softWare, and
`pictures. Also, an individual, a company, an organiZation, or
`the like may be concerned With other intellectual property
`violations such as the illegal sale of their products, or the
`sale of inferior products using their brand names—that is,
`patent and trademark infringements. Furthermore, an
`individual, a company, an organiZation, or the like may be
`concerned With false information (i.e., “rumors”) that origi
`nate and spread quickly over the Internet, resulting in the
`disparagement of the individual, company, organiZation, or
`the like. Such entities may also be interested in gathering
`data about hoW they and their products and/or services are
`perceived on the Internet (i.e., a form of market research).
`Individual artists, Writers, and other oWners of intellectual
`property are currently forced to search Internet Web sites,
`File Transfer Protocol (FTP) sites, chat rooms, etc. by
`visiting over thousands of sites in order to detect piracy or
`disparagement at offending sites. Such searching is currently
`done either by hand or using commercial search engines.
`Each of these methods is costly because a great amount of
`time is required to do such searching—time that detracts
`from positive, pro?t-earning activities. Adding to the frus
`tration of detecting infringements is the fact that commercial
`search engines are infrequently updated and typically limit
`the resulting number of sites (“hits”) that a search request
`returns. Furthermore, the task of visiting each site to deter
`mine Whether there is indeed an infringement or disparage
`ment and if so, the eXtent and character of it, also demands
`a great deal of time.
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`2
`Therefore, in vieW of the above, What is needed is a
`system, method and computer program product for an online
`(i.e., Internet or intranet) monitoring search engine. Such
`online monitoring Would enhance the ability of intellectual
`property oWners and business oWners to detect and prioritiZe
`their response to infringements and disparagements. Further,
`What is needed is a system, method and computer program
`product that searches the Internet’s Web pages, FTP sites,
`FSP sites, Usenet neWsgroups, chat rooms, etc. for data
`relevant to the intellectual property and goodWill oWned by
`an entity and produces a detailed, customiZed report of
`offending sites.
`SUMMARY OF THE INVENTION
`The present invention is a system, method and computer
`program product for an online monitoring search engine that
`satis?es the above-stated needs. The method involves
`receiving search criteria from a user, Where the search
`criteria re?ects the user’s intellectual property infringement
`or disparagement concerns. Then a search of the Internet (or
`intranet) is done for uniform resource locators (URL’ s) (i.e.,
`addresses) that specify sites Which contain contents match
`ing the search criteria. After a list of URL’s containing
`probable infringements or disparagement is obtained, the
`pages of each URL are doWnloaded, archived, and scored.
`The method also obtains contact information for each reg
`istrant of the offending URL’S. The method then produces
`a report listing the offending URL’s and the score for each
`of the URL’s. The report may then be utiliZed by the user to
`plan intellectual property infringement or disparagement
`enforcement activities. In a preferred embodiment of the
`present invention, before generating a report, the pages are
`also grouped into “actual sites” to reduce the magnitude of
`information contained in the report. The method may also
`list the highest scoring page for each of the actual sites, as
`Well as the highest ranking actual site.
`The online monitoring system of the present invention
`includes a Web server for receiving search criteria, search
`setup, and management inputs from users, an intellectual
`property infringement server (IPIS) for searching the Inter
`net (or any computer netWork) for URL’s that contain
`contents matching the search criteria to thereby compile a
`list of offending URL’s. The system also includes a ?le
`system for storing pages from each of the offending URL’s
`and a relational database for alloWing the IPIS to perform
`queries of the pages in order to produce a report. In a
`preferred embodiment, the system also includes a plurality
`of Web clients that provide a graphical user interface (GUI)
`for users to enter their search criteria, as Well as vieW pages
`of the offending URL’s by communicating With the Web
`server.
`One advantage of the present invention is that intellectual
`property oWners may quickly and efficiently search and ?nd
`infringements and disparagements contained on Web, FTP,
`and FSP sites, as Well as chat rooms and Usenet neWsgroups
`Within the Internet.
`Another advantage of the present invention is that detailed
`and customiZable reports listing offending sites and associ
`ated metrics are produced alloWing intellectual property
`oWners to focus their enforcement activities.
`Another advantage of the present invention is that its
`back-end (search engine) and front-end (user interface) are
`designed to operate independently of each other, thus alloW
`ing greater throughput and availability of the system as a
`Whole.
`Yet another advantage of the present invention is that lists
`of probable offending URL’s may be grouped and
`
`Ex 1008-14
`
`

`
`US 6,401,118 B1
`
`3
`prioritized, both in an automated and manual fashion, in
`order to arrive at a manageable set of data to focus intel
`lectual property enforcement activities.
`Further features and advantages of the invention as Well
`as the structure and operation of various embodiments of the
`present invention are described in detail beloW With refer
`ence to the accompanying draWings.
`
`BRIEF DESCRIPTION OF THE FIGURES
`
`The features and advantages of the present invention Will
`become more apparent from the detailed description set
`forth beloW When taken in conjunction With the draWings in
`Which like reference numbers indicate identical or function
`ally similar elements. Additionally, the left-most digit of a
`reference number identi?es the draWing in Which the refer
`ence number ?rst appears.
`FIG. 1 is a block diagram illustrating the system archi
`tecture of an embodiment of the present invention, shoWing
`netWork connectivity among the various components;
`FIG. 2 is a block diagram illustrating the softWare archi
`tecture of an embodiment of the present invention, shoWing
`communications among the various components;
`FIG. 3 is a ?oWchart shoWing the overall operation of an
`embodiment of the present invention;
`FIG. 4 is a block diagram illustrating the softWare archi
`tecture of an intellectual property infringement server
`according to an embodiment of the present invention;
`FIG. 5 is a ?oWchart shoWing the operation of a meta
`search engine, according to an embodiment of the present
`invention;
`FIGS. 6—10 are exemplary output report pages according
`to an embodiment of the present invention; and
`FIG. 11 is a block diagram of an exemplary computer
`system useful for implementing the present invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`
`Table of Contents
`
`I. OvervieW
`II. System Architecture
`III. SoftWare Architecture
`IV. Overall Monitoring System Operation
`A. Inputs and Searching
`B. Web CraWling
`C. FTP CraWling
`D. Processing
`E. Output
`F. DoWnloading Non-FTP and Non-HTTP Contents
`V. Graphical User Interface (Front-End)
`VI. Search Engine (Back-End)
`A. Multi-Threaded Execution Environment
`B. Meta Search Engine Mode
`C. Standard Search Engine Mode
`VII. Output Reports
`VIII. Front-End and Back-End Severability
`IX. Environment
`X. Conclusion
`I. OvervieW
`The present invention is directed to a system, method, and
`computer program product for an online monitoring search
`engine. In a preferred embodiment of the present invention,
`an organiZation provides monitoring services for clients that
`Would include, for example, individuals, companies,
`
`4
`consortiums, organiZations, and the like Who are interested
`in protecting their intellectual property and/or goodWill from
`infringement or disparagement on the Internet.
`Such a monitoring organiZation Would employ an intel
`ligent search engine that spans the entire Internet (Web
`pages, FTP sites, FSP sites, chat rooms, Usenet neWsgroups,
`etc.) and returns links to Internet sites that, With a high
`probability of certainty, contain infringing or disparaging
`content. The input of the monitoring organiZation’s search
`engine Would be customiZed for each client based on, for
`example, their products, services, business activity, and/or
`the form of intellectual property oWned. The monitoring
`organiZation’s search engine Would also provide detailed
`reports, also customiZed to ?t each client’s monitoring
`needs, so that the client’s legal personnel may prioritiZe their
`enforcement activities. In a preferred embodiment, the
`monitoring organiZation also provides a Web server so that
`clients may remotely utiliZe the search engine.
`While the present invention is described in terms of the
`above example, this is for convenience only and is not
`intended to limit its application. In fact, after reading the
`folloWing description, it Will be apparent to one skilled in the
`relevant art(s) hoW to implement the folloWing invention in
`alternative embodiments (e.g., providing online monitoring
`for a corporate intranet or extranet).
`Furthermore, While the folloWing description focuses on
`the monitoring of Web sites and FTP sites, and thus employs
`such terms as URL’s (addresses) and Web pages (contents),
`it is not intended to limit the application of the present
`invention. It Will be apparent to one skilled in the relevant
`art hoW to implement the folloWing invention, Where
`appropriate, in alternative embodiments. For example, the
`present invention may be applied to monitoring Internet
`addresses (URL’s, URN’s, and the like) that specify the
`contents of chat rooms, or Usenet neWsgroups, FSP sites,
`etc.
`II. System Architecture
`FIG. 1 is a block diagram illustrating the physical archi
`tecture of a monitoring system 100, according to an embodi
`ment of the present invention, shoWing netWork connectivity
`among the various components. It should be understood that
`the particular monitoring system 100 in FIG. 1 is shoWn for
`illustrative purposes only and does not limit the invention.
`As Will be apparent to one skilled in the relevant art(s), all
`of components “inside” of the monitoring system 100 are
`connected and communicate via a local area netWork (LAN)
`101.
`The monitoring system 100 includes an intellectual prop
`erty infringement server 106 (shoWn as “IPIS” 106) that
`serves as the “back-end” (i.e., search engine) of the present
`invention. Connected to the IPIS 106, is a relational database
`102 (shoWn as “DB” 102), a ?le system 104, and a Web
`server 108. As is Well-knoWn in the relevant art(s), a Web
`sever is a server process running at a Web site Which sends
`out Web pages in response to Hypertext Transfer Protocol
`(HTTP) requests from remote broWsers. The Web server 108
`serves as the “front end” of the present invention. That is, the
`Web server 108 provides the graphical user interface (GUI)
`to users of the monitoring system 100 in the form of Web
`pages. Such users may access the Web server 108 at the
`monitoring organiZation’s site via a plurality of internal
`search Workstations 110 (shoWn as Workstations 110a—n).
`A ?reWall 112 (shoWn as “FW” 112) serves as the
`connection and separation betWeen the LAN 101, Which
`includes the plurality of netWork elements (i.e., elements
`102—110) “inside” of the LAN 101, and the global Internet
`103 “outside” of the LAN 101. Generally speaking, a
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`Ex 1008-15
`
`

`
`US 6,401,118 B1
`
`15
`
`5
`?reWall—Which is Well-knoWn in the relevant art(s)—is a
`dedicated gateway machine With special security precaution
`software. It is typically used, for example, to service Internet
`103 connections and dial-in lines, and protects a cluster of
`more loosely administered machines hidden behind it from
`an external invasion.
`The global Internet 103, outside of the LAN 101, includes
`a plurality of various FTP sites 114 (shoWn as sites 114a—n)
`and the WWW 116. Within the WWW 116 are a plurality of
`Web sites 120 (shoWn as sites 120a—n). The search space for
`the IPIS 106 includes the W 116 and the plurality of
`FTP sites 114. As mentioned above, it Will be apparent to one
`skilled in the relevant art(s), that the search space (i.e.,
`Internet 103) of the monitoring system 100, although not
`shoWn, Will also include chat rooms, Usenet newsgroups,
`FSP sites, etc.
`Aplurality of external search Workstations 118 (shoWn as
`Workstations 120a—n) are also located Within the WWW
`116. The external search Workstations 118 alloW clients of
`the monitoring organiZation to remotely perform searches
`using their oWn personnel and equipment.
`While only one database 102, ?le system 104, and IPIS
`106 computer are shoWn in FIG. 1, it Will be apparent to one
`skilled in the relevant art(s) that monitoring system 100 may
`be run in a distributed fashion over a plurality of the
`25
`above-mentioned netWork elements connected via LAN
`101. For example, both the IPIS 106 “back-end” application
`and the Web server 108 “front-end” may be distributed over
`several computers thereby increasing the overall execution
`speed of the monitoring system 100. More detailed descrip
`tions of the monitoring system 100 components, as Well their
`functionality, are provided beloW.
`III. SoftWare Architecture
`Referring to FIG. 2, a block diagram illustrating a soft
`Ware architecture 200 according to an embodiment of moni
`toring system 100, shoWing communications among the
`various components, is shoWn. The softWare architecture
`200 of monitoring system 100 includes softWare code that
`implements the IPIS 106 in a high level programming
`language such as the C++ programming language. Further,
`in an embodiment, the IPIS 106 softWare code is an appli
`cation running on an IBMTM (or compatible) personal com
`puter (PC) in the WindoWs NTTM operating system environ
`ment.
`In a preferred embodiment of the present invention, the
`database 102 is implemented using a high-end relational
`database product (e.g., MicrosoftTM SQL Server, IBMTM
`DB2, ORACLETM, INGRESTM, etc.). As is Well-knoWn in
`the relevant art(s), relational databases alloW the de?nition
`of data structures, storage and retrieval operations, and
`integrity constraints, Where data and relations betWeen them
`are organiZed in tables.
`In a preferred embodiment of the present invention, the
`IPIS 106 application communicates With the database 102
`using the Open Database Connectivity (ODBC) interface.
`As is Well-knoWn in the relevant art(s), ODBC is a standard
`for accessing different database systems from high level
`programming language application. It enables these appli
`cations to submit statements to ODBC using an ODBC
`structured query language (SQL) and then translates these to
`the particular SQL commands the underlying database prod
`uct employs.
`The physical ?le system 104, in a preferred embodiment
`of the present invention, is any physical memory device that
`includes a storage media and a cache (e. g., the hard drive and
`primary cache, respectively, of the same PC that runs the
`IPIS 106 application). In an alternative embodiment, the ?le
`
`55
`
`6
`system 104 may be a memory device external to the PC
`hosting the IPIS 106 application. In yet another alternative
`embodiment, the ?le system 104 may encompass a storage
`media physically separate from the cache, Where the storage
`media may also be distributed over several elements Within
`LAN 101. Further, in a preferred embodiment of the present
`invention, the ?le system 104 communicates With the IPIS
`106 application and Web server 108 using the native ?le
`commands of the operating system in use (e.g., WindoWs
`NTTM).
`The Web server 108 provides the GUI “front-end” for
`monitoring system 100. In a preferred embodiment of the
`present invention, it is implemented using the Active Server
`Pages (ASP), Visual BASIC (VB) script, and JavaScriptTM
`sever-side scripting environments that alloW the creation of
`dynamic Web pages. The Web server 108 communicates
`With the plurality of external search Workstations 118 and
`the plurality of internal search Workstations 110 (collectively
`shoWn as a “Web Clients” 202) using the Hypertext Transfer
`Protocol
`The Web clients 202 user interface is a
`broWser implemented using Java, JavaScriptTM, and
`Dynamic Hypertext Markup Language (DHTML). In a
`preferred embodiment of the present invention, as Will be
`described in detail beloW in Section VIII, the Web clients
`202 may also communicate directly With the IPIS 106
`application via HTTP.
`IV. Overall Monitoring System Operation
`A. Inputs and Searching
`Referring to FIG. 3, a ?oWchart 300 shoWing the overall
`operation of the monitoring system 100, according to an
`embodiment of the present invention, is shoWn. FloWchart
`300 begins at step 302 With control passing immediately to
`step 304. In step 304, a user (on one of the Web client 202
`Workstations), de?nes a search criteria. The search criteria,
`as explained in detail beloW in Section V, are customiZed
`according to a particular client’s intellectual property
`infringement or disparagement concerns. In step 306, a
`search of the Internet 103 is performed. This search returns
`a list of probable uniform resource locators (URL’s). As is
`Well-knoWn in the relevant art(s), a URL is the standard for
`specifying the location of an object on the Internet 103. The
`URL standard addressing scheme is speci?ed as “protocol://
`hostname” (e.g., “http://WWW.aicompanycom”, “ftp://
`organiZation/pub/?les” or “neWs:alt.topic”). An URL begin
`ning With “http” speci?es a Web site 120, an URL beginning
`With “ftp” speci?es an FTP site 114, and an URL beginning
`With “neWs” speci?es a Usenet neWsgroup, etc. The prob
`able URL’s indicate a ?rst (preliminary) set of locations
`(i.e., addresses) on the Internet 103, based on the search
`criteria, Where infringements or disparagements may occur.
`The details of the search in step 306 are described in detail
`beloW in Section V.
`B. Web CraWling
`In step 308, each of the probable URL’s is visited and the
`contents doWnloaded locally to the cache of the ?le system
`104. The aim of the doWnload step 308 is so that subsequent
`processing steps of the monitoring system 100 may be
`performed on “local” copies of the visited URL’s. This
`eliminates the need for re-visiting (and thus, re-establishing
`a connection to) each of the URL’s Web severs, thus
`increasing the overall performance of the monitoring system
`100.
`If any of the URL’s Within the preliminary set contains
`?les, those ?les may contain potentially infringing materials
`(e.g., a “*.mp3” music ?le, or a “*.gif” or “*.jpg” image
`?le). This is in contrast to actual text located on a Web page
`of a particular Web site 120. The ?les may be located (1) on
`
`35
`
`45
`
`65
`
`Ex 1008-16
`
`

`
`US 6,401,118 B1
`
`7
`a different Web site 120 accessible via a hyperlink on the
`Web page the monitoring system 100 is currently accessing;
`(2) on a different Web page of the same Web site 120 the
`monitoring system 100 is currently accessing; or (3) in a
`different directory of the FTP site 114 than the monitoring
`system 100 is currently accessing. In these instances, the
`monitoring system 100 employs a Web craWling technique
`in order to locate the ?les. After the original URL is visited
`and the link to the ?le is identi?ed, the monitoring system
`100 truncates the link URL at the rightmost slash (“/”), thus
`generating a neW link URL. This process is repeated until a
`reachable domain is generated. This technique takes advan
`tage of the fact that most designers of Web sites 120 alloW
`“default” documents to be returned by their Web servers in
`response to such URL (via HTTP) requests. An eXample of
`the IPIS 106 Web craWling technique is shoWn in Table 1
`beloW.
`
`TABLE 1
`
`EXAMPLE OF IPIS 106 WEB CRAWLING TECHNIQUE
`
`Original Web Page URL:
`http://WWW.links-to-interesting-?les—all—over-the—net.com
`Interesting Links Found on the Original Web Page Identi?ed by Client’s
`Search Criteria:
`http://WWW.really-good-music-not-yet-released-.com/future—hit.mp3
`ftp://WWW.company-trades-secrets.com/july/tradeseceret.doc
`Truncated URL’s:
`http://WWW.really-good-music-not-yet-released.com/
`ftp://WWW.company-trades-secrets.com/july/
`ftp://WWW.company—trades—secrets.com/
`
`For any Web site 120 Where the site’s server is not
`currently responding (i.e., “doWn” or “off-line”), the IPIS
`106 application, before removing the URL corresponding to
`the site from the preliminary set, implements a “re-try” timer
`and mechanism.
`C. Nice FTP CraWling
`When any of the URL’s Within the preliminary set is an
`FTP site 114 (or FSP site), the normal steps of visiting and
`doWnloading the sites are not practical and thus, not used.
`Therefore, the present invention contemplates a method for
`“FTP craWling” in order to accomplish step 308 for such
`URL’s. First, the IPIS 106 application attempts to log into
`the FTP site 114 speci?ed by the URL. As is Well knoWn in
`the relevant art(s), there are tWo types of FTP sites 114—
`passWord protected sites and anonymous sites. If the site 114
`is passWord protected and the passWord is not published in
`a reference linked page, it is passed over and the URL is
`removed from the preliminary set. If the FTP site 114 has a
`published passWord, the IPIS 106 attempts to login using
`that passWord. If the FTP site 114 is an anonymous site, the
`IPIS 106 application attempts to log in. As is Well knoWn in
`the relevant art(s), an anonymous FTP site alloWs a user to
`login using a user name such as “ftp” or “anonymous” and
`then use their electronic mail address as the passWord.
`In any event, if a connection can be established, the IPIS
`106 application has access to the directory hierarchy con
`taining the publically accessible ?les (e.g., a “pub”
`subdirectory). The IPIS 106 application may then “nicely”
`craWl the relevant portions of the FTP site 114 by mapping
`the directory structure and then visiting certain directories
`based on keyWords derived from the de?ned search criteria
`(step 304).
`The purpose of nice FTP craWling is to capture the
`relevant contents of the FTP site 114 as it relates to the client
`Without burdening the host’s resources by craWling the
`entire FTP site 114. This is especially important due the large
`siZe of a typical FTP site 114 (e.g., a university’s site or
`
`8
`someone entire PC hard disk drive), and due to the lack of
`craWl restriction standards like the “robots.tXt” ?le com
`monly found on Web sites 120.
`Suppose the IPIS 106 is searching the for the directory:
`“ftp://ftp.stuff.com/~user/music/famousiartist” in the con
`teXt of a music and copyright infringement related search.
`First, the nice FTP craWling technique involves establishing
`a single connection to the FTP site 114 (even if multiple
`content is needed from the site) and then going to the root
`directory. Second, a counter is then marked Zero and a
`directory listing and snapshot of the current directory is
`taken. For each directory, if the directory name is
`“interesting,” then the IPIS 106 enters the directory, sets the
`counter to a positive number (e.g., C=2), then repeats the
`listing and snapshot step. If the counter is greater than Zero
`or the directory is on the Way to the destination directory,
`then the directory is entered and then the listing and snapshot
`step is repeated.
`To simulate human behavior, it is best if the IPIS 106
`performs a depth ?rst search, and introduces slight pauses
`betWeen directory l

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket