`Wang Baldonado
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 6,704,722 B2
`Mar. 9, 2004
`
`USOO6704722B2
`
`(54)
`
`SYSTEMS AND METHODS FOR
`PERFORMING CRAWL SEARCHES AND
`
`6,061,682 A * 5/2000 Agrawal et al. ............... 707/6
`6,101,503 A * 8/2000 Cooper et al. .............. 707/104
`
`(75)
`
`(73)
`
`(*)
`
`(21)
`(22)
`(65)
`
`(51)
`(52)
`(58)
`(56)
`
`INDEX SEARCHES
`
`Inventor: Michell 8.ang Baldonado, Palo
`O,
`Assignee: Xerox Corporation, Stamford, CT
`
`US ( )
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by O. davs.
`(b) by 0 days
`
`Notice:
`
`Appl. No.: 09/442,339
`Filed:
`Nov. 17, 1999
`O
`O
`Prior Publication Data
`US 2002/0147880 A1 Oct. 10, 2002
`Int. Cl................................................. G06F 17/30
`U.S. Cl. .................
`707/3; 707/10; 71.5/513
`Field of Search ......................... 707/1-10; 71.5/513
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`6,182,063 B1 * 1/2001 Woods - - - - - - - - - - - - - - - - - - - - - - - 709/223
`
`CA
`EP
`
`... 707/501.1
`6,182,091 B1 * 1/2001 Pitkow et al. ...
`6,195,696 B1 * 2/2001 Baber et al. ................ 709/223
`6,301.614 B1 * 10/2001 Najork et al. ............... 709/223
`6,411,952 B1
`6/2002 Bharat et al. ............... 704/243
`FOREIGN PATENT DOCUMENTS
`2243724
`1/1999
`O 457 705 A2 11/1991
`OTHER PUBLICATIONS
`Lawrence et al., Searching the World WideWeb, Apr. 1998,
`vol. 280, pp. 98-100.*
`Sheldon et al., Discover: a resource discovery System based
`on content routing, Apr. 1995, vol. 27, pp. 953-972.*
`“Sphinx: a framework for creating personal, Site-specific
`Web crawlers', Robert C. Miller et al., School for Computer
`Science, Carnegie Mellon University, Pennsylvania, Sep.
`16, 1999, Sep. 17, 1999, pp 1-12.
`“Autonomous Interface Agents”, Henry Lieberman, Pro
`ceedings of the ACM Conference on Computers and Human
`Interface, CHI '97, Georgia, Mar. 1997, Sep. 17, 1999, pp
`1-12.
`“Information Retrieval in Distribution Hypertexts”, Paul De
`RIAO-94 Conference, New York, Sep. 17, 1999,
`
`2
`
`2
`
`akka C a
`
`- - -
`
`t
`d
`ti
`List
`2/1996 Huck et al. ................. 711/125
`5,493,667 A
`(List continued on next page.)
`5,553,281 A * 9/1996 Brown et al. ............... 395/600
`Primary Examiner-Greta Robinson
`5,778,372 A * 7/1998 Cordell et al. .............. 707/100
`2. A : 3. SE i. - - - - 32, Assistant Examiner Sathyanarayan Pannala
`5,842,206 A * 11/1998 Sotomayor ...
`... 707/5
`E. E".E. Agent, or Firm-Oliff & Berridge, PLC;
`5,855,015. A 12/1998 Shoham ......................... 707/5
`ugene Palazzo
`5,875,446 A
`2/1999 Brown et al. .................. 707/3
`(57)
`5,890,170 A * 3/1999 Sidana .........
`707/501
`5,913.208 A * 6/1999 Brown et al. ..
`... 707/3
`5,924,105 A * 7/1999 Punch et al.................... 704/7
`A :
`S. et al.
`- - - - 7:59:
`2- Y -
`odine et al. ............
`E. A : K. S.telliff et al. .
`707/501.1
`2- - -a-
`ll . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707/3
`6,006,217 A * 12/1999 Lumsden ...
`... 707/2
`6,029,195 A * 2/2000 Herz .........
`... 709/219
`6,035,330 A * 3/2000 Astiz et al. .......
`... 709/218
`6,038,668 A * 3/2000 Chipman et al. ........... 713/201
`
`ABSTRACT
`
`The systems and methods described herein allow a user to
`perform localized Searching from a Standard web browser. In
`particular, the Systems and methods of this invention use a
`two-prong approach to accomplish both a dynamic breadth
`first crawl Search and a contextualize indeX Search to gen
`h results. Th
`h resul
`h
`bled i
`erate searc reSultS.
`e searc reSultS are then assembled in
`a unified results page and displayed to a user.
`32 Claims, 7 Drawing Sheets
`
`RESUTS
`PAGE 60
`
`:
`
`f
`
`s80
`
`600
`Bridgwelcome-Browser
`ite Ed Yiew Goommunicator Help
`: <
`2, 3 f as E 3 of 3
`Sexy
`Pint
`Guide
`Search
`home
`Back forwal Reloats
`Bockmarks a lottionshipigeminiggressionaldorudismalletiquepaquestroden
`inton Mesop
`Morld
`SO
`XMarket: Marks the Spotl
`ES-A 55
`670---RL: httpivoliifcom
`SEARCH
`Aloist Search done
`\Sius:
`STOP SEARCH
`Aliasia Results
`Cliff&Berridge Welcome
`Cliff&Befrigentellectual Property Information
`Oliff&Berridge Alorney profiles
`lf& Berridge Intellectual Property information
`845
`it &Berridge intellectual Property liformation
`N Oliffomp; Berridge Intellectual Property informatian
`Olifamp; Berridge Intellectual Property information
`
`X
`
`what's Related
`
`Crawi
`Sufus:
`
`Search done
`
`840
`
`850
`Crawl Results
`Olifamp;Berridge intellecial properly information a
`Oil & Berridge Attorney Profiles
`Clamp;Berridge intellectual property infonation
`Oliff &Berridge intellectual Property information-Co
`if &BerridgeiatellsducProperty information-co
`
`
`
`Oligop; Berridge Abust-Securing Intellectual Properly
`5: Gaga. 4
`
`IPR2020-00686
`Apple EX1026 Page 1
`
`
`
`US 6,704,722 B2
`Page 2
`
`OTHER PUBLICATIONS
`“WebCutter: A System for Dynamic and Tailored Site Map
`ping”, Yoelle S. Maarek et al., Sixth International World
`Wide Web Conference, pp. 714–722, Sep. 17, 1999, Nov. 16,
`1999.
`“Searching for Arbitrary Information in the WWW: the
`Fish-Search for Mosaic', P.M.E. De Bra et al., Eindhoven
`University of Technology, Department of Computing Sci
`ence, the Netherlands, Sep. 17, 1999, Nov. 16, 1999, pp
`1-10.
`“The shark-search algorithm-An application: tailored Web
`site mapping”, Michael Hersovic et al., IBM Haifa Research
`Laboratory, Israel, Sep. 17, 1999, Nov. 16, 1999, pp 1-12.
`“Information Retrieval in the World-Wide Web: Making
`Client-based searching feasible”, P.M.E. De Bra et al., First
`World Wide Web Conference, Geneva, Sep. 17, 1999, Nov.
`16, 1999, pp 1-14.
`“Finding Information on the Web”, P.M.E. De Bra et al.,
`Information Systems Section, Department of Computing
`Science, Eindhoven University of Technology, the Nether
`lands, Nov. 16, 1999, pp 1-14.
`“Sphinx: a framework for creating personal, Site-specific
`Web crawlers', Robert C. Miller et al., School for Computer
`Science, Carnegie Mellon University, Pennsylvania, Sep.
`17, 1999, pp 1-12.
`“Autonomous Interface Agents”, Henry Lieberman, Pro
`ceedings of the ACM Conference on Computers and Human
`Interface, CHI '97, Georgia, Sep. 17, 1999, pp 1-12.
`
`“Information Retrieval in Distribution Hypertexts”, Paul De
`Bra et al., RIAO-94 Conference, New York, Sep. 17, 1999,
`pp. 1-12.
`“Finding Information on the Web”, P.M.E. De Bra et al.,
`Information Systems Section, Department of Computing
`Science, Eindhoven University of Technology, the Nether
`lands, Sep. 17, 1999.
`“Letizia: An Agent That Assists Web Browsing”, Henry
`Lieberman, Proceedings of the International Joint Confer
`ence on Artificial Intelligence, Montreal, Aug. 1995, pp 1-2.
`“WebGlimpse-Combining Browsing and Searching”, Udi
`Manber et al., http://glimpse.cs.arizona.edu/, Jan. 10, 1997,
`pp. 1-14.
`“Search Utilities”, Danny Sullivan, Search Engine Watch,
`http://searchengineWatch.com/, 1996, pp 1-4.
`“Specialty Search Engines”, Danny Sullivan, Search Engine
`Watch, http://searchenginewatch.com/. 1996, pp 1-6.
`“Bookmarklets-free tools for power Surfing", http://www.
`bookmarklets.com/, Dec. 14, 1998.
`“Creating Bookmarklets”, Yehuda Shiran et al., http://www.
`Webreference.com/s/column35/creating.html, Sep.
`17,
`1999.
`“Writing embedded date bookmarklets”, John Barger, http://
`www.robotwisdom.com/web/bookmarklets.html. Sep. 17,
`1999.
`
`* cited by examiner
`
`IPR2020-00686
`Apple EX1026 Page 2
`
`
`
`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 1 of 7
`
`US 6,704,722 B2
`
`O
`
`y
`
`300
`
`340
`
`32O
`
`DATABASE
`CONTROLER
`
`MEMORY
`
`33O e
`
`SEARCH SERVER
`GUERY MANAGEMENT
`
`
`
`
`
`
`
`CRAW, SEARCH
`MANAGEMENT CIRCUIT
`
`NDEX SEARCH
`MANAGEMENT CIRCUIT
`
`350
`
`360
`
`370
`
`/O NTERFACE
`
`50
`
`NK
`
`22O
`
`DSTRIBUTED
`NETWORK
`
`NETWORK
`
`50
`
`2O
`
`DISTRIBUTED
`NETWORK
`
`50
`
`OO
`
`CONTROLLER
`BROWSER INTERFACE
`20 se QUERY DEVELOPMENT
`MEMORY
`CIRCUIT
`3O s
`AOINTERFACE
`
`RESULT DEVELOPMENT
`CIRCUIT
`
`vo NIERACEH Haugmei
`
`50
`
`5O
`60
`
`70
`
`DISPLAY DEVICE
`
`USER INPUT
`DEVICE
`
`8O
`
`FIG. 7
`
`IPR2020-00686
`Apple EX1026 Page 3
`
`
`
`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 2 of 7
`
`US 6,704,722 B2
`
`
`
`
`
`
`
`
`
`
`
`
`
`IPR2020-00686
`Apple EX1026 Page 4
`
`
`
`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 3 of 7
`
`US 6,704,722 B2
`
`ZZZZZY
`
`
`
`Bºh?l?:ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
`
`
`
`
`
`
`
`OTH ‘ROCITHYIA@I S HRITO
`
`
`
`AA\771 1\7 SÅ EN RIO 1.IX?
`
`?
`
`2-G19
`
`IPR2020-00686
`Apple EX1026 Page 5
`
`
`
`U.S. Patent
`US. Patent
`
`US 6,704,722 B2
`US 6,704,722 132
`
`
`
`
`
`
`
`
`
`.3....2..............3om.32>.3...2...
`
`
`.25“.2...2......%5.2.2....nm“2.3....2322$3.5:as...»3.05.353....3.2.2..323:2...2.3.3:as...”.3548.32.2...3.2.2...32.2.2....53....»as...»..._on2.83;2.3.3..as...»....owm.2...
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`7£33382...Ego...3.2.2.2...33.3:2.553...3:3...35.222.3.5as...»3:0
`
`
`
`
`
`
`33..7.2-..o..a.....2...»..2......3:33.2533.3..as...».3...5.3.3.2...3.2.2..328:2...$3.3:as...“.....c
`
`
`
`
`
`
`
`
`
`
`
`
`lnIIIII.\I:rII/IJ|.|:IIII..|_m_nc.a...
`
`
`
`
`
`
`
`
`
`
`
`.o...2.2.25.3.2.32..33...:2...$3.5mas...»3.95.35.2...Fae...335:2...33.5:as...”.35
`
`
`
`
`
`
`
`
`8-8.35.2...E23...3.62.2...2.3.3:as...”3....5.35.2...2.2.2..330:2....3332.as...»3...
`
`
`
`
`
`
`
`
`
`3.8.3....2...€2.........._.....:2...33:2.as...”.3...5.353....2.2.2..328:2...2.3.3:2......“3...
`
`
`
`
`999
`
`.223...-
`
`
`$2335.52...2.2.12...3:33....$3.5:£353....W.-
`
`
`
`2.2.5.22...€2.35._............_..2.................&2.3.3..3...
`
`”5.8mEEEM
`
`
`
` 23.3....M.359m:EOu.t__O.3>>s>\\u.2...—WE:n.3!§a.M33m233.3..NWill.3.5.5“1I 9.03.._u.oom.250:03p.8531,.Jaovm 5.2.2.2.5is.9......
`222222222
`
`.2...2:5.2.2....
`
`0.0w0<m
`
`whammy.
`
`on“.
`
`omv
`
`|PR2020-00686
`
`Apple EX1026 Page 6
`
`IPR2020-00686
`Apple EX1026 Page 6
`
`
`
`
`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 5 of 7
`
`US 6,704,722 B2
`
`
`
`
`
`OTCI ‘ROOIININGI?I S HHITO
`
`
`
`IPR2020-00686
`Apple EX1026 Page 7
`
`
`
`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 6 of 7
`
`US 6,704,722 B2
`
`F.G. 6
`
`SOO
`
`BEGIN
`
`RECEIVE GUERY
`
`PERFORM CRAWL SEARCH
`
`PERFORM INDEX SEARCH
`
`ASSEMBLE RESULTS OF SEARCHES
`
`S200
`
`S3OO
`
`S4OO
`
`S5OO
`
`DISPLAY RESULTS
`
`END
`
`INDEX SEARCH
`
`STOOO
`
`RETRIEVE CONTEXT INFORMATION
`
`
`
`
`
`EDIT
`CONTENT INFORMATION
`2
`
`NO
`
`YES
`
`S3OO
`
`EDT CONTEXT
`INFORMATION
`
`
`
`ST TOO
`
`
`
`
`
`S400
`
`PERFORM SEARCH ON INDEX
`
`S500
`
`FIG. 7
`
`IPR2020-00686
`Apple EX1026 Page 8
`
`
`
`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 7 of 7
`
`US 6,704,722 B2
`
`CRAWL SEARCH
`
`S2000
`
`RETRIEVE CONTEXT INFORMATION
`
`S200
`
`S2O20
`
`EDT
`CONTEXT INFORMATION
`2
`
`NO
`
`YES
`
`EDIT CONTEXT
`INFORMATION
`
`S2O40
`
`DEFINE CRAWL BOUNDARIES
`
`ADD CONTEXT
`INFORMATION TO CRAWL GRUEUE
`
`S2050
`
`
`
`S2O60
`
`S2070
`
`S2O8O
`
`PERFORM CRAWL ON CONTEXT
`WITHIN CRAWL GUEUE
`
`REMOVE CONTEXT FROM CRAWL GUEUE
`
`ADD TO RESULT LIST RESULTS FROM
`CRAWL THAT MATCH GUERY
`
`ADD TO CRAWL GRUEUE CONTEXTS
`CORRESPONDING TO LINK(S) FOUND
`DURING CRAWL THAT ARE WITHIN
`CRAWL BOUNDARIES
`
`
`
`
`
`
`
`CRAW
`GUEUE EMPTY
`2
`
`
`
`S27 O
`
`YES
`
`FIG. 8
`
`IPR2020-00686
`Apple EX1026 Page 9
`
`
`
`US 6,704,722 B2
`
`1
`SYSTEMS AND METHODS FOR
`PERFORMING CRAWL SEARCHES AND
`INDEX SEARCHES
`
`2
`Second, the user can perform an “advanced Search” at
`Some global Search engine and Specify that results must be
`from the current web site. In this case, the results will indeed
`be guaranteed to come from the Site in question, but the user
`may not receive a Satisfactory Set of results due to the
`incompleteneSS and Staleness of most Search engine indices.
`In addition, this type of Search requires expertise on the part
`of the user.
`Third, the user can look for a locally provided search
`interface on the web site itself. The locally provided search
`interface may be hard to find, i.e. not available at the current
`location the user is browsing, it may have an idiosyncratic
`Syntax and it may not be up-to-date.
`Fourth, the user can manually browse the Site Searching
`for Specific information. At a complex Site, this could be
`time consuming and error prone.
`Finally, the user can contact the administrator of the web
`Site. This is a Slow process, is not always possible and may
`not produce any results.
`The systems and methods of this invention enable a user
`to perform a Search more easily by combining indeX Search
`ing and crawl-based Searching. Furthermore, the Systems
`and methods of this invention enable context information to
`be included with either or both of the index search and the
`crawl Search to further refine the Scope of the Search.
`Specifically, by recognizing the user's current context, e.g.,
`virtual location or Uniform Resource Locator (URL), by
`performing a contextualized indeX Search on behalf of the
`user, and by performing a contextualized crawl looking for
`results that match the user's query, this invention provides a
`non-expert user with localized Search results in a timely and
`comprehensive fashion.
`Specifically, in a crawl type Search, a combination of
`keywords, context and boundary information are used to
`conduct a Search within a Specified area of a distributed
`network. Since this approach operates in real-time or near
`real-time, a number of the drawbacks encountered with an
`indeX type Search are overcome.
`The systems and methods of this invention combine index
`type Searching and crawl type Searching.
`This invention Separately provides Systems and methods
`for assisting users in conducting a Search of one or more
`distributed networks.
`This invention Separately provides Systems and methods
`that allow a user to interface with a Search tool via a user
`interface.
`This invention Separately provides Systems and methods
`that allow users to customize Search Strategies to be applied
`to one or more distributed networks.
`The Search Systems and methods of this invention use a
`combination of indeX based Search Strategies, crawl based
`Search Strategies and context information to provide a com
`prehensive lists of results to a user. In particular, a user
`enters one or more keywords corresponding to information
`on a desired topic. The Systems and methods of this inven
`tion receive the query and perform, either Serially or in
`parallel, an indeX Search of a preexisting indeX and a crawl
`Search within a particular context. The results of these
`queries are then assembled and displayed to the user. Thus,
`the results displayed to the user are comprehensive and the
`combination of the two queries complement each other in
`overcoming their individual shortcomings.
`These and other features and advantages of this invention
`are described in or are apparent from the following detailed
`description of the preferred embodiments.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of Invention
`This invention relates to search systems for distributed
`networks.
`2. Description of Related Art
`A plethora of available “Search engines are available on
`the Internet for locating information about a particular topic.
`Specifically, a user, after typing in a Uniform Resource
`Locator (URL) of a “search engine,” for example, Yahoo.(R),
`InfoseekCR), Lycos(R or AltaVista(E), will typically arrive at a
`Screen at which the user can enter one or more keywords.
`These keywords generally correspond to a distillation of the
`important concepts pertaining to the particular piece of
`information the user is Seeking. Upon entering these
`keywords, and pressing the “Search' button, for example,
`with the click of a mouse, the user is returned a result list of
`information Sources or “hits” which the Search engine found
`in its indeX and determined to be relevant to the user's query.
`The user then typically Scans the result list determining
`which of the particular results is most relevant. The user then
`can click on a result, or a "hit,” and be taken, via hyperlink,
`to the actual information Source, e.g., web page, that corre
`sponds to the hit.
`Once at the web page, the user can then browse the page
`looking for the Specific information item that corresponds to
`the Submitted query. Upon completion of the review of this
`particular web page, a user generally presses the “back'
`button on their browser interface to return to the result page
`generated by the Search engine. The user then again Selects
`a result and follows that results hyperlink in the same
`manner as described above. This process continues until the
`user locates the desired information.
`
`1O
`
`15
`
`25
`
`35
`
`SUMMARY OF THE INVENTION
`Existing Search engines are fast and produce ranked
`results. However, the accuracy of their ranked results is
`based on the internal indices generated at the Specific Search
`engine. If the indices are not routinely maintained, incom
`plete indices produce inaccurate results, the indices may
`contain broken links to web pages that may have moved
`location and the indices may be missing links that have been
`updated Since the last regeneration of the indeX.
`Furthermore, existing Search engines do not take into
`account the user's current context, e.g. the current Virtual
`location that the user is browsing. Accordingly, if a user
`wants to find information within the currently viewed web
`Site about a particular topic, the user must choose from five
`options. First, the user can use a global Search engine and
`Supplement the query with words that are likely to be
`asSociated with the current web site, e.g., the name of the
`company to which the web site belongs. This requires
`expertise on behalf of the user and is not guaranteed to
`produce only results from the Site in question. For example,
`in an exemplary indeX based Search engine, Such as Yahoo(E),
`AltaVista(E) or Excite(R), the search engine receives the user's
`input keyword. This input keyword or words is then com
`pared to the Search engine's index. A correlation is then
`made between the keyword and the frequency of occurrence
`within the index. This correlation produces a result list that
`can then be organized, or ranked, based on this correlation.
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`IPR2020-00686
`Apple EX1026 Page 10
`
`
`
`3
`BRIEF DESCRIPTION OF THE DRAWINGS
`The preferred embodiments of the invention will be
`described in detail, with reference to the following figures,
`wherein:
`FIG. 1 is a functional block diagram showing a first
`embodiment of a Search System according to this invention;
`FIG. 2 shows an exemplary web page tree Structure;
`FIG. 3 shows a first exemplary web page implementing
`the Systems and methods of this invention;
`FIG. 4 is a Second exemplary web page illustrating the
`results of an exemplary Search performed by the Search
`Systems and methods of this invention;
`FIG. 5 is a third exemplary web page illustrating the result
`of selecting a “hit” in the result web page of FIG. 3;
`FIG. 6 is a flowchart outlining one exemplary embodi
`ment of the method for performing crawl Searches and indeX
`Searches according to this invention;
`FIG. 7 is a flowchart outlining in greater detail the index
`search step of FIG. 6; and
`FIG. 8 is a flowchart outlining greater detail the contex
`tualized crawl search step of FIG. 6.
`DETAILED DESCRIPTION OF PREFERRED
`EMBODIMENTS
`By combining crawl type Searches and indeX type
`Searches in an amalgamated “Search engine,” a user is
`provided with a unique list of results. Furthermore, by
`combining contextualized crawl-based Searches and contex
`tualized index-based Searches in an amalgamated “context
`aware Search engine,” a user is provided with a unique
`contextualized list of results no matter what site is currently
`being Visited.
`Specifically, a crawl type Search is more likely to find high
`quality results, but generally requires more time to execute
`and greater network bandwidth. On the other hand, index
`type Searching is generally likely to return results quickly,
`but Some of the results may point to items that are no longer
`in existence, and not all relevant results may be found. For
`example, Search engines do not currently have the ability to
`indeX the entirety of a distributed network, Such as the
`Internet. Furthermore, the rate of change for any one of these
`given indices is generally slower than the average number of
`updates for a given web site.
`The systems and methods of this invention allow users to
`perform Searching which minimizes disruption to the real
`task at hand. Specifically, by providing a context-aware
`Search tool, the boundaries between Searching and browsing
`become more fluid. The systems and methods of this inven
`tion also enable users to retrieve Search results quickly, even
`if the machine from which the search is initiated is a
`relatively “slow machine, e.g., because the machine has a
`Slow processor, a single thread of execution or a slow
`network connection. The dual-prong Search Strategy of this
`invention allows users to quickly obtain matches within the
`context that are available in a global index, while at the same
`time finding matches on pages within the current context
`that are not in the index, e.g., newly introduced pages, newly
`edited pages, pages in an obscure location that are not
`indexed, pages that may be present behind a firewall, or the
`like.
`FIG. 1 illustrates one embodiment of the components of
`a contextualized Search System 10 used during a distributed
`network search. The contextualized search system 10
`includes a user device 100. The user device 100 comprises
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,704,722 B2
`
`4
`a controller 110, a memory 120, an I/O interface 130, a
`browser interface 140, a query development circuit 150 and
`a result development circuit 160. These elements are linked
`via link 50. Additionally, the user device 100 is connected to
`a displayed device 170 and a user input device 180 via link
`50. The user device 100 is also connected to at least one
`distributed network 200 which may or may not also be
`connected to one or more other user devices, Servers,
`databases, or other distributed networks 210, 220.
`The contextualized Search System 10 also comprises a
`search server 300. The search server 300 comprises a
`controller 310, a memory 320, an I/O interface 330, at least
`one database 340, and a query management circuit 350. The
`query management circuit 350 comprises a crawl Search
`management circuit 360 and an indeX Search management
`circuit 370.
`While the exemplary embodiment illustrated in FIG. 1
`shows the user device 100 and the search server 300 located
`at distant portions of a distributed network, Such as a local
`area network, a wide area network, an intranet and/or the
`Internet, it should be appreciated that the components of the
`search server 300 and the user device 100 could be com
`bined into one device or collocated on a particular node of
`a distributed network. As will be appreciated from the
`following description, and for reasons of computation
`efficiency, the components of the user device 100 and the
`search server 300 can be arranged at any location within a
`distributed network without affecting the operation of the
`System.
`Furthermore, the links 50 can be a wired or wireless link
`or any known or later developed element(s) that is capable
`of Supplying electronic data to and from the connected
`elements.
`In operation, a user determines that information regarding
`a particular topic within a current context is desired. For
`example, a user could start a web browser, which is managed
`by the browser interface 140, such as Netscape Navigator(R)
`or Microsoft's(R Internet Explorer(R), for browsing of the
`Internet. Upon preliminary browsing of a web site with a
`web browser, the user determines additional information
`regarding a particular topic within the current context is
`desired. Instead of clicking on a hyperlink to, or entering the
`URL of a traditional Search engine, Such as AltaVista(E), a
`user invokes a Search in accordance with this invention. For
`example, the Search can be invoked by including a clickable
`button in a toolbar of a web browser, such as Netscape
`Navigator(R), by executing a program Such as a JavaScript
`routine, a dedicated button within the operating System
`graphical user interface, a dedicated hardwired button, or
`any other well-known method of triggering execution of a
`program. For example, invocation of the Search System can
`be accomplished by a user Selecting, for example, with the
`click of a mouse, a button on a toolbar of a web browser, that
`in turn executes the Search Systems and methods of this
`invention.
`Upon initialization of the search, the user device 100, in
`cooperation with the query development circuit 150, gener
`ates a keyword entry dialog box on the display device 170.
`This keyword entry dialog box generally operates in a
`Similar fashion to the keyword entry dialog boxes seen on
`conventional Internet Search engines. Thus, the keyword(s)
`generally correspond to a distillation of the important con
`cepts pertaining to the particular piece of information the
`user is Seeking.
`A user, via user input device 180, then enters one or more
`keywords into the keyword entry dialog box. Alternatively,
`
`IPR2020-00686
`Apple EX1026 Page 11
`
`
`
`S
`instead of a user entering one or more keywords through a
`keyword entry dialog box, the browser interface 140 can
`detect highlighted or Selected portions within a document,
`Such as a web page, displayed in the web browser. For
`example, if a user highlights text, for example, by holding
`down the left mouse button and traversing a portion of text
`within a web page, the highlighted portion can be automati
`cally copied and used as the keyword information when the
`initialize Search button is Selected. These keywords are
`transferred, via link 50, and I/O interface 130, with the aid
`of controller 110 and memory 120, to the query development
`circuit 150.
`The query development circuit 150 performs a number of
`tasks. First, the query development circuit 150 receives the
`one or more keywords from the user input device 180 and
`stores them in the memory 120. Additionally, the query
`development circuit 150 communicates with the browser
`interface 140 to determine the current virtual location, or
`context, of the user within the distributed network.
`Alternatively, the context information can be forwarded
`directly with the one or more keywords. For example, as
`previously discussed, the keyword entry dialog box can also
`have a portion that allows entry of the context information
`for the Search. Thus, this context information could include,
`but is not limited to, a Uniform Resource Locator (URL), an
`Internet Protocol address (IP address), a File Transfer Pro
`tocol address (FTP address), a directory, a domain name, a
`universal resource name, or the like.
`Having the context and keyword information, the query
`development circuit 150 initiates the search. In particular,
`the query development circuit assembles two different que
`ries which are Submitted to the search server 300. The first
`query is a crawl Search. The crawl search comprises the
`context information as well as the keyword information
`entered by the user or detected in cooperation with the
`browser interface 140 and the user input device. As previ
`ously discussed, this context information can correspond to
`the URL of, for example, the web page at which the user
`requested the Search Services.
`Alternatively, the context information can be edited by the
`user in order to more explicitly delineate the context. For
`example, if the user, while Surfing, browsed to a web site
`having a URL of www.example.com, the context informa
`tion could be the URL itself, i.e., www.example.com.
`Alternatively, the context information could include one or
`more wildcards to account for varying Structures in the
`example.com web site. For example, the context information
`could be:
`*.example.com
`represents a wildcard that indi
`In this example, the
`cates any prefix within the URL “example.com” would also
`be queried during the crawl Search. Additionally, it should be
`appreciated that highly Specialized context information can
`also be directly entered by a user and combined with the
`keyword information to customize a particular query, with
`out the need of a user actually browsing to a particular web
`page.
`For example, if the example.com web site had a special
`Section on trademarks, and the trademark Section was bro
`ken into a “recent developments' section and a “historical”
`Section, the user may edit the context information to Spe
`cifically target a particular area of the web site. For example,
`the context information could be:
`www.example.com/trademarkS/current/
`This context information would allow a search for the
`keywords within the “current” portion or directory of the
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,704,722 B2
`
`6
`trade mark portion of the example.com web site.
`Alternatively, a combination of web sites could be specified
`as the context information. For example, a user may specify
`the context information as "www.example.com” and
`"www.example2.com.” In general, any information pointing
`to one or more locations in a distributed network can be used
`as the context information.
`The combination of the keyword and context information
`is then Submitted, via link 50 and the network 200, to the
`search server 300. The search server 300 receives the query
`and context, via I/O interface 330, in the query management
`circuit 350. The query management circuit 350 forwards the
`query and context information to the crawl Search manage
`ment circuit 360. The crawl search management circuit
`analyzes the received keywords and context information. In
`accordance with the context information, the crawl Search
`management circuit 360 determines the crawl boundaries
`corresponding to the context information. These crawl
`boundaries regulate the breadth of the crawl search within
`the distributed network.
`Alternatively, the crawl search management circuit 350
`can allow changes to the context information. For example,
`the crawl Search management circuit 350 can return a
`prompt to the user, prior to or during the course of the
`Search, asking whether the determined context information
`is acceptable, or if changes, or a custom crawl context
`information is desired.
`Having established the crawl boundaries, the context is
`added to a crawl queue. At the direction of the crawl Search
`management circuit 360, and in conjunction with the con
`troller 310 and the memory 320, the crawl search is executed
`on the documents or the information, e.g., the web pages,
`within the context of the crawl queue. Specifically, the
`submitted keywords are searched for within the context
`Stored in the crawl queue. Once the context in the crawl
`queue has been Searched, the context in the crawl queue is
`removed.
`The results that match both the context information and
`the keyword(s) are then added to a result list stored in
`memory 320. The crawl search management circuit 360 then
`adds to the crawl queue the contexts, if any, that correspond
`to the one or more links found during the crawl that are
`within the crawl boundaries. The crawl Search management
`circuit 360 then determines if the crawl queue is empty. If
`the crawl queue is empty, the crawl Search is complete. If the
`crawl queue is not empty, the crawl Search management
`circuit 360 continues searching within the context added to
`the crawl queue as described above.
`For example, FIG. 2 illustrates an exemplary web site tree
`structure for the web site “example.com.” The web site has
`a “home page, e.g., www.example.com/index.html, a pat
`ents page, e.g., WWW.example.com/patents.html, a trade
`marks page, e.g., www.example.com/trademarks.html, a
`copyright page, e.g., WWW.example.com/copyright.html,
`and a plurality of Supplemental pages with information on
`trademarks, e.g., page1.html-page3.html.
`For example, assume a user is looking for information on
`trademarks. Additionally, assume the user is located at the
`www.example.com home page upon execution of the Search.
`The crawl search would develop as follows. The initial
`context information could correspond to the web page from
`which the query was invoked, i.e., www.example.com.
`Thus, all web pages within the example.com web site could
`be queried. This context would be added to the crawl queue.
`The crawl search would then be executed on the context
`information Stored in the crawl queue. In this illustrative
`example, results would be returned that corresponded to the
`
`IPR2020-00686
`Apple EX1026 Page 12
`
`
`
`US 6,704,722 B2
`
`15
`
`45
`
`50
`
`25
`
`7
`patents, trademarks, and copyright web pages. Upon
`completion of the crawl Search, the context within the crawl
`queue is removed.
`Each of the resu