throbber
(12) United States Patent
`Wang Baldonado
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 6,704,722 B2
`Mar. 9, 2004
`
`USOO6704722B2
`
`(54)
`
`SYSTEMS AND METHODS FOR
`PERFORMING CRAWL SEARCHES AND
`
`6,061,682 A * 5/2000 Agrawal et al. ............... 707/6
`6,101,503 A * 8/2000 Cooper et al. .............. 707/104
`
`(75)
`
`(73)
`
`(*)
`
`(21)
`(22)
`(65)
`
`(51)
`(52)
`(58)
`(56)
`
`INDEX SEARCHES
`
`Inventor: Michell 8.ang Baldonado, Palo
`O,
`Assignee: Xerox Corporation, Stamford, CT
`
`US ( )
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by O. davs.
`(b) by 0 days
`
`Notice:
`
`Appl. No.: 09/442,339
`Filed:
`Nov. 17, 1999
`O
`O
`Prior Publication Data
`US 2002/0147880 A1 Oct. 10, 2002
`Int. Cl................................................. G06F 17/30
`U.S. Cl. .................
`707/3; 707/10; 71.5/513
`Field of Search ......................... 707/1-10; 71.5/513
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`6,182,063 B1 * 1/2001 Woods - - - - - - - - - - - - - - - - - - - - - - - 709/223
`
`CA
`EP
`
`... 707/501.1
`6,182,091 B1 * 1/2001 Pitkow et al. ...
`6,195,696 B1 * 2/2001 Baber et al. ................ 709/223
`6,301.614 B1 * 10/2001 Najork et al. ............... 709/223
`6,411,952 B1
`6/2002 Bharat et al. ............... 704/243
`FOREIGN PATENT DOCUMENTS
`2243724
`1/1999
`O 457 705 A2 11/1991
`OTHER PUBLICATIONS
`Lawrence et al., Searching the World WideWeb, Apr. 1998,
`vol. 280, pp. 98-100.*
`Sheldon et al., Discover: a resource discovery System based
`on content routing, Apr. 1995, vol. 27, pp. 953-972.*
`“Sphinx: a framework for creating personal, Site-specific
`Web crawlers', Robert C. Miller et al., School for Computer
`Science, Carnegie Mellon University, Pennsylvania, Sep.
`16, 1999, Sep. 17, 1999, pp 1-12.
`“Autonomous Interface Agents”, Henry Lieberman, Pro
`ceedings of the ACM Conference on Computers and Human
`Interface, CHI '97, Georgia, Mar. 1997, Sep. 17, 1999, pp
`1-12.
`“Information Retrieval in Distribution Hypertexts”, Paul De
`RIAO-94 Conference, New York, Sep. 17, 1999,
`
`2
`
`2
`
`akka C a
`
`- - -
`
`t
`d
`ti
`List
`2/1996 Huck et al. ................. 711/125
`5,493,667 A
`(List continued on next page.)
`5,553,281 A * 9/1996 Brown et al. ............... 395/600
`Primary Examiner-Greta Robinson
`5,778,372 A * 7/1998 Cordell et al. .............. 707/100
`2. A : 3. SE i. - - - - 32, Assistant Examiner Sathyanarayan Pannala
`5,842,206 A * 11/1998 Sotomayor ...
`... 707/5
`E. E".E. Agent, or Firm-Oliff & Berridge, PLC;
`5,855,015. A 12/1998 Shoham ......................... 707/5
`ugene Palazzo
`5,875,446 A
`2/1999 Brown et al. .................. 707/3
`(57)
`5,890,170 A * 3/1999 Sidana .........
`707/501
`5,913.208 A * 6/1999 Brown et al. ..
`... 707/3
`5,924,105 A * 7/1999 Punch et al.................... 704/7
`A :
`S. et al.
`- - - - 7:59:
`2- Y -
`odine et al. ............
`E. A : K. S.telliff et al. .
`707/501.1
`2- - -a-
`ll . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707/3
`6,006,217 A * 12/1999 Lumsden ...
`... 707/2
`6,029,195 A * 2/2000 Herz .........
`... 709/219
`6,035,330 A * 3/2000 Astiz et al. .......
`... 709/218
`6,038,668 A * 3/2000 Chipman et al. ........... 713/201
`
`ABSTRACT
`
`The systems and methods described herein allow a user to
`perform localized Searching from a Standard web browser. In
`particular, the Systems and methods of this invention use a
`two-prong approach to accomplish both a dynamic breadth
`first crawl Search and a contextualize indeX Search to gen
`h results. Th
`h resul
`h
`bled i
`erate searc reSultS.
`e searc reSultS are then assembled in
`a unified results page and displayed to a user.
`32 Claims, 7 Drawing Sheets
`
`RESUTS
`PAGE 60
`
`:
`
`f
`
`s80
`
`600
`Bridgwelcome-Browser
`ite Ed Yiew Goommunicator Help
`: <
`2, 3 f as E 3 of 3
`Sexy
`Pint
`Guide
`Search
`home
`Back forwal Reloats
`Bockmarks a lottionshipigeminiggressionaldorudismalletiquepaquestroden
`inton Mesop
`Morld
`SO
`XMarket: Marks the Spotl
`ES-A 55
`670---RL: httpivoliifcom
`SEARCH
`Aloist Search done
`\Sius:
`STOP SEARCH
`Aliasia Results
`Cliff&amp;Berridge Welcome
`Cliff&amp;Befrigentellectual Property Information
`Oliff&amp;Berridge Alorney profiles
`lf&amp; Berridge Intellectual Property information
`845
`it &amp;Berridge intellectual Property liformation
`N Oliffomp; Berridge Intellectual Property informatian
`Olifamp; Berridge Intellectual Property information
`
`X
`
`what's Related
`
`Crawi
`Sufus:
`
`Search done
`
`840
`
`850
`Crawl Results
`Olifamp;Berridge intellecial properly information a
`Oil &amp; Berridge Attorney Profiles
`Clamp;Berridge intellectual property infonation
`Oliff &amp;Berridge intellectual Property information-Co
`if &amp;BerridgeiatellsducProperty information-co
`
`
`
`Oligop; Berridge Abust-Securing Intellectual Properly
`5: Gaga. 4
`
`IPR2020-00686
`Apple EX1026 Page 1
`
`

`

`US 6,704,722 B2
`Page 2
`
`OTHER PUBLICATIONS
`“WebCutter: A System for Dynamic and Tailored Site Map
`ping”, Yoelle S. Maarek et al., Sixth International World
`Wide Web Conference, pp. 714–722, Sep. 17, 1999, Nov. 16,
`1999.
`“Searching for Arbitrary Information in the WWW: the
`Fish-Search for Mosaic', P.M.E. De Bra et al., Eindhoven
`University of Technology, Department of Computing Sci
`ence, the Netherlands, Sep. 17, 1999, Nov. 16, 1999, pp
`1-10.
`“The shark-search algorithm-An application: tailored Web
`site mapping”, Michael Hersovic et al., IBM Haifa Research
`Laboratory, Israel, Sep. 17, 1999, Nov. 16, 1999, pp 1-12.
`“Information Retrieval in the World-Wide Web: Making
`Client-based searching feasible”, P.M.E. De Bra et al., First
`World Wide Web Conference, Geneva, Sep. 17, 1999, Nov.
`16, 1999, pp 1-14.
`“Finding Information on the Web”, P.M.E. De Bra et al.,
`Information Systems Section, Department of Computing
`Science, Eindhoven University of Technology, the Nether
`lands, Nov. 16, 1999, pp 1-14.
`“Sphinx: a framework for creating personal, Site-specific
`Web crawlers', Robert C. Miller et al., School for Computer
`Science, Carnegie Mellon University, Pennsylvania, Sep.
`17, 1999, pp 1-12.
`“Autonomous Interface Agents”, Henry Lieberman, Pro
`ceedings of the ACM Conference on Computers and Human
`Interface, CHI '97, Georgia, Sep. 17, 1999, pp 1-12.
`
`“Information Retrieval in Distribution Hypertexts”, Paul De
`Bra et al., RIAO-94 Conference, New York, Sep. 17, 1999,
`pp. 1-12.
`“Finding Information on the Web”, P.M.E. De Bra et al.,
`Information Systems Section, Department of Computing
`Science, Eindhoven University of Technology, the Nether
`lands, Sep. 17, 1999.
`“Letizia: An Agent That Assists Web Browsing”, Henry
`Lieberman, Proceedings of the International Joint Confer
`ence on Artificial Intelligence, Montreal, Aug. 1995, pp 1-2.
`“WebGlimpse-Combining Browsing and Searching”, Udi
`Manber et al., http://glimpse.cs.arizona.edu/, Jan. 10, 1997,
`pp. 1-14.
`“Search Utilities”, Danny Sullivan, Search Engine Watch,
`http://searchengineWatch.com/, 1996, pp 1-4.
`“Specialty Search Engines”, Danny Sullivan, Search Engine
`Watch, http://searchenginewatch.com/. 1996, pp 1-6.
`“Bookmarklets-free tools for power Surfing", http://www.
`bookmarklets.com/, Dec. 14, 1998.
`“Creating Bookmarklets”, Yehuda Shiran et al., http://www.
`Webreference.com/s/column35/creating.html, Sep.
`17,
`1999.
`“Writing embedded date bookmarklets”, John Barger, http://
`www.robotwisdom.com/web/bookmarklets.html. Sep. 17,
`1999.
`
`* cited by examiner
`
`IPR2020-00686
`Apple EX1026 Page 2
`
`

`

`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 1 of 7
`
`US 6,704,722 B2
`
`O
`
`y
`
`300
`
`340
`
`32O
`
`DATABASE
`CONTROLER
`
`MEMORY
`
`33O e
`
`SEARCH SERVER
`GUERY MANAGEMENT
`
`
`
`
`
`
`
`CRAW, SEARCH
`MANAGEMENT CIRCUIT
`
`NDEX SEARCH
`MANAGEMENT CIRCUIT
`
`350
`
`360
`
`370
`
`/O NTERFACE
`
`50
`
`NK
`
`22O
`
`DSTRIBUTED
`NETWORK
`
`NETWORK
`
`50
`
`2O
`
`DISTRIBUTED
`NETWORK
`
`50
`
`OO
`
`CONTROLLER
`BROWSER INTERFACE
`20 se QUERY DEVELOPMENT
`MEMORY
`CIRCUIT
`3O s
`AOINTERFACE
`
`RESULT DEVELOPMENT
`CIRCUIT
`
`vo NIERACEH Haugmei
`
`50
`
`5O
`60
`
`70
`
`DISPLAY DEVICE
`
`USER INPUT
`DEVICE
`
`8O
`
`FIG. 7
`
`IPR2020-00686
`Apple EX1026 Page 3
`
`

`

`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 2 of 7
`
`US 6,704,722 B2
`
`
`
`
`
`
`
`
`
`
`
`
`
`IPR2020-00686
`Apple EX1026 Page 4
`
`

`

`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 3 of 7
`
`US 6,704,722 B2
`
`ZZZZZY
`
`
`
`Bºh?l?:ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
`
`
`
`
`
`
`
`OTH ‘ROCITHYIA@I S HRITO
`
`
`
`AA\771 1\7 SÅ EN RIO 1.IX?
`
`?
`
`2-G19
`
`IPR2020-00686
`Apple EX1026 Page 5
`
`

`

`U.S. Patent
`US. Patent
`
`US 6,704,722 B2
`US 6,704,722 132
`
`
`
`
`
`
`
`
`
`.3....2..............3om.32>.3...2...
`
`
`.25“.2...2......%5.2.2....nm“2.3....2322$3.5:as...»3.05.353....3.2.2..323:2...2.3.3:as...”.3548.32.2...3.2.2...32.2.2....53....»as...»..._on2.83;2.3.3..as...»....owm.2...
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`7£33382...Ego...3.2.2.2...33.3:2.553...3:3...35.222.3.5as...»3:0
`
`
`
`
`
`
`33..7.2-..o..a.....2...»..2......3:33.2533.3..as...».3...5.3.3.2...3.2.2..328:2...$3.3:as...“.....c
`
`
`
`
`
`
`
`
`
`
`
`
`lnIIIII.\I:rII/IJ|.|:IIII..|_m_nc.a...
`
`
`
`
`
`
`
`
`
`
`
`.o...2.2.25.3.2.32..33...:2...$3.5mas...»3.95.35.2...Fae...335:2...33.5:as...”.35
`
`
`
`
`
`
`
`
`8-8.35.2...E23...3.62.2...2.3.3:as...”3....5.35.2...2.2.2..330:2....3332.as...»3...
`
`
`
`
`
`
`
`
`
`3.8.3....2...€2.........._.....:2...33:2.as...”.3...5.353....2.2.2..328:2...2.3.3:2......“3...
`
`
`
`
`999
`
`.223...-
`
`
`$2335.52...2.2.12...3:33....$3.5:£353....W.-
`
`
`
`2.2.5.22...€2.35._............_..2.................&2.3.3..3...
`
`”5.8mEEEM
`
`
`
` 23.3....M.359m:EOu.t__O.3>>s>\\u.2...—WE:n.3!§a.M33m233.3..NWill.3.5.5“1I 9.03.._u.oom.250:03p.8531,.Jaovm 5.2.2.2.5is.9......
`222222222
`
`.2...2:5.2.2....
`
`0.0w0<m
`
`whammy.
`
`on“.
`
`omv
`
`|PR2020-00686
`
`Apple EX1026 Page 6
`
`IPR2020-00686
`Apple EX1026 Page 6
`
`
`

`

`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 5 of 7
`
`US 6,704,722 B2
`
`
`
`
`
`OTCI ‘ROOIININGI?I S HHITO
`
`
`
`IPR2020-00686
`Apple EX1026 Page 7
`
`

`

`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 6 of 7
`
`US 6,704,722 B2
`
`F.G. 6
`
`SOO
`
`BEGIN
`
`RECEIVE GUERY
`
`PERFORM CRAWL SEARCH
`
`PERFORM INDEX SEARCH
`
`ASSEMBLE RESULTS OF SEARCHES
`
`S200
`
`S3OO
`
`S4OO
`
`S5OO
`
`DISPLAY RESULTS
`
`END
`
`INDEX SEARCH
`
`STOOO
`
`RETRIEVE CONTEXT INFORMATION
`
`
`
`
`
`EDIT
`CONTENT INFORMATION
`2
`
`NO
`
`YES
`
`S3OO
`
`EDT CONTEXT
`INFORMATION
`
`
`
`ST TOO
`
`
`
`
`
`S400
`
`PERFORM SEARCH ON INDEX
`
`S500
`
`FIG. 7
`
`IPR2020-00686
`Apple EX1026 Page 8
`
`

`

`U.S. Patent
`
`Mar. 9, 2004
`
`Sheet 7 of 7
`
`US 6,704,722 B2
`
`CRAWL SEARCH
`
`S2000
`
`RETRIEVE CONTEXT INFORMATION
`
`S200
`
`S2O20
`
`EDT
`CONTEXT INFORMATION
`2
`
`NO
`
`YES
`
`EDIT CONTEXT
`INFORMATION
`
`S2O40
`
`DEFINE CRAWL BOUNDARIES
`
`ADD CONTEXT
`INFORMATION TO CRAWL GRUEUE
`
`S2050
`
`
`
`S2O60
`
`S2070
`
`S2O8O
`
`PERFORM CRAWL ON CONTEXT
`WITHIN CRAWL GUEUE
`
`REMOVE CONTEXT FROM CRAWL GUEUE
`
`ADD TO RESULT LIST RESULTS FROM
`CRAWL THAT MATCH GUERY
`
`ADD TO CRAWL GRUEUE CONTEXTS
`CORRESPONDING TO LINK(S) FOUND
`DURING CRAWL THAT ARE WITHIN
`CRAWL BOUNDARIES
`
`
`
`
`
`
`
`CRAW
`GUEUE EMPTY
`2
`
`
`
`S27 O
`
`YES
`
`FIG. 8
`
`IPR2020-00686
`Apple EX1026 Page 9
`
`

`

`US 6,704,722 B2
`
`1
`SYSTEMS AND METHODS FOR
`PERFORMING CRAWL SEARCHES AND
`INDEX SEARCHES
`
`2
`Second, the user can perform an “advanced Search” at
`Some global Search engine and Specify that results must be
`from the current web site. In this case, the results will indeed
`be guaranteed to come from the Site in question, but the user
`may not receive a Satisfactory Set of results due to the
`incompleteneSS and Staleness of most Search engine indices.
`In addition, this type of Search requires expertise on the part
`of the user.
`Third, the user can look for a locally provided search
`interface on the web site itself. The locally provided search
`interface may be hard to find, i.e. not available at the current
`location the user is browsing, it may have an idiosyncratic
`Syntax and it may not be up-to-date.
`Fourth, the user can manually browse the Site Searching
`for Specific information. At a complex Site, this could be
`time consuming and error prone.
`Finally, the user can contact the administrator of the web
`Site. This is a Slow process, is not always possible and may
`not produce any results.
`The systems and methods of this invention enable a user
`to perform a Search more easily by combining indeX Search
`ing and crawl-based Searching. Furthermore, the Systems
`and methods of this invention enable context information to
`be included with either or both of the index search and the
`crawl Search to further refine the Scope of the Search.
`Specifically, by recognizing the user's current context, e.g.,
`virtual location or Uniform Resource Locator (URL), by
`performing a contextualized indeX Search on behalf of the
`user, and by performing a contextualized crawl looking for
`results that match the user's query, this invention provides a
`non-expert user with localized Search results in a timely and
`comprehensive fashion.
`Specifically, in a crawl type Search, a combination of
`keywords, context and boundary information are used to
`conduct a Search within a Specified area of a distributed
`network. Since this approach operates in real-time or near
`real-time, a number of the drawbacks encountered with an
`indeX type Search are overcome.
`The systems and methods of this invention combine index
`type Searching and crawl type Searching.
`This invention Separately provides Systems and methods
`for assisting users in conducting a Search of one or more
`distributed networks.
`This invention Separately provides Systems and methods
`that allow a user to interface with a Search tool via a user
`interface.
`This invention Separately provides Systems and methods
`that allow users to customize Search Strategies to be applied
`to one or more distributed networks.
`The Search Systems and methods of this invention use a
`combination of indeX based Search Strategies, crawl based
`Search Strategies and context information to provide a com
`prehensive lists of results to a user. In particular, a user
`enters one or more keywords corresponding to information
`on a desired topic. The Systems and methods of this inven
`tion receive the query and perform, either Serially or in
`parallel, an indeX Search of a preexisting indeX and a crawl
`Search within a particular context. The results of these
`queries are then assembled and displayed to the user. Thus,
`the results displayed to the user are comprehensive and the
`combination of the two queries complement each other in
`overcoming their individual shortcomings.
`These and other features and advantages of this invention
`are described in or are apparent from the following detailed
`description of the preferred embodiments.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of Invention
`This invention relates to search systems for distributed
`networks.
`2. Description of Related Art
`A plethora of available “Search engines are available on
`the Internet for locating information about a particular topic.
`Specifically, a user, after typing in a Uniform Resource
`Locator (URL) of a “search engine,” for example, Yahoo.(R),
`InfoseekCR), Lycos(R or AltaVista(E), will typically arrive at a
`Screen at which the user can enter one or more keywords.
`These keywords generally correspond to a distillation of the
`important concepts pertaining to the particular piece of
`information the user is Seeking. Upon entering these
`keywords, and pressing the “Search' button, for example,
`with the click of a mouse, the user is returned a result list of
`information Sources or “hits” which the Search engine found
`in its indeX and determined to be relevant to the user's query.
`The user then typically Scans the result list determining
`which of the particular results is most relevant. The user then
`can click on a result, or a "hit,” and be taken, via hyperlink,
`to the actual information Source, e.g., web page, that corre
`sponds to the hit.
`Once at the web page, the user can then browse the page
`looking for the Specific information item that corresponds to
`the Submitted query. Upon completion of the review of this
`particular web page, a user generally presses the “back'
`button on their browser interface to return to the result page
`generated by the Search engine. The user then again Selects
`a result and follows that results hyperlink in the same
`manner as described above. This process continues until the
`user locates the desired information.
`
`1O
`
`15
`
`25
`
`35
`
`SUMMARY OF THE INVENTION
`Existing Search engines are fast and produce ranked
`results. However, the accuracy of their ranked results is
`based on the internal indices generated at the Specific Search
`engine. If the indices are not routinely maintained, incom
`plete indices produce inaccurate results, the indices may
`contain broken links to web pages that may have moved
`location and the indices may be missing links that have been
`updated Since the last regeneration of the indeX.
`Furthermore, existing Search engines do not take into
`account the user's current context, e.g. the current Virtual
`location that the user is browsing. Accordingly, if a user
`wants to find information within the currently viewed web
`Site about a particular topic, the user must choose from five
`options. First, the user can use a global Search engine and
`Supplement the query with words that are likely to be
`asSociated with the current web site, e.g., the name of the
`company to which the web site belongs. This requires
`expertise on behalf of the user and is not guaranteed to
`produce only results from the Site in question. For example,
`in an exemplary indeX based Search engine, Such as Yahoo(E),
`AltaVista(E) or Excite(R), the search engine receives the user's
`input keyword. This input keyword or words is then com
`pared to the Search engine's index. A correlation is then
`made between the keyword and the frequency of occurrence
`within the index. This correlation produces a result list that
`can then be organized, or ranked, based on this correlation.
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`IPR2020-00686
`Apple EX1026 Page 10
`
`

`

`3
`BRIEF DESCRIPTION OF THE DRAWINGS
`The preferred embodiments of the invention will be
`described in detail, with reference to the following figures,
`wherein:
`FIG. 1 is a functional block diagram showing a first
`embodiment of a Search System according to this invention;
`FIG. 2 shows an exemplary web page tree Structure;
`FIG. 3 shows a first exemplary web page implementing
`the Systems and methods of this invention;
`FIG. 4 is a Second exemplary web page illustrating the
`results of an exemplary Search performed by the Search
`Systems and methods of this invention;
`FIG. 5 is a third exemplary web page illustrating the result
`of selecting a “hit” in the result web page of FIG. 3;
`FIG. 6 is a flowchart outlining one exemplary embodi
`ment of the method for performing crawl Searches and indeX
`Searches according to this invention;
`FIG. 7 is a flowchart outlining in greater detail the index
`search step of FIG. 6; and
`FIG. 8 is a flowchart outlining greater detail the contex
`tualized crawl search step of FIG. 6.
`DETAILED DESCRIPTION OF PREFERRED
`EMBODIMENTS
`By combining crawl type Searches and indeX type
`Searches in an amalgamated “Search engine,” a user is
`provided with a unique list of results. Furthermore, by
`combining contextualized crawl-based Searches and contex
`tualized index-based Searches in an amalgamated “context
`aware Search engine,” a user is provided with a unique
`contextualized list of results no matter what site is currently
`being Visited.
`Specifically, a crawl type Search is more likely to find high
`quality results, but generally requires more time to execute
`and greater network bandwidth. On the other hand, index
`type Searching is generally likely to return results quickly,
`but Some of the results may point to items that are no longer
`in existence, and not all relevant results may be found. For
`example, Search engines do not currently have the ability to
`indeX the entirety of a distributed network, Such as the
`Internet. Furthermore, the rate of change for any one of these
`given indices is generally slower than the average number of
`updates for a given web site.
`The systems and methods of this invention allow users to
`perform Searching which minimizes disruption to the real
`task at hand. Specifically, by providing a context-aware
`Search tool, the boundaries between Searching and browsing
`become more fluid. The systems and methods of this inven
`tion also enable users to retrieve Search results quickly, even
`if the machine from which the search is initiated is a
`relatively “slow machine, e.g., because the machine has a
`Slow processor, a single thread of execution or a slow
`network connection. The dual-prong Search Strategy of this
`invention allows users to quickly obtain matches within the
`context that are available in a global index, while at the same
`time finding matches on pages within the current context
`that are not in the index, e.g., newly introduced pages, newly
`edited pages, pages in an obscure location that are not
`indexed, pages that may be present behind a firewall, or the
`like.
`FIG. 1 illustrates one embodiment of the components of
`a contextualized Search System 10 used during a distributed
`network search. The contextualized search system 10
`includes a user device 100. The user device 100 comprises
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,704,722 B2
`
`4
`a controller 110, a memory 120, an I/O interface 130, a
`browser interface 140, a query development circuit 150 and
`a result development circuit 160. These elements are linked
`via link 50. Additionally, the user device 100 is connected to
`a displayed device 170 and a user input device 180 via link
`50. The user device 100 is also connected to at least one
`distributed network 200 which may or may not also be
`connected to one or more other user devices, Servers,
`databases, or other distributed networks 210, 220.
`The contextualized Search System 10 also comprises a
`search server 300. The search server 300 comprises a
`controller 310, a memory 320, an I/O interface 330, at least
`one database 340, and a query management circuit 350. The
`query management circuit 350 comprises a crawl Search
`management circuit 360 and an indeX Search management
`circuit 370.
`While the exemplary embodiment illustrated in FIG. 1
`shows the user device 100 and the search server 300 located
`at distant portions of a distributed network, Such as a local
`area network, a wide area network, an intranet and/or the
`Internet, it should be appreciated that the components of the
`search server 300 and the user device 100 could be com
`bined into one device or collocated on a particular node of
`a distributed network. As will be appreciated from the
`following description, and for reasons of computation
`efficiency, the components of the user device 100 and the
`search server 300 can be arranged at any location within a
`distributed network without affecting the operation of the
`System.
`Furthermore, the links 50 can be a wired or wireless link
`or any known or later developed element(s) that is capable
`of Supplying electronic data to and from the connected
`elements.
`In operation, a user determines that information regarding
`a particular topic within a current context is desired. For
`example, a user could start a web browser, which is managed
`by the browser interface 140, such as Netscape Navigator(R)
`or Microsoft's(R Internet Explorer(R), for browsing of the
`Internet. Upon preliminary browsing of a web site with a
`web browser, the user determines additional information
`regarding a particular topic within the current context is
`desired. Instead of clicking on a hyperlink to, or entering the
`URL of a traditional Search engine, Such as AltaVista(E), a
`user invokes a Search in accordance with this invention. For
`example, the Search can be invoked by including a clickable
`button in a toolbar of a web browser, such as Netscape
`Navigator(R), by executing a program Such as a JavaScript
`routine, a dedicated button within the operating System
`graphical user interface, a dedicated hardwired button, or
`any other well-known method of triggering execution of a
`program. For example, invocation of the Search System can
`be accomplished by a user Selecting, for example, with the
`click of a mouse, a button on a toolbar of a web browser, that
`in turn executes the Search Systems and methods of this
`invention.
`Upon initialization of the search, the user device 100, in
`cooperation with the query development circuit 150, gener
`ates a keyword entry dialog box on the display device 170.
`This keyword entry dialog box generally operates in a
`Similar fashion to the keyword entry dialog boxes seen on
`conventional Internet Search engines. Thus, the keyword(s)
`generally correspond to a distillation of the important con
`cepts pertaining to the particular piece of information the
`user is Seeking.
`A user, via user input device 180, then enters one or more
`keywords into the keyword entry dialog box. Alternatively,
`
`IPR2020-00686
`Apple EX1026 Page 11
`
`

`

`S
`instead of a user entering one or more keywords through a
`keyword entry dialog box, the browser interface 140 can
`detect highlighted or Selected portions within a document,
`Such as a web page, displayed in the web browser. For
`example, if a user highlights text, for example, by holding
`down the left mouse button and traversing a portion of text
`within a web page, the highlighted portion can be automati
`cally copied and used as the keyword information when the
`initialize Search button is Selected. These keywords are
`transferred, via link 50, and I/O interface 130, with the aid
`of controller 110 and memory 120, to the query development
`circuit 150.
`The query development circuit 150 performs a number of
`tasks. First, the query development circuit 150 receives the
`one or more keywords from the user input device 180 and
`stores them in the memory 120. Additionally, the query
`development circuit 150 communicates with the browser
`interface 140 to determine the current virtual location, or
`context, of the user within the distributed network.
`Alternatively, the context information can be forwarded
`directly with the one or more keywords. For example, as
`previously discussed, the keyword entry dialog box can also
`have a portion that allows entry of the context information
`for the Search. Thus, this context information could include,
`but is not limited to, a Uniform Resource Locator (URL), an
`Internet Protocol address (IP address), a File Transfer Pro
`tocol address (FTP address), a directory, a domain name, a
`universal resource name, or the like.
`Having the context and keyword information, the query
`development circuit 150 initiates the search. In particular,
`the query development circuit assembles two different que
`ries which are Submitted to the search server 300. The first
`query is a crawl Search. The crawl search comprises the
`context information as well as the keyword information
`entered by the user or detected in cooperation with the
`browser interface 140 and the user input device. As previ
`ously discussed, this context information can correspond to
`the URL of, for example, the web page at which the user
`requested the Search Services.
`Alternatively, the context information can be edited by the
`user in order to more explicitly delineate the context. For
`example, if the user, while Surfing, browsed to a web site
`having a URL of www.example.com, the context informa
`tion could be the URL itself, i.e., www.example.com.
`Alternatively, the context information could include one or
`more wildcards to account for varying Structures in the
`example.com web site. For example, the context information
`could be:
`*.example.com
`represents a wildcard that indi
`In this example, the
`cates any prefix within the URL “example.com” would also
`be queried during the crawl Search. Additionally, it should be
`appreciated that highly Specialized context information can
`also be directly entered by a user and combined with the
`keyword information to customize a particular query, with
`out the need of a user actually browsing to a particular web
`page.
`For example, if the example.com web site had a special
`Section on trademarks, and the trademark Section was bro
`ken into a “recent developments' section and a “historical”
`Section, the user may edit the context information to Spe
`cifically target a particular area of the web site. For example,
`the context information could be:
`www.example.com/trademarkS/current/
`This context information would allow a search for the
`keywords within the “current” portion or directory of the
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,704,722 B2
`
`6
`trade mark portion of the example.com web site.
`Alternatively, a combination of web sites could be specified
`as the context information. For example, a user may specify
`the context information as "www.example.com” and
`"www.example2.com.” In general, any information pointing
`to one or more locations in a distributed network can be used
`as the context information.
`The combination of the keyword and context information
`is then Submitted, via link 50 and the network 200, to the
`search server 300. The search server 300 receives the query
`and context, via I/O interface 330, in the query management
`circuit 350. The query management circuit 350 forwards the
`query and context information to the crawl Search manage
`ment circuit 360. The crawl search management circuit
`analyzes the received keywords and context information. In
`accordance with the context information, the crawl Search
`management circuit 360 determines the crawl boundaries
`corresponding to the context information. These crawl
`boundaries regulate the breadth of the crawl search within
`the distributed network.
`Alternatively, the crawl search management circuit 350
`can allow changes to the context information. For example,
`the crawl Search management circuit 350 can return a
`prompt to the user, prior to or during the course of the
`Search, asking whether the determined context information
`is acceptable, or if changes, or a custom crawl context
`information is desired.
`Having established the crawl boundaries, the context is
`added to a crawl queue. At the direction of the crawl Search
`management circuit 360, and in conjunction with the con
`troller 310 and the memory 320, the crawl search is executed
`on the documents or the information, e.g., the web pages,
`within the context of the crawl queue. Specifically, the
`submitted keywords are searched for within the context
`Stored in the crawl queue. Once the context in the crawl
`queue has been Searched, the context in the crawl queue is
`removed.
`The results that match both the context information and
`the keyword(s) are then added to a result list stored in
`memory 320. The crawl search management circuit 360 then
`adds to the crawl queue the contexts, if any, that correspond
`to the one or more links found during the crawl that are
`within the crawl boundaries. The crawl Search management
`circuit 360 then determines if the crawl queue is empty. If
`the crawl queue is empty, the crawl Search is complete. If the
`crawl queue is not empty, the crawl Search management
`circuit 360 continues searching within the context added to
`the crawl queue as described above.
`For example, FIG. 2 illustrates an exemplary web site tree
`structure for the web site “example.com.” The web site has
`a “home page, e.g., www.example.com/index.html, a pat
`ents page, e.g., WWW.example.com/patents.html, a trade
`marks page, e.g., www.example.com/trademarks.html, a
`copyright page, e.g., WWW.example.com/copyright.html,
`and a plurality of Supplemental pages with information on
`trademarks, e.g., page1.html-page3.html.
`For example, assume a user is looking for information on
`trademarks. Additionally, assume the user is located at the
`www.example.com home page upon execution of the Search.
`The crawl search would develop as follows. The initial
`context information could correspond to the web page from
`which the query was invoked, i.e., www.example.com.
`Thus, all web pages within the example.com web site could
`be queried. This context would be added to the crawl queue.
`The crawl search would then be executed on the context
`information Stored in the crawl queue. In this illustrative
`example, results would be returned that corresponded to the
`
`IPR2020-00686
`Apple EX1026 Page 12
`
`

`

`US 6,704,722 B2
`
`15
`
`45
`
`50
`
`25
`
`7
`patents, trademarks, and copyright web pages. Upon
`completion of the crawl Search, the context within the crawl
`queue is removed.
`Each of the resu

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket