throbber
United States Patent
`Bowman et al.
`
`[19]
`
`US006006225A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,006,225
`Dec. 21, 1999
`
`[54] REFINING SEARCH QUERIES BY THE
`SUGGESTION OF CORRELATED TERMS
`FROM PRIOR SEARCHES
`
`[75] Inventors; Dwayne E. Bowman, Woodinville;
`Ruben E. Ortega; Michael L.
`Hamrick, both of Seattle; Joel R.
`Spiegel, Woodinville; Timothy R.
`Kohn, Seattle, all of Wash.
`[73] Assignee: Amazon.Oom, Seattle, Wash.
`
`Quarter?)eck Web Page, Downloaded Sep. 9, 1996, http://
`aracnid.gdeck.com/qdeck/products/webcompass.
`Towell et al. “Learning Collection Fusion Strategies for
`Information Retrieval”, Proceedings of the 12th Annual
`Machine Learning Conference, Jul. 1995, pp. 540–548.
`Voorhees et al., “Learning Collection Fusion Strategies”,
`Proceedings of SIGIR '95, Jul. 1995, pp. 172–179.
`Voorhees et al., “The Collection Fusion Problem” Proceed
`ings of TREC–3, NIST Special Publication 500–225, Apr.
`1995, pp. 95–104.
`Abstract of Generating Advanced Query Interfaces, Lee,
`Srivastava and Vista, Computer Networks and ISDN Sys
`tems Conference Title: Comput. Netw. ISDN Syst. (Neth
`[21] Appl. No.: 09/145,360
`erlands) vol. 30, No. 1–7, pp. 656–657 (1998).
`[22] Filed:
`Sep. 1, 1998
`Abstract of Using Combination of Evidence for Term Expan
`sion, Wilkinson, Information Retrieval Research, Proceed
`Related U.S. Application Data
`ings of the 19th Annual BCS-IRSG Colloquium on IR
`[60] Provisional application No. 60/089,244, Jun. 15, 1998.
`Research (1997).
`[51] Int. Cl." … G06F 17/30
`(List continued on next page.)
`[52] U.S. Cl. ......................... 707/5; 707/2; 707/4; 707/10
`Primary Examiner—Paul R. Lintz
`[58] Field of Search ................................... 707/5, 2, 10, 4
`Attorney, Agent, or Firm—Knobbe, Martens, Olson & Bear,
`LLP
`[56]
`References Cited
`ABSTRACT
`[57]
`A search engine is disclosed which suggests related terms to
`the user to allow the user to refine a search. The related terms
`are generated using query term correlation data which
`reflects the frequencies with which specific terms have
`previously appeared within the same query. The correlation
`data is generated and stored in a look-up table using an
`off-line process which parses a query log file. The table is
`regenerated periodically from the most recent query sub
`missions (e.g., the last two weeks of query submissions), and
`thus strongly reflects the current preferences of users. Each
`related term is presented to the user via a respective hyper
`link which can be selected by the user to submit a modified
`query. In one embodiment, the related terms are added to and
`selected from the table so as to guarantee that the modified
`queries will not produce a NULL query result.
`28 Claims, 10 Drawing Sheets
`
`U.S. PATENT DOCUMENTS
`
`5,675,819 10/1997 Schuetze ................................... 704/10
`5,721,897 2/1998 Rubinstein ...
`... 707/2
`5,787,422 7/1998 Tukey et al. ..
`... 707/5
`5,794,233 8/1998 Rubinstein .......
`... 707/4
`5,864,845
`1/1999 Voorhees et al.
`... 707/5
`5,911,140 6/1999 Tukey et al. .....
`... 707/5
`5,913,215 6/1999 Rubinstein ................................ 707/10
`OTHER PUBLICATIONS
`Bartell et al., “Automatic Combination of Multiple Ranked
`Retrieval Systems”, Proceedings of SIGIR '94, Jul. 1994,
`pp. 173–181.
`Belkin et al., “The Effect of Multiple Query Representations
`on Information System Performance” Proceedings of SIGIR
`’93, Jun. 1993, pp. 339–346.
`Shaw et al., “Combination of Multiple Searches”, Proceed
`ings of TREC–3, Apr. 1995, pp. 105–108.
`
`BEGIN )
`
`PARSE DAILY LOG FILE
`TO EXTRACT OUERY
`SUBMISSIONS FOR WHICH
`UND > 0
`
`4.3%
`
`CORRELATE TERMS BASED
`QN FREQUENCY OF OCCURRENCE
`WITHIN SAME QUERY
`
`CREATE DALY
`RESULTS FILE
`
`4.32
`
`47.4%%
`
`MERGE DAILY
`RESULTS FILES FOR
`| AST M DAYS
`
`
`
`QWERWRITE EXISTING CORRELATION
`TABLE WITH NEW CORRELATION TABLE
`
`
`
`3. (52
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 1
`
`

`

`6,006,225
`Page 2
`
`OTHER PUBLICATIONS
`Abstract of Inquirus, the NECI Meta Search Engine,
`Lawrence and Giles, Computer Networks and ISDN Sys
`tems Conference Title: Comput. NetwISDN Syst. (Nether
`lands) vol. 30, No. 1–7, pp. 95–105 (1998).
`Abstract of Facilitating Complex Web Queries Through
`Visual User Interfaces and Query Relaxtion, Li and Shim,
`Computer Networks and ISDN Systems Conference Title:
`Comput. Netw. ISDN Syst. (Netherlands) vol. 30, No. 1–7,
`pp. 149–159 (1998).
`A User-centred Evaluation of Ranking Algorithms for Inter
`active Query Expansion, Efthimiadis, Proceedings of the
`16th Annual International ACM SIGIR Conference, Pitts
`burgh, pp. 146–159 (1993).
`Concept Based Query Expansion, Qiu and Frei, Proceedings
`of the 16th Annual International ACM SIGIR Conference,
`Pittsburgh, pp. 160–169 (1993).
`Improving Retrieval Performance by Relevance Feedback,
`Salton and Buckley, J. of Am. Society for Info. Science
`41(4):288–297 (1990).
`
`Query Expansion Using Domain—Adapted, Weighted The
`saurus in an Extended Boolean Model, Kwon, Kim and
`Choi, Proceedings of the 3rd International Conference on
`Information and Knowledge Management (CIKM’94), pp.
`140–146 (1994).
`Browsing Through Querying: Designing for Electronic
`Books, Charoenkitkarn, Tam, Chignell and Golovichinsky,
`at the 5th ACM Conference on Hypertext, Seattle, WA
`206–216 (1993).
`A Survey of Information Retrieval and Filtering Methods,
`Faloutsos and Oard, Univ. of Maryland, 22 pages (undated).
`A Corpus Analysis Approach for Automatic Query Expan
`sion, Gauch and Wang, Proceedings of the 6th International
`Conference on Information and Knowledge Mangement, pp.
`278–284 (1997).
`Discovering Web Acess Patterns and Trends by Applying
`OLAP and Data Mining Technology on Web Logs, Zaiane,
`Xin and Han, Proceedings of the IEEE Forum on Research
`and Technology Advances in Digital Libraries (IEEE
`ADL’98), pp. 19–29 (1998).
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 2
`
`

`

`US. Patent
`6,006,225
`Sheet 1 0f 10
`l)ec.21,1999
`____________________________
`
`MMK
`
`”Emmm;Nfix
`_llllllllllllllllllllllllllllllllll
`
`IFSQ
`
`o_Im<moo:m_mmmmwwnmm
`
`mm<m<~<o2mgom._.<._mm
`
`
`
`mm>mmm
`
`
`
`mmmoommzo:.om._mm
`
`\Mx
`
`QNx
`
`aygnwx
`
`Ann-quuuv
`
`42F:
`
`HES.
`
`mmmooE
`
`zofi<mm2mo
`-——>e<o
`
`>mu=o
`
`\M.\
`
`%M\
`
`ownozzz<moommumoF¢>2ozomhm<lm
`
`
`
`m._m<._.zo_._.<._mmmoo>530
`
`<><TméKE
`
`o
`
`o._
`
`1.
`
`z...mwm_m<|mmommmmmzzznm
`
`
`
`«mmmummooum0mmz<o<mu<
`
`\HQK
`
`BloomReach, Inc. EX1026 Page 3
`
`|PR2019—01304
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 3
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 2 of 10
`
`6,006,225
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`2/7/7
`
`<= => & = (\ Q. - (6 A &
`Font Mail
`Bock Forw... Stop Refresh Home Search Favorite
`Address http://www.omazon.com/book_search
`NZ
`
`Print
`
`2/62
`
`amazon.com Book Search
`—-
`Enter Author and/or Title
`Author: |
`
`(C) Exact Name
`
`OLast, First Name
`
`OStart of Last Name
`
`Title:
`O Exact Title
`
`©Title Word(s)
`
`OStart(s) of Title Words
`
`22/7
`
`Author Search Tips / Title Search Tips
`
`Search by Subject
`
`24/7
`
`Subject:
`O Exact Subject
`
`OStart of Subject
`
`©Subject Word(s) O Start(s) of Subject Word(s)
`
`
`
`
`
`Search. Now
`
`|Clear the Form
`
`Subject Search Tips
`
`Other Search Methods:
`ISBN, Publisher/Date, Quick Search
`
`Amazon.com Home | Music Search | Your Account
`
`A/22
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 4
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 3 of 10
`
`6,006,225
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`J.2/2
`
`J.3/7
`
`
`
`J4/7
`
`
`
`/35
`
`Friday, 13-Feb-98 02:23:52
`User Identifier = 29.3847.19.287
`HTTP_REFERRER= http://www.amazon.com/book_search_page
`PATH_INFO=/book_search
`title = Snow Crdsh
`items found = 2
`
`Friday, 13-Feb-98 02:24:11
`User Identifier = 29.3847.19.287
`HTTP_REFERRER= http://www.amazon.com/book_search
`PATH INFO=/ISBN = 0553562614
`
`Friday, 13-Feb-98 06:15:03
`User Identifier = 54730543.261
`HTTP_REFERRER= http://www.amazon.com/music_search_page
`PATH_INFO=/music_search
`artist = This dnd That
`items found = 0
`
`Friday, 13-Feb-98 10:07:34
`User Identifier = 0.273859 18272
`HTTP_REFERRER= http://www.amazon.com/book_search_page
`PATH_INFO=/book_search
`subject = outdoor trail
`items found = 22
`
`
`
`
`
`
`
`A/2J
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 5
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 4 of 10
`
`6,006,225
`
`BEGIN
`
`4//7
`
`PARSE DAILY LOG FILE
`TO EXTRACT QUERY
`SUBMISSIONS FOR WHICH
`|TEMS FOUND > 0
`
`4.267
`
`
`
`CORRELATE TERMS BASED
`ON FREQUENCY OF OCCURRENCE
`WITHIN SAME QUERY
`
`CREATE DALY
`RESULTS FILE
`
`45/7
`
`4.4/7
`
`
`
`MERGE DAILY
`RESULTS FILES FOR
`LAST M DAYS
`
`OVERWRITE EXISTING CORRELATION
`TABLE WITH NEW CORRELATION TABLE
`
`45/2
`
`END
`
`A/24
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 6
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 5 of 10
`
`6,006,225
`
`Suuu 91 N
`
`
`
`
`
`
`
`
`
`þº39//
`
`(yg)
`(yç)
`(z ) ESIONBOX3-S
`
`(G9) 8000100-S
`
`(ç) NITXINW83–W
`
`(z) NOSTJV0-V
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 7
`
`

`

`US. Patent
`
`Dec. 21, 1999
`
`Sheet 6 0f 10
`
`6,006,225
`
`2532
`
`%\M.\
`
`.__<E.lm
`
`
`
`33
`
`:3A3
`
`ca
`
`éa5iv33
`
`ozfifimlm
`
`Cmmm<01<
`
`mvzmlm
`
`EVmxalm
`
`oz_N<._mIH
`so50225;
`8322.288;
`a:“mammoxmum3zomééé
`$3EDI;52323:;
`
`moooHDOIm
`
`zo_._.<o<>lm
`
`mEOmmlm
`
`zoxa>lm
`
`xilm
`
`2.3mEommum
`
`$3425%
`33zo:<o<>|mE4.5:;
`ac62055;,
`38mooEDOIm
`avSHEA?535¢at457m355$;
`
`mum.SQ
`
`km“.
`
`NV\
`
`BloomReach, Inc. EX1026 Page 8
`
`|PR2019-01304
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 8
`
`
`
`
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 7 of 10
`
`6,006,225
`
`Date
`
`Daily Log
`File
`
`Daily Results
`File
`
`Dates of Query
`Correlation Toble
`
`Query Correlation
`Toble
`
`
`
`1 –Feb-98
`
`2–Feb-98
`
`3-Feb-98
`
`4-Feb-98
`
`5–Feb-98
`
`6–Feb-98
`
`7—Feb-98
`
`8—Feb-98
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 9
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 8 of 10
`
`6,006,225
`
`BEGIN
`
`77/2
`FOR EACH TERM
`IN THE QUERY
`
`Z2/7
`LOOK UP TERM IN THE
`CORRELATION TABLE
`
`Z3/7
`RETRIEVE THE TERM'S
`RELATED TERMS LIST
`
`ZZ/7
`
`| NEXT TERM II
`
`Z5/7
`
`
`
`
`
`MULTI—TERM
`QUERY
`2
`
`COMBINE ALL RELATED
`TERMS LISTS
`
`
`
`Z77
`SELECT X TERMS WITH
`HIGHEST VALUES
`
`A/2 2.
`
`END
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 10
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 9 of 10
`
`6,006,225
`
`Top 3 Related Terms:
`&//7
`TRAIL – MIX
`TRAIL – YUKON --"
`TRAIL – BIKE
`
`A/(2 &A
`
`Intersecting Terms:
`S – BIKE
`S – SPORTS
`S — VACATION
`
`&_3/2
``--
`
`Top 3 Related Terms:
`OUTDOOR TRAIL – BIKE
`OUTDOOR TRAIL – SPORTS
`OUTDOOR TRAIL – VACATION
`
`A/22%’
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 11
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 10 of 10
`
`6,006,225
`
`$200
`
`File
`
`Edit View Go Favorite
`
`Help
`
`
`
`<= = @ E ?º Q - 6 A GR
`Font
`Bock Forw... Stop Refresh Home Search Favorite
`Address http://www.dmozon.com/book_search
`NZ
`
`Print
`
`Mcil
`
`amazon.com Book Search
`_--~
`
`Your Search Results
`for: the subject words include "OUTDOOR TRAIL"
`
`Related Query Terms: Click on a link below to narrow your
`search by adding a related term.
`OUTDOOR TRAIL – BIKE
`
`9/62
`
`OUTDOOR TRAIL – SPORTS
`
`OUTDOOR TRAIL – WACATION
`
`Top Matches for this search:
`* Outdoor & Trail Guide to Yosemite
`* Outdoor Training on Mountian Bike Trails
`
`
`
`
`
`
`
`
`
`
`
`Full Results: The First 100 are shown below. to see
`more results scroll down and click the "More" button.
`
`The blessed trail; outdoor diary
`T he Great Outdoor Cookbook: Trail Recipies
`
`Outdoor & Trail Guide to Yosemite
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`A/2.9
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 12
`
`

`

`1
`REFINING SEARCH QUERIES BY THE
`SUGGESTION OF CORRELATED TERMS
`FROM PRIOR SEARCHES
`
`RELATED APPLICATION
`This application claims the benefit of U. S. Provisional
`Application No. 60/089,244, filed Jun. 15, 1998, the disclo
`sure of which is hereby incorporated by reference.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of Invention
`This present invention relates to query processing, and
`more specifically relates to techniques for facilitating the
`process of refining search queries.
`2. Description of Related Art
`With the increasing popularity of the Internet and the
`World Wide Web, it is common for on-line users to utilize
`search engines to search the Internet for desired information.
`Many web sites permit users to perform searches to identify
`a small number of relevant items among a much larger
`domain of items. As an example, several web index sites
`permit users to search for particular web sites among known
`web sites. Similarly, many on-line merchants, such as
`booksellers, permit users to search for particular products
`among all of the products that can be purchased from the
`merchant. Other on-line services, such as Lexis" and
`Westlaw", allow users to search for various articles and
`court opinions.
`In order to perform a search, a user submits a query
`containing one or more query terms. The query may also
`explicitly or implicitly identify a record field or segment to
`be searched, such as title, author, or subject classification of
`the item. For example, a user of an on-line bookstore may
`submit a query containing terms that the user believes
`appear within the title of a book. A query server program of
`the search engine processes the query to identify any items
`that match the terms of the query. The set of items identified
`by the query server program is referred to as a “query
`result.” In the on-line bookstore example, the query result is
`a set of books whose titles contain some or all of the query
`terms. In the web index site example, the query result is a set
`of web sites or documents. In web-based implementations,
`the query result is typically presented to the user as a
`hypertextual listing of the located items.
`If the scope of the search is large, the query result may
`contain hundreds, thousands or even millions of items. If the
`user is performing the search in order to find a single item
`or a small set of items, conventional approaches to ordering
`the items within the query result often fail to place the
`sought item or items near the top of the query result list. This
`requires the user to read through many other items in the
`query result before reaching the sought item. Certain search
`engines, such as Excite" and AltaVista", suggest related
`query terms to the user as a part of the “search refinement”
`process. This allows the user to further refine the query and
`narrow the query result by selecting one or more related
`query terms that more accurately reflect the user’s intended
`request. The related query terms are typically generated by
`the search engine using the contents of the query result, such
`as by identifying the most frequently used terms within the
`located documents. For example, if a user were to submit a
`query on the term “FOOD,” the user may receive several
`thousand items in the query result. The search engine might
`then trace through the contents of some or all of these items
`and present the user with related query terms such as
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6,006,225
`
`2
`“RESTAURANTS,” “RECIPIES,” and “FDA* to allow the
`user to refine the query.
`The related query terms are commonly presented to the
`user together with corresponding check boxes that can be
`selectively marked or checked by the user to add terms to the
`query. In some implementations, the related query terms are
`alternatively presented to and selected by the user through
`drop down menus that are provided on the query result page.
`In either case, the user can add additional terms to the query
`and then resubmit the modified query. Using this technique,
`the user can narrow the query result down to a more
`manageable set consisting primarily of relevant items.
`One problem with existing techniques for generating
`related query terms is that the related terms are frequently of
`little or no value to the search refinement process. Another
`problem is that the addition of one or more related terms to
`the query sometimes leads to a NULL query result. Another
`problem is that the process of parsing the query result items
`to identify frequently used terms consumes significant pro
`cessor resources, and can appreciably increase the amount of
`time the user must wait before viewing the query result.
`These and other deficiencies in existing techniques hinder
`the user’s goal of quickly and efficiently locating the most
`relevant items, and can lead to user frustration.
`SUMMARY OF THE INVENTION
`The present invention addresses these and other problems
`by providing a search refinement system and method for
`generating and displaying related query terms (“related
`terms”). In accordance with the invention, the related terms
`are generating using query term correlation data that is based
`on historical query submissions to the search engine. The
`query term correlation data (“correlation data”) is preferably
`based at least upon the frequencies with which specific terms
`have historically been submitted together within the same
`query. The incorporation of such historical query informa
`tion into the process tends to produce related terms that are
`frequently used by other users in combination with the
`submitted query terms, and significantly increases the like
`lihood that these related terms will be helpful to the search
`refinement process. To further increase the likelihood that
`the related terms will be helpful, the correlation data is
`preferably generated only from those historical query sub
`missions that produced a successful query result (at least one
`match).
`In accordance with one aspect of the invention, the
`correlation data is stored in a correlation data structure
`(table, database, etc.) which is used to look up related terms
`in response to query submissions. The data structure is
`preferably generated using an off-line process which parses
`a query log file, but could alternatively be generated and
`updated in real-time as queries are received from users. In
`one embodiment, the data structure is regenerated periodi
`cally (e.g., once per day) from the most recent query
`submissions (e.g., the last M days of entries in the query
`log), and thus strongly reflects the current tastes of the
`community of users; as a result, the related terms suggested
`by the search engine strongly reflect the current tastes of the
`community. Thus, for example, in the context of a search
`engine of an online merchant, the search engine tends to
`suggest related terms that correspond to the current best
`selling products.
`In a preferred embodiment, each entry in the data struc
`ture is in the form of a key term and a corresponding related
`terms list. Each related terms list contains the terms which
`have historically appeared together (in the same query) with
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 13
`
`

`

`6,006,225
`
`3
`the respective key term with the highest degree of frequency,
`ignoring unsuccessful query submissions (query submis
`sions which produced a NULL query result). The data
`structure thus provides an efficient mechanism for looking
`up the related terms for a given query term.
`To generate a set of related terms for refining a submitted
`query (the “present query”), the related terms list for each
`term in the present query is initially obtained from the
`correlation data structure. If this step produces multiple
`related terms lists (as in the case of a multiple-term query),
`the related terms lists are preferably combined by taking the
`intersection between these lists (i.e., deleting the terms that
`are not common to all lists). The related terms which remain
`are terms which have previously appeared, in at least one
`successful query submission, in combination with every
`term of the present query. Thus, assuming items have not
`been deleted from the database being searched, any of these
`related terms can be individually added to the present query
`while guaranteeing that the modified query will not produce
`a NULL query result. To take advantage of this feature, the
`related terms are preferably presented to the user via a user
`interface that requires the user to add no more than one
`related term per query submission. In other embodiment, the
`related terms are selected and displayed without guarantee
`ing a successful query result.
`Because the related terms are identified from previously
`generated correlation data, without the need to parse docu
`ments or correlate terms, the related terms can be identified
`and presented to the user with little or no added delay.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`These and other features will now be described with
`reference to the drawings summarized below. These draw
`ings and the associated description are provided to illustrate
`a preferred embodiment of the invention, and not to limit the
`scope of the invention.
`Throughout the drawings, reference numbers are re-used
`to indicate correspondence between referenced elements. In
`addition, the first digit of each reference number indicates
`the figure in which the element first appears.
`FIG. 1 illustrates a system in which users access web site
`information via the Internet, and illustrates the basic web site
`components used to implement a search engine which
`operates in accordance with the invention.
`FIG. 2 illustrates a sample book search page of the web
`site.
`FIG. 3 illustrates sample log entries of a daily query log
`file.
`FIG. 4 illustrates the process used to generate the corre
`lation table of FIG. 1.
`FIG. 5A illustrates a sample mapping before a query is
`added.
`FIG. 5B illustrates a sample mapping after a query is
`added.
`FIG. 6 illustrates a process for generating the correlation
`table from the most recent daily query log files.
`FIG. 7 illustrates a process for selecting the related query
`terms from the correlation table.
`FIG. 8A illustrates a set of related query terms from a
`single-term query.
`FIG. 8B illustrates a set of intersecting terms and a set of
`related query terms from a multiple-terrn query.
`FIG. 9 illustrates a sample search result page of the web
`site.
`
`4
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`The present invention provides a search refinement sys
`tem and method for generating related query terms (“related
`terms”) using a history of queries submitted to a search
`engine by a community of users. Briefly, the system gener
`ates query term correlation data which reflects the frequency
`with which specific terms have previously occurred together
`within the same query. The system uses the query term
`correlation data in combination with the query term(s)
`entered by the user to recommend additional query terms for
`refining the query. The incorporation of such historical query
`information into the process tends to produce related terms
`that are frequently used by other users in combination with
`the submitted query terms, and significantly increases the
`likelihood that these related terms will be helpful to the
`search refinement process. To further increase the likelihood
`that the related terms will be helpful, the correlation data is
`preferably generated only from those historical query sub
`missions that produced a successful query result (at least one
`match).
`In the preferred embodiment, the query term correlation
`date is regenerated periodically from recent query
`submissions, such as by using the last M days of entries in
`a query log, and thus heavily reflects the current tastes of
`users. As a result, the related terms suggested by the search
`engine tend to be terms that correspond to the most fre
`quently searched items during the relevant time period.
`Thus, for example, in the context of a search engine of an
`online merchant, the search engine tends to suggest related
`terms that correspond to the current best-selling products. In
`one embodiment, the technique used to generate the related
`terms and present these terms to the user guarantees that the
`modified query will not produce a NULL query result.
`The search refinement methods of the invention may be
`implemented, for example, as part of a web site, an Internet
`site, an on-line services network, a document retrieval
`system, or any other type of computer system that provides
`searching capabilities to a community of users. In addition,
`the method may be combined with other methods for
`suggesting related terms, such as methods which process the
`contents of located documents.
`A preferred web-based implementation of the search
`refinement system will now be described with reference to
`FIGS. 1–9. For purposes of illustration, the system is
`described herein in the context of a search engine that is used
`to assist customers of Amazon.com Inc. in locating items
`(e.g., books, CDs, etc.) from an on-line catalog of products.
`Throughout the description, reference will be made to vari
`ous implementation-specific details of the Amazon.com
`implementation. These details are provided in order to fully
`illustrate a preferred embodiment of the invention, and not
`to limit the scope of the invention. The scope of the
`invention is set forth in the appended claims.
`I. Overview of Web Site and Search Engine
`FIG. 1 illustrates the Amazon.com web site 130, including
`components used to implement a search engine in accor
`dance with the invention.
`As it is well known in the art of Internet commerce, the
`Amazon.com web site includes functionality for allowing
`users to search, browse, and make purchases from an online
`catalog of book titles, music titles, and other types of items
`via the Internet 120. Because the catalog contains millions
`of items, it is important that the site provide an efficient
`mechanism for assisting users in locating items.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`IPR2019-01304
`BloomReach, Inc. EX1026 Page 14
`
`

`

`6,006,225
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`5
`As illustrated by FIG. 1, the web site 130 includes a web
`server application 131 (“web server”) which processes user
`requests received from user computers 110 via the Internet
`120. These requests include queries submitted by users to
`search the on-line catalog for products. The web server 131
`records the user transactions, including query submissions,
`within a query log 135. In the embodiment depicted in FIG.
`1, the query log 135 consists of a sequence of daily query log
`files 135(1)–135(M), each of which represents one day of
`transactions.
`The web site 130 also includes a query server 132 which
`processes the queries by searching a bibliographic database
`133. The bibliographic database 133 includes information
`about the various products that users may purchase through
`the web site 130. This information includes, for example, the
`titles, authors, publishers, subject descriptions, and ISBNs
`(International Standard Book Numbers) of book titles, and
`the titles, artists, labels, and music classifications of music
`titles. The information for each item is arranged within fields
`(such as an “author” field and a “title” field), enabling the
`bibliographic database 133 to be searched on a field
`restricted basis. The site also includes a database 134 of
`HTML (Hypertext Markup Language) content which
`includes, among other things, product information pages
`which show and describe the various products.
`The query server 132 includes a related term selection
`process 139 which identifies related query terms based on
`query term correlation data stored in a correlation table 137.
`As depicted in FIG. 1 and described below, the correlation
`table 137 is generated periodically from the M most recent
`daily query log files 135(1)–135(M) using an off-line table
`generation process 136.
`The web server 131, query server 132, table generation
`process 136, and database software run on one or more
`Unix"-based servers and workstations (not shown) of the
`web site 130 although other types of platforms could be
`used. The correlation table 137 is preferably cached in RAM
`(random access memory) on the same physical machine as
`that used to implement the query server 132. To accommo
`date large numbers of users, they query server 132 and the
`correlation table 137 can be replicated across multiple
`machines. The web site components that are invoked during
`the searching process are collectively referred to herein as a
`“search engine.”
`FIG. 2 illustrates the general format of a book search page
`200 of the Amazon.com web site 130 that can be used to
`search the bibliographic database 133 for book titles. Users
`have access to other search pages that can be used to locate
`music titles and other types of products sold by the on-line
`merchant. The book search page 200 includes author, title,
`and subject fields 210, 220, 240 and associated controls that
`allow the user to initiate fieldrestricted searches for book
`titles. Users can perform searches by first typing in the
`desired information into a search field 210, 220, 240 and
`then clicking on the appropriate search button 230,250. The
`term or string of terms submitted to the search engine is
`referred to herein as the “query.” Other areas of the web site
`ask the user to submit queries without limiting the terms to
`specific fields.
`When the user submits a query from the book search page
`200 to the web site 130, a the query sever 132 applies the
`query to the bibliographic database 133, taking into account
`any field restrictions within the query. If the query result is
`a single item, the item’s product information page is pre
`sented to the user. If the query result includes multiple items,
`the list of items is presented to the user through a query
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`result page which contains hypertextual links to the items’
`respective product information pages.
`For multiple-term queries, the query server 132 effec
`tively logically ANDs the query terms together to perform
`the search. For example, if the user enters the terms “JAVA’
`and “PROGRAMMING” into the title field 220, the query
`server 132 will search for and return a list of all items that
`have both of these terms within the title. Thus, if any query
`term does not produce a match (referred to herein as a
`“non-matching term”), the query will produce a NULL
`query result. Presenting a NULL query result to the user can
`cause significant user frustration. To reduce this problem, in
`this event, the user may be presented with a list of items that
`are deemed to be “close matches.” Although the search
`engine described herein logically ANDs the query terms
`together, it will be recognized that the invention can be
`applied to search engines that use other methods for pro
`cessing queries.
`In accordance with the invention, the search engine uses
`the query term correlation data stored in the correlation table
`137 to select the related terms that best match the user’s
`query. The search engine then presents the related terms to
`the user, allowing the user to refine the search and enhance
`discovery of relevant information. The query term correla
`tion data indicates relationships between query terms, and is
`used to effectively predict query terms that are likely to be
`helpful to the search refinement process. In accordance with
`another aspect of the invention, the correlation table 137
`preferably contains or reflects historical information about
`the frequencies with which specific query terms have
`appeared together within the same query.
`The general format of the correlation table 137 is illus
`trated in FIG. 1. In the embodiment depicted in FIG. 1 and
`described in detail herein, the correlations between query
`terms are based solely on frequency of occurrence within the
`same query. As described below, other types of query tern
`correlations can additionally be used. In addition, although
`the disclosed implementation uses a table to store the query
`term correlation data, other types of databases can be used.
`As illustrated by FIG. 1, each entry within the correlation
`table 137 (two entries shown) has two primary components:
`a “key” term 140, and a “related terms” list 142 for that key
`term. The related terms list 142 is a list of the N (e.g. 50)
`query terms that have appeared within the same query as the
`keyword with the highest degree of frequency, and is
`ordered according to frequency. For example, the entry for
`the key term COSMOS (ignoring the single-term prefixes,
`which are discussed below) is:
`COSMOS: ASTRONOMY, SAGAN, UNIVERSE,
`indicating that ASTRONOMY has appeared together with
`COSMOS with the highest degree of frequency; SAGAN
`has appeared with COSMOS with the second highest degree
`of frequency, and so on. Each term that appears within the
`related terms list 142 is deemed to

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket