throbber
111111111111111111111111111111111111111111111111111111111111111111111111111
`US007260573Bl
`
`c12) United States Patent
`Jeh et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,260,573 Bl
`Aug. 21, 2007
`
`(54) PERSONALIZING ANCHOR TEXT SCORES
`IN A SEARCH ENGINE
`
`(75)
`
`lnventors: Glen Jeh, San Francisco, CA (US);
`Taher H. Haveliwala, Mountain View,
`_______ j::.~_(lTS); ~~ll~<lar I>.J{.IIII!Var,J'.a!<>._ __ _
`Alto, CA (US)
`
`(73) Assignee: Google Inc., Mountain View, CA (US)
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted wider 35
`U.S.C. 154(b) by 543 days.
`
`(21) Appl. No.: 10/848,599
`
`(22) Filed:
`
`May 17, 2004
`
`(51)
`
`Int. Cl.
`(2006.01)
`G06F 17130
`(52) U.S. Cl. ............... 70717; 707/3; 707/10; 715/501.1
`(58) Field of Classification Search .................... 707/7,
`707/3, 10; 715/501.1
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,920,859 A *
`6,285,999 Bl
`6,636,848 Bl *
`2002i0169770 Al *
`2003/0208482 Al *
`2004/0215606 Al *
`2005/0071741 AI*
`2005/0114324 Al *
`2005/0165781 Al *
`2005/0240580 Al *
`OTHER PUBLIC.ATIONS
`
`7/1999 Li ..•..........•.......•..•........ 707/5
`9/2001 Page ............................. 707/5
`10/2003 Aridor et al ................... 707/3
`11/2002 Kim et al. . .................... 707/5
`11/2003 Kim et al. . ... . . ... . .. ... .. . . .. 707/3
`10/2004 Cossock ... . . .. . .. . .. .... .. .. ... 70713
`3/2005 Acharya ct al.
`. ........... 715/500
`5/2005 Mayer ........................... 707/3
`712005 Kraft et al ..................... 70717
`10/2005 Zamir et al .................... 707/4
`
`Tsukasa Hira.shima et al, "Context-Sensitive Filtering for Browsing
`in Hypertext", ACM !998, pp. 119·126.*
`
`Brin, S., et al., "The Anatomy of a Large-Scale Hypertextual Web
`Search Engine," In Proc. of the 7th Int'l World Wide Web Conf.,
`1998, 26 pages.
`Haveliwala, T.H., "Topic-Sensitive PageRank," In Proc. of the lith
`Int'l Worl~ Wide Web Conf., May 2002, 10 pages.
`Jeh;·G.; eta!., -"Scaling-Personalized· Web-Search;''-ln-Procoofthe--
`12th lnt'l World Wide Web Conf., 2003, 9 pages.
`Kamvar, S.D., eta!., "Exploiting the Block Structure of the Web for
`Computing PageRank," Stanford Univ. Technical Report, 2003, 13
`pages.
`* cited by examiner
`Primary Examiner-Uyen Le
`(74) Attorney, Agent, or Firm-Morgan, Lewis & Bockius
`LLP
`
`(57)
`
`ABSTRACT
`
`A search engine identifies a list of documents from a set of
`documents in a database in response to a set of query terms.
`For each document in the list, the search engine determines
`an information retrieval score based on its content and the
`query terms, and also identifies a set of source documents
`that have Jinks to the document and that also have anchor
`text satisfYing a predefined requirement with respect to the
`query terms. The search engine calculates a personalized
`page importance score for each of the identified source
`documents according to a set of user-specific parameters and
`accumulates the personalized page importance scores to
`produce a personalized anchor text score for the document.
`The personalized anchor text score is then combined with
`the document's information retrieval score to generate a
`personalized ranking for the document. The documents are
`ordered according to their respective personalized rankings.
`
`51 Claims, 6 Drawing Sheets
`
`EXHIBIT 2103
`Facebook, Inc. et al.
`v.
`Software Rights Archive, LLC
`CASE IPR2013-00479
`
`

`
`U.S. Patent
`
`Aug. 21, 2007
`
`Sheet 1 of 6
`
`US 7,260,573 Bl
`
`110-1
`
`1 http:iiwww.stantord.edu/bentindex.html
`
`Ben's Homepage
`
`<a.:_:firef=http://www.gostanford.com>Sports News~/~?.>
`-~--------------------------------------------------------------------------
`
`c 110-2
`
`;(]
`
`1 http://www.gostanford.com
`
`I
`
`I
`
`(
`
`120-1
`
`I http://www.geocity.com/cindy/index.html
`
`Cindy's Homepage
`c_~a=fire~t;ii;;:ii~-.~~~~~~~-~~sp~rt~--N~~s-~ia·>--{
`------------------------------------------------------------------------------
`
`(
`
`120-2
`
`I v I http://www.espn.com
`
`Fig.1
`
`

`
`U.S. Patent
`
`Aug. 21, 2007
`
`Sheet 2 of 6
`
`US 7,260,573 Bl
`
`Search Engine System
`
`200 J
`
`Search Query
`
`User Info
`204
`
`,---------- ----------·----·----------- --".-· -----
`: Server
`210
`
`User
`14-------1~ Info DB
`206
`
`Inverted
`Content
`Index
`208
`
`' ' ' '
`' ' ' '
`
`'
`
`'
`'
`'
`' '
`'
`'
`'
`
`'
`
`' ' ' ' '
`' ' ' ' '
`' ' ' '
`' ' ' ' '
`
`'
`'
`'
`L---- --------- ------------------ ---· -- -- ·- --------------+- -----------------------------------------------------
`
`Page
`Importance
`Ranker
`215
`
`Indexer
`217
`
`Network Crawler 220
`
`Fig. 2
`(Prior Art)
`
`

`
`U.S. Patent
`
`Aug. 21, 2007
`
`Sheet 3 of 6
`
`US 7,260,573 Bl
`
`(203
`
`Search_..
`Query
`
`Indexes/
`Databases
`
`208,209,211
`
`(302
`
`Dl (IR score, URL, PR, List of Source Pages)
`~ D2 (IR score, URL, PR, List of Source Pages)
`
`Dl:
`Source Page 1 (URL, PR, Anchor Text
`of Link)
`t . - ._ Source Page 2 (URL, PR, Anchor Text
`ofLink)
`~~
`
`~~~.
`
`_r312
`
`3M I
`
`User
`Pro fi I e f----l~
`
`Personalized
`AT Score
`Generator
`310
`
`Dl:
`Source Page 1 LA Score
`Source Page 2 LA Score
`
`02: ...
`
`_r320
`
`Accumulator
`
`.----....L.._--.v 322
`D1: AT Score
`D2: AT Score
`
`.-------._r 324
`Search Result
`IR Scores
`Ranking
`Function
`
`326 ~ Personalized
`Document
`Rankings
`
`328 \.._,-------'----,
`Order Search
`Results
`
`330~ Search Results,
`Ordered by
`Personalized
`Rankings
`
`Fig. 3
`
`

`
`U.S. Patent
`
`Aug. 21, 2007
`
`Sheet 4 of 6
`
`US 7,260,573 Bl
`
`Determine an IR score for a L-3
`
`document based on its content and a
`set of query terms
`
`0
`
`'
`having links to the document and l-/
`
`Identify a set of source documents
`
`42
`0
`
`anchor text matching at least one of
`the query terms
`
`,,
`Calculate personalized page 0
`
`importance scores for the identified
`source documents according to a set
`of user-specified parameters
`
`30
`
`,
`
`Accumulate a personalized AT score
`for the document based on the
`personalized page importance
`scores of the identified source
`documents
`
`/
`
`4
`50
`
`Generate a personalized ranking for __)
`
`the document based on its I R score
`and personalized AT score
`
`60
`
`r
`
`End
`
`Fig. 4
`
`

`
`U.S. Patent
`
`Aug. 21, 2007
`
`Sheet 5 of 6
`
`US 7,260,573 Bl
`
`Identify a list of documents that
`satisfy a set of query terms
`
`510
`
`.------------------------------------------------------ ------------------------------------------------------~:
`For each
`candidate
`document
`
`410
`
`Determine an IR score for a
`document based on its content
`
`Identify a set of source documents
`having links to the document and
`anchor text matching at least one of
`the query terms
`
`Calculate personalized page
`importance scores for the identified
`source documents according to a set
`of user-specified parameters
`
`Accumulate a personalized AT score
`for the document based on the
`personalized page importance
`scores of the identified source
`documents
`
`430
`
`450
`
`' ' '
`
`Generate a personalized ranking for
`the document based on its IR score
`and personalized AT score
`'
`.._ ______________________________________________________ ------------------------------------------------------
`
`460
`
`Order the list of documents
`according to their personalized
`ran kings
`
`590
`
`End
`
`Fig. 5
`
`

`
`U.S. Patent
`
`Aug. 21, 2007
`
`Sheet 6 of 6
`
`US 7,260,573 Bl
`
`Memory 612 ~
`
`Operating System
`
`Network Communication Module
`
`System Initialization Module
`
`Search Engine 600
`
`Query Processor
`
`~
`CPf]
`'----I
`614 ~
`
`602
`
`604\
`
`User interface
`
`I Display ~ t-606
`II Keyboard I
`~608
`
`61(>__
`
`Network
`interface
`
`Fig. 6
`
`Search Result Ranking Function
`
`Personalized Score Generator
`
`Personalized AT Score Generator
`
`Persorv=!!iz:ed !R Score GP.nP.rator
`
`.
`
`.
`
`(System) Page Importance Score Database
`
`Inverted Content Index Database
`
`Inverted Anchor Text Index Database
`
`User Information Database
`
`User Profile (User1)
`
`.
`
`.
`
`Network Crawler
`
`Content Indexer
`
`Page Importance Ranker
`
`Anchor Text Indexer
`
`.
`
`Search Results
`
`Document#1
`
`Information Retrieval Score
`
`Pagelmportance~core
`(Personalized or System)
`Personalized Anchor Text Score
`: .
`Document#N
`
`Information Retrieval Score
`Page Importance Score
`(Personalized or System)
`Personalized Anchor Text Score
`
`616
`
`618
`
`v
`v
`v
`v
`v
`v
`622
`v
`310
`622
`Lr
`
`620
`
`205
`
`324
`
`v
`v
`v
`v
`v
`v
`v
`v
`v
`
`J
`
`v
`v
`v
`v
`v
`
`209
`
`208
`
`211
`
`206
`
`306
`
`306
`
`220
`
`213
`
`215
`
`217
`
`302/330
`
`640
`
`642
`
`644
`
`646
`
`

`
`US 7,260,573 Bl
`
`1
`PERSONALIZING ANCHOR TEXT SCORES
`IN A SEARCH ENGINE
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application is related to patent application Ser. No.
`10/646,331, filed Aug. 22, 2003, "Improved Methods For
`Ranking Nodes In Large Directed Graphs," which is hereby
`incorporated by reference.
`
`FIELD OF THE INVENTION
`
`The present invention relates generally to the field of
`search engines for locating documents in a computer net(cid:173)
`work system, and in particular, to a system and method of
`personalizing search results produced by a search engine in
`accordance with user-specific parameters.
`
`BACKGROUND OF THE INVENTION
`
`Search engines provide a powerful tool for locating
`documents in a large database of documents, such as the
`documents on the World Wide Web (WWW) or the docu(cid:173)
`ments stored on the computers of an Intranet. The docu(cid:173)
`ments are located in response to a search query submitted by
`a user. A typical search query includes only two to three
`terms. As the number of documents accessible via the
`Internet grows, the number of documents that match the
`search query may also increase. However, not every docu(cid:173)
`ment matching the search query is equally important from
`the user's perspective. As a result, a user might be over(cid:173)
`whelmed by the enormous number of documents retrieved
`by a search engine, if the search engine did not order the 35
`search results based on their relevance to the user's query.
`One approach to improving the relevance of search results
`to a search query is to use the link structure of the documents
`in the database, such as the links between documents on the
`WWW, to compute global "importance" scores for the
`documents in the database. These scores are used to affect
`the order of search results when they are presented to the
`user. This approach is sometimes referred to as the PageR(cid:173)
`ank algorithm. Amore detailed description of the PageRank
`algorithm can be found in the article "The Anatomy of a 45
`Large-Scale Hypertextual Search Engine" by S. Brin and L.
`Page, 7'h International World Wide Web Conference, Bris(cid:173)
`bane, Australia and U.S. Pat. No. 6,285,999, both of which
`are hereby incorporated by reference as background infor-
`mation.
`An important assumption of the PageRank algorithm is
`that there is a "random surfer" who starts his web surfing at
`a randomly selected web page and keeps clicking on the
`links embedded in the web pages, never clicking on the
`"back" button. Occasionally, the random surfer re-starts his
`surfing by randomly picking another web page. The prob(cid:173)
`ability that the random surfer visits (i.e., views or down(cid:173)
`loads) a web page is a function of its PageRank. A web page
`may have a high PageRank if there are many other web
`pages pointing to it, or if some of the web pages pointing to
`it have a high PageRank. For example, www.espn.com is a
`famous website reporting sports-related news. It is conceiv(cid:173)
`able that there are many web pages over the Internet having
`links to www.espn.com. In contrast, www.gostanford.com is
`a website that only reports news about the sports teams of
`Stanford University. For the purposes of this explanation, we
`will assume that www.espn.com is more frequently visited
`
`2
`by WWW users than www.gostanford.com, and we will
`further assume that www.espn.com has a higher PageRank
`than www.gostanford.com.
`For each link in the link structure (representing links
`between the documents in the database), there is a pair of
`source and destination web pages. Source pages are also
`sometimes called "referring" pages. Further, many links in
`source web pages are associated with text that describes the
`destination web page of the link. Such text, commonly
`10 referred to as anchor text, often provides a more concise and
`accurate description than the destination web page itself and
`therefore can be used in determining the relevance of the
`destination web page to a particular query. FIG. 1 provides
`two examples of the link structure between different web
`15 pages. Each of the source web pages 110-1 and 120-1 has an
`embedded link pointing to one of the two destination web
`pages 110-2 and 120-2, respectively. An anchor text "Sports
`News" is associated with each link, characterizing the key
`feature of the corresponding destination page. When a user
`20 submits a query for "sports news" to a search engine (such
`as the Google search engine) that considers a web page's
`PageRank and anchor text, the engine may return both web
`pages 110-2 and 120-2. If so, the www.espn.com web page
`120-2 would likely be displayed higher in the search results
`25 than the www.gostanford.com web page 110-2 because page
`120-2 has a higher PageRank than page 110-2. It is noted
`that the Google search engine, as of late 2003, determines
`the position of a document in a set of search results as a
`function of the PageRanks of the documents in the search
`30 results, the query terms, the documents in the search results,
`and the anchor text of links to those documents. For pur(cid:173)
`poses of this discussion, we have assumed that large differ(cid:173)
`ences in the PageRanks of two documents often determine
`their relative position in a set of search results.
`When using a conventional search engine, the ordering of
`documents in a set of search results may be less than optimal
`for a user with specific personal preferences. In particular,
`documents of highest interest to the user may be positioned
`lower in the search results than one or more other docu-
`40 ments. It would be desirable to have a system and method of
`making the order of documents in a set of search results
`more attuned to a user's personal preferences, and it would
`be desirable for such a system to be computationally fea(cid:173)
`sible.
`
`SUMMARY
`
`In a method of personalizing the search results produced
`by a search engine, in accordance with a set of user-specific
`50 parameters, a search engine produces a set of search results
`in response to a query. The search results identify a set of
`documents, each of which is assigned an information
`retrieval score based on its content and the query terms. In
`some embodiments, the information retrieval score is a
`55 query dependent score that does not take into account the
`user-specific parameters. For a document in the identified set
`of documents, the method identifies a set of source docu(cid:173)
`ments having links to the document. The anchor text of each
`of the identified source documents is examined to determine
`60 if the anchor text satisfies a predefined requirement with
`respect to the query terms. After identifYing the source
`documents whose anchor text satisfies the predefined
`requirement, a personalized page importance score is com(cid:173)
`puted for each of the identified source documents according
`65 to the set of user-specific parameters.
`A personalized anchor text score is generated for the
`document by accumulating the personalized page impor-
`
`

`
`US 7,260,573 Bl
`
`3
`tance scores of the identified source documents. The per(cid:173)
`sonalized anchor text score of a document is combined with
`the document's information retrieval score to produce a
`personalized ranking for the document. The personalized
`ranking can be used in ordering the document in the search
`results.
`In one embodiment, the personalized page importance
`score of a document is a personalized link analysis score (a
`personalized PageRank is sometimes used for this score),
`which is based on an analysis of the linkages between
`documents that are directly or indirectly linked to this
`document. In some embodiments, the user-specific param(cid:173)
`eters includes a list of use favored websites, or includes URL
`keywords suitable for identifYing user favored websites. The
`user-specific parameters may be provided by the user, col(cid:173)
`lected from a third-party having such information, or
`derived by analysis of the user's previous search queries and
`the documents selected by the user from the search results of
`the user's previous search queries.
`In another aspect, a search engine system is configured to
`personalize the search results produced by a search engine,
`in accordance with a set of user-specific parameters, using
`the methodologies summarized above.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The aforementioned features and advantages of the inven(cid:173)
`tion as well as additional features and advantages thereof
`will be more clearly understood hereinafter as a result of a
`detailed description of preferred embodiments of the inven(cid:173)
`tion when taken in conjunction with the drawings.
`FIG. 1 provides two examples of the link structure
`between different web pages, the two source web pages of
`the two examples having the same anchor text.
`FIG. 2 is a simplified diagram illustrating a search engine
`system that utilizes user specific information to produce
`personalized search results.
`FIG. 3 is a conceptual diagram of a method for ordering
`search results in accordance with a user profile.
`FIG. 4 is a flowchart of a method of determining a
`personalized ranking for a document in a set of search
`results.
`FIG. 5 is a flowchart of a method of ordering a set of
`search results in accordance with a user profile.
`FIG. 6 is a block diagram of a search engine configured
`to generate personalized search results.
`Like reference numerals refer to corresponding parts
`throughout the several views of the drawings.
`
`DESCRIPTION OF EMBODIMENTS
`
`Referring to FIG. 1, assume that a user named Adam is
`looking for a website covering Stanford's sports teams. For
`the purposes of this explanation, we will assume that Adam
`would prefer that the search engine return www.gostanford- 55
`.com 110-2 ahead ofwww.espn.com 120-2. To achieve this
`goal, one approach would be to allow a user like Adam to
`instruct the search engine to personalize the rankings of
`search results by providing appropriate user information
`such as the user's background information or a plurality of 60
`favorite websites. For example, Adam may register with the
`search engine that he prefers web pages whose URL
`includes the term "Stanford" over other web pages.
`In FIG. 1, source page 110-1 (www.stanford.edu/ben/
`index.html) has a URL that includes "stanford", while 65
`source page 120-1 (www.geocity.com/cindy/index.html)
`does not. If Adam enters a search query of"sports news", the
`
`25
`
`4
`query terms will be found in the anchor text of both of these
`source pages 110-1, 120-1. Furthermore, based on Adam's
`registered search preferences, source page 110-1 should
`receive a higher score than source page 120-1. Described
`below are embodiments of search engine systems and meth(cid:173)
`ods for ranking and ordering search results in accordance
`with a user's preferences. Using these systems and methods,
`the web page www.gostandford.com 110-2 may be ranked
`higher and ordered before web age www.espn.com 120-2,
`10 depending on a variety of factors that are taken into account
`by the search result ranking function of the search engine.
`FIG. 2 is a simplified diagram illustrating a search engine
`system 200. The search engine system 200 is implemented
`in a client-server network environment, which comprises
`15 one or more client computers 201 and one or more server
`computers 210. Prior to any searches being performed on
`behalf of users, a network crawler 220 (sometimes called a
`web crawler) locates and downloads documents from a
`document database or network 219 (e.g., the Internet or an
`20 Intranet). In some embodiments, these documents are pro(cid:173)
`cessed by a content indexer to produce a set of indexes and
`databases are generated
`The documents downloaded by the network crawler 220
`are stored in the server and analyzed by different compo(cid:173)
`nents of the server 210. For instance, when a document
`arrives at the server 210, a content indexer 213 generates
`inverted content index entries for the document, which are
`stored in or added to the inverted content index 208. A page
`30 importance ranker 215 computes the document's page
`importance score. In some embodiments, the page impor(cid:173)
`tance score is the document's PageRank, which is a score
`generated using a specific link analysis methodology that
`propagates rank through links. A document's PageRank is
`35 based on the PageRanks of the documents that have links to
`the document. The resulting page importance scores are
`stored in a database 209. In other embodiments, the page
`importance scores could be replaced by another set of
`scores, such as scores produced using another link analysis
`40 methodology or scores produced using yet another page
`importance determination methodology.
`An anchor text indexer 217 is responsible for generating
`an inverted anchor text index 211 from the links in each
`document received by the server 210, including the text
`45 surrounding the links. The links and corresponding text are
`extracted from each document and stored in records identi(cid:173)
`:tying the source document, the target document associated
`with a link, and the anchor text associated with the link.
`When a sufficient number of such records have been accu-
`50 mulated, an inverted anchor text index 211 is generated,
`mapping anchor text terms to the documents that are the
`target of the corresponding links. In some embodiments, the
`inverted anchor text index 211 is merged with or incorpo(cid:173)
`rated in the inverted content index 208. More information
`about anchor text indexing is provided in U.S. patent appli(cid:173)
`cation Ser. No. 10/614,113, filed Jul. 3, 2003, "Anchor Text
`Indexing in a Web Crawler System", which is hereby
`incorporated by reference.
`A user of the system 200 first submits a search query 203
`through a client 201. A search query typically includes a set
`of query terms, which identifY terms to be included in
`documents that satisfY the search query. The search query is
`processed by a query processor 205 on the server side. Based
`on the search query, the query processor 205 generates
`search results 207, typically a list of documents that satisfy
`the search query 203, and returns the search results to client
`201.
`
`

`
`US 7,260,573 Bl
`
`10
`
`5
`Within server 201, the query processor 205 communicates
`with various databases to identifY documents that satisfY the
`search query and to determine how to order the search
`results. In some embodiments these includes the inverted
`content index 208, the page importance scores database 209
`and the inverted anchor text index 211. For example, the
`database that stores the inverted content index 208 first
`returns a set of documents identifiers, which identifY docu(cid:173)
`ments that contain the query terms of the search query.
`Optionally, the query processor 205 may submit the same
`query to the database storing the inverted anchor text index
`211 to find another set of documents that satisfY the search
`query. It is possible that a document may appear in both sets
`of documents. Finally, the two sets of documents are sub(cid:173)
`mitted to the page importance scores database 209 and 15
`ordered in accordance with their respective page importance
`scores. In some systems, the ordering of documents is query
`dependent, taking into account the documents' query inde(cid:173)
`pendent page importance scores (one example of which is
`PageRank) as well as the position(s) in which the query 20
`terms are found in the documents within the search result.
`The search engine system 200 further includes a user
`information database 206, with personalization information
`for its users. The personalization information for each
`respective user is herein called a user profile. The term user 25
`profile is used here without limitation on the particular data
`structures and methodology used to store the personalization
`information.
`In some embodiments, in addition to submitting a search
`query 203 to the server 210, a user can also submit user 30
`information 204 to the server 210. The user information 204
`may be in the form of a user profile or a set of user-specific
`parameters characterizing a user's background and prefer(cid:173)
`ences. In one embodiment, user information 204 is submit(cid:173)
`ted to the server together with the search query 302. In 35
`another embodiment, user information is submitted to the
`server separately. In yet another embodiment, the user
`information 204, is derived by the server 210 at least in part
`from search queries previously submitted by the user and by
`the documents in the search results that the user chooses to 40
`view or use. In some embodiments, when the server 210
`receives (or derives) user information 204, it stores such
`information in a user information database 206 and associ(cid:173)
`ates the user information with a unique user ID. In other
`embodiments, the server 210 receives the user information 45
`204 with each search query and does not retain such user
`information for use when processing subsequent search
`queries.
`In some embodiments, the user information is used to
`compute personalized page importance scores for at least a
`subset of the documents retrieved by the network crawler
`220. The Page Importance Ranker 215 generates a system
`page importance score for each document, as well as a set of
`user-specific (personalized) page importance scores for at
`least a subset of the retrieved documents. The Page Impor(cid:173)
`tance Ranker 215 utilizes the user-specific parameters to
`compute a personalized page importance score for at least a
`subset of the documents retrieved by the server 210 from the
`network or document database 219. In one embodiment, the
`Page Importance Ranker 215 accomplishes this task using
`an efficient link analysis calculation method taught in greater
`detail in the co-pending application Ser. No. 10/646,331,
`filed Aug. 22, 2003, which is hereby incorporated by refer(cid:173)
`ence. Conceptually, when computing personalized page
`importance scores, the Page Importance Ranker 215 boosts
`the page importance scores of documents that are deemed to
`match the user-specific parameters, which in tum boosts the
`
`6
`downstream documents linked to those documents. From
`another viewpoint, the Page Importance Ranker 215 boosts
`the page importance scores of documents of each host whose
`URL matches one or more of the user-specific parameters.
`In some embodiments, a document is be deemed to match
`(or not match) user-specific parameters solely based on the
`URL of the document. In other embodiments, a document is
`deemed to match the user-specified parameters based not
`only on the URL of the document, but also based on the
`content of the document, and/or based on the anchor text
`content of links to the document. When a document is
`deemed to match the user-specific parameters in a user
`profile (e.g., if the URL of the document includes any of
`URL keywords in the user profile), the document is assigned
`a personalized page importance score specified by a param-
`eter in the user profile. For example, the user profile may
`specifY for each URL keyword a particular page importance
`score adjustment that is to be applied to matching docu(cid:173)
`ments. When a document matches more than one URL
`keyword, the largest such page importance score adjustment
`is applied to the document. In other embodiments, the user
`profile may specifY the adjustment or assigrnnent of person(cid:173)
`alized page importance scores in other ways.
`Note that the personalized page importance score of a
`document is only a function of the document, the user(cid:173)
`specific parameters, and the link structure through which the
`document is related to other documents. In other words, the
`personalized page importance score of a document is a
`ranking factor that is independent of any individual search
`query submitted by the user to the search engine. However,
`it should be noted that in some embodiments the server 210
`is configured to generate at least some of the user-specific
`parameters based on previous search queries submitted by
`the user, and thus a user's search queries may indirectly
`affect a document's personalized page importance score. For
`example, if a user has submitted many queries related to the
`standard aptitude test (SAT), the server 210 may update his
`user information and incorporate this information into the
`set of user-specific parameters.
`In some embodiments, due to storage and computational
`limitations, the number of documents for which personal(cid:173)
`ized page importance scores are stored by the server 210 is
`limited. For instance, the server 210 may store up to N
`personalized page importance scores for each user, and
`furthermore will store personalized page importance scores
`only for those documents where the personalized page
`importance scores differs form the system (non-personal-
`50 ized) page importance scores. In another example, the server
`210 may store up to N personalized page importance score
`adjustment values, each for documents whose address or
`URL has a prefix indicating a respective web host. When the
`server 210 is processing a search query from a user for
`55 which it has a user profile that includes a stored set person(cid:173)
`alized page importance scores, it retrieves page importance
`scores from both the page importance scores database 209
`and the user information database 206. Where a page
`importance score for a particular document is found in both
`60 databases 215, 206, the page importance score from the user
`information database 206 is used (or, in some embodiments,
`the adjustment value from the user information database is
`applied to the system page importance score). Alternately,
`the server 210 first retrieves page importance scores from
`65 the user information database 206, and then retrieves page
`importance scores from the page importance scores database
`209 only for those documents for which a personalized page
`
`

`
`US 7,260,573 Bl
`
`7
`importance score (for the user who submitted the search
`query being processed) is not found in the user information
`database 206.
`In some embodiments, a personalized page importance
`score is determined for a respective document at runtime,
`while the server 210 is processing a search query. In
`particular, the personalized page importance score is gener(cid:173)
`ated by determining the set of documents that have links
`referring to the respective document, determining personal(cid:173)
`ized page importance scores for the referring documents 10
`based on the user profile of the user, and then computing a
`personalized page importance score for the respective docu(cid:173)
`ment as a function of the personalized page importance
`scores of the referring documents. This methodology may be
`extended to the "grandparent" documents that refer to the 15
`referring documents. While the personalized page impor(cid:173)
`tance scores produced by this runtime methodology may
`differ from those produced by a Page Importance Ranker
`using a full network link analysis, it avoids or largely
`reduces the persistent storage of personalized page impor- 20
`tance scores.
`In the remainder of this document, when a document's
`page importance score is retrieved or otherwise determined,
`the source of the page importance score will be understood
`to be either a page importance scores database 209, or a
`personalized page importance score associated with the user
`whose search query is being processed.
`The personalized page importance scores of source docu(cid:173)
`ments having links to a destination document can be used in
`many different ways. For example, the personalized page
`importance scores can be used in a method for generating a
`personalized and query-dependent ranking for the destina(cid:173)
`tion document. Such ranking can be used for ordering the
`destination document in the search results or for other
`further analysis.
`FIG. 3 is a conceptual representation of one embodiment
`of a method for personalizing the ordering of documents in
`a s

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket