throbber
To appear, IJCAI Workshop on Intelligent Techniques for Web Personalization (ITWP), 2001
`
`Web Site Personalizers for Mobile Devices
`
`Corin R. Anderson, Pedro Domingos, Daniel S. Weld
`{corin, pedrod, weld}@cs.washington.edu
`University of Washington, Seattle, WA, USA
`
`Abstract
`The fastest growing community of web users is that
`of mobile visitors who browse with wireless PDAs,
`cell phones, and pagers. Unfortunately, most web
`sites today are optimized exclusively for desktop,
`broadband clients, and deliver content poorly suited
`for mobile devices — devices that can display only
`a few lines of text using slow wireless networks.
`To best serve the needs of this growing community,
`we propose building web site personalizers that ob-
`serve the behavior of web visitors and automati-
`cally customize and adapt sites for each individual
`mobile visitor. In this paper, we give an overview of
`our approach to web site personalization as utility-
`maximizing search through the space of person-
`alized web sites. Following this framework we
`have implemented two personalizers: PROTEUS
`and MINPATH. PROTEUS allows changes to site
`navigation (adding or removing links) as well as
`content manipulation (rearranging or eliding con-
`tent), and evaluates the result with a learned model
`of the current visitor. MINPATH concentrates ex-
`clusively on adding “shortcut” links, but uses a
`model learned by clustering visitors based on their
`sequences of page requests. We introduce PRO-
`TEUS and MINPATH, and outline our current and
`future directions for these personalizers.
`
`Introduction
`1
`The fastest growing community of web users is that of mobile
`visitors — people who browse the web with wireless PDAs,
`cell phones, and pagers. Ninety-five percent of cell phones
`sold today are “web-ready” and authorities predict that the
`number of wireless Internet devices will outnumber desktop
`computers by 2003. Despite this trend, however, few web
`sites today cater to mobile visitors, instead optimizing their
`content for desktop clients. Unfortunately, mobile devices
`are not as capable as their desktop counterparts, being lim-
`ited by small screens, low-bandwidth networks and slower
`processors. Thus the user experience for mobile visitors at
`these “one-size-fits-all” sites suffers. To address this prob-
`lem, we propose building web site personalizers that auto-
`matically adapt and personalize a web site to each individual
`
`mobile visitor.
`Mobile web visitors exhibit a variety of browsing behav-
`iors: random surfing, task completion (e.g., buying stocks),
`information-goal seeking (i.e., answering questions), etc.
`Information-goal seeking is of particular interest because it
`is generally predictable: visitors tend to have similar infor-
`mation goals in the future as in the past. Some example goals
`include: “What is the current stock price of MSFT?”; “Are
`there any Pentax K-mount zoom lenses on auction at eBay?”;
`“What office is Dan Weld in?”. This behavior is predictable
`because visitors generally follow the same set of links, view
`the same set of pages, to achieve these goals each time, and
`attempt to do so in a direct and efficient manner.
`In addi-
`tion, visitors tend to view pages with similar content as pages
`viewed in the past (e.g., a photographer may frequently view
`pages containing words “zoom lens” and “f-stop”, although
`the URLs requested may differ). By mining past interactions
`with the web site for these behaviors, we can automatically
`personalize the web content for each individual visitor. We
`envision web site personalizers that act on behalf of a mo-
`bile visitor to adapt web content as the visitor browses. A
`web site personalizer is an intermediary between the web site
`and the visitor and may be situated on the web server, on the
`visitor’s device, or at a proxy server in between. A web site
`personalizer can:
`• Make frequently-visited destinations easier to find, by
`highlighting relevant links or adding new links to a page.
`• Highlight content that interests the visitor, by rearrang-
`ing content on the page, or by adding visual cues.
`• Elide uninteresting content and structure, replacing them
`simply with links to the omitted material.
`In
`Web site personalization follows a two-step process.
`the first step, the personalizer builds a model of each visitor
`by mining the access logs and site content. The model in-
`cludes information about navigational browsing behavior as
`well as content interests. This model could also include “out-
`of-band” information, such as visitors’ geographic location or
`demographics. In the second step, the personalizer transforms
`the site to maximize the expected utility [9] for a given visitor.
`The expected utility of a personalized web site is a measure
`of how much benefit the visitor will receive by browsing the
`site; the personalizer computes this value based on the visitor
`model derived in the first step.
`
`SCEA Ex. 1046 Page 1
`
`

`
`a link at the top of the current page may not be difficult, but
`reaching a page many links away will require scrolling (to
`find the links) and waiting for intermediate pages to down-
`load over the wireless network. We transform this intuition
`into practice by recursively defining the utility of a page as
`the sum of its intrinsic utility — the utility of the page, in
`isolation — and its extrinsic utility — the utility of the linked
`pages2. We then calculate the utility of the site by evaluat-
`ing this recursion beginning with the page requested by the
`visitor. We make these concepts more precise below.
`Web site model for evaluation
`We find it advantageous to transform the search state model
`slightly when calculating expected utility. Specifically, we
`into a sequence of “screens”
`now decompose a page pi
`(cid:104)si0, . . . , sim(cid:105), each of which represents the web content
`that can be seen in one window of the visitor’s browser. A
`screen sij is composed of web content (i.e., text and graph-
`ics), which we denote as Tij, and a set of links lij1, ..., lijk.
`Expected utility
`Let ˆp be the personalized page for the requested URL u and
`let UV (ˆp) be the utility of ˆp for visitor V . The evaluation
`of W is the result of a recursive traversal through the site
`beginning with ˆp. Because only the first screen of ˆp is initially
`visible to the visitor, the expected utility of ˆp (or any pi, in
`fact) is the expected utility of its first screen:
`E[UV (pi)] = E[UV (si0)]
`The expected utility of a screen sij, in turn, is the sum of
`its intrinsic and extrinsic utilities:
`E[UV (sij)] = E[IUV (sij)] + E[EUV (sij)]
`The intrinsic utility of a screen measures how useful the
`screen’s content is towards fulfilling the visitor’s information
`goal, independent of the rest of the web site. Typically, the in-
`trinsic utility depends on the visitor model — past history and
`demographics. A more detailed description of intrinsic utility
`depends on particular assumptions regarding visitor interests
`and goals; section 3 discusses the method used in PROTEUS.
`The extrinsic utility measures the value of a screen by its
`connections to the rest of the web site. To reach the rest of the
`site, we model the visitor as choosing from a fixed set of nav-
`igation actions, any number of which the visitor may select
`(i.e., the actions are not mutually exclusive). Specifically, if
`the visitor is at screen sij, then the visitor may: scroll down
`to the next screen (assuming that sij is not the last screen of
`the page); or follow any link that appears on the screen. We
`maintain independent probabilities that the visitor will take
`each action, denoted as P (action), as well as the cost (i.e.,
`negative utility) each action imposes on the visitor, denoted
`γs and γl for scrolling and following a link, respectively. If
`we let dijk be the destination page of link lijk, then the ex-
`trinsic utility of screen sij is a sum weighted by probabilities:
`E[EUV (sij)] = P (scroll)(E[UV (si,j+1)] − γs) +
`[P (lijk)(E[UV (dijk)] − γl)]
`
`(cid:88)k
`
`2In many ways, intrinsic and extrinsic utilities are analogous to
`Kleinberg’s authority and hub weights [10].
`
`The focus of our work is in defining this framework of
`personalization as search and exploring how to instantiate
`this framework in a practical manner. This paper discusses
`our recent efforts along these lines and details current work.
`This paper draws heavily from previous papers describing our
`work [1; 2].
`
`2 Personalization as search
`We first describe our approach briefly. Our web site person-
`alizer performs a search through the space of possible web
`sites. The initial state is the original web site of unmodified
`pages. The state is transformed by any of a number of adap-
`tation functions, which can create pages, remove pages, add
`links between pages, etc. The value of the current state (i.e.,
`web site) is measured as the expected utility of the site for
`the current visitor. The search continues either until no bet-
`ter state can be found, or until computational resources (e.g.,
`time) expire.
`
`2.1 State representation
`Each state in our search space is an entire web site, W .
`Although an actual implemented system (such as those dis-
`cussed in sections 3 and 4) may choose to personalize only a
`single page at a time, we model the entire web site to allow
`adaptations to be made anywhere in the site. The web site
`W is modeled as a directed graph whose nodes are pages,
`p0, . . . , pn, and whose arcs are hypertext links, l0, . . . , lm.
`A link lk is a triple (ps, pd, a) where ps is the source page
`(i.e., the page on which the link appears), pd is the destina-
`tion page, and a is the anchor text. Each page pi is modeled as
`a hierarchy of web content, much in the same way the parse
`tree of an HTML document confers a hierarchy of HTML
`tags. pi is thus represented as the root of this hierarchy, and
`is a content node. A content node c is a pair (C, B) where C
`is a sequence of children (cid:104)c1, . . . , ck(cid:105) of c and B is a behav-
`ior that c imparts on its children. The elements of C may be
`either plain text or (recursively) content nodes. The behavior
`B is the action that affects the human-viewable content. For
`example, if c were a “<strong>” node, then B would ren-
`der its children in boldface; or if c were an “<a>” node, then
`B would render them as a hypertext link. Summarizing:
`S = {W0, W1, . . .} Each state in the space is a web site
`W = ({p0, . . . , pn}, A web site is a directed graph of
`{l0, . . . lm})
`pages and links
`lk = (ps, pd, a)
`A link has a source, destination,
`and anchor
`pi = ci
`A page is a root content node
`ci = ((cid:104)ci1, . . . , cik(cid:105), A content node is a sequence of
`B)
`children and node behavior; or
`ci = text
`A content node is plain text
`
`2.2 State Evaluation
`We estimate the quality of the personalized web site as its
`expected utility from the point of view of the requested page.
`Intuitively, the expected utility is the sum of the utility the vis-
`itor receives by browsing each page in the site, discounted by
`the difficulty of reaching each page1. For example, following
`
`1We actually compute utility at a finer granularity than a page.
`
`SCEA Ex. 1046 Page 2
`
`

`
`This equation recursively references the utility of other
`pages and screens. The recursion is halted when the ex-
`pected utility of a screen or page is less than the cost of
`reaching that content (i.e., when E[UV (si,j+1)] < γs or
`E[UV (dijk)] < γl).
`If evaluated na¨ıvely, expected utility is not computationally
`tractable – it would require a screen-by-screen decomposition
`of potentially every page in the entire web site. Fortunately,
`the evaluation is made computationally much simpler with a
`few assumptions. First, when the cost of scrolling dominates
`the cost of following a link (γl + γs ≈ γl), then we treat all
`pages but ˆp as single-screen pages and can ignore the recur-
`sion due to scrolling on these pages. Second, we limit how
`deeply our recursion proceeds by setting a minimum thresh-
`old on the probability of viewing a screen or page — if the
`probability of it being visited is lower than the threshold, then
`it is excluded from the calculation. The threshold allows us to
`trade off performance for accuracy: the lower the threshold,
`the more accurate the evaluation, at the expense of recurring
`through more of the site.
`
`3 Proteus
`We have implemented this search framework in the web site
`personalizer PROTEUS. While most of the details of the im-
`plementation should be clear from the framework, we discuss
`a few implementation-specific issues here.
`
`3.1 Search operators
`To reduce the complexity of PROTEUS, we require that search
`operators directly affect the requested page ˆp in some way,
`e.g., adding links to ˆp or manipulating content on ˆp. We ex-
`clude operators that, for instance, generate new pages or add
`links between two other pages.
`PROTEUS supports two transformation operators3: elide-
`content and add-shortcut. elide-content replaces a block
`of content on ˆp with a link to the original content in a fash-
`ion similar to Digestor [3] (Figure 1). The add-shortcut
`operator creates a new link from ˆp to some other page that
`can be reached by following at most k links from ˆp. Thus,
`add-shortcut creates a link that “shortcuts” a longer path of
`links. For example, if the visitor previously followed the path
`ˆp → pa → pb → pc → pd, add-shortcut may create a link
`directly from ˆp to pd. Furthermore, add-shortcut places the
`new link ˆp → pd next to the original link ˆp → pa, anticipating
`that, when the visitor wants to find pd again, the visitor will
`look towards the link to pa first. This placement is possible
`only if the visitor actually followed the path previously, oth-
`erwise, add-shortcut places the link near paths other visitors
`at the site have taken. The anchor text of the link is chosen
`heuristically as either the destination’s <title> or <h1>.
`
`3.2 Expected Utility
`Our implementation measures intrinsic utility of a screen as a
`weighted sum of two terms: how well the screen’s content
`matches the visitor’s previously viewed content, according
`
`3PROTEUS supports a third operator, swap-siblings, but we omit
`its discussion for space considerations.
`
`to a text-similarity measure, and how frequently the visitor
`viewed this screen. See earlier work [2] for more detail.
`PROTEUS estimates the action probabilities by measuring
`the frequency with which the visitor took each action in the
`past. For example, the probability that the visitor follows a
`link ps → pd is the quotient of the number of sessions in
`which the visitor viewed pd sometime after ps divided by the
`number of sessions in which the visitor viewed ps. The prob-
`ability for scrolling is derived empirically and is held con-
`stant at 0.85, although we are in the process of determining
`this number as part of the visitor model. Also through em-
`pirical evaluation, we set the cost of scrolling, γs, at 0.01 and
`the cost of following a link, γl, at 0.05. These values work
`acceptably in practice, although our results are largely insen-
`sitive to the exact values. In practice, the dominant term in
`the expected utility equation is the product of probabilities of
`taking chains of actions. For example, for all but the most
`probable links, the contribution of a remote page to expected
`utility is already vanishingly small, irrespective of the cost of
`following the link.
`
`3.3 Results
`In this section we present the results of a small user study
`of PROTEUS. We track ten test subjects’ browsing habits on
`their desktop workstations and then measure how effectively
`they use a suite of personalized and non-personalized web
`sites on a wireless Palm Connected Organizer. We measure
`visitor effort in terms of both time to attain the goal and the
`amount of navigation (number of scrolling actions and links
`followed) required.
`To collect training data, we asked the subjects to perform
`a suite of information-seeking tasks that we provided daily.
`We directed the subjects to attain their goals by browsing ex-
`clusively at the given site starting from a given page. An ex-
`ample question is: “Find the current stock price for MSFT,
`starting at finance.yahoo.com”. The tasks in the seed
`suite were drawn randomly from a distribution of parametric
`questions and represent a coherent model of visitor interest.
`Following the training phase, we asked the subjects to an-
`swer another suite of questions using a wireless Palm Con-
`nected Organizer. We asked them to answer the questions
`twice: once, on the unmodified web site, and again on a per-
`sonalized version of the target site. PROTEUS personalized
`the target site for each visitor by first building a model of that
`visitor, based on the subject’s past browsing data, and then
`creating adapted pages for the testing suite. Because our cur-
`rent implementation is not yet fast enough to adapt a single
`page in real-time, we personalized the sites before the sub-
`jects performed their tests. We chose which pages for PRO-
`TEUS to adapt by using our human judgment of where the
`subjects would likely visit during the test. Note that we have
`not influenced the personalization at all — we have merely
`selected the subset of pages that PROTEUS personalizes, for
`efficiency.
`Figure 2 compares links followed to attain each goal on
`the personalized versus unmodified web sites (the graphs of
`the time required and scrolling actions are similar). The y-
`axis shows the number of links while the x-axis shows the
`location of each goal listed chronologically. The graph shows
`
`SCEA Ex. 1046 Page 3
`
`

`
`Figure 1: Elided content. On the left is an unmodified web page. On the right a number of blocks of content have been elided
`and replaced with hypertext links.
`
`4.1 The MinPath algorithm
`A trail [16] T = (cid:104)p0, p1, . . . pn(cid:105) is a sequence of page re-
`quests such that each request occurs within some fixed time
`window of the previous request and is the destination of a
`link on the previous page. The personalizer watching a vis-
`itor’s behavior midway through the trail sees only a prefix,
`(cid:104)p0, . . . , pi(cid:105). The trail suffix, (cid:104)pi+1, . . . , pn(cid:105), must be hypoth-
`esized by the personalizer.
`If one had knowledge of the complete trail (cid:104)p0, . . . ,
`pi, . . . pn(cid:105), selecting the best shortcut at any page pi is easy:
`simply, pi → pn. Of course, given only a trail prefix, the per-
`sonalizer must infer the remaining pages. Our approach uses
`a model of the visitor’s behavior to compute a probability for
`every possible trail suffix (cid:104)qi+1, . . . , qn(cid:105) on the site.
`Intu-
`itively, these suffixes are all possible trails originating from
`pi. Given a suffix and its probability, we assign an expected
`savings to the shortcut pi → qj for each qj in the suffix as
`the product of the probability of the suffix and the number of
`links saved by the shortcut. Note that many trail suffixes may
`pass through the same page qj, and so the expected savings
`of a shortcut pi → qj is summed over all suffixes.
`For example, suppose that a visitor requests the trail prefix
`(cid:104)A, B, C(cid:105) and we wish to find shortcuts to add to page C.
`Suppose further that our model of the visitor indicates there
`are exactly two sequences of pages the visitor may complete
`the trail with: (cid:104)D, E, F, G, H(cid:105), with probability 0.6, and
`(cid:104)I, J, H, K(cid:105) with probability 0.4. The expected savings from
`the shortcut C → E would be 0.6×1 = 0.6, because the trail
`with E occurs with probability 0.6 and the shortcut saves only
`one link. The expected savings for shortcut C → H includes
`a contribution from both suffixes: 0.6 × 4 + 0.4 × 2 = 3.2.
`MINPATH constructs trail suffixes by traversing the di-
`rected graph induced by the web site’s link structure. Start-
`ing at the page last requested by the visitor, pi, MINPATH
`computes the probability of following each link and recur-
`sively traverses the graph until the probability of viewing a
`page falls below a threshold, or a depth bound is exceeded.
`
`Figure 2: Links followed.
`
`that for a majority of the sites, PROTEUS’s personalizations
`appear quite useful: the addition of shortcut links and elision
`of unnecessary content reduced the amount of visitor naviga-
`tion, and hence the amount of time spent, at the sites.
`
`4 MinPath
`
`One of PROTEUS’s weaknesses is that it builds its visitor
`models in isolation of each other. When training data is
`sparse, these models can be quite inaccurate.
`Instead, we
`would like to combine data from many, similarly-behaving
`visitors to build more robust models. To this end, we devel-
`oped MINPATH, an algorithm that finds high-quality short-
`cuts by modeling clusters of visitors, and using these mod-
`els to predict where in the site the visitor is likely to travel.
`By leveraging data from many visitors in a single cluster, the
`models MINPATH builds are more accurate and lead to bet-
`ter personalizations. Moreover, while not as general as PRO-
`TEUS’s expected utility, MINPATH’s evaluation metric — ex-
`pected savings — can be very efficiently computed.
`
`Unmodified
`Personalized
`
`0123456
`
`# links followed
`
`cs.washington.edu
`
`cnet.com
`cs.washington.edu
`
`ebay.com
`finance.yahoo.com
`
`cnn.com
`cs.washington.edu
`finance.yahoo.com
`
`cnn.com
`cs.washington.edu
`finance.yahoo.com
`
`ebay.com
`
`SCEA Ex. 1046 Page 4
`
`

`
`The savings at each page is the product of the probability of
`reaching that page and the number of links saved. MINPATH
`collates the results and returns the best shortcuts. We next
`describe how we obtain the model required by MINPATH.
`
`4.2 Predictive Models
`The key element to MINPATH’s success is the predictive
`model of web usage. The probabilistic model predicts the
`next web page request pi given a trail prefix (cid:104)p0, . . . , pi−1(cid:105)
`and the visitor’s identity V : P (pi = q|(cid:104)p0, . . . , pi−1(cid:105), V ). Of
`course, a model may condition this probability on only part
`or even none of the available data; we explore such models in
`our experiments. We fit the models to past web usage data,
`either to all the visitors to the site at once, or to clusters of
`visitors. We group visitors by clustering their sequences of
`web page requests, and use EM to simultaneously find the
`clusters and fit the models. For the en masse and cluster-
`ing approaches, we evaluate two types of models: uncondi-
`tional, which predicts the next link without regard to previ-
`ous browsing, and Markov, which predicts the next link given
`the previous request. When applied to clusters of visitors, we
`effectively have mixture models, either Na¨ıve Bayes mixture
`models [6] or mixtures of Markov models [5]. We describe
`these models in more detail in our other work [1].
`
`4.3 MinPath results
`We evaluate MINPATH’s performance on usage at our home
`institution’s web site. We use web access data for September
`2000 to produce a training set of 35,212 trails (approximately
`20 days of web usage) and a test set of 2,500 trails (1.5 days);
`the time period from which the test trails were drawn occurred
`strictly after the training period. We selected only those trails
`with link length at least two, because shorter trails cannot be
`improved. We measure MINPATH’s performance by the num-
`ber of links a visitor must follow to reach the end of the trail.
`We compare MINPATH’s performance using the models
`described earlier (see Figure 3). The first column shows the
`number of links followed in the unmodified site. In the sec-
`ond and third sets of columns, MINPATH uses, respectively,
`an unconditional and Markov model, each fitted to all the
`site’s visitors, and produces 1, 3, or 5 shortcuts per page4.
`In the last two sets, MINPATH uses mixture models of either
`10 or 25 clusters, and selects the distribution of the models
`in the mixtures based on only the current trail prefix (ignor-
`ing past visitor behavior). This graph demonstrates that the
`shortcuts MINPATH finds are of high quality — when using a
`mixture of Markov models and suggesting just three short-
`cuts, MINPATH can eliminate an average of 0.97 links, or
`40% of the possible savings. Of course, the actual effective-
`ness of adding shortcuts will depend on the visitor’s ability
`to discern whether the link will be of value, and in section 6
`we describe our ongoing work to intelligently select appro-
`priate link anchor texts. Finally, we note that MINPATH’s
`running time is quite small. The models are learned offline,
`but the process usually requires only several minutes. Given
`a model and the trail prefix, MINPATH finds a set of short-
`cuts in 0.65 seconds on an average desktop PC. MINPATH’s
`
`4We limit MINPATH to only as many shortcuts as can reasonably
`be displayed on the small screen of a wireless web browser.
`
`Figure 3: MinPath’s performance. Each column shows the
`average number of links followed in a trail. Mixture model
`columns are annotated with the number of clusters. Error-
`bars denote 95% confidence intervals.
`
`expected savings computation is substantially faster than and
`positively complements PROTEUS’s more general approach.
`
`5 Related work
`Two closely related lines of research are IndexFinder [15] and
`Digestor [3].
`IndexFinder creates singular transformations
`that appeal to all visitors at the site, in particular, generat-
`ing new index pages — hubs of links to other pages on the
`site. These pages are evaluated based strictly on the naviga-
`tional usage patterns of past visitors — the pages requested
`— irrespective of their content. Digestor optimizes pages for
`small-screen display and uses a steepest-descent search simi-
`lar to ours. However, Digestor rates the quality of a web page
`simply by how much screen space it occupies, a metric that
`encourages degenerate pages — a blank web page receives
`the highest quality value. In contrast, our approach personal-
`izes content per visitor, and evaluates the adaptations using a
`principled, utility-maximizing approach.
`The Web Browser Intelligence (WBI) [12] project pro-
`poses an architecture of pluggable intermediaries between
`web servers and clients. These intermediaries generate, trans-
`form, and monitor the content they see as visitors browse, and
`can be used either individually or in chains. An interesting
`line of future research would be to integrate PROTEUS into
`the WBI framework and investigate what what other interme-
`diaries could be usefully composed with PROTEUS.
`Countless other systems attack all or part of the web site
`personalization problem; we mention briefly several related
`systems. The Daily Learner [4] learns a Palm VII user’s pref-
`erence for news content, by monitoring exactly which stories
`the user requests from both the Palm device and the user’s
`corresponding desktop computer. Mobasher et. al. [13] mine
`web usage patterns and web content to personalize the vis-
`itor experience, specifically, to recommend new content the
`visitor may like to see. The PersonalClipper [7] allows visi-
`tors to build their own custom views of web sites by record-
`ing navigational macros using a VCR-metaphor and select-
`ing components of the target page to view with the mobile
`device. Letizia [11], WebWatcher [8], and adaptive web site
`agents [14] guide visitors by suggesting pages they may like
`
`3.42
`
`3.13
`
`2.75
`
`2.63
`
`2.74
`
`2.47
`
`2.31
`
`2.72 2.70
`
`0 shortcuts
`1 shortcut
`3 shortcuts
`5 shortcuts
`
`2.45 2.45
`
`Unmodified
`
`Unconditional
`model
`
`Markov
`model
`
`10
`25
`Naïve Bayes
`mixture model
`
`25
`
`10
`Mixture of
`Markov models
`
`4.00
`
`3.50
`
`3.00
`
`2.50
`
`2.00
`
`1.50
`
`1.00
`
`0.50
`
`0.00
`
`Average # of links per trail
`
`SCEA Ex. 1046 Page 5
`
`

`
`to see. Our work differs from this earlier agent-based work in
`that we concentrate on personalizing the web site for each vis-
`itor, instead of merely suggesting which links to follow next,
`and in the depth to which we mine the web logs for useful
`access patterns.
`
`6 Future work
`Our approach to personalization as search offers many fruit-
`ful lines of continued research; we are currently exploring
`several directions. First, we are investigating how to incor-
`porate MINPATH’s cluster-based model of visitors into PRO-
`TEUS’s expected utility framework. The models MINPATH
`and PROTEUS employ differ substantially: MINPATH builds
`a predictive model of visitor behavior, while PROTEUS builds
`a descriptive model of visitor behavior and interests. At the
`heart of this work is applying the cluster-based techniques to
`descriptive models.
`Second, we are exploring how to automatically select con-
`cise and descriptive anchor texts for shortcut links. PRO-
`TEUS’s heuristics to select either a shortcut destination’s
`<title> or <h1> text as the anchor are frequently in-
`adequate. The title text of a shortcut destination may contain
`redundant information. A page’s <h1> tag may contain too
`little information, or may not exist at all. Instead, we propose
`using an information-based approach to select anchor words
`with the highest information content.
`In a third direction we are considering incorporating guid-
`ance from both the web master and the web visitor into the
`personalizer’s search process. The web master can provide
`suggestions, either directly or in the form of annotations in
`HTML documents, on what personalizations are allowed or
`disallowed and which might be the most useful. The web
`visitor can express to the personalizer certain aspects of the
`visitor model, perhaps as a list of keywords the visitor finds
`interesting, or a set of pages that the visitor wants to be able
`to easily find.
`Finally, we are expanding the set of transformations that
`PROTEUS can make to a site. Specifically, we are investi-
`gating a transformation that can aggregate blocks of content
`from many pages into a single view for the visitor. Such a
`transformation can allow PROTEUS to effectively create “por-
`tal pages” that integrate information from several separate re-
`sources on a web site. The challenge of this work lies in ap-
`propriately melding blocks of content into a cohesive whole
`by carefully analyzing the donor and recipient documents for
`clues about their presentation.
`
`7 Conclusions
`As the community of mobile web visitors grows, so grows the
`need for web sites to cater to visitors off the desktop and off
`broadband network connections. We propose building web
`site personalizers to meet the need of this community, and
`this paper gives an overview of our approach to personalizers
`as utility-maximizing search. Initial results from our person-
`alizers, PROTEUS and MINPATH, indicate that this approach
`holds great promise for improving the mobile web experi-
`ence. In the future we plan to extend their capabilities and
`continue to improve browsing for mobile visitors.
`
`[7]
`
`[5]
`
`References
`[1] C. R. Anderson, P. Domingos, and D. S. Weld. Adaptive
`web navigation for wireless devices. in Proc. of the 17th
`Intl. Joint Conf. on Art. Int., 2001.
`[2] C. R. Anderson, P. Domingos, and D. S. Weld. Person-
`alizing web sites for mobile users. In Proc. of the 10th
`Intl. WWW Conf., 2001.
`[3] T. W. Bickmore and B. N. Schilit. Digestor: Device-
`independent access to the World Wide Web. In Proc. of
`the 6th Intl. WWW Conf., 1997.
`[4] D. Billsus, M. J. Pazzani, and J. Chen. A learning agent
`for wireless news access. In Proc. of the Conf. on Intel-
`ligent User Interfaces, 2000.
`I. V. Cadez, D. Heckerman, C. Meek, P. Smyth, and
`S. White. Visualization of navigation patterns on a web
`site using model based clustering. In Proc. of the 6th.
`Intl. Conf. on Know. Disc. and Data Mining, 2000.
`[6] P. Cheeseman, J. Kelly, M. Self, J. Stutz, W. Taylor, and
`D. Freeman. AutoClass: A Bayesian classification sys-
`tem. In Proc. of the 5th Intl. Conf. on Mac. Learning,
`1988.
`J. Freire and B. Kumar. Web services and information
`delivery for diverse environments. In Proc. of the VLDB
`Wkshp. on Tech. for E-Services, 2000.
`[8] T. Joachims, D. Freitag, and T. Mitchell. WebWatcher:
`A tour guide for the World Wide Web. In Proc. of the
`15th Intl. Joint Conf. on Art. Int., 1997.
`[9] R. L. Keeney and H. Raiffa. Decisions with Multiple
`Objectives: Preferences and Value Trade-Offs. Wiley,
`New York, NY, 1976.
`[10] J. Kleinberg. Authoritative sources in a hyperlinked en-
`vironment. In Proc. 9th ACM-SIAM Symp. on Discrete
`Alg., 1998.
`[11] H. Lieberman. Letizia: An agent that assists web brows-
`ing.
`In Proc. of the 14th Intl Joint Conf. on Art. Int.,
`1995.
`[12] P. P. Maglio and R. Barrett. Intermediaries personalize
`information streams. Comm. of the ACM, 43(8), 2000.
`[13] B. Mobasher, H. Dai, T. Luo, Y. Sun, and J. Zhu. Com-
`bining web usage and content mining for more effec-
`tive personalization.
`In Proc. of the Intl. Conf. on E-
`Commerce and Web Technologies (ECWeb), 2000.
`[14] M. J. Pazzani and D. Billsus. Adaptive web site agents.
`In Proc. of the 3rd Intl. Conf. on Auto. Agents, 1999.
`[15] M. Perkowitz and O. Etzioni. Towards adaptive web
`sites: Conceptual framework and case study. Art. Int. J.,
`118(1–2), 2000.
`[16] A. Wexelblat and P. Maes. Footprints

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket