`
`LENOVO ET AL. -- EX. 1011 -- Page 1
`
`
`
`introduce some of the dimensions which can help distinguish the variety of known £dtering applica- tions and usage senarios and proceed to describe a novel filtering model for casual users and its implementation in the Lyric-Time personalized music system. This model utilizes a stored long-term user profile and involves time explicitly in its selection criteria. Filtering Application and Usage Scenarios Imagine information-filtering sys- tems in their full generality, operat- ing across public and private net- works and involving various kinds of information sources, delivery archi- tectures, and user equipment. In this general scenario many of the param- eters which typically have had fixed values in Information Retrieval (IR) systems now become variables with values depending on the specific application and usage scenarios. This, in effect, makes most existing IR systems individual points in a multidimensional space of possible personalized delivery systems. In order to articulate some of di- mensions which come into rplay dur- ing the design of an end-to-end in- formation-filtering system we examine an abstract architecture of a system consisting of three logical units: the source, the filter, and the user (see Figure 1). In this system, the source presents some descriptors of the multimedia information items IIIIIIIIIII F I L T E R I N 6 to the filter, and the filter forwards to the individual users a subset of the items it selected for them based on advanced knowledge stored as users' profiles or queries. Users may have the option of providing the filter with feedback indicating the extent to which the selection met their cur- rent information needs. The feed- back may be provided explicitly, through a feedback-gathering mech- anism in the user's interface, or im- plicitly, by skipping over unwanted items or reexamining interesting items. Note that in actual implemen- tations the filter may need to con- sider only descriptors of the multi- media information items and may not need to access the complete items directly. Only after items have been selected by the filter, the filter pro- vides the logical addresses for the se- lected items to a presentation control module, which fetches them. In order to accommodate diverse information types, source types, and usage scenario types, the flow of in- formation items between the source, the filter, and the user may need buf- fering. This buffering may not be required in all applications, but when needed it enables pre- and postprocessing of filtered informa- tion as well as temporary storage. The four buffers are shown in Figure 1--the source-to-filter buffer (buff-l), the filter-to-user buffer (buff-2), the user buffer (buff-3), and the user-to-filter buffer (buff-4). Figure 1. Abstract view on the flow and the temporary storage of information in a generalized filtering system feedback USER EQUIPMENT buff-3 T elected items b ff.2 U////////////A I buff-4 buff-1 SOURCE These buffers should be viewed as enabling four distinct steps in the end-to-end filtering and delivery process. The explicit introduction of these steps enables us to generate and to reason about various options for filtering mechanisms and usage scenarios. For example: 1. The availability of buff-1 may enable the filter to perform limited "look ahead" on the prefiltered in- formation. 2. The availability of buff-2 may enable some postfiltering "editing" to take place if necessary in order to accommodate for changes in the rel- evance of the selected but undeliv- ered items. 3. The availability of buff-3 should allow for a variety of extra processing to take place after the information is delivered, but before it is viewed by the end user. 4. The availability of buff-4 allows for flexibility in receiving and han- dling of user feedback. The complete landscape of filter- ing applications and usage scenarios is yet to be explored. However, at this point we can identify a subset of 11 dimensions that define it. These dimensions, which will be described, entail aspects of the architecture and the dynamics of the complete end-to- end personalized delivery system when deployed in the real world. The information-filtering model appropriate for each individual real- izable point in this multidimensional space may be different, enabling it to accommodate different end-to-end application needs. We believe it is important to present these dimen- sions here to provide the context for the new filtering model and applica- tion, which are presented later. The 11 dimensions can be divided into four major groups: user disposi- tion, time scale, information delivery, and information content: 1. User disposition: user type and privacy protection. These dimen- sions reflect users' attitudes toward the service. 2. Time scale: Information lifetime, source availability patterns, filter de- livery patterns, user usage patterns, and user feedback delivery mode. These dimensions reflect five time scales that are explicit in the dynam- December 1992/Vol.35, No.12/¢ONlUlilCATlOIl| OFTHI 1[¢11
`
`LENOVO ET AL. -- EX. 1011 -- Page 2
`
`
`
`ics of the end-to-end system. They are related to the timing by which information is moved among the var- ious buffers and logical units in the model shown in Figure 1. A sixth time scale reflects the dynamics of the changes in user information needs. This time scale is captured by the information-filtering model and will be alluded to in the next section. 3. Information delivery: informa- tion media characteristics, informa- tion transport architecture, and user equipment. These are related to the actual form of the information being filtered and delivered. 4. Information content: information content attributes. User l~/pes--Proactive, Casual Not all users of information-filtering systems have the same needs and expectations, and, therefore, they can be classified by the nature of their information needs and by the way they want to address them. In the two extremes along this dimen- sion we can distinguish between two types of users, proactive and casual. The information needs of proactive users are very well defined and are usually formulated as a query or a profile. The users of libraries and other information banks usually fall into this class. Unlike the proactive users, the casual users do not have immediate and specific information needs, and they are typically the users of entertainment and daily news services. Most existing informa- tion retrieval and filtering systems are designed to address the informa- tion needs of the proactive users, who are expected to be able to accu- rately assess the relevance of the in- formation they receive. When the received information fails to satisfy the proactive user, query reformula- tion and relevance feedback methods can be further employed to success- fully pinpoint the exact needs of the user. All of this is done in the context of prior queries in the current ses- sion without any explicit reference to time. In contrast, casual users have drawn much less attention from in- formation-filtering and retrieval sys- tem designers. Unlike the proactive users, the casual users are not likely to be willing to engage in lengthy in- teractions with the system in order to IIIIIIIIIIIIIIIIIlilll It3mK3 m DIt3 DI IZi F i L T E g i H G articulate current information needs and provide explicit feedback. Therefore, automating the personal- ized delivery of information to this class of users requires mechanisms that can cope with this fact. In partic- ular, issues related to mechanisms for the creation of profiles for new users (e.g., by either using initial pro- files based on stereotypes for users' groups or by building profiles di- rectly from usage data) and to the detection of implicit feedback (e.g., skipped and revisited items) need further research. Privacy Protection--Protected Profiles, Protected Usage History, Protected Information The ability to personalize the deliv- ery of information depends on the availability of personal information about the users and their needs and wants. Since a significant portion of the user community may be con- cerned with their privacy, privacy protection must be an important consideration in the design of the fil- tering delivery systems. We can dis- tinguish three types of privacy con- cerns: protection of user profiles and queries, protection of usage history, and protection of the actual informa- tion if the delivery takes place over public networks. The third concern may be shared by the information sources as well, since it is in the best interest of information providers that the information be delivered only to users who are legal subscrib- ers, and no eavesdropping is possi- ble. The privacy concerns may be addressed by the introduction of var- ious levels of encryptions on the communications lines between the logical units shown in Figure 1 and by the introduction of a third-party name translation facility that medi- ates between the user and other parts of the system. Loeb and Yacobi [11] describe a privacy protection scheme that covers all three concerns. Information Lifetime--Minutes (Stock Market), Days (News Events, Mail), Decades (Technology Reports), Centuries (Entertainment) The value of the information as it relates to users' needs may vary over time. However, in all cases, the total delay between the time the item left the source and the time it reached the users must be considerably less than the lifetime of the item. For example, the relevance of some in- formation items such as financial news items may diminish within min- utes, and hence these items are of very little value to most users if not delivered on time. Other informa- tion items such as music pieces and movies may have value beyond days and years, and their delivery strategy may employ more flexible timing. Source Availability Patterns--Stored Information, Live Information The types of information items pro- vided by the source and their arrival frequencies, if available in advance, may be used by filters in order to as- sess the value of individual items (e.g., how rare they are). For exam- ple, rare topics in news services, such as items regarding earthquakes, may have enhanced value due to their low frequency. Live sources (e.g., sports events, concerts, political events) may provide only limited advanced knowledge of their contents. In some cases, buffering of the incoming in- formation items (in buff-1 of Figure 1) may provide the filter with some localized lookahead capabili- ties. Filter Delivery Patterns-- Continuous, Synchronous, Asynchronous The filter may deliver information to users continuously as the informa- tion becomes available from the source, synchronously or asynchro- nously following users' requests. The ability to provide synchronous deliv- ery requires that the filtering and delivery mechanisms respond to users' requests in near real-time fash- ion. The ability to provide asynchro- nous delivery depends on the avail- ability of some store-and-forward capabilities which can be offered by either buff-2 or buff-3 (see Figure 1) or by some other storage facility ex- ternal to the system. Users Usage Patterns--Continuous, Regular, Irregular, Single Session The session duration and frequency may vary. Users may be provided with single sessions with the filter, in which case no history of usage is re- corded, or they may be provided with a dedicated service, in which ~lIMUltir-.,AVllg~lalOIm~l~iAGm/December 1992lVol.35, No.12 4~
`
`LENOVO ET AL. -- EX. 1011 -- Page 3
`
`
`
`case stored information about their usage history as well as preferences are used during filtering. For the case of dedicated services, the usage may be continuous (e.g., stock mar- ket information, email), at regular intervals (evening news), or at irreg- ular intervals (music). Mechanism for User Feedback-- Real-Time, Off-Line The feedback provided by users may affect the filter section process im- mediately, or it may be stored in buff-4 (see Figure 1) to be considered off-line later and to be effective only on future session selections. Information Media Characteristics-- Media Composition, Size The actual form of the information has a strong effect on the required performance of the filter. For exam- ple, if the information items are full- length motion pictures, it is unlikely that the filter will be required to make more than a few selections per session. However, due to the large size of the items and the high bit rate required for their transmission (at least a few megabits per sec.), storing them in either buff-2 or buff-3 may not be practical. In this case, all that the buffers may store is the address from which the motion picture can be retrieved. Furthermore, in this extreme, a special network delivery mechanism has to be employed to deliver large video-based items, due to their very high resource require- ments. At the other extreme of the item size scale the situation is differ- ent. Information items such as text- based stock market updates are small enough to be stored in their entirety in either buffer. Information Transport Architecture-- Broadcast, Narrowcast, Switched Information items can be delivered to users in a variety of ways. The de- livery architecture determines where the filter is physically located and how large the needed buffers are. For example, if the information is broadcasted or narrowcasted to users (e.g., broadcast television or cable), the filter has to be located on the users' premises. In this case, the equipment must provide the storage of the items and the processing power required by the filtering com- IIllIIIIIII amr '-,a am F I L T E R I N G putations. If the information is switched and delivered point to point, the filter can be anywhere in the system because the delivery selec- tivity is high enough to provide per- sonal targeting. user Equipment--Television, PCS The intelligence and storage capabil- ities of the equipment available at the users' site used to store, process, and present the information to users may have a strong influence on the kind of information-filtering scenarios that can be designed. For example, if the equipment lacks processing and storage capabilities (e.g., television), any filtering solution involving a large buff-3 (see Figure 1), as well as any solution requiring locating the filter on users' premises for privacy purposes become impractical. In this case, all the postfiltering processing and storage that are necessary must be done elsewhere, and privacy must be protected in the network. Information Content Attributes-- Full-Content Indexing, Descriptors Information filters must examine information items or some descrip- tion of them before it can be decided whether they are relevant for any given user. The filtering model is very strongly affected by the types of descriptions of the information that are available. Since, in general, the information items that are being fil- tered are not necessarily text-based, powerful methods such as full-text indexing cannot always be employed. In such cases, existing matching al- gorithms based on descriptions of the information items need to be used. These descriptions include characterizations of the intended users of the information, characteri- zation of the source of information, the item creation time, and the like. A more novel matching situation arises when one tries to compare pic- tures, music, or speech directly using image, sound, or speech recognition techniques rather than textual de- scriptors. In this scheme, features such as filtering by and for a commu- nity [6] can be easily provided. Please note, however, that the scalability of this description-based approach may be crucially dependent on the avail- ability of some standard vocabulary for such descriptions. Describing existing information- filtering and retrieval systems in terms of the 11 dimensions pre- sented may provide a framework for a comparative analysis of their appli- cation and usage scenarios and of the filtering mechanisms that are needed in each case. For example, existing library-retrieval systems typically occupy the nitch described in Table 1. In the following we focus our dis- cussion on a specific type of filtering problem which involves filtering for casual users and present a novel in- formation-filtering model which ex- tends existing IR models. The pre- sentation of the model is followed by a description of its implementation in the LyricTime system prototype. A Model for Dedicated Filters for Casual Users Casual users are defined as users who do not have specific and imme- diate information needs but who wish to obtain information they might like. These users are the likely recipients of news and entertainment information. Casual users are a mixed blessing for information- filtering systems. On the one hand, since users' needs are not specifically defined their expectations are not specific, and they may be satisfied with a variety of information items. On the other hand, since casual users cannot articulate their needs, the fil- ter cannot rely on the guidance that could have been provided by users' requests and their reformulations while searching for the information. We focus our attention here on a filtering model in which the filter is presented with a stream of informa- tion items (see Figure 1); it has to se- lect from this stream items that meet some acceptance criteria based on comparisons with user profiles and with previous and planned deliver- ies. This delivery architecture can be viewed as an abstraction to a sequen- tial search in any database system. However, it can also be realized di- rectly by the Datacycle system [3]. Figure 2 presents the abstract struc- tures of the information stream and of the users' profile. Every selected information item is put on a delivery list which is kept in 42 December 1992/Vol.35, No.12 /¢OMMUNIT..ATIONll 011 Till AL'[I
`
`LENOVO ET AL. -- EX. 1011 -- Page 4
`
`
`
`buff-2 of Figure 1. In the model we assume that the filter does not have the option of reordering this list as future items arrive. Consequently, the decision whether to accept or re- ject an item is based not only on the current match between the profile and each incoming item individually, but also on the union of all accepted as well as anticipated information items. The latter is obtained by prob- ing for statistical characteristics of the information source. This is nec- essary because the filter is not pro- vided with a way to reorder items on the selected list. Consequently, tak- ing into account the future availabil- ity of items in a given time interval allows the filter to balance the mix of selected items to best match the user profile. The two basic extensions provided to IR models by the model presented here are: (1) a decision step which is used to assess whether to actually deliver an information item to a user after some match between the user profile and a description of the item has been de- tected. In our model, not every item that has met some matching criteria actually gets delivered. Rather, the model embeds the notion of mixing of information in order not to bore the casual user with "too much of a good thing." (2) time- and context-sensitive user preference functions. The model assumes that user preferences and the effects of user feedback may be context- and time-sensitive. The model allows for preferences and feedback to be expressed as explicit functions of context and time. Con- sequently, their effect may be differ- ent in different usage settings (home, office) and moods and at different times (e.g., holidays, a certain time after a specific event). We believe that these dependencies on context and time are fundamental to the na- ture of casual use of information. The model which governs the de- cision whether to select an informa- tion item after a match with the pro- file has been found can be expressed in terms of delivery frequencies and item counters. We define two fre- quencies: (1) w i is the frequency at which the Illllllllll omomM Dnomm F l L T E R O N G lrable !. The "hitCh" in the application and usage space of a typical library-retrieval system Dimension User type Privacy protection Information lifetime Source availability Filter delivery User usage User feedback mechanism Media composition Information transport User equipment Information attributes Value proactive not possible years known synchronous single sessions unavailable text none PC keywords Information Stream item-identifier k deacritor 1 descriptor 2 . . . end l itemt+ 1 ... Use Profile descriptor| "-> wl (user, time, context, feedback) descriptor 2 -"> w 2 (user, time, context, feedback) I:lllure 1. The structure of the information stream and user profile user wishes to obtain information items with a descriptor i. In addition to being time-sensitive, w i is also as- sumed to be context-sensitive; namely, the user may have different values of w assigned for the same descriptor i, depending on the context. (2) F / is the average frequency at which information items with de- scriptor i are provided by the infor- mation source. Now let us introduce four counters which operate over some predefined time period T denoted as the session. The actual value of T depends on the dynamic of changes in both the user preference and the information domain as well as on the definition of the information-filtering service and will not be discussed further here. (1) M ...... is the total number of items that were provided by the source. (2) Mimatche~ is the total number of items with descriptor i that were matched to the profile. (3) MidelivereU is the total number of items with descriptor i that were de- livered to the user. (4) Mses, io, is the total number of items delivered to the user. All of the frequencies and the count- ers defined here may be explicit func- tions of time. From their definitions, the following two relations between the frequencies and the counters should hold for any session of length T: i wi = M delivered ( 1 ) msession F i = m~matched (2) msource Note that the definition of a match COMMUNICATIONI OP TXll A~M / December 1992/Vol.35, No.12 41
`
`LENOVO ET AL. -- EX. 1011 -- Page 5
`
`
`
`may result in kP, ad~,~ having nonin- teger values in some matching scheme in which the descriptors are weighted for their relevance to the individual information item. This is a subject of much research in the IR community (see [2]) and is not the main focus of this work. For every increment of the counter Mi=,u~ a decision should be made whether a corresponding in- crement of M/~u~.~d should take place; namely, whether the match should result in the item being se- lected for delivery. In order to quan- tify this decision procedure, let us examine the effect that a newly ob- tained match has on some of the fre- quencies and counters in the filter. For this purpose we use difference equations. The frequency w / should remain unaffected by any match (i.e., incre- ment in M~,a~.~a) since the filter "protects" the user from the fre- quency of the items in the source. This is expressed by equation (3). =0. (3i AMi=ar.~/ Using equations (1) and (3), we ob- tain: AMi~ w/" AM~/o,, _- 0. (4) However, if we assume that, at any time, the selection of items which is contained in Ms~o, reflects (on the average) the increments achieved as a result of individual matches we ob- tain: AMse.~n M~_~,m = ~ (5) ~Mi,~n,n M i,~a~n" Notice that equation (5) is true when the source is reasonably uniform (namely that F ~ is not a function of time). If it is, higher-order correc- tions should be added to the equa- tions. Equation (5) together with equations (2) and (4) yields: ~m~ _ ~. M__~_ = 0. (6) ~Mi,,,aau,a F / M~o~,-~ A decision whether to deliver a given item with multiple descriptors has to be a function of the contributions of individual descriptors as expressed by equation (6). As a first-order ap- proximation we chose this function IIIIIIIIIII I? | It. T IE I~ I ~ to be the linear sum of equation (6) over all of the item descriptors i. This is represented by equation (7). \AM'=aa~n F~ ~->0. (7) Equation (7) represents the basic constraint which drives the filtering mechanism. The inequality ex- pressed by equation (7) represents the desired state the fdter must maintain. While processing a given information item, every time a match for descriptor i is found, M/,.t~d is incremented. As a result, the in- equality in equation (7) may be po- tentially violated unless M/~ is incremented as well. Only when the inequality is indeed violated, the fd- ter then makes the decision to deliver the current information item. The delivery of the item causes Mi, ze~.,~ to be incremented and restores the validity of equation (7). Note that as a side effect of the decision to deliver any item, the delivery counters M'da/~,a, for all the other descriptors of that item have to be incremented as well. Note also that in order for the Filter to maintain a realistic value of F ~ it must occasionally probe the incoming information stream to compute the current value of this frequency. Any big deviation be- tween the anticipated actual value of F signifies that something unex- pected has happened in the domain (e.g,, a fast-breaking story in the news domain). The filtering mechanism de- scribed here is, in fact, a mechanism for frequency transformation. The filter makes sure the spectrum of items which is provided by the information sources is transformed into a set of items most likely to tit the users' cur- rent needs. Now we turn to the use the filter can make of the limited information it receives directly from the users. In the scheme developed here, users can provide two types of informa- tion. One is information about their current context or mood, and the other is feedback information indi- cating whether they liked or disliked the received items. The context- related information is used to select the profile appropriate for that con- text. The feedback information is used by an adaptation scheme m modify # for the current context. The rationale behind the time- dependent adaptation scheme is as follows: users' needs (in a given con- text) are assumed to be a "reasonably behaved" function over time. Note that without this assumption there is no point trying to provide automatic information faltering in the first place. Now w / can be approximated using a sequence of periodic func- tions (this is the essence of Fourier analysis). The adaptor takes the feedback information and modifies the current value of u/by adding to it a time-dependent function, hence, for every w/, dynamically creating the approximation series. In the adaptation model for casual users proposed here the effects of user feedback have two components: (1) an immediate reaction which causes an increase in u/for the case of positive feedback and a decrease when the feedback is negative and (2) a long-term recovery mechanism which eventually moderates the ef- fect of the original short time scale reaction. The magnitudes of the ef- fects and the time frame involved are still a subject of experimentation. The personalized music system prototype LyricTime has been built to demonstrate the practical validity of the paradigm, and the mechanism described here as well as to study the demands that customized multime- dia information delivery places on communication networks. 1'lie tl/r/ct/mo I~rsonallzetl Music Systam LyrkTirae is a personalized music sys- tem in which songs are played at a listener's workstation, using its built- in audio capability. At the same time, a still image from the album cover for the song is presented on the in- terface display. The listener is free to stop and start playing at any time, step forward and backward through the list of selected songs, change the volume, bias the selection of the filter by clicking on a "mood" button, and provide evaluative feedback on the current song. The LyricTime research prototype has a collection of nearly 1,000 songs that can be played for the listener. At ~l I~cember 1992/Vol.35, NoA2 / _ --OlU'llli01~l
`
`LENOVO ET AL. -- EX. 1011 -- Page 6
`
`
`
`its simplest level, the LyricTime proto- type selects songs from a database and plays them for the listener. More specifically, to select songs from the database, it uses the information fil- ter which implements the model de- scribed in the previous section, using descriptions of the songs, a listener profile, and feedback from the lis-- tener. The listener can step through the selected songs, look at title and artist information, or have the LyricTime prototype play them. The listener profile provides listener- specific preference information to the filter. Listeners can have differ- ent profiles for different moods. Lis- tener feedback is used to update the profile based on the listener's opin- ion of songs that have been played. The listener controls the LyricTime prototype with the user interface shown in Figure 3. The interface was built using an experimental object-oriented pro- gramming language which simplifies the construction of multimedia and multiuser interfaces to computer applications [8]. A hand-drawn color image of a jukebox is the backdrop for the interface. The drawing helps establish the listener's view of the sys- tem and is divided into three main units. The upper portion shows in- formation on the selected songs and the song currently being played. This includes a still image representing the current song, some text indicat- ing how many songs have been se- lected, and the title and artist infor- mation for, from left to right, the previous, current, and next songs. The middle portion of the interface contains buttons for controlling the playing of songs. These buttons allow the listener to step backward and for- ward through the list of selected songs, play the current song, and stop playing. There is also a button to quit the program. Once the listener clicks on the play button, the songs that have been selected by the filter are played, in order, until the listener clicks on another control button. If the listener causes playing to be stopped in the middle of a song, the volume is gently brought to zero be- fore stopping. This avoids abrupt stops that can startle listeners. The lower portion of the interface has three groups of five buttons. The llllIllllll F i L T E R l N G zo~ ~eleczlons/~vanaole!! Elmers Tune I Aint Misbehavin I Gotta Give Benny Goodman I Louis Armstrong I Ella Fitzgerald PLAY STOP Volume: jSoftest Q Soft Q Medium Q Loud Q Loudest Mood: QCheerful l~ Romantic Q Calm QSad U Curious Evaluation: O Bad Q Poor Q Okay Q Good ~ Great ..