throbber
The D esign and Implemen tat ion of
`an Intelligent Interface for
`Information Retrieval
`
`H.oger Howard Thompson
`Ph.D . Thesis
`Computer a nd Information Sci ence Department
`Univ ersity of Massachusetts
`
`corNS Techni cal Report H8-RH
`
`Computer and Information Science
`
`001
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`002
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`The Design and Implementation of
`an Intelligent Interface for
`Information Retrieval
`
`A Dissertation Presented
`
`by
`
`ROGER HOWARD THOMPSON
`
`Submitted to the Graduate School of the
`
`University of Massachusetts in partial fulfillment
`
`of the requirements for the degree of
`
`Doctor of Philosophy
`
`February 1989
`
`Department 0(" Computer and lnformation Science
`
`003
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`.~.:
`
`.t
`
`© Copyright by ROGER HOWARD THOMPSON 1989
`
`All Rights Reserved
`
`.......~
`
`004
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`The Design and Implementation of
`
`an Intelligent Interface for
`
`Information Retrieval
`
`A Dissertation Presented
`
`by
`
`Roger Howard Thompson
`
`\(/Iirl CI/{/.ic
`
`_
`
`\ ' : Rich ards Adr ion , Ucpart.mcnt. Head
`Department. of Co m p u ter a nd Inform ati on Scien ce
`
`005
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`DEDICATION
`
`This work is dedicated to the memory of
`
`Dr. Victor Paul Wierwille.
`
`IV
`
`006
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`ACKNOWLEDGMENTS
`
`I would like to thank the following people, who helped me greatly to accomplish
`
`this work. First, I would like to thank my advisor Bruce Croft, whose constant
`
`encouragement and thoughtful constructive criticism was instrumental in helping see this
`
`project through. Professors Dave Stemple and Nick Belkin provided me with different
`
`perspectives that enabled me to think more clearly.about the subject.
`
`I would like to thank some of the residents of the Wombat Research Lab, Larry
`
`Lefkowitz, Carol Broverman, Tom Parenty , Norm Carver, and Al Hough for making the
`
`time bearable.
`
`I would like to thank my supervisors at Hughes Aircraft, Bill King and Jim
`
`.---Blackburnfor their understanding while finishingmy.writing.; .. - ~ ----- --- -- - --
`
`I would like to thank my friend, Andy Zitelli, for his wise counsel throughout my
`
`entire undergraduate and graduate academic career.
`
`Finally, I would like to express my gratitude to wife, Darlene, for her unfailing
`
`encouragement and support, and my daughter Rebeca for the joy that only a young child
`
`can bring.
`
`v
`
`007
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`ABSTRACT
`
`THE DESIGN AND IMPLEMENTATION OF
`
`AN INTELLIGENT INTERFACE FOR
`
`INFORMATION RETRIEVAL
`
`February, 1989
`
`ROGER HOWARD THOMPSON, B.A., UNIVERSITY OF CALIFORNIA AT
`
`BERKELEY
`
`M.S., NEW MEXICO STATE UNIVERSITY
`
`Ph.D., UNIVERSITY OF MASSACHUSETIS
`
`Directed by: Professor W. Bruce Croft
`
`Commercial information (text) retrieval systems have been available since the early
`
`1960's. While they have provided a service allowing individuals to find useful documents
`
`ou t of the millions of documents contained in online databases, their are, a number of
`
`problems that prevent the user from being more effective. The primary problems are an
`
`inadequate means for specifying information needs, a single way of responding to all users
`
`and their information needs, and an inadequate user interface.
`
`This thesis describes the design and implementation of 13R, an intelligent interface
`
`for information retrieval the purpose of which is to overcome the limitations of current
`
`information retrieval systems by providing multiple ways of assisting the user to precisely
`
`specify his information need and to search for information. The system organization is
`
`based on a blackboard architecture and consists of a number of "experts" that work
`
`cooperatively to assist the user. The operation of the experts is coordinated by a control
`
`expert that makes its decisions based on a plan derived from the analysis of human search
`
`intermediaries, end user dialogues, and user model. The experts provide multiple formal
`
`search strategies, the lise and collection of domain knowledge, and browsing assistance.
`
`The operation of the system is demonstrated by four scenarios.
`
`VI
`
`008
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`TABLE OF CONTENTS
`
`DEDICATION
`
`ACKNOWLEDGEMENTS
`
`ABSlRACf
`
`LIST OF FIGURES
`
`CHAPTER
`
`1
`
`Overview
`
`,........ ... ... ....... . ....... .... . ...... .... .. .
`
`1.1 Introduction
`
`'" ..
`
`... ... .... ... .. .. .... . ... ... .... ... .
`
`1.2 Retrieval Problems
`
`'....... .....
`
`1.3 Intermediary Model
`
`:. ... .... ... .... ... ........ ... .
`
`1.4 System Analysis and Requirements..... ................. ......... .... .. .. ....
`
`1.5 Architecture. .... ... .... ... . ......
`
`IV
`
`V
`
`VI
`
`Vll
`
`1
`
`1
`
`1
`
`4
`
`5
`
`7
`
`1.6 Organization................. ... .. ............... ................................. 12
`
`1.7 Contributions. ......... ... ... ........ ....................... .. .... . ... .... .... ... 12
`
`2
`
`Background and Related Work.... .
`
`14
`
`2.1
`
`Introduction
`
`,.... ... .
`
`. . ... .. .. .. ..... ... 14
`
`2.2 Traditional Information Retrieval..
`
`14
`
`2.3 Retrieval Problems.................. ...... . ..... .. ........... ......... ... ...... . 34
`
`2.4
`
`Intelligent Text RetrievaL.... ................ ... ............................... 39
`
`2.5 Analysis of Systems ,;
`
`2.6 Summary
`
`3
`
`The Basis of I3R
`
`43
`
`61
`
`62
`
`,
`
`3. I
`
`Introduction ........... ... .. ....... .. ...... ............... ....................... . 62
`
`3.2 Discussion
`
`3.3 Conclusion
`
`4
`
`Design and Implementation
`
`4.1
`
`Introduction
`
`"
`
`Vll
`
`62
`
`74
`
`75
`
`_..... 75
`
`009
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`4.2 Design..... ....... .... ....... ... .... ...... .. ... .. ............ ..... .. ..... .. ... .... 75
`
`4.3
`
`Implementation of the Blackboard System
`
`4.4 Summary
`
`5
`
`Browsing Expert
`
`5.1 Introduction
`
`5.2 Definition
`
`5.3 Brow sing Operation
`
`5.4 Browsing Implementation
`
`5.5 Summary
`
`,
`
`6
`
`Example Scenarios
`
`"
`
`6.1 Introduction
`
`6.2 Evaluation
`
`6.3 Scenarios
`
`6.4 Possible Behavioral Changes
`
`6.5 Summary
`
`7
`
`Conclusion
`
`7.1 Summary
`
`7.2 Future Directions
`
`"
`
`BIBLIOGRAP1·IY
`
`,
`
`I 17
`
`124
`
`125
`
`125
`
`125
`
`131
`
`145
`
`147
`
`148
`
`148
`
`148
`
`150
`
`197
`
`198
`
`199
`
`199
`
`202
`
`207
`
`Vlll
`
`010
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`LIST OF FIGURES
`
`1. 1 System organization of 13R . .. . . ... . .. . . .. . . .. . . . .. .. . . .. .. .. . .. . . ... . . ... . .. . .. . ... . .. . 11
`
`2.1 Showing how a cluster search can retrieve different documents................ .... 26
`
`2.2 "Contingency" table for computing evaluation measures. .. ....
`
`2.3 Typical precision/recall graph. .. .. ...
`
`3.1
`
`13R /
`
`IPM functional correspondence..
`
`31
`
`32
`
`73
`
`4 .1 Hearsay II high level architecture........ ........................................... .... 80
`
`4.2 Hearsay II control
`
`function.. ..
`
`4.3 High level 13 R design..
`
`4.4 Basic structure of document collection, showing the relationships between
`the
`levels..... .
`
`4.5 Sample Conceptual structure....... ...... .. .. ... ..........
`
`4.6 The user is making a connection between "concurrent processes" and
`"parallel processes."
`
`82
`
`85
`
`87
`
`89
`
`92
`
`4.7 After selecting Entry OK from the Content menu, the phrase "concurrent
`processes" is transferred to the Related Window... . . ... . . . .. . .. .. . . .. . . . .. . . .. .. . .. . 93
`
`4.8 The user has keyed re t ur n in the Text Entry window (which then
`disappears), causing the word "parallel" to appear in the Phrase window,
`and has selected "processes" from the text, which also appears in the Phrase
`window
`
`94
`
`4.9 The user selects Entry OK from the Content menu, which causes the
`phrase to be transferred to the Related window. ...................................... 94
`
`4.10 Representation of the domain knowledge added to the user's model... ........... 95
`
`4.11 The Document level.
`
`4.12 Document neighborhood taken from the CACM collection
`
`4.13 Control Expert States
`
`4. 14 Summary of control expert expectation values based on user stereotypes
`
`4. 15 Organi zation of Interface Manager data
`
`5. 1 Sample Neighborhood Map
`
`~
`
`;
`
`5.2 Sample Context Map
`
`_ 96
`
`97
`
`102
`
`103
`
`t24
`
`132
`
`t 33
`
`IX
`
`011
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`5.3 Grid for browsing maps showing different node markings, but without
`labels
`
`5.4 User chooses the References selection
`
`5.5 Neighborhood Map showing the addition of Reference and Journal Issue
`nodes
`
`5.6 Context Map showing configuration if the Reference and the Journal Issue
`Nodes are expanded
`
`5.7 Example term neighborhoods
`
`5.8 Expansion of a document list.
`
`6.1
`
`First portion accomplished of the CE plan
`
`6.2
`
`Initial
`
`state of the interface
`
`6.3 System prompting the user to answer questions that will determine the ap-
`propriate stereotypes
`
`6.4 These choices determine domain knowledge expertise
`
`6.5 These choices determine search orientation
`
`6.6 New portion of CE Plan accomplished
`
`6.7 System asks the user for the kind of input form to initially specify his
`query
`
`6.8 The user has entered his query in a free text form at. .
`
`6.9 The CE's plan after CE operation in cycle 29
`
`6.10 User selecting phrases and important words
`
`6.11 Concepts presented for user evaluation
`
`6.12 Information about analysis of programs
`
`6.13 CE moves from $DNC to $SD using control expert rules 21, 15,5 and 25,
`making the search controller active
`
`6.14 Top five documents of initial search
`
`6.15 The control expert moves the system to state $ER, evaluate results, for
`evaluation of the search results :
`
`6.16 User makes relevance judgements of documents terms and phrases in the
`retrieved documents
`
`6.17 The exception transition back to $SD to enable the search controller.
`
`x
`
`135
`
`137
`
`138
`
`139
`
`142
`
`142
`
`152
`
`152
`
`154
`
`155
`
`156
`
`158
`
`159
`
`160
`
`161
`
`162
`
`164
`
`165
`
`166
`
`167
`
`169
`
`170
`
`171
`
`012
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`173
`
`174
`
`177
`
`177
`
`178
`
`179
`
`180
`
`181
`
`182
`
`183
`
`186
`
`188
`
`189
`
`191
`
`192
`
`193
`
`193
`
`195
`
`196
`
`197
`
`6.18 CE moving back to $ER
`
`6.19 The CE moves the system to the $Finish state
`
`6.20 CE moving back to the state $DNC to allow the DKE to search the domain
`knowledge for other concepts
`
`6.21 Message advising the user on the next activity
`
`6.22 CE move system state to $SD to allow the SC to make another search
`
`6.23 Search results windows after the user is done with the second search
`
`6.24 Query elaboration with more choices for the expert user.
`
`6.25 Domain knowledge entry by a domain and system expert..
`
`6.26 Concepts from the user's domain knowledge
`
`6.27 Results of the first
`
`two searches (window menus not shown)
`
`6.28 Initial display on the Neighborhood Map (the doc ument 2889 and Context
`window is not shown)
`-_._----_ ... _--_ .. -..._ ._ ~ - ~- - - -- - - - _.-- ...
`.--. -_..._....
`6.29 User selects a recommend ed node to view its contents, and selects terms
`that arc particularly relevant or interesting
`
`-- - - ---
`
`-
`
`.
`
`.
`
`-" .-.
`
`..
`
`-
`
`6.30 Neighborhood Map with expanded document neighborhood
`
`6.31 User views document 2722 (text is incomplete)
`
`6.32 User selects a term to examine from document 2722
`
`6.33 Display for the concept mul tidimen sional.
`
`6.34 User selects Documents option
`
`6.35 Context Map after examining document #2846
`
`6.36 Context Map showing crowded region around node " A," and user desires to
`expand node "B."
`
`6.37 Use of connector to expand node "B."
`
`Xl
`
`013
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`014
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`CHAPTER 1
`
`OVERVIEW
`
`1 . 11
`
`Introduction
`
`In this chapter, an overview of this dissertation is presented. We begin by dis-
`
`cussing the problems of traditional information retrieval systems and how they are usually
`
`overcome. These problems form the basis for the requirements of a more sophisticated
`
`system called 13R, an Intelligent Interface for Information Retrieval. A design is then
`
`outlined that will meet the specified requirements. The design has two major aspects: the
`
`first is facilities that should be provided; the second is how these facilities are to be
`_
`__.
`.
`.
`-
`.
`..
`- --.
`-0_-
`supported in ways that allow easy modification.
`
`1 .2 Retrieval Problems
`
`Commercial retrieval systems have been available since the early 1960's. At that
`
`time, they were a significant breakthrough in the use of computers for non-numeric appJi-
`
`cations. They allowed scientists and engineers to sort through the many journals, technical
`
`reports, and other written works to find information that might be useful in helping them
`
`solve their problems. The utility of these systems has been recognized in other professions
`
`such as law and medicine, where major retrieval services are now available.
`
`While developments in storage technology, such as ever increasing densities in disk
`
`storage, and developments in communications technology, such as relatively inexpensive
`
`2400 baud modems, have made these systems more widely available, the interface
`
`technology has remained for the most part stagnant, reflecting the designs of the original
`
`systems. These interfaces were designed to operate with simple input/output devices such
`
`as 110 character/second printing terminals. This significantlly limits the kind of information
`
`1
`
`015
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`2
`
`that can be displayed . Furthermore, the operation of the system has a command-line
`
`orientation, which is reflected in the use of specialized languages for query specification.
`
`These languages are based on Boolean logic and are usually augmented with
`
`proximity operators and "don' t care" or wildcard characters. The former specify how close
`
`words must be in sentences or paragraphs. The latter handle alternative spellings and in(cid:173)
`
`flected forms of words. The use of these languages requires specialized training for the
`
`user to teach them the semantics for AND, OR, and NOT. While the basic concepts are
`
`re latively simple, use of these languages is mastered only after a significant amount of
`
`experience. Furthermore, different systems have different query languages and many users
`
`do not have the time or the inclination to learn Boolean logic.
`
`Boolean logic cannot precisely specify many relationships between words. For ex(cid:173)
`
`ample, AND can be used to describe phrases or words that are required; OR may specify
`
`alternative words, synonyms, or components of "higher" level concepts. In addition, AND
`
`and OR in some situations in everyday language can be used synonymously. This lack of
`
`precision or multiple meaning can be overcome by adding other operators to specify re(cid:173)
`
`lationships more exactly or adding weights to the AND and OR to give "soft" Boolean op(cid:173)
`
`erators [Salton 83].
`
`Both solutions, while feasible, simply add to the amount of knowledge that the user
`
`mu st know in order to lise a system effectively. This increases the potential for confusion
`
`and, hen ce, frustration on the part of the end user. The casual user or "permanent novice"
`
`will , in all likelihood, never bother to learn how to use the advanced features of the query
`
`language.
`
`Compounding the problem of using the query language, which is a matter of query
`
`form, is the problem of determining precisely what is the content of the query. This is a
`
`problem of selecting the proper words to express what the user wants. Two potential
`
`problems arise here. The first is that the user may not know exactly what he wants, and the
`
`016
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`3
`
`second is that he may not know the precise terminology required to express the need. In
`
`some systems, the user has recourse to an online thesaurus, which is a collection of words
`
`that is structured to show the relationships between them, to find the proper descriptive
`
`terms and to give the structure of the knowledge of a domain.
`
`In others, the best that he
`
`can do is get an alphabetical list of terms occurring in the database.
`
`The problems of query form and content are mani festations of the inflexible nature
`
`of retrieval systems. They have only one way to respond to every type of user and every
`
`type of problem.
`
`To overcome this inflexibility, end-users, the persons with the information need ,
`
`often resort to using the services of a search intermediary.
`
`Intermediaries have received
`
`specialized training in the use of retrieval systems. They often have a degree in librarianship
`
`or have a degree in the field in which they search or by constant use have developed a
`
`knowledge of the terminology of a domain. For example, an intermediary that searches
`
`Chemical Abstracts might have a Ph. D. in chemistry. This background allows them to
`
`concentrate on getting the best possible results from the retrieval system by knowing the
`
`correct terminology.
`
`One of the main advantages of using intermediary services is that the intermediary,
`
`being a person, can be much more flexible than the current commercial systems. The in-
`
`termediary can adapt to the needs of different users. If the session is the end-user's first
`
`experience, the intermediary can help the user understand the search process by explaining
`
`what he is doing as he goes along . The intermediary can adjust his explanations to match
`
`the kind of user that he is dealing with . A college freshman with an orientation to the
`
`humanities would require a different kind of assistance than a medical doctor with some
`
`computing experience. Another advantage of an intermediary is that he can continue to
`
`learn about the domains that people consistently search in and he can learn about the needs
`
`of the people that consistently use their services.
`
`017
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`4
`
`While the use of intermediary services removes the burden from the end-user of
`
`having to deal with the query language, and often provides him with terminological assis(cid:173)
`
`tance, it adds a new difficulty, since the user is now often removed from participating di(cid:173)
`
`rectly in the search process. The user must now, as before, try to express his information
`
`need to the intermediary, but, in general, cannot take advantage of the recognition ability
`
`that humans have in the search process. This is due to the fact that often intermediaries will
`
`search without the user present. The preferred situation is to have the end-user present
`
`with the search intermediary while the search is taking place. This, however, often slows
`
`the intermediary down, since he often has to explain his actions to the end user. This
`
`situation is not always possible due to considerations such as scheduling, among others.
`
`Other factors such as the availability of intermediary services also come into play. These
`
`services may not be free; adding further to the cost of using the system. Furthermore, with
`
`the advent of extremely high density storage such as CD-ROM (Compact Disk-Read Only
`
`Memory), end-users may be searching for information in their own home, where search
`
`intermediaries are not available.
`
`1.3
`
`The Intermediary Model
`
`The search intermediary provides a model that can be useful in designing systems
`
`that can help overcome the problems of using IR systems. There are two ways that this
`
`concept can be used. One way is to simulate the activity of an intermediary, that is to at(cid:173)
`
`tempt to provide the same services as the intermediary. This has been the basis of a num(cid:173)
`
`ber of expert systems that provide such services as a common command language to mul(cid:173)
`
`tiple retrieval systems [Marcus 81a, Marcus 81b, Marcus 83] and rudimentary query for(cid:173)
`
`mulation assistance [Yip 79, Pollitt 84]. More sophisticated systems [Brajnik 85, Brajnik
`
`87, Chiaramella 87, Defude 85] that take this approach attempt to implement the strategies
`
`and tactics used by intermediaries for searching [Bates 79a, 79b] and attempt
`
`to
`
`018
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`5
`
`incorporate a natural language dialogue with the user. All of the systems that attempt this
`
`kind of simulation have been designed to work with Boolean systems and therefore have
`
`the limitations on retrieval effectiveness [Salton 83] that plague Boolean systems.
`
`. The approach taken in this thesis is to look at the intermediary concept as an intelli-
`
`gent interface system which is composed of the intermediary and the retrieval system.
`
`Analysis can then be made of the kinds of facilities that this system provides or should
`
`provide to assist the user in expressing his need and finding information that will meet it.
`
`The system designer can then determine how best to implement those facilities, taking
`
`advantage of the current research in information retrieval, and not be limited to ineffective,
`
`immature, or inappropriately applied technologies in an effort to exactly simulate the human
`
`intermediary.
`.-..-_ .._._._. -- - -_._ -
`--- - .._. -- -
`1.4 System Analysis and Requirements
`
`-
`
`--
`
`- -
`
`--
`
`- -
`
`~------ - --- - - - '-
`
`In analyzing the combined intermediary/retrieval system, the four basic elements of
`
`a retrieval system are the basis of the analysis. These basic elements are:
`
`1.
`
`2 .
`
`3 .
`
`a representation of the content or meaning of the documents and the queries,
`
`a process, usually called indexing, that maps the content of the document and
`the queries into the content representation,
`
`a decision method, usually called a search strategy, that the system uses to
`determine whether or not a document should be retrieved,
`
`4.
`
`a user interface.
`
`The user interface element in the combined intermediary/retrieval system is COI11-
`
`posed of the services that the search intermediary provides, and the actual method (i.e. how
`
`the query is typed in, how results are displayed, etc.) of interacting with the system. The
`
`essential serv ices that the intermediary provides arc:
`
`1.
`
`2.
`
`explanation of system operation,
`
`term selection assistance,
`
`019
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`6
`construction of a model of the information need, which consists of the query
`and the documents that have been retrieved,
`
`execution of the searches,
`
`overall control of the course of a session.
`
`3.
`
`4 .
`
`5.
`
`To adapt to the different kinds of end-users, the intermediary must make some as-
`
`sessment with regard to the end-user's familiarity with the domain, his familiarity with the
`
`search process, and the kind of results that he wants, such as whether he wants a few spe-
`
`cific documents or a comprehensive collection. Essentially, the intermediary forms a model
`
`of the end-user and adapts the session to that model.
`
`While the intermediary aspect of the system addresses most of the issues of in-
`
`flexibility, some of them are rooted in the retrieval portion of the system. In the past, sys-
`
`terns have been limited to a single decision method (retrieval method) for determining what
`
`documents ought to be retrieved. By having different methods for different kinds of
`
`queries the effectiveness of a system can be increased substantially [Croft 85J. A system's
`
`effectiveness can also be increased by providing direct access to the documents by brows-
`
`ing, a heuristically driven incremental search and evaluation technique [Oddy 77J.
`
`Browsing need not be limited to just the examination of documents; it can also be used to
`
`find the appropriate concepts to describe the information need.
`
`The prec eding high level analysis of the elements of the combined intermedi(cid:173)
`
`aryjretrieval system has pointed out the need for the system to support a number of facili-
`
`ties or functions that either provide services similar to that of an intermediary or support
`
`functions that are part of the underlying retrieval system. These functions or services can
`
`be summarized in the following modules:
`
`1.
`
`2.
`
`3.
`
`Explainer - explains system operation to the user,
`
`Domain Knowledge Expert - suggests additional concepts to the user and ac(cid:173)
`quires domain knowledge from the user,
`
`Request Model Builder - maintains information about the current state of the
`session such as relevant conc epts and relevant documents,
`
`020
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`4.
`
`5.
`
`6.
`
`7.
`
`7
`Search Controller - chooses search techniques that are appropriate to the
`current state of the session and information need,
`
`User Model Builder - determines what kind of end-user is currently in(cid:173)
`teracting with the system,
`
`Browsing Expert - provides recommendations to the user about information
`to view that is likely to be relevant when the user is browsing, and remember
`the path that the user has taken during browsing,
`
`Control Module - determines the direction of the "dialogue" that system has
`with the user.
`
`The representation for the documents must contain all the information necessary to
`
`support multiple search strategies and browsing. Traditional systems have usually main-
`
`tained simple inverted files that would be inadequate in this casco In addition, thesaurus
`
`information in most systems has not been integrated into the overall retrieval process.
`
`A number of other factors come into play in determining what the requirements of
`
`the retrieval system should be. One important factor of traditional information retrieval
`
`systems that is desirable to maintain is their domain independence. This means that the
`
`system cannot depend on having a significant amount of domain specific knowledge.
`
`However, since domain knowledge is very useful in assisting the user to precisely express
`
`his information need, the system should have the ability to use whatever domain specific
`
`knowledge that is available, and should be able to acquire this knowledge from the user.
`
`1.5 Architecture
`
`In order to build a system that provides the kinds of facilities that the combined
`
`search intermediaryjretrieval system does, it must have an architecture that allows it to be
`
`flexible. This flexibility is manifested in a number of different ways. First, the system
`
`must adapt itself to different kinds of users and different kinds of information needs; this is
`
`external flexibility. Second, it must be flexible enough so that it can incorporate new tech-
`
`niques as they are developed; this is internal flexibility.
`
`021
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`8
`
`The first kind of flexibility requires that the system changes the way it interacts with
`
`the user as does the intermediary. For a novice user, it should offer more explanation and
`
`assistance, and it should limit his choices so that he does not get in to a situation that he
`
`cannot handle; for an expert user it should not interfere with his use of the system, and
`
`should provide him access to all of the system's functionality. Another aspect of this flexi(cid:173)
`
`bility is that different kinds of information needs require different kinds of searches. The
`
`system must be able to respond appropriately.
`
`The second kind of flexibility requires an architecture that is modular in nature.
`
`This modularity should be at two levels.
`
`It should be able to support the addition of new
`
`large pieces of functionality. This would allow it to take advantage of new developments in
`
`information retrieval research. Each large scale function should also be modular, so that it
`
`can be adjusted to operate more effectively as the pattern of system usage is established. It
`
`also allows for the integration of new developments. For example, if a new search tech(cid:173)
`
`nique is developed that is particularly good at retrieving relevant information for one kind
`
`of information need, it can be incorporated into the search function of the system.
`
`The architecture that best supports the requirements of an intelligent IR interface is a
`
`modified blackboard architecture [Erman 80, Nii 86a Nii, 86b]. A blackboard architecture,
`
`of which Hearsay II is a typical example, consists of a number of independently operating
`
`modules, called knowledge sources, that work together to solve a problem. Each works on
`
`a particular aspect of a problem. The results of their work is posted on a shared data struc(cid:173)
`
`ture called a blackboard. This blackboard is typically organized as a series of levels that
`
`represent abstraction levels of the problem. The operation of the knowledge sources is
`
`coordinated by a scheduler.
`
`The basic operation of a blackboard system is as follows. First, each expert exam(cid:173)
`
`ines the state of the blackboard in its area of interest. It then decides if it has any action that
`
`it would like to perform based on the current conditions.
`
`If it does, it places an action
`
`022
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`9
`
`(called an instantiation) on the system agenda. The agenda is examined by the scheduler
`
`and is sorted in order of importance based on criteria that are problem dependent. The
`
`scheduler then takes the most important action and runs it. The cycle then begins again.
`
`The blackboard architecture is appropriate since supports the easy addition of large
`
`scale functions by means of knowledge sources.
`
`In addition, the way that knowledge
`
`sources are to be implemented is not specified, so they can be implemented in the way that
`
`is most appropriate for their specific task. The knowledge sources in I3R are called experts
`
`since they are implemented as individual rule based systems. This provides a means of in-
`
`crementally developing the experts. These experts correspond to the functions that were
`
`derived in the system analysis.
`
`The basic blackboard architecture must be adapted to fit the nature of the in-
`
`formation retrieval problem. The first adaptation is to the structure of the blackboard; it is
`
`not structured into abstraction levels, since there is no single overall hierarchical rep-
`
`resentation that can be applied to IR.
`
`Instead, the blackboard, called the short term
`
`memory, consists of different models built by the experts in the course of the session .
`
`The purpose of the control function in I3R also differs from that of the scheduler in
`
`a typical blackboard system. In a typical system, the scheduler manages the system's re(cid:173)
`
`sources to corne to the solution of the problem in the shortest time possible.
`
`In I3R the
`
`control expert manages the dialogue the system has with the user, so that it is consistent
`
`and coherent. This difference stems from the fact that information retrieval is better likened
`
`to a process than to a problem to be solved. The control expert makes sure that the process
`
`is conducted correctly.
`
`The control function uses information provided by the user model builder and the
`
`request model to determine the course of a session. The information for the user model
`
`builder is based 011 the stereotypes that the UMB decides apply to the particular user for the
`
`particular session. Stereotypes are mod els of different kinds of typical users.
`
`In the
`
`023
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`10
`
`current system three general categories are used, with two values for each category. The
`
`categories are domain expertise, search system expertise, and search type. The values are
`
`novice and expert for the first two categories, and selective or exhaustive for the third cat(cid:173)
`
`egory.
`
`The documents, concepts, and user histories are kept in a long term memory. The
`
`user histories store information about the user obtained from previous sessions with the
`
`system. This includes the original query, concepts that were judged relevant, documents
`
`that were judged relevant, and the stereotypes that were in effect at the end of the session.
`
`Also included in the user histories is a model of the whatever domain knowledge that the
`
`user has contributed in the course of his interaction with the system.
`
`The system also maintains a store of global domain knowledge that is derived from
`
`available sources such as thesauri, and domain experts that use the system. This store is
`
`organized as semantic net [Quillian 68] with the concepts being the nodes and the links be(cid:173)
`
`ing the relationships. Stored with the concepts nodes is their frequency of occurrence in
`
`the document collection. Included with the normal conceptual relationships is a statistical
`
`nearest neighbor relationship that reflects the occurrence of concepts together in documents
`
`of the collection.
`
`The documents are represented by lists of concepts that occur in them (authors are
`
`also consid ered concepts) and their frequency in the document. The lists are determined
`
`using a standard automati c indexing technique IPorter 80]. Additionally, citation informa(cid:173)
`
`tion is retained along with the docum ent nearest neighbors, which is a link based on the
`
`similarity of the representations of two documents. Other information such as the date and
`
`journal is included. The combination of the user domain knowledge model, global domain
`
`knowledge model, and document database forms the concept/document knowledge base.
`
`The concept/document knowledge base supports all of the traditional search tech(cid:173)
`
`niques, as well as providing a structure that the user can browse. Browsing is considered
`
`024
`
`Facebook Inc. Ex. 1214 Part 1
`
`

`
`11
`
`an important alterna

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket