`
`references, to over a hundred, in the case of a review article (see [Fox 87bD. A document
`
`also may be cited by a large number of documents; for example, Dijkstra's letter [Dijkstra
`
`68/ on GOTO's has generated much controversy and still generates discussion (see
`
`editorial comment after [Klein 88]), so it is referenced many, many times. Furthermore, a
`
`document may typically have as many as twenty to thirty terms as part of its representation.
`
`This potentially large number of links cannot be effectively displayed on a graphic
`
`screen. Therefore, the number of nodes that can appear around a node of interest is
`
`reduced and the organization of the screen is made regular so that the information can be
`
`effectively managed and displayed. The maps are organized as a square grid (see figure
`
`5.3), which is not displayed. Only one object is displayed in a cell whether it is a link or a
`
`node. This limits the number of new nodes that can be displayed around a central node to a
`
`maximum of seven, since every node (but the initial node in the map) has an input link.
`
`Nodes in a neighborhood are drawn one cell away, with the links and their labels being
`
`drawn in the adjacent cells. Different symbols indicate different kinds of elements; circles
`
`represent documents, octagons represent concepts, squares represent lists of documents,
`
`hexagons represent journal issues, and diamonds arc connectors. These symbols are
`
`shaded to indicate their status. Black indicates a relevant element, gray indicates an element
`
`that has been viewed by the user, diagonal bars indicate a node that is recommended by the
`
`BE. An oversize square around another node is a mark that the user can place to indicate a
`
`node that he might wish to come back to.
`
`In addition, the nodes are labeled; document
`
`nodes have the document number, concept nodes have the concept, list nodes have the
`
`number of items that they represent.
`
`001
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`Document List Node
`
`Connectors '"""""':::::::::...-.-----t--_
`
`Relevant Node-------4-__
`
`Visited Node - - - - -
`
`Link - - - - - - -
`
`135
`
`10
`
`Reminder Box
`
`Concept Node
`
`--. --
`
`-
`
`- .- -
`
`- - --
`
`- - -- - -- - -- -- --
`
`:::"...",.,...,....".x,.,}...,.,.,.,.,."...,...,.,...,.,.;~~,;.,...".,."...,.,...,...\ ,.,.,.,.....,.,...,.,.,.,.,.,...t..,...",.,.,.,.,.",.,.,.J.....:.:.,.,...,.,..,.,.,...,.,.1=:::.;---··-------·----·-
`
`·;·;·~f·:···;···;,;·~",:~·:",:··
`
`Figure 5.3: Grid for browsing maps sho wing different nod e mark(cid:173)
`ings, but without labels .
`
`Organization of the browsin g maps in this way was done with two considerations
`
`in m ind, cognitive economy and machine efficiency.
`
`It is desirable not to overwhelm the
`
`user with a mass of nodes, such as can be the case even for an ave rage document.
`
`It is not
`
`unreasonable to assume that a document might contain 10 terms and 10 references, making
`
`20 links to other items . To display this many nodes requires a great deal of space on the
`
`screen . If a node is represented by a 1 inch diameter circl e, the 20 rel ated nod es have to be
`
`drawn on an approximately 6 inch diam eter circle around the ce nter nod e. Furtherm ore,
`
`consideration must be given to the space needed for labeli ng the links.
`
`When the user traverses a link to examine the co ntents of another node and desires
`
`to sec the neighborhood, the link must be extend ed and space must be fo und to draw the
`
`nei gh borhood . Initially, this is nOI a pro blem , but as the user co ntinues to examine differ-
`
`ent nodes, it becomes very diffi cult to man age the space. Furthermore, many of the nodes
`
`ma y not be examined. so their presence only contributes to the visual clutter.
`
`002
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`136
`
`The actual number of nodes shown depends on the connectivity of the node and the
`
`user model. For example, a term that is found in only two documents, and has no nearest
`
`neighbors or domain knowledge connections will only have two links. A system expert
`
`will have a maximum of seven nodes shown to him, where a system novice will only have
`
`four. Consequently, the browsing expert must make a choice about the nodes that are
`
`likely to be useful, and what node it considers to be the most useful. This is accomplished
`
`using heuristics that are described in section 5.3.2.
`
`There are times, due to the way that the user moves through the maps, that it may
`
`not be possible to expand the neighborhood of a node and still keep it linked to the origi-
`
`nating node. For example, if the user takes a breadth-first approach to viewing the nodes,
`
`the neighborhood around the first node will be filled quickly. In this case, a connector is
`
`used. The node to be expanded is replaced by a connector, and its neighborhood is drawn
`
`somewhere else on the map where there is sufficient open space.
`
`It is drawn with a
`
`connector on its input node. If the user selects the connector, the map is moved to the
`
`location of the neighborhood . If the connector on the input link to the new neighborhood is
`
`selected the map is moved back to the neighborhood of the originating node.
`
`As mentioned previously the user can elect to go off in a direction of his own
`
`choosing. To do so, the user selects from the choice s available in the content menus of the
`
`windows that display a document or a concept.
`
`In the case of a document, the user can
`
`look at the reference list, citation list, nearest neighbor list, or journal issue list. If one of
`
`these is selected, a node representing it will be placed on the maps. This is shown in figure
`
`5.4. The reference list for the selected paper is:
`
`1.
`
`2.
`
`3 .
`
`Friedman, J.H., Bently, J.L., Finkel, RA. An algorithm for finding best
`matches in logarithmic time. Stanford CS Rep. 75-482.
`
`Blum, M., Floyd, R.W., Pratt, Y., Rivest RL., Tarjan, RE. Time bounds
`for selection . Stanford CS Rep. 73-349.
`
`Finkel, R.A:, & Bently, J.L "Quad Trees: a data structure for retrieval on
`composite key." Acta Informatica 4, 1(1974),1-9.
`
`003
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`4.
`
`5.
`
`137
`Knuth, D.E. The Art of Computer Programming, Vol I: Fundamental Algo(cid:173)
`rithms. Addison-Wesley, Reading MA, 1969.
`
`Knuth, D.E. The Art of Computer Programming, Vol Ill: Sorting and
`Searching Addison-Wesley, Reading MA, 1973.
`
`6. McCreight, E. Computer Science 144A midterm examination, spring quarter,
`1973, Stanford University.
`
`7.
`
`Rivest, R.L. Analysis of Associative Retrieval Algorithms. Stanford CS Rep.
`74-415.
`
`NeHt
`Plreulous
`I nttlat
`Last
`Help
`
`Ref
`7
`
`_ __..__.._---
`
`-
`
`-
`
`-
`
`-
`
`- - _.. _-- _._-
`
`·-- ··----8··----_._-- ~-- ./tmlh::::
`
`_
`
`_
`
`_
`
`d1116
`
`C
`
`d2722
`-:::::~t~~tf::~::::·· -,
`N
`
`North
`North-East
`East
`South-East
`South
`South-West
`West
`North-West
`-t""cU..S..p..eAd...(cid:173)
`Help
`
`Journal: CACM Year:1975 Month:9
`
`e at Entry
`Suspend
`Multidimensional Binary Search Trees Used fo Help
`Rssociatiue Searching
`
`Bently, J.L.
`
`2722
`
`This paper deuelups the multidimensional binary
`search tree (or k-d tree, where k is the
`dimensionality of the search space) as a data
`structure for storage of information to be
`retrieued by associative searches. The k-d tree
`
`Citations
`N. Neighbors
`Brooder
`Narrnur e r
`Related
`Synonym
`Phrase
`Entry Ole
`Cancel
`Releuant
`Done
`Hel
`
`Figure 5.4: User chooses the References selection.
`
`004
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`138
`
`This list would be displayed in a document-list window with a banner labeling it as Re f-
`
`e rence Li st: Docu me n t 2722. None of these papers are available in the current
`
`collection, so the user cannot go anywhere. If any of the papers turned out to have an in-
`
`teresting title, which the user selected for viewing, the document list node would be ex-
`
`panded and the document node drawn around it. He then could choose the j . issue op-
`
`tion to find that the issue contained primarily papers that are related by the fact that they
`
`were the top entries in the 1975 Forsythe student paper competition. The only citations in
`
`the current collection are already displayed on the map.
`
`In the case of a concept, he can choose the Sel ect option from the content menu
`
`and then select the concept he wishes to examine. A window with the new concept then
`
`appears and a concept node is added to the map.
`
`Nor-th
`Nor-th-East
`East
`South-East
`South
`South-West
`West
`Nor-th-West
`Suspend
`Help
`
`Ref
`
`7 e
`
`NeHt
`Pr-elJious
`Initial
`last
`Help
`
`Figure 5.5: Neighborhood Map showin g the addition of Reference
`and Journal Issue node s.
`
`005
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`5.3.2
`
`Browsing Heuristics
`
`139
`
`The browsing heuristics make use of the evidence provided by the links and the re-
`
`quest model to recommend nodes that are strongly related to the node currently in view.
`
`Those that are related by multiple pieces of evidence are selected as the most promising to
`
`examine. None of the heuristics infer recommendations by looking more than one link
`
`away. The heuristics are divided into two kinds, concept heuristics and document heuris-
`
`tics.
`
`Figure 5.6: Context Map showing configuration if the Reference
`and the Journal Issue Nodes are expanded (Content and Window menus not
`shown, but are the same as a neighborhood map).
`
`5.3.2.1
`
`Concept Heuristics
`
`The concept heuristics contain one underlying assumption; since documents are the
`
`fundamental units of information in the system, the user sh ould be directed towards
`
`documents rather than other concepts. Consequently, wh en a concept occurs in 4 to 7
`
`006
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`140
`
`documents (depending on the user model) that have not been seen, the recommended nodes
`
`will be those documents.
`
`In determining what concepts to show if the previous condition does not hold, the
`
`most important links between concepts are the nearest neighbor and the synonym links. In
`
`the domain knowledge there should be relatively few synonym links, since these represent
`
`a very strict view of synonymy. Similarly, for any single word concept there should be
`
`few nearest neighbors, since only at most the top five most similar ones are saved. The
`
`next most important are the related-to and phrase links. The following are the heuristics of
`
`choosing concepts.
`
`I.
`
`2.
`
`3.
`
`4.
`
`5.
`
`6.
`
`7.
`
`8.
`
`For any term (single-word concept) that has been marked relevant and occurs
`in 4 to 7 documents (depending on the user model) that have not been viewed
`select those documents as the neighborhood. If this is not the case, perform
`the subsequent steps.
`
`Retrieve concepts connected by nearest-neighbor-to links. These are concepts
`on the nearest neighbor list of the current concept, and are given a weight of
`3.
`
`Retrieve concepts connected by nearest-neighbor-from links. These are con(cid:173)
`cepts that have the current concept on their nearest neighbor list, and also are
`given a weight of 3.
`
`Retrieve concepts connected by synonym links. These are given a weight of
`2.
`
`Retrieve concepts connected by related-to links. These are given a weight of
`1.
`
`Retrieve concepts connected by narrower links. These are given a weight of
`1.
`
`Retrieve concepts connected by phrase links. These are given a weight of 1.
`
`If there are any tied scores, order by document frequency giving higher pref(cid:173)
`erence to those in fewer documents.
`
`Once the list has been determined, the BE will take the top 4 to 7 as determined by the user
`
`model and send them to the interface manager for displ ay. The highest ranked will marked
`
`as the recommended one.
`
`007
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`141
`
`While a user is viewing a concept, he may decide to look at the documents in which
`
`it occurs. The BE will act differently depending on the frequency of the concept in the
`
`collection and the user specific values determined by the UMB. The selection of the
`
`documents to display as nodes is determined by the frequency of the concept in the docu(cid:173)
`
`ment and by the date of the document; the newest is given preference.
`
`If the number of
`
`documents is less than the number of nodes specified by the user model, all of the doc(cid:173)
`
`uments are displayed as nodes connected to the concept (figure 5.7b) and the links are
`
`marked by the concept frequency. If the number of documents is more than the number
`
`specifi ed by the UMB or if there is no room around the node, then some of the documents
`
`are displayed as individual nodes and the remaining are represented by a document list node
`
`(box) that is marked with the number of documents it represents (figure 5.7c). If the user
`
`decides to examine the documents represented by the document list node, then that node is
`
`extended and as many document nodes as possible are shown with the remaining docu(cid:173)
`
`ments again being represented by a box node (figure 5.7d) . .If the concept occurs in more
`
`than 13 documents and the user is a system expert, the system asks if he really wants to
`
`look at that many documents. The value of 13 is chosen because of the organization of the
`
`map . Figure 5.8 shows the number of nodes that can be displayed around a document list
`
`node and one extension of it; there are 13 nodes surrounding the two document list nodes.
`
`If any more documents are on the list a third document list node would be required. This
`
`extra node would be shown far to the right , in this case, of the original node of interest,
`
`and would lead the user away from the immediate neighborhoods of the original nod e. If
`
`the user does want to look at that many documents, the document list nodes are expanded
`
`as before.
`
`If the user is a system novice, he simply is not allowed to look at that many
`
`documents, and a message informing him of that fact will be displ ayed in the system
`
`messages window.
`
`008
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`(a)
`Sample concept neighborhood
`
`(h)
`Documents included
`
`0(30
`'"
`'R N R/
`(5-3 aaaa 2-6
`/
`/S
`
`1/
`
`10
`
`(c)
`With more documents
`than can be shown
`
`(d)
`Expansion of document list node
`
`Figure 5.7: Example term neighborhoods (node markings are:
`R = Related, N = Nearest Neighbor, <number> = occurrences of a
`concept in a document).
`
`Original
`Node
`
`Document
`List Node - - -
`
`Figure 5.8: Expansion of a document list.
`
`009
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`5.3.2.2
`
`Document Heuristics
`
`143
`
`The document heuristics are similar to the concept ones. The most important links
`
`are the nearest neighbor links (to and from, both given a weight of 3). After this is the
`
`citation link (given a weight of 1), and then the reference link (given a weight of 1).
`However, since adocument can potentially have many references and be cited many times,
`
`an evaluation of the documents connected by the citation links must be made. A document
`
`that is cited many times is given a higher value than one cited few times. This is based on
`
`studies of citation patterns [Salton 83Jindicating that citation of a document is a relatively
`
`rare occurrence. Consequently, a document that is referenced many times is significant.
`
`The quantification of this significance for the browsing heuristics is:
`
`_ _ --'1~. _ Citc_(tJ_t!L1tirnes.an.additional.weigln~o,--f.s:1_____ _ _ __ _
`
`2.
`
`3.
`
`4.
`
`Cited 4 to 5 times, an additional weight of 2
`
`Cited 6 to 9 times, an additional weight of 3
`
`Cited 10+ times, an additional weight of 4.
`
`The small numbers are due to the particular test collection being used to develop the sys(cid:173)
`
`tem. It has information only about documents in the collection. A document could be cited
`
`many times, but since the citations are from other than the CACM, they are not available to
`
`the system. In a production system they would be tuned to more accurately reflect the fre(cid:173)
`
`quency of citation in the document collection. If there are any tied scores, they are resolved
`
`by ranking on the actual number of citations.
`
`An evaluation of a document by the size of the reference list is not so straight(cid:173)
`
`forward . A document with a large reference list may be good if the user is a novice in the
`
`domain of interest, since it might be assumed that the article is a survey of some field.
`
`However, a document that has only a few references may also be valuable, if it cites the
`
`current document of interest. This implies that the citing document may be closely related .
`
`If a document with few references is cited by the current document nothing can really be
`
`010
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`144
`
`determined about its potential value. So, if the user is a domain novice, a document with a
`
`large (10, because of the test set) reference list is given an additional weight of 2.
`
`In both cases, nodes that have been viewed are not included in the list of recom(cid:173)
`
`mended nodes. It is assumed that the searcher can remember what he has examined in a
`
`particular session, and if not, then he can access them by looking at the list of documents
`
`judged relevant, the lists of search results, or by retracing his path.
`
`The user is not limited to viewing the nodes that are displayed on the neighborhood
`
`map. When he selects a node for viewing from the map, a window containing the textual
`
`information appears. He may decide, for example, to examine the reference list of a docu(cid:173)
`
`ment node. A list of titles appears, from which he can choose anyone. If he chooses one
`
`that is not on the map, a node will be put up representing the document chosen.
`
`In this
`
`way, the map always maintains the path that the user has taken. Should the user care to
`
`backtrack, he can scroll a map window back over a node marked with a reminder box,
`
`select it for view, and move in a new direction.
`
`5.3.3
`
`Browsing Model
`
`The model in the STM by the BE is the "path" that the user has taken through the
`
`network. Each node on this path represents a node that the user has viewed in the course
`
`of the session. And with each node the recommendations made by the BE when the node
`
`was first visited are saved . The path is connected, even though the user may have entered
`
`and left browsing several times, so while the entry and exit points may not be connected in
`
`the concept/document database, they are in the path model. This allows a user to retrace
`
`his steps through all the nodes that he has seen while browsing in an entire session . The
`
`1Mprovides four commands for retracing:
`
`Last - moves the user to the last node viewed; the end of the path.
`
`Next - moves the user to the next node on the path ;'i'f not at the end.
`
`011
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`145
`Previous - moves the user to the previous node on the path, if not at the beginning.
`
`First - moves the user to the first node viewed; the beginning of the path.
`
`New nodes are added to the path only when the user examines a new node. While
`
`this sounds obvious, it means that the path does not record every move that the user makes.
`
`For example, a path at some point while browsing consists of the nodes A through E, and
`
`the user has retraced his steps back to node C. From this point, the user moves to a new
`
`node, F. The path would consist of the nodes A through F and not A, B, C, D, E, D, C,
`
`F. If the user desires to trace back the exact path, he has the maps as references.
`
`When the session is finished (not suspended), the path is saved as part of the ses-
`
`sion history, except that the recommendations are not retained.
`
`If the user comes back to
`
`the particular session and there have been no changes to the concept/document database the
`
`recommendations will be similar. The documents already seen will not be considered as
`
`candidates for recommendation during the new session. If the session is suspended, all the
`
`context information including the path and the maps is retained, so the user would see ex-
`
`actly the same scenes as he did previously.
`
`5.4 Browsing Implementation
`
`The implementation aspects of browsing that have not been discussed previously in
`
`relation to the interface manager or the implementation of the system architecture lie in the
`
`construction and maintenance of the browsing maps, and the structure of the browsing
`
`model.
`
`Recall that the organizational principle behind the browsing maps is a square grid
`
`upo n which all the nodes and other symbols are drawn. These symbols are stored in a
`
`hash table that is keyed by the grid coordinates. The rea son for this is two-fold. First, it
`
`allows the interface manager to immediately determine if there is a symbol on the screen in
`
`anyone of the cells by hashing on the coordinates.
`
`If there is a hit, then the cell is already
`
`. taken. Second, there is no way ahead of time that the 1M can know how much the user will
`
`012
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`146
`
`browse and, therefore, it has no idea how much space to allocate for the grid storage.
`
`Since hash tables in Common Lisp are automatically extended when they become full. the
`
`system need not limit the user's ability to browse because of space considerations, and can
`
`allocate a small table initially.
`
`The records that are stored in the map table have one of the following structures:
`
`(Defstruct
`
`(Node
`( : Conc-Name N-)
`(:Predicate Node?)
`Icon Types: Doc, Concept, Doc-List,
`Jou rn a l - I s s u e
`
`"
`"
`Icon
`
`,
`
`r
`
`, ,
`, ,
`, ,
`
`Values: (A)uthor,
`
`(C)oncept,
`(D)ocument
`(JI)
`- Journal Issue,
`(R) e f e r e n c e s ,
`(Ci) tations
`
`Value
`"
`a pointer to the content of the node or the
`"
`ids o f a node.
`Cont ent
`"
`Wh e r e the node was developed from
`Predece ssor
`links e man a t i n g from this node
`"
`A l is t o f
`(Su c c e s s o r s
`(Make-Array '(8)))
`"
`Ha s
`the n ode been v i s i t e d , rel evant,recommended
`Stat us
`tha t goes on the icon
`"
`Te x t
`label on the neighborhood map
`;
`N-Label
`l abel on the context map
`;
`C- Labe l
`"
`Wher e t h node i s on the maps
`Coordinates
`"
`Whe ther the node has been marked as interesting
`"
`Thi s means a r eminder box is around it
`(Marked nil))
`
`013
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`147
`
`(Defstruct
`
`(Connector
`(:Conc-Name Connect-)
`(:Predicate Connector?)
`
`Coordinates
`Source
`Destination)
`
`The map table can be saved if the session is suspended, and when resumed the map can be
`
`reconstructed. When the map table is saved, the pointer to the content is replaced with the
`
`identifier of the object pointed to. For example, with a document, the document number is
`
`put in; with a document list, a list of documents is put in; with a concept, the text of the
`
`concept is put in.
`
`Since the information that determines the contents of the browsing map is stored in
`
`_______~~
`
`~_th~l!li:1P_~~I;>J~_Lthe structure of the browsing~llodel is simple. It consists of a list of c()()Ec.li~ _ _ ~__
`
`nates of the cells that the user has been to. Any information required can be obtained by
`
`using these coordinates to key into the hash table.
`
`5.5
`
`Summary
`
`Browsing is potentially an important technique for retrieving documents from large
`
`knowledge bases.
`
`Its advantages are that a user receives immediate feedback from the
`
`structure of the knowledge base and exerts complete control over the outcome of the
`
`search . The primary disadvantage is that it is easy to get lost in a complex network of
`
`nodes representing documents and concepts. Furthermore, there is no guarantee that
`
`browsing will be as effective as a more conventional search. These disadvantages can be
`
`avoided by providing facilities for controlling the browsing and for using the information
`
`derived during browsing in conventional search techniques.
`
`014
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`CHAPTER 6
`
`EXAMPLE SCENARIOS
`
`6.1
`
`Introduction
`
`In this chapter, four example scenarios are presented to demonstrate the operation
`
`of 13R, and to show how the system as a whole operates. Before these scenarios are
`
`presented, the difficulty of evaluating highly interactive systems like I3R is discussed.
`
`It
`
`should be noted that 13R is a prototype system, consequently parts of the interface are not
`
`as well developed as those of a production system would be. Furthermore, the screens
`
`shown in this chapter were composed using a drawing program to facilitate easy inclusion
`
`into the text, and are taken from screen dumps of the system while operating. The only
`
`differences between the screens seen in the text and those seen by the user are in the
`
`typefaces and in the sizes of the windows and symbols .
`
`6.2
`
`Evaluation
`
`Evaluation of a highly interactive system such as 13R presents a number of
`
`challenges. The first is a matter of retrieval effectiveness; how will the addition of the
`
`facilities provided by 13R help increase the performance of the system. One way to answer
`
`this question is to notice that, since the system relies on retrieval techniques that have a
`
`sound theoretical and empirical basis, it will not perform any worse than those techniques.
`
`Furthermore, since Boolean query retrieval systems do not perform as well as statistically
`
`based techniques, 13R will perform better than a Boolean system. This , however, is
`
`inadequate since it only tells us the minimum performance to be expected, does not
`
`consider the amount of effort that the user must expend to use the system, and provides no
`
`real way to compare the effectiveness of this system with others.
`
`148
`
`015
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`149
`
`Major difficulties arise when considering how to evaluate a system like 13R that has
`
`a variety of different functions that all contribute to the operation of the system. One
`
`difficulty is getting genuine needs and another is getting genuine users to interact with the
`
`system. Use of the standard test collections only provides a partial foundation for testing.
`
`In the test queries, there is no indication of the user's interest in recall or precision. One
`
`possibility for ascertaining this is to count the number of relevant documents; many relevant
`
`documents implies that the query is recall oriented, and few implies that the query is
`
`precision oriented. This, however, is a simplistic categorization. It ignores the possibility
`
`that a query may be recall oriented, but there is little information on the topic in the
`
`database, so there will not be not many relevant documents. Or, the opposite case. in
`
`. .--- -
`
`- - -
`
`-
`
`which the user is looking for a specific piece of information that may be contained in many
`_...._.. - - -- -_. -
`-
`_ .._- - -- - -
`docu·ments. One possible solution-to -this proble-m is to develop an new set of queries for
`
`- -_ .._---- --
`
`one of the collections, such as the CACM collection, since it is not too large, and have the
`
`query authors indicate their interest in either an exhaustive or a specific search, as well as
`
`supplying other pertinent information This would include additional domain knowledge.
`
`A possible way to get this kind of information need description would be to extend the
`
`dialogue analysis technique used by Belkin et al. past the presearch interview stage to
`
`include retrieval with the user present,with the intermediary, helping him evaluate the
`
`results of the search.
`
`The problem of getting genuine users is also very difficult. On e possibility is to
`
`form pseudo-users, much as Oddy did in evaluating T HOMAS [Oddy 19741. Th e
`
`aforementioned dialogues could be analyzed to determine how users would react to specific
`
`situations arising in the course of a session. This analysis would produce rules that
`
`describe this behavior. The advantage of this approach is that the users remain constant,
`
`making the test relatively repeatable. This approach has a number of problems. First, it is
`
`not possible to determine all behaviors of the users using different facilities. Second, it is
`
`016
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`150
`
`not possible to determine the reaction of the user to new facilities that were not present
`
`when the sessions were being developed.
`
`The alternative is to test using real users; this too has many difficulties.
`
`It is
`
`impossible to give one user the same need to use on a system as novice for testing various
`
`combinations of facilities, since the user will have learned something about the problem the
`
`first time he uses the system. This will affect how he uses the system to deal with the need
`
`in subsequent sessions. This learning process, however, can be used as training for the
`
`user.
`
`In subsequent sessions, the user can take different system experience stereotypes,
`
`progressing from novice to expert, so a test subject can participate in more than a single
`
`experiment. This points out, however, the need for a great number of test subjects for
`
`experiments to test all the possible combinations of facilities that a system like I3R has to
`
`offer. These combinations include not only the use or non-use of a particular expert, like
`
`the domain knowledge expert for example, but
`
`the use of different heuristics in
`
`implementing the expert. This situation is exacerbated by the requirement to have a
`
`statistically large enough sample to reduce the occurrence of any biases. Although this kind
`
`of testing requires a significant investment of resources (ie. perhaps thousands of test
`
`subjects), it is in the opinion of experts [Spark Jone s 88], the only legitimate way to
`
`proceed. This kind of testing is far beyond the current resources available for the I3R
`
`research.
`
`6.3 Scenarios
`
`With the difficulties of evaluating a highly interactive system in mind, this section
`
`of the chapter presents example scenarios to indicate how the system works. These sce(cid:173)
`
`narios are meant to be illustrative of the flexibility of the architecture, and show how the it
`
`assists the user to find the information he desires. The query used in scenarios one, two,
`
`and three is taken from the 52 test queries that are a part of the CACM test collection. This
`
`017
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`means that there is an established set of relevance judgements for the query. The query
`
`used in scenario four was developed by the author, who having a background in computer
`
`151
`
`systems was qualified to make relevance judgements.
`
`6.3.1
`
`Scenario One
`
`The first scenario demonstrates the basic operation of the system. The user is a
`
`domain and a system novice. In this scenario, much of the internal operation of the system
`
`will be shown. Each section will summarize what happens during the cycle. There are a
`
`number of occasions where the user will be thinking about what his response should be.
`
`In this case the system simply "spins." This points out a difference in the implementation
`
`of the experts from traditional rule based systems. Typically, when the system has nothing
`
`-----
`
`------ --
`
`- fua o, (Ie. no rures tOfireplratSignalSlne end of processing.
`
`InI3ttirTnerrn-s-r1raTthe-
`
`---- - - - -
`
`system has nothing to do for the moment, but that does not means that more information of
`
`interest to an expert is not forthcoming from the user. The system ends when it reaches the
`
`end state.
`
`6.:3.1.1 Cycle 1
`
`The first part of every system cycle is the operation of the control expert
`
`to deter(cid:173)
`
`mine the state of the system. The initial state of the system is $Start. The CE uses rule 2
`
`that recognizes that it is in $ Start and changes it to $CRS (conduct retrieval session).
`
`Since this is not a leaf state, one that has an associated priority list of experts, the CE con(cid:173)
`
`tinues to operate. Since a state has been changed, the rules that are associated with state
`
`change are active. The rule selected changes the CE state form $CRS to $CU (characterize
`
`user). This is a leaf state, so a priority list for the experts is established, which consists of
`
`one expert, the user model builder, UMB. Figure 6.1 shows schematically the portion of
`
`the plan that the scheduler has traversed .
`
`018
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`$CRS
`
`152
`
`$CIN
`
`r~•
`
`\{r~/ff!ir?
`UMB
`
`. :..:::.::.l: '.::.••..::.:.:.:"•• a.:$CU::
`
`Figure 6.1: First portion accomplished of the CE plan. The tran(cid:173)
`sitions taken are shown in bold lines and marked with the rule that they cor(cid:173)
`respond to. A state or goal that has been "satisfied" is filled with black;
`partially satisfied goals are filled with varying shades of gray,
`
`The UMB recognizes that there is no user model, so it sends a message,
`
`(IPM :Message-No 1 :Msg-Id UMB-User-Name :Choices 1
`:Value-Type Name :Values Nil),
`
`to the interface manager, 1M, telling it to ask for the user's name. Figure, 6.2, shows the
`
`initial state of the interface with the user being prompted! for his name.
`.
`.': , '.
`es~age,s.
`.r. ".
`. - " . .
`Please enter your usernarne in the space
`prouided in the window below
`
`Releuant Docs
`Show Query
`Browse
`Quit Session
`Suspend Session
`
`'.
`.
`,:-, ."::. :•.:':-
`' .
`
`'
`
`Figure 6.2: Initial state of the interface.
`
`019
`
`Facebook Inc. Ex. 1214 Part 4
`
`
`
`6.3.1.2 Cycle 2
`
`153
`
`The user enters his name and keys return. From this action, the 1M returns the
`
`message with the : Val ues field filled in wilth the user's name, Novice. The CE runs
`
`throug