throbber
32
`
`queries are used. These two measures arc usually combined into a precision/recall (P/R)
`
`graph, which shows the precision at standard recall points, usually at intervals of to
`
`percent. Figure 2.3 shows a typical P/R graph. A P/R graph for a particular strategy for a
`
`given collection represents the results of all the queries in the test collection averaged
`
`together. The details of the averaging techniques can be found in Van Rijsbergen [19791 or
`
`Salton 119S3J.
`
`o 10 20 30 40 50 60 70 80 90 100
`% RECALL
`
`100
`90
`80
`70
`60
`50
`40
`30
`20
`
`10
`o
`
`%pRECISIoN
`
`Figure 2.3: Typical precision/recall graph.
`
`A measure that can be used instead of recall is fallout, which is (from figure 2.3) n-r / N-R.
`
`This measure makes more sense in an operational system where the number of documents
`
`relevant to a particular query is unknown. However, since comparative studies are done
`
`using test coll ections, fallout is seldom used in IR literature for comparing the effectiveness
`
`of different systems or search strategies. There are a number of other reasons why fallout
`
`may be preferred, which may be found in VanRijsbergen [1979] or Salton [1983].
`
`The measures of recall and precision can be combined to give a single valued mea-
`
`sure of effe ctiveness for a particular search strategy for a given collection. One way is to
`
`001
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`take the average of the precision values at various recall points. Another composite mea-
`
`sure is the E-measure [Van Rijsbergen 79/, which is given by
`
`33
`
`1 _
`
`1
`(I-a)
`a
`-+ - -
`P
`R
`
`(2.5)
`
`where a = 1 / (~2 + 1) and ~ represents the relative importance attached to precision and
`
`recall. A value of 1 for ~ indicates equal importance to precision and recall. Values greater
`
`than 1 indicate greater emphasis on recall; less than 1 on precision. Smaller values of E
`
`indicate greater retrieval effectiveness.
`
`Another single-valued measure that has been proposed is search length, which is
`
`defined to be the number of non-relevant documents that the user must examine before his
`
`search length will be zero. This measure can used in systems that provide an ordering of
`
`the documents retrieved or as in THOMAS [Oddy 74, 771 produce one document at a time
`
`for evaluation. A variation of this mea sure is to count the number of non -relevant
`
`documents presented before the first relevant document is presented. This is used t~
`
`estimate, in THOMAS, the amount of work expended in formulating the query. Ofori
`
`[1982] also used this measure in the estimatiton of the amount of effort that was expended
`
`in the course of a search session.
`
`While testing different search methods with standard test collections gives a basis
`
`for compari ng their relative performance and for testing the effects of new strategies or
`
`wei ghting methods, a few things should be kept in mind. First, the concept of relevance
`
`for a document is subjective. Different users have different needs , which will be reflected
`
`in the kind of document that they choose as relevant. Second. the relevance judgements of
`
`test collection s were made by domain experts; each expert developed one or more queries
`
`and looked through the collection to find the documents that were relevant. Therefore, the
`
`002
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`34
`
`relevance judgements may reflect the biases of the experts or overlook what might appear to
`
`be relevant to an unsophisticated user. Another consideration is how the relevance judge(cid:173)
`
`ments were made.
`
`In some collections, the experts had a fairly comprehensive idea of the
`
`documents in the collection , so they could make reasonably knowledgeable evaluations. In
`
`other collections, the experts derived queries, submitted them to a retrieval system,
`
`evaluated the results, and then followed citations from the retrieved documents to others.
`
`Even with these considerations, the use of standard test collections represent the only way
`
`developed so far to perform repeatable experiments.
`
`2.3 Retrieval Problems
`
`The advantages of the traditionallR systems are that they are easy to implement,
`
`relatively fast, and those that use automatic indexing are domain independent. However,
`
`there is a major problem, and that is the overall performance of these systems is low. The
`
`average precision of the advanced statistical systems varies from 50% (unusual) to 25%
`
`(more typical) taken at recall intervals of 5% from 90% to 5% [Fagin 87]. One of the
`
`causes of this problem is that traditional systems are limited to only a single way of
`
`responding to every user and every type of problem. This limitation manifests itself in two
`
`important ways, in query formulation and in the search process.
`
`2.3.1
`
`Query Formulation Problems
`
`Query formulation is the process of expressing an information need, which is in the
`
`mind of the user, in the best possible way in the form required by the system. Certainly,
`
`the best way is one that provides the system with the necessary information to retrieve a
`
`high proportion of the documents that the user wants and a low number of unusable ones .
`
`The perfect search would retrieve all and only those documents that the user wants or
`
`needs. The results of a search canonly be as good as the description of the items being
`
`003
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`35
`
`sought. Therefore, if the user cannot adequately define what he is looking for, the system
`
`cannot produce adequate results. There are two major obstacles to the user in expressing
`
`his information need in an effective way. The first is the form of the query required by the
`
`system, and the second is the ability of the user to express precisely the information need.
`
`Some users may know exactly what they want and can describe it precisely, while others
`
`may not be able to do so.
`
`2.3.1.1
`
`Form of the Query
`
`Most commercially available systems require that the user cast his need into a for-
`
`mal notation using Boolean logic. Users not familiar with Boolean logic find it difficult to
`
`use. The NOT operator, particularly, causes problems, although it is essential for getting
`
`good results [Vigil 83]. Furthermore, the Boolean connectives AND and OR do not have a
`
`sufficiently precise meaning with regard to the relationships between words. This lack of
`
`precision can confuse the casual user.
`
`For example. when two words, such as,
`
`"information" and "retrieval," are ANDed together, does the user mean that the two words
`
`form the phrase "information retrieval" or, simply, that they both are required to be present.
`
`The OR connective also presents some ambiguity. Words combined with OR are usually
`
`taken to be synonymous, but OR can also be used to list the components of a higher level
`
`concept. For example, the concept of "additive primary color" can be expressed by the list:
`
`"red OR green OR blue." AND and OR can also be used interchangeably in everyday
`
`usage, as in the sentences: "I do not like peas and carrots," and " I do not like peas or car-
`
`rots."
`
`To overcome some of these problems, some systems allow the user to specify
`
`proximity information in the query, but these additional operators can be a source of con-
`
`fusion adding further difficulty to query formulation.
`
`It is clear that when two words are
`
`specified to be immediately adjacent that the user is specifying a phrase, but what the user
`
`004
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`36
`
`means is less clear if the proximity is 5 words or 20 words. Perhaps the user means that
`
`the words must occur in the same sentence or the same paragraph. Some systems that
`
`retain the full text of the document provide more me aningful sentence and paragraph
`
`proximity operators. The addition of more operators, however, requires that the user make
`
`more of an investment in learning how to formulate queries, which he may not want to do.
`
`Systems using automatic indexing are easier to use, allowing the user to enter his
`
`query as free text. This text is then indexed, generating the same internal representation as
`
`used for the documents. Although this form is easier to use, it suffers from the loss of
`
`information rega.rding relationships between words . It also can pick up spurious words,
`
`depending on how the user phrases his request. For example, if the request leads off with
`
`the phrase, "I am interested in...", the word "interested" will be part of the representation
`
`of the query, even though it has nothing to do with what the user is looking for.
`
`2.3.1.2
`
`Content of the Query
`
`Inability to express the content of the query has two causes. The first cause is that
`
`the user lacks knowledge of the terms and their interrelationships of the domain he is
`
`searching. It may be that the user has a definite idea of what he wants, but cannot find the
`
`appropriate terms. For example, a scientist or engineer may change his primary field of
`
`interest and cease to keep up with developments in his former field. At some time, he may
`
`need to look for information in his former field, but may not know new terminology. A
`
`searcher that does not have a good understanding of the area in which he is working cannot
`
`select the necessary words to compose a well defined query. For example, a person
`
`searching in a medical domain may be interested in all the information about a particular
`
`class of diseases . The documents describing those diseases mayor may not refer to the
`
`general class name when discussing a particular disease, and if the documents in the
`
`collection are automatically indexed, then that class name will not be in the document
`
`005
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`37
`
`representatives. If the user does not provide specific disease names as well as the general
`
`class name, a number of poten tially relevant documents may not be retrieved.
`
`The second cause is that the user does not have a definite idea of what he wants but
`
`perhaps, only a vague notion or hunch. This is not an unusual experience.
`
`It stems from
`
`the user trying to describe something that he does not know. An example might be some-
`
`one who wants to know about the topic of "color." Is the user interested in the physics of
`
`colored light, the psychological effects of color, the physiological aspects of color vision,
`
`or color theory as it relates to art to name a few of the possibilities. The information de-
`
`sired. may actually draw from all of these areas, but at the time the user comes to the sys-
`
`tern, he may not really know exactly what he wants. In this case, the user may need to ex-
`
`amine results of a few searches to help him clarify his need. This kind of difficulty is the
`..._-------
`-- -
`-
`_ ...-- - ---_ .__.."._- -- -_.._ - - ...._...--- --_.. " .. -,.. ----
`- _._-
`------------ -------------_.._---.-
`---.-- ----_._- _._-_ .- -- - - -_ ._ -
`-
`-
`-
`-
`-
`-
`basis for THOMAS [Oddy 77j and RABBIT/ARGON [Patel-Schneider 85). Both of these
`
`systems assist the user by showing him examples of what his current query will retrieve,
`
`,
`
`so that he may alter them to better match what he really wants.
`
`2.3.2
`
`Search Methods
`
`Search methods in Boolean retrieval are directed at gening a set of documents small
`
`enough so that the user can conveniently examine all the titles and abstracts. Since the
`
`query expression acts as a filter on the collection selecting only those that match, no order-
`
`ing of the documents is done . Each document is considered as equally relevant. Conse-
`
`quently, the user must examine all of the documents in the retrieved set to assure himself
`
`that no document of value is missed.
`
`The feedback for manipulating the query is indirect, consisting primarily of tracking
`
`the size of the retrieved document set, and occasionally looking at some of the documents
`
`in the current set. Only when the size of the set is in theneighborhood of 10 to 20 will the
`
`user, if present, look at all of the documents. If a search intermediary is doing the. search-
`
`006
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`ing without the user in attendance, he must use his judgement based on the knowledge of
`
`the topic he has gained from the user and from his own background to determine whether
`
`38
`
`or not the results are adequate.
`
`2.3.3
`
`Search Intermediaries
`
`The difficulties in using traditional systems is most often overcome by employing
`
`people called search intermediaries or search analysts [Barraclough 77, Marcus 78J. These
`
`individuals have specialized knowledge in 4 areas:
`
`Knowledge of command languages for different systems.
`
`Knowledge of the subject coverage of particular databases.
`
`Knowledge of how to use Boolean logic to formulate a query.
`
`Knowledge of how to elicit information from the user about the information
`need.
`
`•
`
`•
`
`They remove from the end user the burden of having to concern himself with the details of
`
`the command language, the selection of which database to search, and how exactly to for-
`
`mulate a query. The problem with this approach is that the person with the information
`
`need is not doing the searching. No matter how good the intermediary is, there is no way
`
`that the end user can really convey the exact nature of the need. The preferred arrangement
`
`is the person with the need working with the intermediary as the search is being carried out.
`
`In this way, the intermediary can get immediate feedback on the progress of the search.
`
`Th is, however, is often not the most economical arrangement, since it can slow down the
`
`intermediary and increase the connection charges.
`
`Another difficulty is that intermediaries are not always available at the time when the
`
`user may be looking for information. For example, university students often search for in-
`
`formation during the evening and on weekends, and people who use services such as
`
`Compuserve tend to search at night when phone rates are at their lowest. Another consid-
`
`eration is that with the advent of optical disk technologies, such as CD-ROM, large
`
`007
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`amounts of information may be available to many people at a relatively low price for use in
`
`their homes or businesses. People using search systems based on CD-ROM technology in
`
`their home or business may not have access to intermediary services, such as are available
`
`in large research institutions or corporate technical libraries.
`
`39
`
`2.4
`
`Intelligent Text Retrieval
`
`With the success of knowledge-based systems in areas such as medical diagnosis,
`
`computer system configuration, and others, the developments in natural language process-
`
`ing techniques, and computer aided instruction, it appears that artificial intelligence (AI)
`
`may have techniques to contribute that will make IR systems more effective and overcome
`
`the problems described in the previous section. This is especially attractive, since research
`
`in improving statistical systems using purely statistical means appears to have reached a
`
`peak LBelkin 87aj.
`
`Knowledge-based, often called "expert", systems technology seems to be ap-
`
`plicable since there are search intermediaries who are experts in conducting retrieval
`
`sessions and since there is no standard method to conduct a retrieval session.
`
`Intermediaries develop their own sets of heuristics or rules of thumb. These heuristics deal
`
`not only with the details of the actual search, but also with methods for eliciting the in-
`
`formation need from the user, and transforming it into a query. Bates [l979a, 1979b]
`
`presents some heuristics related to search. These heuristics can be encoded into rules that
`
`form the basis of the typical expert system IHayes-Roth 83J.
`
`The work in story understanding, fact retrieval, and translating natural language
`
`database queries also appears to be especially appropriate. The focus in story understand-
`
`ing IWilensky 841 and fact retrieval work IKolodner 831 has been to develop represcnra-
`
`lions for the meaning of narrative text so that paraphrases of the story can be generated and
`
`questions can be answered about the actions and intentions of the characters. The focus of
`
`008
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`40
`
`the database query work has been to provide user-friendly interfaces that do not require a
`
`user to know the formal query language of the database in order to get information. As an
`
`example sec [Salveter 87, Waltz 78(.
`
`An example of a sophisticated natural language interface system is UNIX Consul-
`
`rant (UC) [Wilensky 831, the purpose of which is to provide online help in using the UNIX
`
`operating system. One of the features that makes this system sophisticated is the ability to
`
`analyze the user's statements to infer his goals. For example, a user enters the query:
`
`I'm trying to get some more diskspace.
`
`An appropriate response is:
`
`Delete files that you do not need,
`
`or
`
`Ask the system manager for more space.
`
`The latter implies that the system is aware of the user's desire to save files, and that getting
`
`more space in his account will satisfy that desire.
`
`Based on these ideas, one could envision an intelligent retrieval system that would
`
`assist the user in expressing his information need so that the system could understand it and
`
`determine an appropriate response. This idea of applying Al to IR has led to several defi-
`
`nitions of what an intelligent retrieval system might be. One definition of an intelligent re-
`
`trieval system is given by Sparck Jones 11983J:
`
`...an intelligent retrieval system would have an inferential capability so that it
`could deploy prior knowledge to establish, if not by certain reasoning at any rate by
`plausible reasoning, a connection between a request and a candidate relevant docu(cid:173)
`ment.
`
`Brooks [1987J expands on this definition by incorporating more awareness of the user into
`
`the requirements of the system. She states that given the following three conditions
`
`1 .
`
`2.
`
`A user who has come to an information service with a problem for whose
`resolution or management, information is required.
`
`The user is unable to specify exactly the information needed, since this is de(cid:173)
`scribing what the user does not know.
`
`009
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`3.
`
`Access to a number of document databases.
`
`41
`
`The system must, by use of its knowledge of its world (of documents. users and topics),
`
`and its knowledge of the specific user and problem, infer documents that will enable the
`
`user to better resolve or manage the problem. The inclusion of awareness of the user in(cid:173)
`
`creases the complexity of the system dramatically, since it now must handle users that have
`
`varying levels of domain knowledge and computer experience, as well as widely differing
`
`search goals.
`
`An intelligent text retrieval system would use artificial intelligence techniques to im(cid:173)
`
`prove each of the four system elements defined previously. AI techniques for text repre(cid:173)
`
`sentation and indexing elements are intimately connected. Often, when one picks a repre(cid:173)
`
`sentation scheme the indexing or parsing method comes along with it. There are a number
`
`----- ------------- --- -- of-diffi€ulties-w ith -using the-curren tly-available-techniques,-First-i-s the-problem-of-dsaling .
`
`with enormous amounts of domain knowledge. Not onlly does one have to consider the
`
`number of possible concepts that a general information retrieval system has to deal with,
`
`but also the number and types of possible connections between them. The typical approach
`
`has been to restrict the application to a narrow domain. This reduces the volume of infor(cid:173)
`
`mation to a level at which domain knowledge representation can be built by hand. For
`
`example, DC only deals with the UNIX file system. This kind of approach is not appro(cid:173)
`
`priate in IR since real systems often must provide access to hundreds of thousands of doc(cid:173)
`
`uments. Applicable AI models for representing domain knowledge are semantic nets
`
`[Quillian 68] and rule representations [Tong 87J.
`
`Another problem is selecting the appropriate representation for document meaning.
`
`At this point there is no universally accepted method of representing the meaning of natural
`
`language. There are a number of possibilities, which include conceptual dependency the(cid:173)
`
`ory ISchank 75 j, Montague semantics IDowty XI\, and logic representations ISimmons
`
`87 J.
`
`010
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`42
`
`Besides the difficulty of representing the knowledge contained in documents and
`
`their domain is the difficulty in searching to find relevant information, The methods on
`
`how to do this are not well established. Fact retrieval systems [Kolodner 8Jluse sophisti(cid:173)
`
`cated matching algorithms on their internal representation structures that may not be effi(cid:173)
`
`cient enough to work in an IR setting . Furthermore, the intermediary knowledge that does
`
`exist about searching, that could be used in construction of an expert system, is oriented
`
`toward optimizing the performance of commercially available systems.
`
`Van Rijsbergen IVan Rijsbergen 87/ has taken some initial steps in characterizing
`
`the logical foundation necessary for intelligent retrieval decision methods, although the
`
`concept of "logical relevance" appears as early as 1971 [Cooper 71].
`
`Instead of
`
`considering retrieval as being based on a similarity or a likelihood, it is considered as an
`
`inference that allows the system to determine if a particular document implies the user's
`
`query. Associated with this inference is a measure of its certainty.
`
`If the measure of the
`
`certainty is sufficiently high, then the document will be selected for presentation to the user
`
`as a candidate relevant document.
`
`The significant point about this reformulation of the document selection process is
`
`that it relies on an accumulation of evidence to determine the certainty of the implication.
`
`This evidence may come from a variety of sources. One of the difficulties with the
`
`formulation is the definition of the semantics and its representation. This is an open re(cid:173)
`
`search problem. The traditional keyword approach represents a primitive approach to
`
`representing the semantics of text. However, keywords are only one possible source of
`
`evidence. Other possible sources are citations and other kinds of user defined links.
`
`Even if one could develop a system that retrieves documents intelligently based on
`
`the natural language description of the information need, the nagging query formulation
`
`problems remain. The use of natural language as the means of expressing the need by no
`
`means solves the problem. Each user will have a different degree of ability with natural
`
`011
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`43
`
`language. The user must still take time to carefully draft his query or the system will have
`
`to have the ability to handle ill-formed and perhaps ambiguous need statements.
`
`Furthermore, the system will have to deal with users that do not have a clear idea of what
`
`they want, besides not being able to express it.
`
`The work of intelligent retrieval has for the most part, as traditional retrieval work
`
`has, focussed on the objects of retrieval, documents, queries, and inference methods, and
`
`not on the end user. Consideration of the end user has a significant impact on the design of
`
`the interface element of the system as well as the search element.
`
`It introduces a new di-
`
`mension to the kinds of knowledge that the system must consider in order to be effective.
`
`But, by doing this the system builder can take advantage of the expertise of the search in(cid:173)
`
`termediary in helping the user refine his information need.
`
`2.5 Analysis of Systems
`
`The systems to be reviewed in this section are organized by their major emphasis,
`
`which corresponds to what element of an information retrieval system they are trying to
`
`improve. The systems are organized by what is considered to be their major feature or em(cid:173)
`
`phasis relative to the elements that compose an information retrieval system. The initial
`
`section concentrates on the representation of meaning and the organization of the text ele-
`
`ments.
`
`2.5,.1
`
`Representation of Meaning
`
`Research into the improvement of the representation of the content of a document
`
`fall into two primary categories, those that lise natural language processing (NLP) tech-
`
`niques and those that do not. One motivation for attempting to use techniques that do not
`
`usc the NLP techniques is to reduce or eliminate the need for the large amounts of domain
`
`012
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`knowledge that is required. Another is to retain the efficiency that automatic indexing
`
`44
`
`provides.
`
`2.5.1.1
`
`Non Natural Language Approaches
`
`The primary work in this area has been by Belkin [Belkin 82a, Belkin 82b, Belkin
`
`84]. The underlying motivation for this work is to model the user's "anomalous state of
`
`knowledge." The attempt to resolve this perceived anomaly is what causes the user to
`
`come to an information retrieval system. The actual representation of the user's
`
`information need and representation of the document content is based on the proximity of
`
`word pairs in the text. Words that are adjacent are given a score of 12; those that are in the
`
`same sentence are give a score of 4; and those in adjacent sentences are give a score of 3.
`
`These values are then taken and used to derive a graph representation of the top 40
`
`associations. The graphs can then be used in two ways. The first is to get feedback about
`
`the query from the user. The graphs show what the user thinks is important based on his
`
`expressed interest. The graph can be altered if it does not reflect the real information need.
`
`The second use is for search, where the graph representation is compared to documents that
`
`have been processed into the same graph representation. Various heuristics are used to
`
`match substructures in both of the graphs [Belkin 86].
`
`Other work in representation has focused on the addition of citation information as
`
`an extension of the normal vector representation [Salton 83J. Citation information, how(cid:173)
`
`ever does not contribute (Q the representation of meaning of the text, but serves to connect it
`
`to other pieces of text that are judged to be relevant by the text's author. This topic is
`
`expanded in section 2.5.2.
`
`013
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`2.5.1.2
`
`Natural Language Approaches
`
`45
`
`An approach of using NLP in text retrieval is FASIT (Fully Automatic Syntactically
`
`based Indexing of Text) [Dillon 83]. This approach attempts to be completely automatic
`
`and functional without semantic information, using only information about the structure of
`
`English to extract meaningful phrases from text. This is accomplished in a two stage
`
`process: concept selection and concept grouping. Concept selection is done in three steps,
`
`which are: assignment of words to syntactic categories (tagging), disambiguation of
`
`multiply tagged words, and concept selection. Concept grouping is a two stage process.
`
`The first is formation of canonical form and the second is grouping by canonical forms.
`
`For details see Dillion [1983]. Since FASIT does not require domain specific knowledge,
`
`__ ____ _______________jLi~~_~_ry.g;enera I La nd can b_e_u_secUn_rnan)':_domains,....whichis.its..primary-advantage. -It- -----------(cid:173)
`
`does not, however, perform significantly better than traditional stem-based systems.
`
`More recently, in a system named ADRENAL [Croft 87a] consideration has be
`
`given to the incorporation of semantics. The system does not attempt
`
`to build
`
`representations of every document as they enter the system, nor does the system require
`
`large amounts of domain knowledge.
`
`Instead, it uses a general representation of the
`
`semantics of science and technology called REST (REpresentation for Science and
`
`Technology). The system-also does not attempt to understand the content of documents
`
`and queries , but only garners enough information to make a judgement about the relevance
`
`of a document to a query. This process is made more efficient by not comparing the
`
`representation of the query to every document. Instead, a retrieval is made using traditional
`
`techniques, and the deep comparison, using the NLP techniques, is made with the results
`
`of the retrieval.
`
`Other research in applying NLP techniques to IR has been performed by Spark
`Jones r1984j, Smeaton [1986J as well as others.
`
`014
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`2.5.2
`
`Organization of Text Units
`
`46
`
`The first group of systems fall into the category of "hypertext" or "hypermedia"
`
`systems; the distinction being that latter incorporates more than just text The emphasis in
`
`these systems is the organization of the units of information, where the units may be one of
`
`a number of types, such as text, graphical, pictorial, or audio. There is little emphasis on
`
`the representation of the meaning of the unit itself.
`
`An early hypertext-like system is ZOG [McCracken 84]. In this system the basic
`
`unit of information is a 24 line by 80 column full screen of text. These textual units are
`
`then linked to form a network. ZOG has been used with success to implement a naval
`
`operations manual. Another similar system is the help system in the GNU EMACS
`
`[Stallman 87] text editor. In this system, the information is basically structured as a hierar(cid:173)
`
`chy with added cross reference links.
`
`This system was applied to an information retrieval system and called BROWSE
`
`[Fox 80].
`
`In these systems the only way to find information is to follow the links.
`
`BROWSE included a method of performing a "parameterized" content search which is car(cid:173)
`
`ried out in the context of the user's current location. The primary organizing structure is
`
`the Computing Reviews classification structure. The level that the user is at determines the
`
`constraints on the search.
`
`One difficulty with these systems is that the user, especially the novice user, has
`
`little context from which to determine where he is in the net. This can lead to a variety of
`
`behaviors, such as frequent backtracking or returning to the top of the classification struc(cid:173)
`
`ture to start over. The user is required to maintain a mental model of where he is in the
`
`system. This can be quite difficult in a large highly interconnected network.
`
`Another difficulty is that these systems are by and large hand-built. While ZOG
`
`provides an integrated editor, there is'no provision for the automatic addition of text to the
`
`system.
`
`015
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`47
`
`A more sophisticated system, which is representative of more recent imple-
`
`mentations of hypertext systems is Textnet [Trigg 86]; TIle emphasis in this system is the
`
`support of scientific authoring and information sharing. Because of the emphasis in this
`
`system, there are 80 different kinds of links. The two broad classifications of them are
`
`Normal and Commentary links. The Normal links define the structure of a document,
`
`provide citation links, and allow other users to attach their own opinions. The Commen-
`
`tary links provide the means to attach information that is related to the style of the docu-
`
`ment. While there is no provision for the automatic inclusion of text to the system, it is an
`
`open system. The intended use of Textnet is as an authoring system to develop ideas. As
`
`with ZOG, no provision is made to include a normal search facility. User's must follow
`
`links to find the information that they desire.
`
`Another similar system is the Hepatitis Knowledge Base (HKB) developed at the
`
`National Library of Medicine [Williamson 85]. Its intent is to provide quick access to
`
`textbook knowledge on viral hepatitis.
`
`It is organized as a three level hierarchy. The
`
`bottom level contains very specific detailed information, along with references to journal
`
`articles. As one moves up the hierarchy, the information becomes less detailed and of a
`
`summary nature. Built on top of this is a system called ANNOD that allows the user to
`
`make content based searches.
`
`Dynamic Book [Weyer 82] is similar to the HKB in that it provides access to text-
`
`book information. Its purpose was, however, to study how people search for information.
`
`It provided a number of methods to allow users to find the answers to factual questions. It
`
`also was organized hierarchically but was limited to only three levels that correspond to
`
`chapters , sections and subsections. These levels had simple title-like sentences or phrases
`
`describing the content of the corresponding unit of text. The significant point about the
`
`Dynamic Book is that it used a sophisticated multi-window interface to give user different
`
`016
`
`Facebook Inc. Ex. 1214 Part 2
`
`

`
`48
`
`views of the information, A user could trace down the hierarchy or could look at the titles
`
`alphabeticalIy.
`
`THOMAS [Oddy 77Jhas a unique way of organizing the text.
`
`Instea

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket