throbber
Inference Networks for Document Retrieval
`
`Howard Turtle and W. Bruce Croft
`Computer and Information Science Department
`University of Massachusetts
`Amherst, MA 01003
`
`Abstract
`The use of inference networks to support document retrieval is introduced. A
`network-based retrieval model is described and compared to conventional probabilis(cid:173)
`tic and Bool~an models.
`
`1
`
`Introduction
`
`Network r~presentations have been used in information retrieval since at least the early
`1960's. Networks have been used to support diverse retrieval functions, including browsing
`[TC89], document clustering [Cro80], spreading activation search [CK87], support for mul(cid:173)
`tiple search strategies [CT87], and representation of user knowledge [OPC86] or document
`content [TS85}.
`Recent work suggests that significant improvements in retrieval performance will require
`techniques that, in some sense, "understand" the content of documents and queries [vR86,
`Cro87] and can be used to infer probable relationships between documents and queries. In
`this view, information retrieval is an inference or evidential reasoning process in which we
`estimate the probability that a user's information need, expressed as one or more queries, is
`met given a document as "evid~nce." Netwo~k representations show promise as mechanisms
`for inferring these kinds of relationships [CT89,CK87].
`The idea. that retrieval is a.n inference or evidential reasoning process is not new.
`Cooper's logical relevance [Coo71] is based on deductive relationships between represen(cid:173)
`tations of documents and information needs. Wilson's situational relevance [Wil73) extends
`this notion to incorporate inductive or uncertain inference based on the degree to which
`documents support information needs. The techniques required to support these kinds of
`inference are similar to those used in expert systems that must reason with uncertain infor·
`mation. A number of competing inference models have been developed for these kinds of
`expert systems [KL86,LK88) and several of these models can be adapted to the d~cument
`retrieval task.
`In the research described her~ we adapt an inference network model to the retrieval
`task. The use of the model is intended to:
`
`• Support the use of multiple document representation schemes. Research has shown
`that a given query will retrieve different documents when applied to different repre·
`
`Permission to copy without fee all part of this material is granted provided
`that the copies are not made or distributed for direct commercial advantage,
`the ACM copyright notice and the title of the publication and its date appear,
`and notice is given that copying is by permission of the Association for
`Computing Machinery. To copy otherwise, or to republish, requires a fee and/
`or .specific permission.
`ACM
`(C)
`
`1990
`
`Q-89791-408-2
`
`90
`
`0009
`
`1
`
`$1.50 .
`
`1
`
`EXHIBIT 2029
`Facebook, Inc. et al.
`v.
`Software Rights Archive, LLC
`CASE IPR2013-00479
`
`

`

`sentations, even when the average retrieval performance achieved with each represen(cid:173)
`tation is the same. Katzer, for example, found little overlap in documents retrieved
`using seven different representations, but found that documents retrieved by multi(cid:173)
`ple representations were likely to be relevant [KMT+82]. Similar results have been
`obtained when comparing term- with cluster-based representations [CH79] and term(cid:173)
`with citation-based representations [FNL88].
`
`• Allow results from different queries and query types to be combined. Given a single
`natural language description of an information need, different searchers will formulate
`different queries to represent that need and will retrieve different documents, even
`when average performance is the same for each searcher (MKN79,KMT+82]. Again,
`documents retrieved by multiple searchers are more likely to be relevant. A descrip(cid:173)
`tion of an information need can be used to generate several query representations (e.g.,
`probabilistic, Boolean), each using a different query strategy and each capturing dif(cid:173)
`ferent aspects of the information need. These different search strategies are known to
`retrieve different documents for the same underlying information need {Cro87).
`
`• Facilitate flexible matching between the terms or concepts mentioned in queries and
`those assigned to documents. The poor match between the vocabulary used to express
`queries and the vocabulary used to represent documents appears to be a major cause
`of poor recall [FLGD87]. Recall can be improved using domain knowledge to match
`query and representation concepts without significantly degrading precision.
`
`The resulting formal retrieval model integrates several previous models in a single theoretical
`framework; multiple document and query representations are treated as evidence which is
`combined to estimate the probability that a document satisfies a user's information need.
`In what follows we briefly review candidate inference models 1 present an inference
`network-based retrieval model, and compare the network model to current retrieval models.
`
`2
`
`Inference networks
`
`The development of automated inference techniques that accommodate uncertainty has
`been an area of active research in the artificial intelligence community, particularly in the
`context of expert systems [KL86,LK88}. Popular approaches include those based on purely
`symbolic reasoning [Coh85,Doy79], fuzzy sets [Zad83], and a variety of probability models
`[Nil86,Che88]. Two inference models based on probabilistic methods are of particular inter(cid:173)
`est: Bayesian inference networks [Pea.88,LS88) and the Dempster-Shafer theory of evidence
`[Dem68,Sha76].
`A Bayesian inference network is a directed, acyclic dependency graph (DAG) in which
`nodes represent propositional variables or constants and edges represent dependence rela(cid:173)
`tions between propositions. If a proposition represented by a. node p "causes" or implies
`the proposition represented by node q, we draw a directed edge from p to q. The node
`q contains a link matrix that specifies P( qjp) for all possible values of the two variables.
`When a node has multiple parents, the link matrix specifies the dependence of that node
`on the set of parents (1r9 ) and characterizes the dependence relationship between that node
`
`2
`
`

`

`and all nodes representing its potential causes.1 Given a set of prior probabilities for the
`roots of the DAG, these networks can be used to compute the probability or degree of belief
`associated with all remaining nodes.
`Different restrictions on the topology of the network and assumptions about the way in
`which the connected nodes interact lead to different schemes for combining probabilities.
`In general, these schemes have two components which operate independently: a predictive
`component in which parent nodes provide support for their children (the degree to which we
`believe a proposition depends on the degree to which we believe the propositions that might
`cause it), and a diagnostic component in which children provide supil)ort for their parents (if
`our belief in a proposition increases or decreases, so does our belief in its potential causes).
`The propagation of probabilities through the net can be done using information passed
`between adjacent nodes.
`The Dempster-Shafer theory of evidence, althouglt not originally cast as a network
`model, can be used as an alternative method for evaluating these kinds of probabilistic
`inference networks. Rather than computing the belief associated with a query given a set
`of evidence, we can view Dempster-Shafer as computing the probability that the evidence
`would allow us to prove the query. The degree of support parameters associated with the
`arcs joining nodes are not interpreted as conditional probabilities, but as assertions that
`the parent node provides support for the child (is active) for some proportion p of the time
`and does not support the child for the remainder of the time. For an and-combination we
`compute the proportion of the time that all incoming arcs are active. For an or-combination
`we compute the proportion of the time that at least one parent node is active. To compute
`the provability of the query given a document, we examine all paths leading from the
`document to the query and compute the proportion of time that all of the arcs on at least
`one proof path are active. Given the structure of these networks, this computation can be
`done using series-parallel reduction of the subgraph joining the document and query in time
`proportional to the number of arcs in tlte subgraph ..
`The Bayesian and Dempster-Shafer models are different and can lead to different results.
`However, under the assumption of disjunctive rule interaction (so called "noisy-OR") and
`the interpretation of an arc from a to b as P(bla) = p and P(bl-,a) = 0, the Bayesian
`and Dempster-Shafer models will produce similar results [Pea88, page 446]. The document
`retrieval inference networks described here are based on the Bayesian inference network
`model.
`The use of Bayesian inference networks for information retrieval represents an extension
`of probability-based retrieval research dating from the early 1960's [MK60]. It has long
`been recognized that some terms in a collection are more significant than others and that
`information about the distribution of terms in a collection can be used to improve retrieval
`performance. The use of these networks generalizes existing probabilistic models and allows
`integration of several sources of knowledge in a single framework.
`
`1 While this probability specification is generally referred to as a link matrix, it is actually a tensor.
`
`3
`
`

`

`Query
`Network
`
`Figure 1: Basic document inference network
`
`3 Basic Model
`
`The basic document retrieval inference network, shown in Figure 1, consists of two compo(cid:173)
`nent networks: a document network and a query network. The document network represents
`the document collection using a variety of document representation schemes. The document
`network is built once for a given collection and its structure does not change during query
`processing. The query network consists of a single node which represents the user's infor(cid:173)
`mation need and one or more query representations which express that information need.
`A query network is built for each information need and is modified during query processing
`as existing queries are refined or new queries are added in an attempt to better characterize
`the information need. The document and query networks are joined by links between rep(cid:173)
`resentation concepts and query concepts. All nodes in the inference network take on values
`
`4
`
`

`

`from the set {false,true}.
`
`3.1 Document network
`
`The document network consists of document nodes (dis), text representation nodes (tj's),
`and concept representation nodes (r~c:'s). Each document node represents a. document in
`the collection. A document node corresponds to the event that a specific document has
`been observed. The form of the document represented depends on the collection and its
`intended use, but we will assume that a document is a well defined object and will focus on
`traditional document types (e.g., monographs, journal articles, office documents).
`Document nodes correspond to abstract documents rather than their physical represen(cid:173)
`tations. A text representation node or text node corresponds to a specific text representation
`of a document. A text node corresponds to the event that a text representation ha.s been
`observed. We focus here on the text content of documents, but the network model can
`support documents nodes with multiple children representing additional component types
`(e.g., figures, audio, or video). Similarly, a single text might be shared by more than one
`document. While shared components is rare in traditional collections (an example would
`be a journal article that appears in both a serial issue and in a reprint collection) a.nd is
`not generally represented in current retrieval models, it is common in hypertext systems.
`For clarity, we will consider only text representations and will assume a one-to-one corre(cid:173)
`spondence between documents and texts. The dependence of a text upon the document is
`represented in the network by an arc from the document node to the text node.
`The content representation nodes or representation nodes can be divided into several
`subsets, each corresponding to a single representation technique that has been applied to
`the document texts. For example, if a collection has been indexed using automatic phrase
`extraction and manually assigned index terms, then the set of representation nodes will
`consist of two distinct subsets or content representation types with disjoint domains. Thus,
`if the phrase "information retrieval" has been extracted and "information retrieval" ha.s
`been manually assigned as an index term, then two representation nodes with distinct
`meanings will he created. One corresponds to the event that "information retrieval" has
`been automatically extracted from a subset of the collection, the second corresponds to the
`event that "information retrieval" has been manually assigned to a (presumably distinct)
`subset of the collection. We represent the assignment of a specific representation concept to
`a document by a directed arc to the representation node from each text node corresponding
`to a document to which the concept has been assigned. For now we assume that the presence
`or absence of a link corresponds to a binary assigned/not assigned distinction, that is, there
`are no partial or weighted assignments.
`In principle, the number of ·representation schemes is unlimited; in addition to phrase
`extraction and manually assigned terms we would expect representations based on natural
`language processing and automatic keyword extraction. For any real document collection,
`however, the number of representations used will be fixed and relatively small. The potential
`domain of each representation scheme may also be unlimited, but the actual number of
`primitive representation concepts defined for a given collection is fixed by the collection.
`The domain for most automated representation schemes is generally bounded by some
`
`5
`
`

`

`function of the collection size (e.g., the number of keywords cannot exceed the number of
`words in a collection). For manual representation schemes the domain size is limited by the
`number of documents, the representation scheme itself (e.g., a controlled vocabulary), and
`the amount of time a human expert can spend analyzing each document.
`The basic document network shown in Figure 1 is a simple three level DAG in which
`document nodes are roots, text nodes are interior nodes, and representation nodes are leaves.
`Document nodes have exactly one text node as a child a.nd each text node has one or more
`representation nodes as children.
`Each document node has a prior probability associated with it that describes the
`probability of observing that documen_t; this prior probability will generally be set to
`1 f (collection size) and will be small for real collections. Each text node contains a speci(cid:173)
`fication of its dependence upon its parent; by assumption, this dependence is complete, a
`text node is observed (ti =true) exactly when its parent document is observed (di = true).
`Each representation node contains a specification of the conditional probability associ·
`ated with the node given its set of parent text nodes. This specification incorporates the
`effect of any indexing weights (e.g., term frequency for each parent text) or term weights
`(e.g., inverse document frequency) associated with the representation concept. While, in
`principle, this would require 0(2n) space for a node with n parents, in practice we use
`canonical representations that allow us to compute the required conditional probabilities
`when needed. These canonical schemes require O(n) space if we weight the contribution of
`each parent or 0(1) space if parents are to be treated uniformly.
`
`3.2 Query network
`
`The query network is an "inverted" DA G with a single leaf that corresponds to the event that
`an information need is met and multiple roots that correspond to the concepts that express
`the information need. As shown in Figure 1, a set of intermediate query nodes may be
`used when multiple queries express the information need. These nodes are a representation
`conveniencej it is always possible to eliminate them by increasing the complexity of the
`distribution specified at the node representing the information need.
`In general, the user's information need is internal to the user and is not precisely un(cid:173)
`derstood. We attempt to make the meaning of an information need explicit by expressing
`it in the form of one or more queries that have formal interpretations. These queries may
`be generated from a single natural language description (e.g., keywords or phrases for a
`probabilistic search, a Boolean representation, sample documents, ... ) or they may repre(cid:173)
`sent additional sources of information (e.g., an intermediary's description of the user or of
`the information need, or feedback provided by the user). It is unlikely that any of these
`queries will correspond precisely to the information need, but some will better characterize
`the informatjon need than others and several query specifications taken together may be a
`better representation than any of the individual queries.
`The roots of tl1e query network are query concepts; they correspond to the primitive
`concepts used to express the information need. A single query concept node may have several
`representation concept nodes as parents. Each query concept node contains a. specification
`of its dependence on the set of parent representation concepts. The query concept nodes
`
`6
`
`

`

`define the mapping between the concepts used to represent the document collection and the
`concepts used in the queries. In the simplest case, the query concepts are the same as the
`representation concepts so each query concept has exactly one parent. In a slightly more
`complex example, the query concept "information retrieval" may have as parents both the
`node corresponding to "information retrieval" as a phrase and the node corresponding to
`"information retrieval" as a manually assigned term. As we add content representations
`to the document network and a.llow query concepts tha.t do not explicitly appear in any
`document representation, the number of parents associated with a single query concept will
`increase.
`A query concept is similar to a. representation concept that is derived from other rep(cid:173)
`resentation concepts (see section 5 for a discussion of derived representation concepts) and
`in some cases it will be useful to "promote" a query concept to a representation concept.
`For example, suppose that a researcher is looking for information on a recently developed
`process that is unlikely to be explicitly identified in any existing representation scheme. The
`researcher, if sufficiently motivated, could work with the retrieval system to describe how
`this new concept might be inferred from other representation concepts. If this new concept
`definition is of general interest, it can be added to the collection of representation concepts.
`This use of inference to define new concepts is similar to that used in RUBRIC [TS85J.
`The attachment of the query concept nodes to the document network has no effect on
`the ba.sic structure of the document network. None of the existing links need change and
`none of the conditional probability specifications stored in the nodes are modified.
`A query node represents a distinct query form and corresponds to the event that the
`query is satisfied. Each query node contains a specification ofthe dependence of the query on
`its parent query concepts. The link matrices tha.t describe these conditional probabilities are
`discuss.ed further in section 3.4, but we note that the form of the link matrix is determined
`by the query type; a link matrix simulating a Boolean operator is different than a matrix
`simulating a probabilistic or weighted query.
`The single leaf representing the information need corresponds to the event that an
`information need is met. In general, we cannot predict with certainty whether a user's
`information need will be met by a document collection. The query network is intended to
`capture the way in which meeting the user's information need depends on documents and
`their representations. Moreover, the query network is intended to allow us to combine infor(cid:173)
`mation from multiple document representations and to combine queries of different types to
`form a single, formally justified estimate of the probability that the user's information need
`is met. If the inference network correctly characterizes the dependence of the information
`need on the collection, the computed probability provides a good estimate.
`
`3.3 Use of the inference network
`
`The retrieval inference network is intended to capture all of the significant probabilistic de(cid:173)
`pendencies among the variables represented by nodes in the document and query networks.
`Given the prior probabilities associated with the documents (roots) a.nd the conditional
`probabilities associated with the interior nodes, we can compute the posterior probability
`or belief associated with each node in the network. Further, if the value of any variable
`
`7
`
`

`

`represented in the network becomes known we can use the network to recompute the prob(cid:173)
`abilities associated with all remaining nodes based on this "evidence."
`The network, taken as a. whole, represents the dependence of a user's information need
`on the documents in a collection where the dependence is mediated by document and
`query representations. When the query network is first built and attached to the document
`network we compute the belief associated with each node in the query network. The initial
`value at the node representing the information ne€d
`is the probability that the information
`need is met given that no specific document in the collection has been observed and all
`documents are equally likely {or unlikely). If we now observe a single document di and
`attach evidence to the network asserting di = true we can compute a new belief for every
`node in the network given di = true. In particular, we can compute the probability that
`the information need is met given that di has been observed in the collection. We can now
`remove this evidence and instead assert that some dj, i :f. j has been observed. By repeating
`this process we can compute the probability that the information need is met given each
`document in the collection and rank the documents accordingly.
`In principle, we need not consider each document in isolation but could look for the
`subset of documents which produce the highest probability that the information need is
`met. While a general solution to this best-subset problem is intractable, in some cases
`good heuristic approximations are possible. Best-subset rankings have been considered in
`IR [Sti 75], and similar problems arise in pattern recognition, medical diagnosis, and truth(cid:173)
`maintenance systems. See [Pea88] for a discussion of the best-subset or belief revision
`problem in Bayesian networks. At present, we consider only documents in isolation because
`the approach is computationally simpler and because it allows comparison with earlier
`retrieval models that produce document rankings consistent with the Probability Ranking
`Principle [Rob77] in which documents are considered in isolation.
`The document network is built once for a given collection. Given one or more queries
`representing an information need, we then build a query network that attempts to charac(cid:173)
`terize the dependence of the information need on the collection. If the ranking produced by
`the initial query network is inadequate, we must add additional information to the query
`network or refine its structure to better characterize the meaning of the existing queries.
`This feedback process is quite similar to conventional relevance feedback.
`
`3.4 Link matrix forms
`
`For all non-root nodes in the inference network we must estimate the probability that a
`node takes on a value given any set of values for its parent nodes. If a node a has a set of
`parents 1r" = {p~, ... ,pn}, we must estimate P(alpt, ... ,pn)·
`The most direct way to encode our estimate is as a link matrix. Since we are dealing
`with binary valued propositions, this matrix is of size 2 x 2n for n parents and specifies
`the probability that a takes the value a = true or a = false for all combinations of parent
`values. The update procedures for Bayesian networks then use the probabllities provided
`by the set of parents to condition over the link matrix values to compute the predictive
`component of our belief in a or P(a == true). Similarly, the link matrix is used to provide
`diagnostic information to the set of parents based on our belief in a. As mentioned earlier,
`
`8
`
`

`

`encoding our estimates in link matrix form is practical only for nodes with a small set of
`parents, so our estimation task has two parts: how do we estimate the dependence of a
`node on its set of parents and how do we encode these estimates in a usable form?
`We will describe four canonical link matrix forms, three for the Boolean operators and
`a fourth for simple probabilistic retrieval. For illustration, we will assume that a node Q
`has three parents A, B, and C and that
`
`P(A =true)= a, P(B =true)= b, P(C =true)= c.
`
`For or-combinations, Q will be true when any of A, B, or Cis true and false only when
`A, B, and C are all false. This suggests a link matrix of the form
`
`Lor =
`
`(
`
`10000000 )
`0 1 1 1 1 1 1 1

`
`Using a closed form of the update procedures, we have
`
`P(Q = true) = (1- a.)(1- b)c + (1- a.)b(l- c)+ {1- a.)bc + a{l- 6){1- c)
`+a(1- b)c + ab(l- c)+ abc
`= 1-(1-a)(I-6)(1-c)
`
`which is the familiar rule for disjunctive combination of events that are not known to be
`mutually exclusive. Similar matrix forms can be developed for and (P(Q = true) = abc)
`and not (P(Q = true)= 1- a).
`If we restrict the parent nodes for any of these logic operators to values 0 or I then Q
`must also have a value of 0 or 1. If we allow terms to take on weights in the ra.nge [0, 1} and
`interpret these weights as the probability that the term has been assigned to a document
`text, then these inference networks provide a natural interpretation.for Boolean retrieval
`with weighted indexing. The use of these canonical forms to simulate Boolean retrieval is
`discussed in section 4.3
`For probabilistic retrieval each parent has a weight associated with it, as does the child.
`In this weighted-sum matrix, our belief in Q depends on the specific parents that are true
`-parents with larger weights have more influence in our belief. H we let Wa,wb,Wc ~ 0 be
`the parent weights, 0 :5 Wq :5 1 the child weight, and t = Wo. + Wb + We, then we have a link
`matrix of the form
`
`WaWq
`t
`( wo+wc)wg
`t
`
`(wb+tue)Wq
`t
`Wa.Wq
`t
`
`WbWq
`t
`(w.,+we)w9
`t
`
`WeWq
`t
`(wa.+wl>)w9
`t
`
`1-Wq).
`
`Wq
`
`Evaluation of this link matrix form results in
`( W 0 a + Wbb + Wcc)wq
`t
`
`) _
`_ t
`-
`rue
`-
`
`P(Q
`
`.
`
`This link matrix can be used to implement a variety of weighting schemes, including
`the familiar term weighting schemes based on within-document term frequency (tf), inverse
`
`9
`
`

`

`document frequency (idf) or both (tfid!). To illustrate a tf.idf weighting, let Q be a
`representation node and let A, B, and C be document nodes. Let Wa, wb, and We be the
`normalized tf values for A, B, and C, let id/9 be the normalized idf weight for Q, and let
`(1)
`
`Given our basic model, when A is instantiated, belief in Q is given by
`
`bel(Q) =
`
`=
`
`=
`
`wa+wb +we
`tfa • id/9 • {w0 + Wb +We)
`W11 +wb+ We
`tfa · id]q
`
`which is a form of tj.idf weight. In general, when a document is instantiated all represen(cid:173)
`tation concept nodes to which it is attached take on the tj.idf weight associated with the
`document/term pa.ir.
`The weight at Q has two distinct parts. The first part (idfq in our example) acts to set
`the maximum belief achievable at a node. If, for some combination of parent values, our
`belief in Q is certain then this component disappears. Note that in this formulation, the
`idf component is dependent only upon the distribution of the term in the collection, not
`on the distribution of the term in relevant and non-relevant subsets. Relevance feedback is
`modeled as part of the query network and does not affect belief in representation concepts.
`The second part ( w 11 + w~ + We in our example) acts to normalize the parent weights.
`Equation 1 is appropriate for the basic model in which only one document is instantiated
`at a time. In the extended model of section 5 where multiple roots can be instantiated, this
`component is adjusted to normalize for the maximum achievable set of parent weights. In
`the general case, where all parents can take any value in the range [0, 1], this normalizing
`component disappears.
`These canonical forms are sufficient for the retrieval inference networks described here,
`but many others are possible (see section 4.3 for other examples). Further, when the number
`of parents is small (say, less than 5 or 6) we can use the full link matrix if the dependence
`of a node on its parents does not fit a canonical form.
`
`4 Comparison with other retrieval models
`
`The inference network retrieval model generalizes both the probabilistic and Boolean mod(cid:173)
`els. Inference networks can be used to simulate both probabilistic and Boolean queries and
`can be used to combine results from multiple queries.
`In this section we compare the inference network model with probabilistic (sections 4.1
`and 4.2) and Boolean (section 4.3) models and show how inference networks can be used
`to simulate both forms of retrieval. We then consider how the probabilities required by the
`model can be estimated (section 4.4); the estimation problems are essentially equivalent to
`those encountered with probabilistic or vector-space retrieval.
`
`10
`
`

`

`Figure 2: Inference network for binary independence model
`
`4.1 Probabilistic retrieval models
`
`Conventional probabilistic models [vR79,SM83] rank documents by the probability that each
`document would be judged relevant to a given query, P(relevant[d,)-2 This is, in many ways,
`similar to computing the probability that a user's information need is met given a specific
`document, P(IIdi)· The principal differences between conventional probabilistic models and
`the model described here are: 1) most probabilistic models do not explicitly represent the
`query, 2) conven,t;ional probabilistic models do not distinguish between a document and its
`representations but treat a document as a single vector, and 3) the inference network model
`depends less upon Bayesian inversion than probabilistic models, Bayesian inversion is just
`one way to estimate P(IIdi) (or P(Qidi) in the case of a single query).
`In this section we summarize the major differences between the inference network and
`conventional probabilistic models by comparing the network model to the binary indepen(cid:173)
`dence model. In the next section we provide a formal comparison of the inference network
`model with a recent probabilistic model that explicitly represents documents and queries.
`An inference network that corresponds to the binary independence model [vR79] is
`shown in Figure 2. A document is represented by a vector whose components are indexing
`or representation concepts ( di = { r1 , ... , rra} ). The set of concepts considered is generally
`restricted to the subset that actually occurs in the query. Comparing this network with that
`shown in Figure 1, we see that in the binuy independence model, the document network
`is represented by a single level of representation nodes and the query network consists of a
`single relevance node. In order to implement this network we must somehow estimate the
`probability of relevance given the set of parent representation contepts and this estimate
`must incorporate all of our judgments about the probability that a representation concept
`should be assigned to a document, about the semantic and stochastic relationships between
`representation concepts, about the relationship between concepts named in the query and
`assigned to documents, and a.bout

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket