throbber
Query Expansion Using Local and Global
`Document Analysis
`
`Jinxi Xu and W. Bruce Croft
`Center for Intelligent Information Retrieval
`Computer Science Department
`University of Massachusetts, Amherst
`Amherst, MA 01003-4610, USA
`xu@cs.umass.edu croft@cs.umass.edu
`
`Abstract
`
`Automatic query expansion has long been suggested as a
`technique for dealing with the fundamental issue of word
`mismatch in information retrieval. A number of approaches
`to expansion have been studied and, more recently, attention
`has focused on techniques that analyze the corpus to discover
`word relationships (global techniques) and those that analyze
`documentsretrieved by the initial query ( local feedback). In
`this paper, we compare the effectiveness of these approaches
`and show that, although global analysis has some advantages,
`local analysis is generally more effective. We also show that
`using global analysis techniques, such as word context and
`phrase structure, on the local set of documents produces re-
`sults that are both more effective and more predictable than
`simple local feedback.
`
`1
`
`Introduction
`
`The problem of word mismatch is fundamental to informa-
`tion retrieval. Simply stated, it means that people often use
`different words to describe concepts in their queries than au-
`thors use to describe the same concepts in their documents.
`The severity of the problem tends to decrease as queries
`get longer, since there is more chance of some important
`words co-occurring in the query and relevant documents.
`In many applications, however, the queries are very short.
`For example, applications that provide searching across the
`World-Wide Web typically record average query lengths of
`two words [Croft et al., 1995]. Although this may be one ex-
`treme in terms of IR applications, it does indicate that most
`IR queries are not long and that techniques for dealing with
`word mismatch are needed.
`An obvious approach to solving this problem is query
`expansion. The query is expanded using words or phrases
`with similar meaning to those in the query and the chances
`of matching words in relevant documents are therefore in-
`creased. This is the basic idea behind the use of a thesaurus
`
`Permission to make digital/hard copy ofall part of this work for per-
`sonal or classroomuse is granted without fee provided that copies are
`not madeor distributed for profit or commercial advantage, the copy-
`right notice, the title of the publication and its date appear, and notice
`is given that copying 1s by permission of ACM,Inc. To copy otherwi-
`se, to republish, to post on servers or to redistribute to lists, requires
`prior specific permission and/orfee.
`0-89791-792-
`SIGIR'96,
`Zurich,
`Switzerland©1996
`ACM
`8/96/08 $3.50
`
`in query formulation. There is, however,little evidence that
`a general thesaurus is of any use in improving the effec-
`tiveness of the search, even if words are selected by the
`searchers [Voorhees, 1994].
`Instead, it has been proposed
`that by automatically analyzing the text of the corpus be-
`ing searched, a moreeffective thesaurus or query expansion
`technique could be produced.
`Oneof the earliest studies of this type was carried out
`by Sparck Jones [Sparck Jones, 1971] who clustered words
`based on co-occurrence in documents and used those clus-
`ters to expand the queries. A numberof similar studies
`followed but it was not until recently that consistently pos-
`itive results have been obtained. The techniques that have
`been used recently can be described as being based on either
`global or local analysis of the documents in the corpus being
`searched. The global techniques examine word occurrences
`and relationships in the corpus as a whole, and use this in-
`formation to expand any particular query. Given their focus
`on analyzing the corpus, these techniques are extensions of
`Sparck Jones’ original approach.
`Local analysis, on the other hand, involves only the top
`ranked documents retrieved by the original query. We have
`called it local because the techniques are variations of the
`original work on local feedback [Attar & Fraenkel, 1977,
`Croft & Harper, 1979]. This work treated local feedback as
`a special case of relevance feedback where the top ranked
`documents were assumed to be relevant. Queries were both
`reweighted and expanded based on this information.
`Both global and local analysis have the advantage of ex-
`panding the query based on all the words in the query. This
`is in contrast to a thesaurus-based approach whereindivid-
`ual words and phrases in the query are expanded and word
`ambiguity is a problem. Global analysis is inherently more
`expensive than local analysis. On the other hand, global
`analysis provides a thesaurus-like resource that can be used
`for browsing without searching, and retrieval results with
`local feedback on small test collections were not promising.
`More recent results with the TRECcollection, however,
`indicate that local feedback approaches can be effective and,
`in somecases, outperform global analysis techniques. In this
`paper, we compare these approaches using different query
`sets and corpora.
`In addition, we propose and evaluate a
`new technique which borrows ideas from global analysis,
`such as the use of context and phrase structure, but applies
`them to the local document set. We call the new technique
`local context analysis to distinguish it from local feedback.
`In the next section, we describe the global analysis pro-
`cedure used in these experiments, which is the Phrasefinder
`component of the INQUERYretrieval system [Jing & Croft,
`
`1
`
`EX1021
`EX1021
`
`

`

`1994]. Section 3 covers the local analysis procedures. The
`local feedback technique is based on the most successful ap-
`proaches from the recent TREC conference [Harman, 1996].
`Local context analysis is described in detail.
`The experiments and results are presented in section 4.
`Both the TREC [Harman, 1995] and WEST [Turtle, 1994]
`collections are used in order to compare results in differ-
`ent domains. A numberof experiments with local context
`analysis are reported to show the effect of parameter varia-
`tions on this new technique. The other techniques are run
`using established parameter settings. In the comparison of
`global and local techniques, both recall/precision averages
`and query-by-query results are used. The latter evaluation
`is particularly useful to determine the robustness of the tech-
`niques, in terms of how many queries perform substantially
`worse after expansion.
`In the final section, we summarize
`the results and suggest future work.
`
`2 Global Analysis
`
`The global analysis technique we describe here has been used
`in the INQUERY system in TREC evaluations and other
`applications [Jing & Croft, 1994, Callan et al., 1995], and
`was one of the first techniques to produce consistent effec-
`tiveness improvements through automatic expansion. Other
`researchers have developed similar approaches [Qiu & Frei,
`1993, Schiitze & Pedersen, 1994] and have also reported good
`results.
`The basic idea in global analysis is that the global con-
`text of a concept can be used to determinesimilarities be-
`tween concepts. Context can be defined in a numberof ways,
`as can concepts. The simplest definitions are that all words
`are concepts (except perhaps stop words) and that the con-
`text for a word is all the words that co-occur in documents
`with that word. This is the approach used by [Qiu & Frei,
`1993], and the analysis producedis related to the represen-
`tations generated by other dimensionality-reduction tech-
`niques [Deerwester et al., 1990, Caid et al., 1993]. The
`essential difference is that global analysis is only used for
`query expansion and does not replace the original word-
`based document representations. Reducing dimensions in
`the document representation leads to problems with preci-
`sion. Another related approach uses clustering to determine
`the context for document analysis [Crouch & Yang, 1992].
`In the Phrasefinder technique used with INQUERY,the
`basic definition for a concept is a noun group, and the con-
`text is defined as the collection of fixed length windows sur-
`rounding the concepts. A noun group (phrase) is either a
`single noun,
`two adjacent nouns or three adjacent nouns.
`Typical effective window sizes are from 1 to 3 sentences.
`One way of visualizing the technique, although not the most
`efficient way of implementingit, is to consider every concept
`(noun group) to be associated with a pseudo-document. The
`contents of the pseudo-document for a concept are the words
`that occur in every window for that concept in the corpus.
`For example, the concept airline pilot might have the words
`pay, strike, safety, air, traffic and FAA occurring frequently
`in the corresponding pseudo-document, depending on the
`corpus being analyzed. An INQUERYdatabaseis built from
`these pseudo-documents, creating a concept database. A fil-
`tering step is used to remove words that are too frequent or
`too rare, in order to control the size of the database.
`To expand a query,it is run against the concept database
`using INQUERY,which will generate a rankedlist of phrasal
`concepts as output, instead of the usual list of document
`names. Document and collection-based weighting of match-
`
`ing words are used to determine the concept ranking, in a
`similar way to document ranking. Some of the top-ranking
`phrases from the list are then added to the query and
`weighted appropriately.
`In the Phrasefinder queries used
`in this paper, 30 phrases are added into each query and are
`downweighted in proportion to their rank position. Phrases
`containing only terms in the original query are weighted
`more heavily than those containing terms not in the origi-
`nal query.
`retrieved by
`the top 30 concepts
`shows
`Figure 1
`Phrasefinder for the TREC4 query 214 “What are the differ-
`ent techniques used to create self induced hypnosis”. While
`some of the concepts are reasonable, others are difficult to
`understand. This is due to a numberof spurious matches
`with noncontent words in the query.
`The main advantages of a global analysis approach like
`the one used in INQUERYis that it is relatively robust in
`that average performance of queries tends to improve us-
`ing this type of expansion, and it provides a thesaurus-like
`resource that can be used for browsing or other types of
`concept search. The disadvantages of this approach is that
`it can be expensive in terms of disk space and computer
`time to do the global context analysis and build the search-
`able database, and individual queries can besignificantly
`degraded by expansion.
`
`3 Local Analysis
`3.1 Local Feedback
`
`The general concept of local feedback dates back at least
`to a 1977 paper by Attar and Fraenkel [Attar & Fraenkel,
`1977]. In this paper, the top ranked documents for a query
`were proposed as a source of information for building an
`automatic thesaurus. Terms in these documents were clus-
`tered and treated as quasi-synonyms.
`In [Croft & Harper,
`1979], information from the top ranked documentsis used to
`re-estimate the probabilities of term occurrence in the rel-
`evant set for a query.
`In other words, the weights of query
`terms would be modified but new terms were not added.
`This experiment produced effectiveness improvements, but
`was only carried out on a small test collection.
`Experiments carried out with other standard small col-
`lections did not give promising results. Since the simple
`version of this technique consists of adding common words
`from the top-ranked documents to the original query, the
`effectiveness of the technique is obviously highly influenced
`by the proportion of relevant documents in the high ranks.
`Queries that perform poorly and retrieve few relevant doc-
`uments would seem likely to perform even worse after local
`feedback, since most words added to the query would come
`from non-relevant documents.
`In recent TREC conferences, however, simple local feed-
`back techniques appear to have performed quite well. In this
`paper, we expand using a procedure similar to that used by
`the Cornell group in TREC 4 & 3 [Buckley et al., 1996].
`The most frequent 50 terms and 10 phrases (pairs of adja-
`cent non stop words) from the top ranked documents are
`added to the query. The terms in the query are reweighted
`using the Rocchio formula with a: 8:y=1:1:0.
`Figure 2 shows terms and phrases added bylocal feed-
`back to the same query used in the previous section. In this
`case, the terms in the query are stemmed.
`One advantage of local feedback is that it can be rela-
`tively efficient to do expansion based on high ranking doc-
`uments.
`It may be slightly slower at run-time than, for
`
`2
`
`

`

`
`
`hypnosis
`practitioners
`meditation
`dentists
`disorders
`antibodies
`
`
`psychiatry
`anesthesia
`immunodeficiency-virus
`susceptibility
`dearth
`therapists
`
`
`atoms
`self
`van-dyke
`confession
`proteins
`stare
`
`
`katie
`growing-acceptance
`jobns-hopkins-university
`
`
`reflexes
`ad-hoc
`voltage
`
`
`correlation
`dynamics
`conde-nast
`
`
`
`ike
`illnesses
`hoffman
`
`
`
`Figure 1: Phrasefinder concepts for TREC4 query 214
`
`
`
`19960500
`hypnotiz
`hypnot
`
`immun
`psychiatr
`psychosomat
`
`franz
`mesmer
`suscept
`dyck
`austrian
`psychiatrist
`
`tranc
`shesaid
`professor
`
`centur
`18th
`hallucin
`unaccept
`
`1ith
`hilgard
`exper
`syndrom
`19820902
`told
`physician
`patient
`
`cortic
`strang
`hemophiliac
`defic
`ol
`muncie
`diseas
`spiegel
`imagin
`
`feburar
`dyke
`suggest
`fresco
`reseach
`immunoglobulin
`
`katie
`numb
`person
`medicin
`treatment
`psorias
`
`ms
`17150000
`franz-mesmer
`
`
`psychosomat-medicin
`austrian-physician
`intern-congress
`fight-immun
`hypnot-state
`
`
`
`hypnotiz-peopl
`diseas-fight
`late-18th
`
`
`ms-ol
`
`
`
`
`
`
`Figure 2: Local feedback terms and phrases for TREC4 query 214
`
`example, Phrasefinder, but needs no thesaurus construction
`phase. Local feedback requires an extra search and access
`to document information. If document information is stored
`only for this purpose, then this should be counted as a space
`overhead for the technique, but it likely to be significantly
`less than a concept database. A disadvantage currently is
`that it is not clear how well this technique will work with
`queries that retrieve few relevant documents.
`
`about multiple topics, a co-occurrence of a concept at
`the beginning and a term at the end of a long docu-
`ment may mean nothing.
`It is also more efficient to
`use passages because we can eliminate the cost of pro-
`cessing the unnecessary parts of the documents.
`
`2. Concepts (noun phrases) in the top n passages are
`ranked according to the formula
`
`bel(Q,c) = |] (5 + log(af(c, ts)idfe/ log(n))**
`t3€Q
`
`Where
`
`3.2 Local Context Analysis
`Local context analysis is a new technique which combines
`global analysis and local feedback. Like Phrasefinder, noun
`groups are used as concepts and concepts are selected based
`on co-occurrence with query terms. Concepts are chosen
`from the top ranked documents, similar to local feedback,
`but the best passages are used instead of whole documents.
`The standard INQUERYranking is not used in this tech-
`nique.
`e
`is a concept
`Below are the steps to use local context analysis to ex-
`pand a query Q onacollection.
`fiz is the number of occurrences of ¢;
`in p;
`j
`is the number of occurrences of c in pj;
`Nis the number of passages in the collection
`N;
`is the number of passages containing t;
`N.
`is the number of passages containing c
`6
`is 0.1 in this paper to avoid zero bel value
`
`1. Use a standard IR system (INQUERY)to retrieve the
`top n ranked passages. A passage is a text window
`of fixed size (300 words in these experiments [Callan,
`1994]).
`There are two reasons that we use passages rather than
`documents. Since documents can be very long and
`
`af(c,t:) = I=? ftss fey
`taf;
`= maza(1.0,log10(N/N;)/5.0)
`idf,
`= maz(1.0,logl0(N/N.)/5.0)
`
`The above formula is a variant of the tf idf measure
`used by most IR systems. In the formula, the af part
`
`3
`
`

`

`rewards concepts co-occurring frequently with query
`terms, the idf, part penalizes concepts occurring fre-
`quently in the collection, the idf; part emphasizes in-
`frequent query terms. Multiplication is used to em-
`phasize co-occurrence with all query terms.
`
`3. Add m top ranked concepts to Q using the following
`formula:
`
`Qnew = FWSUM(1.01.0Q w OQ)
`Ql
`= #WSUM(1L.0 wi cr wa ca... Wm Cm)
`
`In our experiments, m is set to 70 and w; is set to
`1.0—6.9%1/70. Unless specified otherwise, w is set to
`2.0. We call Qi the auxiliary query. #WSUM is an
`INQUERYquery operator which computes a weighted
`average of its components.
`
`Figure 3 shows the top 30 concepts added by local con-
`text analysis to TREC4 query 214.
`Local context analysis has several advantages. It is com-
`putationally practical. For each collection, we only need a
`single pass to collect the collection frequencies for the terms
`and noun phrases. This pass takes about 3 hours on an
`Alpha workstation for the TREC4 collection. The major
`overhead to expand a query is an extra search to retrieve
`the top ranked passages. On a modern computer system,
`this overhead is reasonably small. Once the top ranked
`passages are available, query expansion is fast: when 100
`passages are used, our current implementation requires only
`several seconds of CPU time to expand a TREC4 query.
`So local context analysis is practical even for interactive
`applications. For queries containing proximity constraints
`(e.g. phrases), Phrasefinder may add concepts which co-
`occur with all query terms but do not satisfy proximity con-
`straints. Local context analysis does not have such a prob-
`lem because the top ranked passages are retrieved using the
`original query. Because it does not filter out frequent con-
`cepts, local context analysis also has the advantage of using
`frequent but potentially good expansion concepts. A disad-
`vantage of local context analysis is that it may require more
`time to expand a query than Phrasefinder.
`
`4 Experiments
`
`4.1 Collections and Query Sets
`
`Experiments are carried out on 3 collections: TREC3 that
`comprises Tipster 1 and 2 datasets with 50 queries (topics
`151-200), TREC4 that comprises Tipster 2 and 3 datasets
`with 49 queries (topics 202-250) and WEST with 34 queries.
`TREC3 and TREC4(about 2 GBs each) are much larger
`and more heterogeneous than WEST. The average docu-
`ment length of the TREC documents is only 1/7 of that of
`the WEST documents. The average number ofrelevant doc-
`uments per query with the TREC collections is much larger
`than that of WEST. Table 1 lists some statistics about the
`collections and the query sets. Stop words are not included.
`
`4.2 Local Context Analysis
`Table 2 shows the performance of local context analysis on
`the three collections. 70 concepts are added into each query
`using the expansion formula in section 3.2.
`Local text analysis performs very well on TREC3 and
`TREC4. All runs produce significant
`improvements over
`the baseline on the TREC collections. The best run on
`
`TREC4 (100 passages) is 23.5% better than the baseline.
`The best run on TREC3 (200 passages) is 24.4% better than
`the baseline. On WEST,the improvements over the baseline
`are not as good as on TREC3 and TREC4. With too many
`passages, the performance is even worse than the baseline.
`The high baseline of the WESTcollection (53.8% average
`precision) suggests that the original queries are of very good
`quality and we should give them more emphasis.
`So we
`downweight the expansion concepts by 50% by reducing the
`weight of auxiliary query Qi from 2.0 to 1.0. Table 3 shows
`that downweighting the expansion concepts does improve
`performance.
`It is interesting to see how the numberof passages used
`affects retrieval performance. To see it more clearly, we
`plot the performance curve on TREC4in figure 4. Initially,
`increasing the numberof passages quickly improves perfor-
`mance. The performance peaks at a certain point. After
`staying relatively flat for a period, the performance curves
`drop slowly when more passages are used. For TREC3 and
`TREC4,
`the optimal number of passages is around 100,
`while on WEST, the optimal numberof passages is around
`20. This is not surprising because the first two collections
`are a order of magnitude larger than WEST. Currently we
`do not know how to automatically determine the optimal
`numberof passages to use. Fortunately, local context anal-
`ysis is relatively insensitive to the numberof the passages
`used, especially for large collections like the TREC collec-
`tions. On the TRECcollections, between 30 and 300 pas-
`sages produces very good retrieval performance.
`
`5 Local Text Analysis vs Global Analysis
`
`In this section we compare Phrasefinder and local context
`analysis in term of retrieval performance. Tables 4-5 com-
`pare the retrieval performance of the two techniques on
`the TREC collections. On both collections,
`local context
`analysis is much better than Phrasefinder. On TREC3,
`Phrasefinder is 7.8% better than the baseline while local
`context analysis using the top ranked 100 passagesis 23.3%
`better than the baseline. On TREC4, Phrasefinder is only
`3.4% better than the baseline while local context analysis
`using the top ranked 100 passages is 23.5% than the base-
`line.
`In fact, all local context analysis runs in table 2 are
`better than Phrasefinder on TREC3 and TREC4. On both
`collections, Phrasefinder hurts the high-precision end while
`local context analysis helps improve precision. The results
`show that local context analysis is a better query expansion
`technique than Phrasefinder.
`to show why
`We
`examine
`two TREC4 queries
`Phrasefinder is not as good as local context analysis. For
`one example, “China” and “Iraq” are very good concepts
`for TREC4 query “Status of nuclear proliferation treaties —
`violations and monitoring”. They are added into the query
`by local context analysis but not by Phrasefinder.
`It ap-
`pears that they are filtered out by Phrasefinder because they
`are frequent concepts. For the other example, Phrasefinder
`added the concept “oil spill” te TREC4 query “As a result
`of DNAtesting, are more defendants being absolved or con-
`victed of crimes”. This seems to be strange. It appears that
`Phrasefinder did this because “oil spill” co-occurs with many
`of the terms in the query,e.g., “result”, “test”, “defendant”,
`“absolve” and “crime”. But “oil spill” does not co-occur
`with “DNA”, which is a key element of the query. While
`it is very hard to automatically determine which terms are
`key elements of a query, the product function used by local
`context analysis for selecting expansion concepts should be
`
`4
`
`

`

`
`
`
`
`hypnosis
`brain-wave ms.-burns
`technique
`pulse
`reed
`ms.-olness
`brain
`trance
`hallucination
`process
`circuit
`
`
`van-dyck
`behavior
`suggestion
`case
`spiegel
`finding
`
`
`
`hypnotizables
`subject
`van-dyke
`patient
`memory
`application
`
`
`katie
`muncie
`approach
`study
`point
`contrast
`
`
`Figure 3: Local Context Analysis concepts for query 214
`
`collection
`Number of queries
`Raw text size in gigabytes
`Numberof documents
`Mean words per document
`Mean relevant documents per query
`Number of words in a collection
`
`0.26
`11,953
`
`2.2
`| 741,856
`
`567,529
`299
`
`
`Numberof passages
`50
`100
`200
`31.0
`+23.0
`39.3
`+24.4
`53.1
`-13
`
`30.7
`+21.8
`39.1
`423.7
`52.7
`-2.0
`
`29.9
`+18.6
`38.3
`421.3
`52.1
`-3.2
`
`
`
`collection
`TREC4
`
`TREC3
`
`WEST
`
`a
`
`:
`
`.
`
`:
`5
`.
`
`.
`
`.
`
`3
`
`Table 2: Performance of local context analysis using 11 point average precision
`
`collection
`
`2000 WEST
`
`10
`55.6
`56.5
`55.9
`43.8 +50 43.4
`
`40
`55.7
`43.6
`
`Numberof passages
`50
`100
`200
`55.8
`556
`546
`43.7
`43.3
`+1.6
`
`300
`544
`41.2
`
`500
`53.6
`-04
`
`1000
`53.7
`-01
`
`53.7
`-0.1
`
`Table 3: Downweight expansion concepts of local context analysis on WEST. The weight of the auxiliary query is reduced to
`1.0
`
`better than the sum function used by Phrasefinder because
`with the product function it is harder for some query terms
`to dominate other query terms.
`
`6 Local Text Analysis vs Local Feedback
`
`In this section we compare the retrieval performances oflo-
`cal feedback and local context analysis. Table 7 shows the
`retrieval performance of local feedback.
`Table 8 shows the result of downweighting the expansion
`concepts by 50% on WEST. The reason for this is to make
`a fair comparison with local context analysis. Remember
`that we also downweighted the expansion concepts of local
`context analysis by 50% on WEST.
`Local feedback does very well on TREC3. The best run
`produces a 20.5% improvement over the baseline, close to
`the 24.4% of the best run of local context analysis. It is also
`relatively insensitive to the number of documents used for
`feedback on TREC3.
`Increasing the number of documents
`from 10 to 50 does not affect performance much.
`It also does well on TREC4. The best run produces a
`14.0% improvement over the baseline, very significant, but
`lower than the 23.5% of the best run of local context analy-
`
`sis. It is very sensitive to the number of documents used for
`feedback on TREC4.
`Increasing the number of documents
`from 5 to 20 results in a big performance loss. In contrast,
`local context analysis is relatively insensitive to the number
`of passages on all three collections.
`On WEST, local feedback does not work at all, With-
`out downweighting the expansion concepts, it results in a
`significant performance loss over all runs. Downweighting
`the expansion concepts only reduces the amount ofloss. It
`is also sensitive to the number of documents used for feed-
`back. Increasing the number of feedback documentsresults
`in significantly more performanceloss.
`It seems that the performanceof local feedback and its
`sensitivity to the number of documents used for feedback
`depend on the number of relevant documents in the col-
`lection for the query. From table 1 we know that average
`numberof relevant documents per query on TREC3 is 196,
`larger than 133 of TREC4, which is in turn larger than 29
`of WEST. This corresponds to the relative performance of
`local feedback on the collections.
`Tables 4-6 show a side by side comparison between local
`feedback and local context analysis at different recall levels
`on the three collections. Top 10 documents are used for local
`
`5
`
`

`

`2 8
`
`2
`
`oa
`
`30
`
`20
`
`3a
`
`e5
`&
`
`28 °
`
`60
`
`100160
`
`260300
`200
`number of passages
`
`390
`
`400
`
`450
`
`500
`
`Figure 4: Performance curve of local context analysis on TREC4
`
`Phrasefinder
`
`Tea-100p
`
`average 68.4
`
`52.8
`43.2
`36.0
`29.8
`24.5
`19.7
`14.8
`10.8
`
`(
` (+33.4)
`(+56.9)
`(+74.7)
`
`$11.0) [31.1 (+235) ]
`
`Table 4: A comparison of baseline, Phrasefinder, local feedback and local context analysis on TREC4. 10 documents for local
`feedback (If-10doc). 100 passages for local context analysis (lca-100p)
`
`feedback and top 100 passages are used for local context
`analysis in these tables. In table 6 for WEST, the expansion
`concepts are downweighted by 50% for both local feedback
`and local context analysis.
`We also made a query-by-query comparison of the best
`run of local feedback and the best run of local context anal-
`ysis on TREC4. Of 49 queries, local feedback hurts 21 and
`improves 28, while local context analysis hurts 11 and im-
`proves 38. Of the queries hurt by local feedback, 5 queries
`have a more than 5% percentloss in average precision. The
`worst case is query 232, whose average precision is reduced
`from 24.8% to 4.3%. Of those hurt by local context analysis,
`only one has a more 5% percent loss in average precision.
`Local feedback also tends to hurt queries with poor perfor-
`mance. Of 9 queries with baseline average precision less than
`5%, local feedback hurts 8 and improves 1. In contrast,lo-
`cal context analysis hurts 4 and improves 5. Its tendency to
`hurt “bad” queries and queries with few relevant documents
`(such as the WEST queries) suggests that local feedback is
`very sensitive to the number of relevant documents in the
`top ranked documents. In comparison, local context analy-
`sis is not so sensitive.
`It is interesting to note that although both local context
`analysis and local feedback find concepts from top ranked
`passages/documents, the overlap of the concepts chosen by
`them is very small. On TREC4,
`the average number of
`unique terms in the expansion concepts per query is 58 by
`local feedback and 78 by local context analysis. The aver-
`age overlap per query is only 17.6 terms. This meanslocal
`
`context analysis and local feedback are two quite different
`query expansion techniques. Some queries expanded quite
`differently are improved by both methods. For example, the
`expansion overlap for query 214 of TREC4 (“Whatare the
`different techniques usedto create self-induced hypnosis”) is
`19 terms, yet both methods improve the query significantly.
`
`7 Conclusion and Future Work
`
`This paper compares the retrieval effectiveness of three au-
`tomatic query expansion techniques: global analysis, local
`feedback and local context analysis. Experimental results
`on three collections show that local document analysis (local
`feedback and local context analysis) is more effective than
`global document analysis. The results also show that local
`context analysis, which uses someglobal analysis techniques
`on the local document set outperforms simple local feedback
`in termsofretrieval effectiveness and predictability.
`Wewill continue our work in these aspects:
`
`1. local context analysis: automatically determine how
`many passages to use, how many concepts te add to
`the query and how to assign the weights to them on a
`query by query basis. Currently the parameter values
`are decided experimentally and fixed for all queries.
`
`a new metric for selecting concepts.
`2. Phrasefinder:
`Currently Phrasefinder uses Inquery’s belief func-
`tion, which is not designed to select concepts. We
`
`6
`
`

`

`Table 5: A comparison of baseline, Phrasefinder, local feedback and local text analysis on TREC3. 10 documents for local
`feedback (If-10doc). 100 passages for local context analysis (Ica-100p)
`
`average
`50.1 (—7.0)
`
`1f-10doc-dw0.5 Ica-100p-w1.0
`
`81.9
`92.1
`(44.7)
`84.3
`76.9
`71.4
`78.5
`68.2
`73.9
`60.8
`61.8
`56.8
`56.8
`50.7
`44.2
`36.4
`22.6
`10.0
`[average|53.8]62.0(-3.3)|55.6(133)|
`
`(—4.0)
`
`Table 6: A comparison of baseline, local feedback and local text analysis on WEST. 10 documents for local feedback with
`weights for expansion units downweighted by 50% (lf-10doc-dw0.5). 100 passages for local context analysis with weight for
`auxiliary query set to 1.0 (lca-100p-w1.0).
`
`hope a better metric will improve the performance of
`Phrasefinder.
`
`8 Acknowledgements
`
`We thank Dan Nachbar and James Allan for their help dur-
`ing this research. This research is supported in part by the
`NSF Center for Intelligent Information Retrieval at Univer-
`sity of Massachusetts, Amherst.
`This material is based on work supported in part by
`NRaD Contract Number N66001-94-D-6054. Any opinions,
`findings and conclusions or recommendations expressed in
`this material are the author(s) and do not necessarily re-
`flect those of the sponsor.
`
`References
`
`(Attar & Fraenkel, 1977] Attar, R., & Fraenkel, A. S.
`(1977). Local Feedback in Full-Text Retrieval Systems.
`Journal of the Association for Computing Machinery,
`24(3),397-417.
`
`[Buckley et al., 1996] Buckley, C., Singhal, A., Mitra, M.,
`& Salton, G. (1996). New Retrieval Approaches Using
`SMART : TREC4. In Harman,D., editor, Proceedings of
`the TREC 4 Conference. National ‘institute of Standards
`and Technology Special Publication. to appear.
`
`[Caid et al., 1993] Caid, B., Gallant, S., Carleton, J., &
`Sudbeck, D. (1993). HNC Tipster Phase I Final Report.
`In Proceedings of Tipster Text Program (Phase I), pp. 69-
`92.
`
`[Callan et al., 1995] Callan, J., Croft, W. B., & Broglio,
`J. (1995). TREC and TIPSTER experiments with IN-
`QUERY.
`Information Processing and Management, pp.
`327-343.
`
`(Callan, 1994] Callan, J. P. (1994). Passage-level evidence
`in documentretrieval. In Proceedings of ACM SIGIR In-
`ternational Conference on Research and Developmentin
`Information Retrieval, pp. 302-310.
`
`[Croft et al., 1995] Croft, W. B., Cook, R., & Wilder, D
`(1995). Providing Government Information on The In-
`ternet: Experiences with THOMAS. In Digital Libraries
`Conference DL'95, pp. 19-24.
`
`[Croft & Harper, 1979] Croft, W. B., & Harper, D. J.
`(1979). Using probabilistic models of documentretrieval
`without relevance information.
`Journal of Documenta-
`tion, 35,285-295.
`
`[Crouch & Yang, 1992] Crouch, C. J., & Yang, B. (1992).
`Experiments in automatic atatiatical thesaurus construc-
`tion.
`In Proceedings of ACM SIGIR International Con-
`ference on Research and Development in Information Re-
`trieval, pp. 77-88.
`
`10
`
`7
`
`

`

`‘
`
`number of documents used
`20
`30
`.
`27.2
`+8.2
`37.7
`+19.4
`44,1
`-18.0
`
`50
`26.7
`+6.2
`37.7
`+419.3
`40.0
`-25.6
`
`26.1
`43.5
`36.6
`+15.8
`35.1
`-34.7
`
`collection
`TREC
`
`
`
`number of documents used
`100
`10
`20
`30
`50
`40.0
`52.0
`48.7
`47.5
`44.5
`52.6
`WEST
`
`-2.2
`-3.3
` -9.5
`-11.6
`-17.2
`-25.7
`
`Table 8: Downweight expansion concepts of lecal feedback by 50% on WEST
`
`{Deerwester et al., 1990] Deerwester, S., Dumais, §., Fur-
`nas, G., Landauer, T., & Harshman, R. (1990).
`Index-
`ing by latent semantic analysis. Journal of the Ameri

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket