Modern Information Retrieval
`Ricardo Baeza-Yates
`Berthier Ribeiro-Neto
`Unified EX1014 Page 1
`ACM Press
`New York
`A Addison-Wesiey
`Hariow, England 0 Reading, Massachusetts
`Menlo Park, Caiifornia 0 New York
`Don Mills, Ontario I: Amsterdam I Bonn
`Sydney 0 Singapore 0 Tokyo 0 Madrid
`San Juan 0 Milan 0 Mexico City 0 Seoul o Taipei
`Unified EX1014 Page 1


`Copyright © 1999 by the ACM press, A Division of the Association for Computing
`MachinaI'y, Inc. (ACM).
`Addison Wesley Longman Limited
`Edinburgh Gate
`Essex CM20 ZJE
`and Associated Companies throughout the World.
`The I'ights of the authors of this Work have been asserted by them in accordance with
`the Copyright, Designs and Patents Act 1988.
`All I'ights reserved. No part of this publication may be reproduced, stored in a
`retrieval system, or transmitted in any form or by any means, electronic, mechanical,
`photocopying, recording or otherwise, without either the prior w1'itten permission of
`the publisher or a licence permitting restricted copying in the United Kingdom issued
`by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE.
`While the publisher has made every attempt to trace all copyright owners and obtain
`permission to reproduce material, in a few cases this has proved impossible.
`Copyright holders of material which has not been acknowledged are encouraged to
`contact the publisher.
`Many of the designations used by manufacturers and sellers to distinguish their
`products are claimed as trademarks. Addison Wesley Longman Limited has made
`every attempt to supply trade mark information about manufacturers and their
`products mentioned in this book. A list of the trademark designations and their
`owners appears on page viii.
`Typeset in Computer Modern by 56
`Printed and bound in the United States of America
`First printed 1999
`ISBN 0-201-39829-X
`ŽzZg‘$’j“•”;–g—?˜™šœ›$›$›g‘`—I–RžŸ Žz¡"g’bž£¢j¢£¤RŸx¥¦“•§$“•¢I“•;¨ Z©v—I–Rž Ÿ ¢j¢j ˜ “•ªU—I“•Z¨C©€Z’5Žz$«¬R—I“†¨S”¡ˆª ˜ –g“†¨Sª;’b‘$¤Y­®¨ ˜$¯_° Ÿ±ŽO¡C² ¯Ÿ ³g³>“•¢jZ¨t´ž£¢Iµ•žœ‘0¶8;¨S”U«AªZ¨C¶]“•«·“•—jž£³¸v³e“†¨>g¬’b”;– ¹ ªU—jžº ª;’jµ•;»¸v¢j¢jžh¼CŽz¡C½Z¾|½¿g¸¸¨S”;µ•ªZ¨R³ªZ¨R³ˆŸ ¢j¢j ˜ “•ª$—jž£³CŽz$«Sª;¨“•ž£¢ —I–g’bZ¬R”Z–RZ¬R— —I–Rž ´CZ’jµ•³ ¯Àz–Sž_’j“•”Z–g—j¢±Z©—I–Rž ªZ¬R—I–RZ’b¢_Z©Á—I–“•¢±´;’jÂ`–Rª;§$ž ]žœžh¨ˆªU¢j¢jžh’b—jž£³`g‘k—I–Rž£«Ã“†¨ˆª ˜£˜ ;’b³ª;¨ ˜ ž»z“•—I–—I–RžŽO;g‘$’j“•”Z–g—£¤]¥±ž£¢I“•”Z¨R¢ ªZ¨R³ˆÄª$—jžh¨g—j¢_Ÿ ˜ — š£›$ÅUÅ ¯Ÿ¦µ†µY’j“•”Z–g—j¢_’bž£¢jžh’b§$žœ³ ¯3Æ  Sª;’b— Z©Á—I–“•¢_¬ggµ†“ ˜ ª$—I“ǏZ¨ «:ªZ‘A]ž ’bžb’b>³>¬ ˜ ž£³n¤Y¢j—jZ’bžœ³`“†¨Cª’bž£—I’j“•ž£§;ª;µv¢j‘>¢j—jž£«0¤S;’±—I’bªZ¨R¢j«·“•—j—jž£³0“†¨CªZ¨g‘A©€;’b«Z’_g‘kªZ¨g‘&«Až£ªZ¨R¢£¤Sžbµ•ž ˜ —I’bZ¨g“ ˜ ¤Y«Až ˜ –Sª;¨“ ˜ ªZµ€¤g–SU—j ˜ Zg‘$“†¨R”¤8’bž ˜ Z’b³>“†¨S”k;’ U—I–Sžb’b»O“•¢jžU¤]»O“•—I–RZ¬R— žb“•—I–Sžh’5—I–Sž g’j“•Z’5»z’j“•—j—jžh¨0]žb’b«·“•¢j¢I“•Z¨ ;©—I–Rž ¬gµ†“•¢I–Ržh’:Z’±ª·µ†“ ˜ žb¨ ˜ ž]žh’b«·“•—j—I“†¨R”`’bž£¢j—I’j“ ˜ —jž£³ ˜ Zg‘$“†¨R”A“†¨C—I–Sž È_¨“•—jžœ³ˆÉ “†¨S”U³U«"“•¢j¢I¬Sžœ³g‘ —I–Rž_ŽzZg‘$’j“•”Z–g—v¶n“ ˜ žh¨S¢I“†¨R”Ÿ ”$žb¨ ˜ ‘¶8—j³]¤U›$¾ ÀU—j—jžh¨g–SªU«ÊŽzZ¬g’b—˦Uª$³]¤$¶YZ¨R³;¨k´ÌšœÄ›$º±¸ ¯´Í–g“†µ•žA—I–Rž ¬ggµ†“•¢I–Sžh’–Rª$¢_«Aª$³gžž£§$žh’b‘kªU—j—jž£«·R—¦—j:—I’bª ˜ ž ª;µ†µ ˜ Zg‘$’j“•”;–g— ;»z¨Sžh’b¢±ªZ¨R³0;S—jª;“†¨]žh’b«“•¢j¢I“•Z¨t—j·’bžhg’b>³e¬ ˜ žA«AªU—jžh’j“•ªZµ€¤S“†¨Cª©€ž£» ˜ ªU¢jž£¢_—I–“•¢_–SªU¢¦g’bZ§Už£³0“•«·]U¢j¢I“†µ•ž ¯ŽzZg‘$’j“•”;–g—¦–RZµ•³gžh’b¢ Z©v«AªU—jžh’j“•ªZµÁ»z–“ ˜ –ˆ–SªU¢¦¨R$—z]ž£žb¨ˆª ˜ Â>¨RZ»zµ•ž£³g”$žœ³ªZ’bž žh¨ ˜ ;¬’bªU”$ž£³ˆ—j˜ ;¨g—jª ˜ —_—I–RžMg¬gµ†“•¢I–Ržh’ ¯¡ˆªZ¨g‘&Z©Á—I–Sž³žœ¢I“•”Z¨Rª$—I“•Z¨R¢ ¬S¢jž£³kg‘0«Aª;¨>¬©®ª ˜ —I¬g’bžh’b¢ ª;¨S³&¢jžhµ†µ•žh’b¢ —jA³>“•¢j—I“†¨R”Z¬g“•¢I– —I–Sžb“†’g’b>³e¬ ˜ —j¢ ª;’bž ˜ µ•ªZ“•«:ž£³ªU¢_—I’bª$³gž£«AªZ’jÂg¢ ¯ Ÿ ³g³e“•¢j;¨´Cž£¢Iµ•ž£‘&¶8;¨S”U«AªZ¨C¶]“•«·“•—jž£³ˆ–SªU¢¦«:ª$³gžž£§Užh’b‘kª$—j—jž£«·R— —j:¢I¬gµ•‘&—I’bª$³gž·«:ªZ’jÂk“†¨©®Z’b«AªU—I“•Z¨tªZ];¬S—±«AªZ¨>¬g©€ª ˜ —I¬’bžh’b¢±ªZ¨R³ˆ—I–Sžb“†’g’b>³e¬ ˜ —j¢ «:žh¨g—I“•Z¨Rž£³ˆ“†¨C—I–“•¢_]>Z ¯ ŸÌµ†“•¢j—±Z©v—I–Rž —I’bª$³gž£«AªZ’jŠ³žœ¢I“•”Z¨Rª$—I“•Z¨R¢ªZ¨R³ˆ—I–Sžh“†’;»z¨Sžh’b¢_ª;]žœªZ’b¢ ;¨ˆSªU”$ž §$“†“†“ ¯Àz‘$]žœ¢jž£—z“†¨tŽOU«·g¬S—jžh’5¡ˆ>³žb’j¨ˆg‘0Î;ÏÄ’j“†¨g—jž£³ˆªZ¨R³0]Z¬g¨R³0“†¨C—I–Sž È_¨“•—jž£³CÐ>—jª$—jžœ¢ Z©ÁŸ «Ažb’j“ ˜ ªÑ8“†’b¢j—¦g’j“†¨g—jž£³šœ›$›$›­IÐ>Ò Æ ¾;ÓÔ½Z¾šbÓsÕ$›$ÅG½Z›;ÓIÖ× ØZÙÇÚ£ÙÜÛ£ÝßÞ)ÙÜàØ;ágØZâmã·ágÚ£áäÇåSæSçÙÜèæébÙÇè8éœê çàäÜÙÜë>ágÚ£ÙÜåSèíìkáÚ£áŸ ˜ ªU—jªZµ•U”Z¬RžM’bž ˜ Z’b³0©€;’ —I–g“•¢z]>ZÂ&“•¢ ª;§;ªZ“†µ•ª;µ•žA©Ü’bU«—I–Sž Ò)’j“•—I“•¢I–C¶]“†’bª;’b‘Þ)ÙÜàØ;ágØ;âÍågîã·åSèæRØZï>ۜÛ0ãáÚ£ágäÜåSæRçÙÜèæébÙÜè8éœê çàäÜÙÇë>áÚ£ÙÇåSèíì`ágÚ£áÒzªUž£ð£ªZӀñ3ª$—jž£¢£¤RË ¯ò° Ëz“ ˜ ª;’b³G²¡ˆ>³gžh’j¨ˆ“†¨g©€Z’b«:ª$—I“•Z¨C’bž£—I’j“•ž£§;ª;µ3ó·Ëz“ ˜ ªZ’b³gkÒzª$ž£ðœªZӀñzªU—jž£¢£¤SÒ3žh’b—I–“•žb’ Ëz“†]žh“†’b;Ó Æ ž£—j ¯ ¯-˜ « ¯­Ü¨ ˜ µ†¬S³gž£¢ g“†µ†“•U”Z’bªZg–S“ ˜ ª$µv’bžh©€žb’bžh¨ ˜ ž£¢_ªZ¨R³0“†¨S³gžh¼ ¯­sÐeÒ Æ ¾ZÓÔ½Z¾gšhÓsÕ$›UÅ>½;›ZÓs֚ ¯ ­Ü¨©€;’b«Aª$—I“•;¨C¢j—jZ’bªU”$žªZ¨R³`’bž£—I“•ž£§;ª;µv¢j‘>¢j—jž£«A¢ ¯ ­ ¯ ËO“†ôžh“†’b¤]Òzžb’b—I–“•žh’5³žŸ¦’bªvõ¬;öbÆ ž£—j¤Yš£›UÏ$¾ZÓ ¯ ­j­ ¯Àz“•—Iµ•ž ¯÷SÏ$ÏGø ¯ÒzÕ$ù š£›$›U›¾>½UÎ ¯¾$ù;ú>³ ˜ ½eš ›U›ZÓbš£¾U¾$Õ$Վ¦­€Ä
`British Library Cataloguing-in-Publication Data
`A catalogue record for this book is available from the British Library
`Library of Congress Cataloguing-in-Publication Data
`Baeza-Yates, R.(Ricardo)
`Modern information retrieval / Ricardo Baeza-Yates, Berthier Ribeiro-Neto.
`Includes bibliographical references and index.
`ISBN 0-201-39829-X
`1. Information storage and retieval systems. I. Ribeiro, Berthier de Arafijo
`Neto, 1960- . II.Title.
`Unified EX1014 Page 2
`Unified EX1014 Page 2


`t.—clii1imisioiia.l \-'c:ct.01‘ial sspaccz and SE£1I1(lB.I'Cl linear etlgehra U}'.)(‘-I‘i11'.iOl1§-.3 mi \-'<‘(".mr;~s.
`1901‘ the classic pi‘ol_ml)ilisi.ic model, the frainmx-'i:irk is (‘.A.iIl'l[)t).SC('l of sets. etnmla1'r:l
`tirnulialiility opemtioiis, and the [3-a.;.-‘us’ theorem.
`In the l.'B'll1‘r1.iI1(lE‘1' of this clia.pter._ we Cli.‘-.2(,‘ilSr& the m1‘;~; IR 1I‘i()Llt.‘l.‘i Sl1OWI‘l
`in l.4‘igiire 2.}.
`'i‘hroughoi1t'. the L'liE-5(.‘.1iSi-i'lL"iI1._ we (lo not vxp1ir.'iti_i-'
`iJ1>it.ai1i.i:i.t:r the
`L-.ompoi'1e1'its.-1 D_._ Q, F. a.nd R{q,-_,d_,- of earth 1I1()(li‘l. Sm;-li mr1ipom=.1ii.s .‘-LliUi_llL'l be
`quite cletti‘ from the R'li.‘s'CilSSiCJl'I and C2111 he ea.-:«'il3= ii1fe.rrerl.
`2.5 Classic Information Retrieval
`in this ;-ser':tim1 we briefly pwscziit the tlirve ('?lE1.‘_-i.H‘lC 1' in ii1fi'Jri'm'i.tioii T‘(':l'-1"i£':E‘»-'£1l
`I1.£i.l'l1F..‘l}r: wit! Buoleaii. Elli‘.
`\,'<*:.:tor_, £i.‘I1Ll the }'J1't'Jl_‘JE1lJill.‘3il(' Jill..N'lL‘l,-1‘.
`2.5.1 Basic Concepts
`Tilt‘. c:la5sic lI1U(l(*.li-E in iiiibiiiiatiuii i‘::t1‘is':\'al <?<iIi.~'irl:.=i‘ that m:':h cl<Jr.'Liiim1t
`is alv-
`i’(‘I‘I1i>i. An iiir'1’e.r tr-“r-rrr
`::at:1‘il':ecl by {it stat. of repmistriitiitive l<€‘_\-'WUTll.‘-a'
`is :~;-iiiiply ii {tlmciiim-.1it} worcl whose :-.4€’.I1lé11'Ji.iC..‘i
`hs.=lp;~: in 1-::1m>iiiln'~1'iii;_;; the L'itI}(.‘l|.—
`merit”:-; iiiziin ilieme:-;.
`iIlflE‘X l.€!1‘lIl:':i are 11:"-_'.P(.l to imle): 'rlll(l
`:-w'L1il1}l1'r'l.I"l;’.£‘ this
`Iii g‘eiieml._ iiidex terins aJ:¢"~ n1a,i1il_*_v 11011115 l'Jt'i('£1ilt:1f" nouiis
`i18.\"E.’ nmitniiig by tl1t:111;~';elx=e.~; and thus, their .":‘(‘lI’lE].I}til'.f:-4 is f‘.?1.‘.~?iE.‘l‘ to ir.i<':ii1if_\_' £i11Ll
`to grmap.
`»h.cljeL-tit-‘ea ?l.lLlV'E¥TlJS_, dllfl
`t:o1i1ie':u:.tiVe.;-; are l<=.~s.~5 L1.~;et'iil
`<z}-' work riiaiiily 3,3 (301Il])ll'r.‘.1ll('=.I1T-.'~.'.
`iiiigglir he iriter<.a.~itiiig_;
`in E‘t)I1E-§lCi(.‘.I‘ all the distirict v.-'01'(ls‘ in a. duciiinent (?0ll{'.t“.ill)1l
`23.:-1 index t.:_:r1n.~s. F01‘
`is‘lopt.ed by :-soim-r Wei: S(?£'11'(_'-ll E!1l,‘_{iI)(‘h its f.ll!vit",11SS(':{l in
`iiistaiirte, this }.1.ppT‘()€|.(‘ll
`C.‘-lieipter l3 [in wliirh traee, tlie rlociiiiieiit logical '\-'l(:\-\-' is f1r.!£ !r_:.'ri‘).
`\’\-"0 ptistpoiiir
`+1. c,lis(:u:-ssimi on the pr0h1:_=.i'n of how to gt:-m*rii.t.e i]‘I.(i.E.‘.K terims imtil C-lizipter‘ T.
`wliere the issue is cm-'e.rerl in cietiiil.
`the-it not till
`:'lo(-ui11e1_it._, we 110'l'-l(_‘.(:
`Given a set of index I,(.‘1"Il1r':'~ for {-1
` (K1118-ll_‘_y’
`iiee.t‘i1l for Ll(Eé:;I_‘.I'ii‘)lI'1,"_’,' the dnciiimziit CUI1i.eT1i.‘H,'. iii fat:-t,
`ll1(fI‘t.! are inciex
`l.ei'm.~3 wliicli are saiiiiply ‘.-'21gIlE?I‘
`thzm otliei‘:-;.
`Ilecitiiiig on the lTIJD{'JI‘L:;111t".(' of at
`term for HL1l1.1ll1Z1.I"i'Zillg i..l'1f'3 (roim-:1it:«: of +1.
`(‘lL'.iI.‘.‘[l]1.LF,‘Yii. is not 3. 1-I‘i\"iE.1l
`tliifi <iifiic_'.11lty._ tlivrtr are properties of an inrlex IIEITI1 vi,-'hic.h are vn:ail_\'
`':1l)(i‘€Vi]i('.h are useful for evaliiaiiiig tlu: potzzntial of a E-(‘U11 as r%1i.t‘.l1. Fm‘ iii:-:'LH.1J(‘(?.,
`miisider 2+. r:ollet:tic:in with E). l11111f.lI‘(i(l i-l’1(Jll>s‘Elll('.l dm:iime1it.~:. A WtJl‘t'i whic..l'i a.1)p(::u':<
`in each of the r_mr..= hui'i<h‘ed tl)(JLl.‘-.%€-l.l](l clocu1ne1it.~s is cmr1plu1;el‘i-' tieeelcvss as an index
`term bec2t1.1.~'se it clues not tell uh‘ .=i1iythii1g‘ about Wlll(‘.ll (i0{‘L‘I1l'1(’IJ.i'.'.~i the u:~:er might
`be iI1i'.E:‘-I(!ST-£L’.L'l in. On the other liaricl, EL word whiczh :-ippeam in just five (lt)L".i1111(‘.J‘ii..‘u‘
`is quite iiseful
`lietta.i.1i-se it Iiarrows down c0ii;~sidc-.i‘2il3l_x-- the .s';.i;u:<'- of {iOE.'UI11fE11i'.£-3
`wl'iiC.l1 might he of illlit-Z'l‘i).'3T. to the user. Thus,
`it .~;l1milti
`lie :.*ltaa.i'
`index t.<2r1'r1s have x-'ei1‘yi1ig relm'a.nce wht.~.11 i.i.eietl to -.'ir.i-'.;:.~1‘ihe clurriuiienii C01iT.(‘I'Il'.‘~'.
`T his eHet:t is f‘.EtpL111'E‘.(i t.h1'm1gli the aeeigiiiiitaiit oi'1ii1mei‘i::-al i.i.:eighf.~? to each imiex
`term of a. clocumeiit.
`Unified EX1014 Page 3
`Unified EX1014 Page 3


`CL.-XSSIC‘ IN FOR.1\IATlCJN 1t.E-”1"HII~_?\-"Al.
`ti1- be a fi('JC‘11l11E'-11L, and -ir.r,-‘_,- 3 U be a -ri=r-gight
`l!L‘;.' he an .‘iIl(iE‘X teriii,
`I:‘L.*-$5-i(.\t.’i‘cli.t_"(i witii the pair (A3-.(1_, This weight qua.ntiiir_\=.' the iI7npOI‘F:111(.?(.‘
`tit" the
`imicv: T.t"FlL'l for (ti-sr.'t‘ii3iI1g,' thv <.h:t_:11111(_=.I1t stemaiitic tI0l‘n'.t‘l1i'i-'~.
`:1 ge.t'nc':n'r.-
`F.-tf. f be Ht:-' 7r::mrt’.:cr‘ t.:f;imir:.1: ts.-'t' in. the .s'y..~'.'?‘.r:‘r;r1 (2.-mi rh hr:
`if-i"']‘H. K H]. .
`_ 313}
`thy .901 of uh‘.
`r-reach mdr-.e' farm.
`53- of (2 dot:-wnzrmf.
`For‘ an.
`.'t:hu'h I.’f:c.Jt'.*€ not
`{hr-. clot:-ri.1rm.'-raj.
`'Ei‘l'_J' 2 ti.
`I-1"’-?(.h. Hts:
`:'1.'~''ir."rT€.rt an m.rrie-'3'
`If.‘F"IH r=r'r.'t‘o-1‘ ff;
`7'e:p7'res<t=rt.f.t'rI by d._.,
`[rt Us
`it. f-rm<'r.‘i':m that.
`<:r..=;.s'r_:."-i'rL£<’rE wife.
`the: m.a'c:.r
`is-1-;r.r_r A'_.
`‘IN may J'—r.h‘rirt.r&-:rJ..-4iurrrti e=t'-r:f'm'
`,g¢[:rt)] = I15.-:.,'..5'.
`li_’.1"1l'l w<~ig'i1r._~; a.1'c tlstiaiiy asstniiied I.-U hv mutu-
`we iE!.LE¥1'
`ztlh-' imhrpr*1iLh'-i'it.. T1115 iiimiis that kiiowiiig the we-.ig'iit
`as.~3u(_'.i::'Lt.t=.d with the
`pair {:’+_..n’__;_':
`tail:-s us l1f.)Ti1ii1E,' athmit
`the weight. H"l'_iJ‘_-', E1f'§St_.1("i}1if‘.(i with the pair
`{‘,t',;__- ].t1“.); This is cicr211'i_\-' at Him1:iiific_'at.io11 i.)t’E?'i1‘llF:‘u(t
`(Jf_‘.(‘.111'I‘{"liCL'!.‘«f of indr-x i.(‘.1'IJl.H" iii
`:1 <i:_wmuz=m E-LI't-“ not l1llE.‘()1‘l‘E‘.iI-Li.U{.i.
`(.‘.<;=1i:-sidvr, for i1i;4;t.a.1iL'-.c.*.._ that the terms (‘om-
`ii1f_i€.‘X as, git-'ezi t..ir_zt:u1nr.:1)t' which t".{)\'f-.‘T;'5 the arm. of
`gi-i:tr".*‘ zmil n.r'e‘ri.Irm’st arr‘ lt:%(t1'i
`co1npI.1te‘-.r .m:t.woi‘k.~‘. FI'F.‘t'1LLt?I1i..iv\_'_. in this (i()t‘.t1Yll(.‘Ui'-_. the a.ppear:s.m.-v. ot'u11P of tiiestz
`t\T(J v.-'or:,i:-.: a.tt1‘acrI...5 the: 3-ppe;11‘ai'ir:(.=. of the Othf-‘.‘L“. Thus~‘.. these two wurrls are (.'.t:n‘1‘:..‘.—
`h'a.r.L=r.i £1I1f_i i.i1P.i1' wc~.ight.:-; E_‘(]lI.i{i rcrftertt t.hi.~4 correhition. Vt-"hih\ 1m1tt1a.i il1Cit.‘I)!‘?I1{ii&‘l1(1t3
`I-.¥L’(‘I1‘1:-5 to ht‘ as
`:-4tl‘()11g siiiiuplitic-ation, it,
`(i()L‘..‘_~‘. ésilllpiiffy’ the task of ticnnptltiiig ilit'it'E}(
`terLx1‘v.'(:igi1t-:3" stud aiizm-'.~.' for feist. raliking miiiptitatiull. Ft.1rt.tJ9riiir:I‘e, Tttiiitlg mi-
`\*a111t.ag_J,(' rJfi11<.h':x t.o1'm c0I‘t‘Lrh'a,r.io1i:3 for i111p1‘m-'ir1g' the firm} (iOC.'l1I1Jt-T’l1L I‘é111i{iTly,‘ is
`:1» siiiipinr t.-«mic.
`f:u:.'|.. iiunt: of the nia.n_\; £ipp11m.ci1v.3 p1‘0pr).*~:¢3€'l
`in the 1m.~'st.
`ha."-s r_:ir.azs,r‘l_v ('i£’ni(iI).‘~‘tFili'.(.‘(i that imtex term r_‘m‘1'ei;i.t.i01'is
`are u.i'1vn.1it:i.gc:m15 [for
`ralilaiiig 111i1.‘[3tJ5<?H] with gv1it2I':\.i c.<.':iiectioi'is. Ti1e2i'efurr2._ LIl1i(I‘S'.‘n‘ (.'i€&11'i}'
`.~eta.r.<rd oth-
`xx-1‘ a.~;;~sL1n1:.‘. muttl.--ti
`itu"h2pe11dem:Lz aiiitzarig index t.eriiis.
`In C'hapter :3 we
`(ii."u'('-‘LiE'-'~.‘% 11'l(}(iE‘T"IL r‘<zt.i'im'ei.i TI.‘-(‘i]I1iqL1E‘.‘-E which are ha::r.>d on term c_-.0rr:;2i:-1t_ir.m.~4 and
`i1a.\'£) hmzu t.rA:~'t0ri HI.1r'.LtL:rssf11ily with pa1'tic.ular (::')il:rc.t.in1i.~_<. Tiirzse .‘~.'IIEL?i'?i‘:-$863.‘-.'
`EiP.(‘IIl to if)? ;sh'm-‘i‘\-'
`.~+Jiit'ti11g,' the mt1'rent t11itle1‘sta.1iding'
`t.m=x-'ards it more fEL\-'OT&tiJi(‘
`View of the :isvt‘11iI1v:~::s of term cun‘el2t1.i0n.'-; for infm'n'mI.imi I‘€‘tI"iE‘\-'E.ti
`The di)U\-'(‘ detiiiilioiirs I‘>rm=i:.'ic.~ .'~_n.1131Jt>rt for (iii-i{.’L1.‘é‘.-iiI.'lf§ thir t.i1I't\(:e <".tms::-;i(’ iItfor-
`111-atirm 1‘etric\-‘£11 llN'.)fi{?iS. 11amt':1_\_-'. the Buraltauii, the x-'0c:tc:r. Etllfii thv pi‘t.nha.i.1ili.'~stit':
`1iiodoi.~a. as we imw tin.
`.5.2 Boolean Model
`Tim Buoitzan model is 21 simphz ]‘€!t1‘iE.\-‘Hi nmdel i.)a.soL'l on set. iCi1E1t’)‘J‘_"_\_' and B(.1L'JiQ&l-ll
`fl.ig,'t:hret. Siiice i'.i1E’.
`('(J1].C.E!DL of a. set is quite intuitive,
`til!‘ Boolean Jiiuciei pm-
`Vities it f1‘étIl'].(.‘-\'L't')Ti:: which is e.a:a}-' to g1‘EL-Sp by a.
`t:0n1n'ir.a11 11:"-it.‘-1' of an IR .~;_ys4r.u1J1.
`Furtherrnorzr, the (1115-','i"iE‘:-I are .'~spec:ifler_i as Boohran t:xp1‘e.s:3ic)115 which imve pret'i5:e
`(_.'i\-‘on its i11h<‘re.i1t ssiinpiicity and mat fl‘}I'fl1il.i.h5IlZl. the Br..u;ii(2a.1i mor_l<'.]
`ceived g1‘ea.t. n:r.t.eI1t.io11 in past \\_-'E‘.E1l‘:$ micl vas adnpteci lay imtiiy of the va1‘ly
`brniiierciai i'.Pii)ii(}g'I'it[I1ili(.'— S}-’SiT(3l11-.5".
`Unified EX1014 Page 4
`Unified EX1014 Page 4


`Figtlre 2.3 The 1hrt=<—'
`i"[)11_i1_l}1a’.‘1..iV£~? cm11]:m1ici11.s for the quiet},-'
`iq —— kn
`t.-he B001eaI'1 model :‘_«'11tTe1‘2-; from iimjur dz‘awlm:fks. First‘
`i:-a Em;-:c(i U1} :1 binary d<2ci.«‘io11 Cl“ii'.(?l‘i()1l [i.e.._ £1 dtrciiimriii. is
`its re.1.i"ie\-'*r_1.1 stra,1'.vg_\'
`p1‘er1iL:1.eci to bv s_=ir.lii-=r 1‘(!il':5V’E\-I1i'. or ii0n—1'e1uva.1'iL) withollt any notimi of a grading
`rsmie. which p1‘v\-'ents gooci
`rer:.riev:Ll ps.'1'fc_arImmc<>. Thu.'~;. the Bt')()1(.‘FFl.l1 mc;»t.1+:l
`in rc=aliL‘\_-' 111U('h IIJI_‘Jl‘(.‘ a.
`c1211‘.-A {iTl&si~i?é1(i of inf01‘11ia,r.io11} 1'nt,1‘i(5:\'ui 1IJ.tTJ(i(‘.i. Sr=.('c;nid.
`xx-hilt: B‘.)01(‘~Lln L‘X[)I‘i?.‘3!-§iOI‘LS lime ]'J1'€t’_‘.iS+;".
`s.~?eLmi1ii.ic:s. f1'eqi.iei1t,ly it
`is not simple to
` an i1if'omiat.imi nved i1iLo it Buolzr-an a.=.xpres.~sic:11. In fart‘ nicast ll!-%E!1‘S find
`it (.iii’fiCl_]ii. and E1“-'i{\‘\r"8.l'l'i in K?-XD!.'(_’-S5 thvir t’1l.iF_‘.I“}"
`l‘(?qL1(‘-£s'ih' in i'.(‘.I.“m.'5 of Boolean (ax-
`przassioiis. The BL'JU].(.’a'1.l1 <~.:-:pre.=.+.=:io11.=s a.:rr.1.1a.l1_x,' fumi1.i1a‘r.a:d by 113'-iE'.l‘:-i ofteii 'r_‘\.I'(‘ quite
`ifsou. (f.’.Iiapter 10 for n. iiiure t}mrr;n_ig-11 di.‘3C1lSSi()I] on this issiie). T)ospit.e
`t-hi.‘é-30 drawbwrks. thv B[f1()iL°.a11 inndei is still the (i0I‘I1i11?.l1lL model with C.('JlIlI‘I1(31'Cii!.i
`tiOf‘1ll'[1(‘-lli". datnba:-sL= .‘1‘_‘y'.‘-il-i'!Il1:'5 and provides a. gomi stmniiig point. for T-h0S(‘. new to
`the iield.
`;1}.Jr3e.iit in a
`Tlie Tiocnhzall 1n:_‘nit=1 tE(lL1Si(iETS then iridex twin:-; £u‘r.‘. pI"{.’.'3£?1li. 01'
`.-'\:s a 1'e:.~:u1t, the iiidox Li-.‘.1‘1i1 weiglits aw 3h.‘H11]T1(3('.i to be all binary. i.e.,
`11:,-_J E {I}. 1}. A q1.1m‘§-' q is cnuiposeri of iI1d(’X tcriii;-; liiikeci by t.h1‘ee (.’L'Jl'iTJl3(‘.i'.-i\-‘E’!-ii
`not,u.n.d,:.>:-.T1nm, :1 qm=.i'_y is :~sser1r.ia1l}' a COYIK-'E‘.11UOTlai B:’)()i(‘.E1.l1 BXpl'(-‘{-$.‘SiU1l which
`(‘(1.11 he raz1)r(=se111.e:'i as adi.-s_j1i11r.'tion (.If(‘UI1j1lI1()tiVé.E‘(E!(1’£iJl‘:-$ i.r-.._ in :1‘.-1'5;-'r'1.r.;ri.c:t‘.?f.-1'6: -nw‘~
`mar.’ fr.I?‘?:I. — DN F}. |:"r_:r i11sst.zu:utv, the query [q = kn ,=“-.
`11L‘-pj] n-an he writtmi
`in d‘Ls_j1imrti\-'(r nor111a.lf01‘11i:i;-s [g‘a’,,[,lf —: (1, 1, 1) V {L 1._ (fl)
`"-J [1, 0, 0}]. Whf?l"i'_=.E!E1('.h of
`HM‘ E‘l_)IllpUI]{’.Il1.S- is a i)i.I‘1E11‘j; wuiglitecl ‘\'E!(‘.i10T :].SSt')Cil:1tL‘d with Li'1(‘- ltuphr £_'.?ca._ A3,, Aer}.
`'Ii1::rse iJi.'t‘lEl.I'_}’ weiglitmi \’(.‘(.’iL(.}1‘S are cn.Hr..-cl
`tlica ('(‘rDj‘l1I1i"T.i\-‘F!
`i.‘0mp0I1e-.Iii.s of rfignf.
`Figiire 2.3 illiisatratcs the three c.0nj1.:m':t.iv<: cm111.m11ent.s for the qL1v1'_\_-' q.
`1.-'m'~mbi’r:-..=:' are all
`the 2'.ndr3:r
`For mt: Boolrsmr m.odr=.£.
`'-3; i.e., U,';,J-,' C {U._1}. A q-u.e:ry q is :1 co-rme-rat-ion.u.E Boriittrm r2:zcp¢‘r:a-3-slon. Let
`rjr:m_,c be the d-.a'..::jm2.<-?fz'.-vie‘ -nr2rrn..aJ fmm for the query Q. Fm‘i‘hr;::r‘_. Eat rfil. be rm-y of Hi-1".‘
`(:0-rijir.-r1.{:ti:.'e r:ompor1—r-:n.ts of (fignf. The si.r:zi£r1.7‘-eTtg,r of a doc:-u.-rr1.e:n..t d_j
`to the query :3
`rifrejénecf as
`.9i'r?1(rijar}:| ¢ {’
`if E?“ I’
`{gm E q;m‘f} A W"‘*" giidii 1‘ .‘é’1'[‘IrcJ.\-I
`Unified EX1014 Page 5
`Unified EX1014 Page 5


`(_II..»\.‘:t§‘5]fjI INHOHA-IA'1'l0N I{.I2TRIE\-'AL
`-is r'rJ£r-":'rm.i
`If .-r.e":n..[r.iJ _. q) = 1 {ht-1*‘: Ute: Bonita.-I1 mortar’. p'r‘ed'£L:fis Th.a.t H1»: dot:-mn.(;'ia.t rfj
`to Hire guesry q .j""3f1f
`-m.-fight‘ not 5.9"}.
`thr-L gm-.:rh'rrt.i'«:m.
`-21.x‘ fhrif the‘ d»rIc‘t:.In.t'<1r1.f
`is 110T r‘r-:£m..'r1'ra.£.
`-nr;'.I.’c':-'mt.£. U1" mm-
`is (:it.h(.‘l‘
`that eztetl
`The B0:J1ea.L1 model p1'edi:::t.s
`i:-s nu notitm of em pa.-rtfru.’ m.a.t'r.‘h. L0 the :111.1e1“_\''1.'~s. For
`let dj he 31
`for V»-'i1i(‘i’1
`'— (0.1.t}}. Dm‘It1'11e11t
`t.I'1(= im_i(':.‘( t.t:1‘1I1 Fa, hut ire (‘(JlJ.réi(i("‘I‘EI(i 11n11—1'el:_=.vaL1t- to t.h+'.‘: qI.1m‘_V it;
`:2 K“ -(‘#5.
`The main tlfh.-'IfJ.'.f'.'.-fFr.m[3.€§ of the Btmlcmi niodci 21.10 t-he clean ft'_J1‘I’11&}iiSlI‘l ht‘.-}1it1t'_l
`the II3(}(i(‘-i and it.'~'s SiIIJ.pli(‘-it_\-'. The main di5‘a.<f'L'(1.11-tagr3.s El.l‘(‘ that <=x2.u‘t.-'}.1iug2_;
`imty tearl to I"{‘TI"i(:‘\-’:l-I of FUU few or too 111a11_v d:_n:L1Ji1entS {see (_'.}mpt{:1‘
`it is well kllovm t.ha.t. iI'1(lE‘){ ’rm‘1i1 v.-' (‘+1.11 i(l£l(.i to at. rstlbstalltiat in1prt'_‘:\'t'21n:2Iit.
`in n".’r.1‘it=\-'nl perfmliiaiiee. Imtex L.eI'm weigliting hriiig;-s 11:; to Llita vs-_‘(.'t<_:1' 1'r1::1d(%i.
`2.5.3 Vector Model
`is t.u(')
`iti.‘}?_. 605] 1‘trt-ngiiizm that Lhe Lise of‘ biliary,‘ xx-'vig'l1t.~:
`The \'t‘f‘.t'.t'Jl‘
`iiiiiitiiig; a.1u_i }JI‘c.:}';t'_Js(.‘:5 .11
`f‘1‘i1.I11£,'\‘.-'t_JI‘i»<' in v.'hic'}1 }m1‘ti2.1l 11i:~.L<_'hi11g 1:‘: 1'n.15sih1t‘. This
`is‘ El.('('()lll[)]iHiLli‘-{.1 hy assiglliiig 'n.o?;r—tJmr.1';r‘y weigiim to iIl(i(.’X t..L:r11Lr+ in q1_1t_'.i'i:.'=.5 and
`in {'iUt'i1‘1111(‘-1'1i-:-1'.
`'l'he:st* term \\r(-iglits are 1.1h-ima.te1}-“ 1.1S:1c] tn r‘01111)11t¢.: the :ir:_q'r‘:-:r'
`of xiii:-2I..’ti.r-.if_r,r bt':t.wt:m1 onvh t}o(:11111o11t E-§T.01‘(‘(i
`in the s_\-Stet]:
`tmsfl the Iisstri‘ q11t21‘_\_-'.
`f.i'u'2 I‘(‘f-I‘if2\'(?(] 1']fJf‘l1‘n1t’§I11'.{-é in :iv('1‘c‘:1.'~;ii1g order of thirs z.ieg1‘m‘. of .~'imii:-11‘-
`ity. L-he vector 1110:2191 1,ak(1s.=. into f.‘f_Jl1Si(,iF_!l‘FL1Ti{_'3I‘1 :10(:1111it:11t:~: which T.1'1a't.t.:::h the qum‘j\_'
`ft‘1'111.~a' 0111;’ [)a1‘tia1]y. The m:,' 1‘ <?tieC.t is that the 1'21I1kt=d :.t¢_:c.11i11r-_-nt ali-
`E-IWt.‘]‘ W1 is :1 lot.
`I11:)1'L'= ]')J."~‘..’L'IiEiL’ [in i..i.lE..‘ 91-.'eI1se t-hat. it hilt.’-t.(“1‘ 11mt.('t1(*.'~: the 11.‘~.'(.‘l'
`J11a.r.ir_m mxrd] sham the dcJ<'mntz11t. z1.11:~'wt31‘ rstzt. 1‘et.1' by 11110 Bticilmii II1(}(i(‘i.
`the -zt.'e'--igfzt
`Definition For the t.Ie"u:‘i‘.o-2‘ model.
`“"""""=""t3-W"-‘*5 U-"31-it *1-J-W="'U1':-d;'J
`izrr. H'J.(:
`qure-3"3.r mi’.
`rt.-iii‘: M5,: -pa.-fr [hgrg], 1r..u':'rrr'.
`r.a.-‘r.‘-.1'gra'i,tr1i'. Let arr,” hr’ the a:r:'.'3'g}r.if
`[:u.';:,‘,._ u;l,I, .
`is thr
`-i.':3r?1‘.m' r? 1:; xiezjimrzd as. sf
`.-1.9 he2fm‘r:'.
`thr: -e=r:r:to-:- for r:.
`iuf..rz.t mt'rr;r.E>e‘:r Uf'iI?.fJi{31?‘f£‘-?‘?IE.9 in ‘the:
`rt} is r'r'pr'r.*.sr<m’.c-itf by ah = {'t-t=1_\.J'.
`'e1.}.J- ].
`El. 11%;‘ qllttrjy q i-1-ft‘ 1'(=p1'L='::t:11ted as 1.-{iil1'Jt+]]:-ui1'Jl1E]..i
`rt} elllfi
`Tiierofrartt, n. uiim-i1111<‘i1t.
`'\-'E?L"-tOI'.‘u' as shmn-"n in Figlm‘ 12.4. The vector‘ Jmwiet prt')]'m.'-;(‘.:«' t-u EBV-'El.it1iti£’ the t.'i£*gI‘ee
`ti} with 1‘eg'a.1‘ti to the qL1t'2r_v q as l.l1(*. mrmiettitiii
`of 5-;i111ila.1‘ity of the L'i0(:1u1'1eI1t.
`ber.ween the vrzctms at; and cf. This c".-<:1rre]-artran can he q11a.11f.ifiot{. for iiistzuice.
`by T.-ilf."-
`t.'0.s"i'r:c: of Um:-.
`tmylc‘. }_H;‘—tW[.‘(.‘-11 tlitz:-;<z two \-'eL‘tr_:1‘r3. That. is,
`—--._- --— ---
`Unified EX1014 Page 6
`Unified EX1014 Page 6


` Q
`Figure 2.4
`'I"1'1e L‘O!_~'iI‘|{-‘ of 8 is a.ciopLcLl as .-se':n{rFJ-.q}.
`Ir? 21-73?‘ the HUTII15 oft.l1t.3 LiO('.11lI'1P1'lt anrzl queiry \-'(’L‘T.-OI‘:-i. The factor
`lg"! dmzs not atfferr. the reuiklng [i.e..
`t.h.c_=. <:11‘<ic1‘i11p; of thv.
`(i0(‘11lll£"I]iE-i:| 1:19?-(11156 it
`is the Salim for all <_h'.>(:1111ie11ts. Thv fnctmr |cf;;| D1'O\.-‘it'i(.’!-i a I1o1‘1112Lliz:u'.ic:»1'1 in the
`:«'pat_‘.0 of the {it')C'l11i1E’.I1I-H.
`'i'h11:~:. insmtad of
`II], s-2'-e11(q,:fJ-Z} \'&11'ie.-3fr0111Ut0+i.
`Si1ice-r:=,_J- ff»-_ U&.ll1Li'?f=,;_,;
`af’r::1r1pt.i11g; to p1'eL'ii:‘t Wi1t‘.T.i1£.‘l' a. rim‘-.L1meut
`is rczlevaiit or not. the x-'e(’t0r n1od:'_>.1
`rank;-s Lin":
`(lOC!.11u+;"I1f-.‘.w’ El.(.'('.'0I‘f'iill[r._', to their riegrcr-: of s-z'me'lara'rfg; to the qi.1r;+1‘_\-'. A
`ciomnizexit might be 3“¢'.-"f1”ie.".'t_=t_‘i
`(‘V011 if it nmt.c'.11ez-s the qiirzrg,-' only gm-r't:1'{:r.Hy. For
`iiist-a1ir'(:_, om‘ rem 0.‘-Sifi.if}iiHil at !;h1‘L:.~:iio1cl 011 .5inrz{d‘?-,q‘_J and 1“I_+'[T‘i(?V'L’. the d0t".‘llII1(-Elltri
`\'.'ith a tit:,§_g1‘<-‘<1 of siI1iilaJ‘it.}-' e1h:)\-1‘ that t111‘esl'1LJld. But
`to colnpiite miikiiigs we
`I1(‘(F(i Iirsf to .<apL::_:if'_\_-' iiow itithzx T..PI‘lIl weigiits. 2111‘. obtzminefl.
`\&'a}-'1-5. Thv work by
`Iurhrx term w<::ig'1iie< can he r_‘a1<:111a.t.£2ci in iriuiiy r1ii’fr:.rem..
`Saltori and _-\-Ir.‘Gil1 [(39% 1'e\'i<rw:«'
`\-‘arimi.~'s t.0r1i1-w:?ig,'h1;ing‘h11iq11e:-s. Iiert‘, we do
`not (iii-SCUSE-i 1_h::=.m in clertezii.
`I11s:t.uai.L'i, we :::01imiit1‘:1tc-. on n1iL«;:i<.I21tiug the main icjltrn
`hviiinri the 1110.51 (?fI(’tI![.i&'(‘ f,¢7i'Iii—w<zighr.iz1g T.£.‘(.’l1Iliql1(.’.'-i.
`iri£*.=1 is rclzued hr) the
`hen-;ir: p1"i11C‘ipi(‘-E-l which .s11ppo1'L 41111151-eJ‘i11g i,Pf'iJlli(']‘|l(‘..‘-1'. as f01iL':w.~s.
`Givmi :1 <:oiier:.r.itn1i CI‘ of nbje<‘t..~a and +1 -r.'o1_r,ru.r:deRt:ripl.i01i of a SM. .4, the goal of
`it Silliple t'.1L1st.(41‘ii1g a.ig<)1'iLhI11 111i;_5ht. be 1.0 .'~;(zpa.1‘+1t<z Lhe‘ (‘()1iE.“t".Ti011 C-' of 0l)jer_'.Ls into
`two :-sets:
`#1 first. one whi¢:.h is f.‘0111p0h‘mi of r.:hjazc.1.s I'f..‘i1,t(-‘.(itU thv set :'i and FL .~:r_=co11d
`mu,- wiiich is (;'.C:IT‘Ipl.J.‘-$0-Ii of (Ji_)_]t’:‘{'1‘.':-5 not rolaiecl to thv :-stat.
`V-a.g11v :_ieS<‘riptin1i here
`nit:-a.n.'~; that. we (‘lo 11¢;'rt'-
`imvte Cmiiplete i11f{::I‘r1'mLi:_m for ¢'h:trit,iiI1g p1‘E*('i:-;e1y which
`uhje¢'_‘.t.~s are and whivh ::n'<~ not in the set A. For i11st.:uic:e._ ()llt'.! mig__'ht be looking
`for 8. 55:21‘. A of(:eL1'.s' which hmwr it p1‘it'(.! C0m.pc:.-rahlr: to that of :1 Lexils -"100. Silltlfl it.
`is 1101. r_‘iea1' 17:-‘i'lE-It the i'.(.‘1”IJJ {1rJrn.;ua.rrI.b£&:lilemis e3c'r1t'.i1_\_-', there is 1101'. n p1'er‘.i.~;e (and
`ilniqiiej «:le.=sc:ript.io1i ufr.i1v set‘ .»'I. More .~';opi1i.~;tica.t.<rci c:iL1stcring algoritlixils might
`-anenipt to .‘-§(![Ja.1“cl.t(‘- the nnlijt-.c.ts of a collection into va.rir.:11:~: c.111stcr.~3 {or clasmzsj
`eivuordiiig L0 their pmpv.=1‘t.i:1:~:. For iiistance.
`]‘1il.EieI]t5+ of 3. doctor sp:*t'.ia.lizing
`in (::m::er (:U‘|.li(i be c]as.~;iiied into five £31213.‘-5082
`terminal. advaticed, 1I1et:1.'-;t.a5is._
`niiagiinsecl, and he>:i.1’r.h;u. Agaiii, thv possible CiEl..':i5 c1es:<‘1'ipt'ior1s might be iII].p1‘£?(.'iS€
`ufniici not unique) and the problem is one of decziding to Wi'1i(:il of these classes
`a, nczw patient sliouid bi.’ assigI1(.‘.Li.
`In what follows, hrm-'r3\'£=.r, we 011]},-' (iisttllss
`the Hiiripier vt'.e1‘.~5i011 of tilt.‘ i.'i11.'~JL(:‘l‘i1l}_£_,' problem {i.:=.. tlu: {"1119 which r_-o1I.'~;idt;—1's (ml;-'
`two C121.‘-it-i('.‘.S)
`i')(?(_'.£.i.ll.‘.%f‘. 2111 that is requirecl i:-a
`:1 ::1e(:isi011 on \'r'ili(tiL dm'.'.Ll11ie11L3 are
`to he rt:-.1cva111, an-.-.1 which Om‘:-; are predir.-tm.i to he: not reievzirit. [with
`1‘ega.1‘::1 to a giviau l.1iw‘<‘l‘ <1i1i'.e1'jy'j.
`Unified EX1014 Page 7
`Unified EX1014 Page 7


`£.'".‘-LAHSIC‘ INI-‘()R.T\'IATION RI:-'I'R.IE\"AL
`"Ii": View the IR ]')lT_Jb1f.‘.‘-lit eu-; om: uf(:l11.~1't0ring_, we refer to flu: m1‘ly work of
`\\-'1! think of the dor'11mE!nts as a :.':01]eCti011 (7 of Objects 2'1.11:.'i think of t-ht’.
`1lh‘t'.‘L‘ qm..=1'_\' as a {_\-'a.gLie]
`.~;~'[.':e::'i[i(.' of :1 wt 21 of 0l')j(?(:t.~+.
`[11 this :-«'('(‘11éi1‘i(). F-hi‘
`IR. p1't:fI')l(:r11 11.‘E\Il be 1‘vc111c:(‘d ‘rm the proh1e'.=.1'n r.:f:.'1eL.er111i11i11g which r.ic.:r‘I11J1ent.s are
`in Lhv Heat
`:1 and w]1iI.‘.h mms are not 1:i.v.._._
`the IR. pruhltrlu ("£111 }..w \-"Lewe(i as at
`r*l1IHt.eI‘i11g }'.11‘x:1I'11e1.n}. In a. r.']u.".~'t(*ri11g }'1mh1t‘111_, two n1n.i11 i.~:s«:11(‘.°s hm-‘v to hv Ft‘:-i()]‘\’(‘(l.
`First, 011:‘
`r1<r<"<.1;~; to (i(.=t:{‘1‘111i11(: what z-131.2 the fs.=a.tL11‘es which b+3ttt'.=.r‘ <1e'.=sc.I‘ibe the
`the fEI£i.T11I'I’.‘-h"
`to I1E_‘[(..‘l'IIlilL(.! what
`-.‘:§F.=C011d._ 011::
`in the set.
`\'.-'hiE‘h }.'mt.t.<.‘I' dist-i11g11i:~:l1 the 0hj(>(:r..~; in tho ;~;{'-I.
`_.-1 from the r(:1m1i11i11_g,' ul)jt?r:r.H in
`the mil:-mien (7. The fimt .-.:<.‘k of f+::at111'-2:4 provides for qua,11t-ific-31.1011 of s'm‘.-m--
`the semlnd .'w'vT_. hf featurvs p1‘u\'idr-*.~' for Q11et.1LI_-ifi('n.t.iL:1L
`of mfr-'r’—c=..’.u.‘~'1‘r:'-3' L1i:'~:r.~:iu1'1l:1triL_\-'.
`111o.~1'T :-€‘|](.'(.‘E‘{-ii-iflll
`(:111.~;t.(rriI1g a.lg,'c»rit.lm1.~; try to
`i;a1;u'u"(‘ l.}1e5<:'- two (.‘i'fE.‘('T{-5.
`Tn the \'£>c‘r.u1‘ 11'10c1t21.
`im1‘:.1-r.:111.‘.«'t.eJ‘iI1g 5i1}'1i]a1‘it-‘\' is q1.1an1'ifie<1I by 111£.*'«1511I‘i11g
`iusicle a documellt d‘,-. Suc}1 term fr‘s.~c111rer'u1y is
`the 1'aw f1'ec111em‘_x' of :1 term 3.;
`u5u;;.L1_\- 1‘:rfe1‘1'ed to as the ff fau::(.o':' and p1'u\-‘ides 011:: 111easL11'e of how well that
`r.p1-In d-:—e.'a'r‘1"ibe.~'
`the dL'JCllll1f.’IlT. m11t'v11’r:~: {i,¢.z..
`i11t1“a—tiotrumzrlnt L':lmI'a(.'L(*riz.2tlio11}.
`i11t.t3r—Lrl11:-;L(%1' di.~;.~;i111i1:1ri’rj,'
`in‘ q11:1.11r.ii'i0d by I1'1t?zL.-:111‘i11g,' the i11\-'<‘1‘;~:(=.
`of thr‘ frt=.q1.1mu'-._v of 21
`tcrrm k_..; a1m.'mg I-he d:'.)C11111e11t.S in the collecticm.
`fa(‘t.ur is 1L:«1.1{1.1L\‘
`I‘t!fm‘r‘r'-:11 tr: 3.54 the -2'm'<-'r'.st.'
`tfcJL.'u'm:e:'m'. fi*‘t‘-gm‘;-r;r.c‘y or Hm 2'Jffr.1r't£0r.
`The 111«:>T.i\'e1ii01L for 11S'd§.'§e of an iulf fa(.'t.or is that
`Lt‘.-11115 which upp:'.~.a1' in 1na111_\'
`21.11! not
`\-‘e1‘}-' 11;~;t:ful for (ii.-;t.i11rr>;11'1.'-;hi11g'
`rvlm-'n11t F1(J(‘111l1(?IlI- frolu n
`[1OI1—l'i!1E?\':.‘t.l1T o11:=, AH wit.-11 good c‘l11:~:ttrring' e1]g2;u1‘ir.l111m.
`r.l1(‘ 111us+t L"-l'1l‘<-’ri\'c‘
`woigl1ti1Lg .'~‘.cl1<‘1L1(=.'~s for IR. III‘_'\'' to h+11+111r‘(‘.
`t.}:r‘s«‘<- Two r‘-filer-.’r..~:.
`.isy.!~'I.é"nr u.n.rr’ Hg hr:
`a'.uu::u'mt:n..L~' E-It
`T.-(if N ha": Hit? r‘.0faaT 'n.'u.mlr<=I' U
`L621 fre.rh=.,-
`t.r;:-rm }r-
`re.-m::.e')t:i" of dor.':'u.m-(:nI.:»‘ in '1:,rf:.£:.:/a U ;: £n.rJ:.'.'r;
`{? Hat" mm:fJr"r' of f.1'n:.:e.- Her (firm
`‘Put:-'j'fifi'ue:r:.L'y of £»=."r"m. kg in 1'.:'a1<;: u.’r)r'mm-:-Irf. {I}
`IQ "is
`'m.r:m‘1'o-Iz(.'u.'. -in J.'.h.r-.'
`I..r'.'e'f. of afh..».~.
`r}‘.r)r.=r.r.rm:?'rt ¢:r§_?-)4’.
`fchra mJ1‘m.rr.£i2r‘ri f-r':"q-7.r.r'm:y
`- of ft-‘rwrn !a:,-
`-.i.~: g-:'.r.Ir-rs by
`fed :1‘
`. I
`rm1.;rrIr1:r.u.m. is c:c.emp-'r1.£r.'ri mm‘ 1'J‘.H fi£I'.f‘i'ifa‘..*.' u.=h.sI<"h. uni n2.e'.r'2fmn..r"rJ.' in Hart
`:3 Utfi
`- the (for:-"u.-rrmm .t1.'_?-,
`If mt:
`tmvrl kg a'.r_>e':._w' not apperar in (he tfcJr.?1mz.€'r:.f.
`= U‘
`5t':'?‘- idf1:.
`tEor:umr;:-xzf. fr'r:'q-mizrarrg; for kl-_. hr: _qi?..'»r;:'n.
`'be.9t kn.rJ'u.=n.
`-14.5-c -33¢:mhr.!.'Liuh-éch. rm: _r.3z':!w: by
`I. n‘
`M‘ = fi._-;'
`><1.<3g__,' ——l
`Unified EX1014 Page 8
`Unified EX1014 Page 8


`H‘. e.‘m"iu.£-do-I1. of a‘.fu'.'~: fa-:‘m.u..'.'rL.
`0-I‘ h-_r,r
`.‘3'uc:'L It':?‘m--'u.'(3'i_qfLting .h'f'I'{L(.Ef_F,PiE?3 (1-TE? (:a.Hr:'d flf—id_f
`IE3.\'}.‘lI‘E'H.‘~‘.‘i()1‘1 for the weight 11',-‘J are <.iv5(.'1'ibe(i in an
`\'a.1‘ia.fio11s of L-he .'1hcy.'¢':
`ill!-t‘1'('-:-%I.i.1lg papm‘ by S;11t.n11 mu! l:’:1LL':kie_v whieh n.ppt:a1'ed in 1988 [E96]. 1-lowevel‘.
`in p,v11(=.ra1. Lhr:
`a}_'n')k-‘e ex1:n‘e3Him1 Hil{)11i(.i p1‘0\‘ide :1 good weightillg .‘.4.f.‘.i‘1(E1'I1(.‘. for
`Hit‘ the q1.i€:’I‘}'
`.‘}éLlr..ou a,n<.i H1,l(’ki(‘f_»' 5L1gg{est
`r.<"r‘m w0i_y,'}it'.'~s,
`'3 ‘hHL‘‘—--) X 109' —
`(.F "- -,
`whswe fr‘:-'q._._,£. is the mw f'1‘c:q11c‘11¢:}' of the Lerm In in the text of the inf0r11mt.i011
`].'f-_‘t_{‘I.1t“-.‘_wT. q.
`"1"i1¢§111:1i11 :1rfle.r:r.m'.u:1g_,Ie:.s of the \=<"rtt<'i1‘111ml<r1 am}: {1} its f.(*r111—w:_=ig]1LiI1g aehenw
`i111p1'L_.n.-'e.*.'-.' 1‘er.1'i(*v-:11 pt*rf(ir1i1auu;'t'.='. (2) it..‘_«‘
`]T1E}.I‘F-iéli ma1.Chi11g 5rraf.t3_9,j__.' allows l‘E‘.i‘I‘i(':‘\"cl.i
`of ticmilliexits that r1p;Jr:'J;r‘.-i'm.r1Hr: the qL1el‘}'
`1-Iliri {3} its m;-sine rank»
`mg f'[;rn1u1;1.
`;-:u.1‘ts the (If)(‘l1]I1(?I1i.E-i H.(.’f.'.f.)1'(iiil§:’, to their tieg,'1'£=e of 5_a‘i1‘nilm'i’ry to the
`q1.1e1'_\'. Th:_--ci1‘'al1}-', L.h(.=.
`\-'e{."Lt_ar' mcidvl hm; the d'.é.h‘<I.d-r.'r1.-I2.rfrigri that iiidex t.(..'i“I11h‘ are
`:-L!-1':-4111116..‘-(i to ho. 1Il11t.l1.21—ii_\_'
`ilL{iE'.‘[}I;'1lLiE!Til {1-tqilafiuii 2.3 riioesa not :u'm1J.11L for index
`I1-rm ci(:}Jza11tit’I1ciz-5}.
`I'Imve\'c'1'. in pm.r':t-ice.
`t':o1'1si(i<!mti0r1 of t‘-erni
`might be it c1i!:;aLi\‘a.11I'.e!.,r_{t:. Due to the lu(:a1iLjy (_':f‘1I1:111y teriii de}:aeI1ri<‘11<:ie;-s_, their
`itI(iiésf‘I"il'[1‘i11El-17-(3 dpp1i(:ati1'.m tn all the tit)£’1l1lJ.£‘I1i..‘_-'-
`in the (‘()iiL‘-C.1.iUI1 might in féict.
`huif the ox-'c~.mi1 pe.t‘J}_11'1ILa.Iice.
`Despit.r:- its .~.'i1L1pliL'.iL._y, 1-he ven"i.m‘ ll1m'i{‘.l is a 1‘:35iiieI1r. nirakiiig .~.a'l.-1'ateg}-' with
`_aLr1w1'al <_‘<_'1|1e(:tiu11.~:.
`It yi¢.!1(is I'éLIlk{‘.f_i zulisvcei‘ s(rt

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.


A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket