`
`AOL Ex. 1026
`Page 1 of 5
`
`
`
`;~N is an example of multi-sense combination; (ice), ~I(box) and ~N(refrigerator) are all words. The character string ~flN is an exam- ple of intersection combination; Ntfl(paddle) is a word and ~fl~(sell.-at-sate-price) is also a word, whereas tfl is the intersection charac- ter. The example of the string ~fl~ f illustrates the typical segmentation ambigu- ity caused by word chains. The segmentation of this string can be either (fl'hc ping-pong-balLs were soht outat sale price.) or ('13e paddles for gable tennis were sold out.) Some ambiguities can be solved by word structure knowledge. Others can be disam- biguated by syntactic and/or semantic knowl- edge. The most difficult disambiguation is that requiring contextual or pragmatic knowl-- edge to arrive at an appropriate interpreta- tion a,s in the string ~~t which can be segmented into: (students will write a paper.) or (student-association writes a paper'.) Both are syntactically and semantically cor- rect. in this case, contextual information would allow the reader to trace the informa- tion claimed in the previous statements to solve ambiguity problems. 3 Reasoning theory for Chinese segmentation disambiguation A model of evidential strength in inexact reasoning studied by (Buchanan and Short- liffe, 1984) has been successfully implemented in the MYCIN system. Tihe theory is that, if a hypothesis can be derived from various types of mutually exclusive evidence, then the strength of truth of the hypothesis can be in- creased to reach a plausible conclusion. Two concepts MB[h,e] and Ml)[h,e] have been introduced as the measures of belief and disbelief. MB[h,e] means the measure of in- creased belief in the hypothesis h, based on the evidence e. M l)[h,e] means the measure o[ increased disbelief in the hypothesis h, based on the evidence c. To facilitate comparison of the evidential strength of competing hy- potheses, certainty factor CF is introduced to combine degrees of belief and disbelief as fop ]OWS: csqh, ~1 = M l~[t~, e] - MY[h, c] in the case that a hypothesis is derived froIn a number of mutually exclusive observations, the combining functions are defined as: if MD[h, el&,e2] = 1 then MB[h, el&,e2] = 0 otherwise M l:~[h, el&,c2] = MB[h, e~] + M,[h, e~] • (:l - MY[h, e,]) if M13[h, el&e2] = 1 then MD[h, cl&c~2] = 0 otherwise M D[h, cl &c2] = MD[h,e~,] + MI)[h, e2], (l - mD[h, ej]) In the case that two hypotheses are estab- lished with positive evidence from syntactic and semantic knowledge with the same de- gree, no discrimination of the strength of truth hypotheses can be drawn. If world knowledge provides positive evidence for the first hypothesis and negative evidence to the second; then the strength of the first hypothe- sis is stronger than thai; of the second. There- fore, the first hypothesis would be the most likely correct segmentation. A weighted certainty factor is proposed he, re to represent the importance of various linguistic aspects. The, weight is a vector of four elements representing the importance of morphology, syntax, semantics and pragmat- its, respectively, which total 1, i.e. Cl,;[h,, e] - w~ , CF[h, ~] where Wi is the weight of the certainty fac-- tor CFi in hypothesis h supported by the ev- idence e with respect to one of the linguistic 1246
`
`AOL Ex. 1026
`Page 2 of 5
`
`
`
`a,specl;s. Suppose, the weight; vecl;or (O.l, 0.2, 0.3, 0A:) is a,ssigncd (or morphology, synU~x, scma,ni;i(:s a, nd pr~gtnal;i(;s, r(,speci;ivcly, Lh(;n I;hc following exa.tnple iJlusLra,i, es Lhe t:uncl, iou or Ge wcighLcd (:erLa,inl;y [a,(;l;o," (]/'i[/G c,]. (lihe Lhird ]e+der in our (:olnp+ny does (tel; ha,re much power) l;he word ¢t]~iil +~1 ~ pro- (hl(:es l, wo segmenLa£ions: (t;hc l;hird leaxler it+ ()tit: (:olrit)a,tty (toes HOt, have tnueh power) or: (l,llc Lhicd piece-el ha,ud hi ()ill' COtlll)a, lty (foes UOL ha, re much power) To esLima, l,e Lhe sLrengt, h o[' l, rul, h o1: (,he ficsL hypoLhesis, sttppos(': • Lhe word sLt'u(:Lltre rule gives Lhe evi(letl- l, ia, l st, rengl,h (0.5) ror l,h(, hypot;hesis be- e+us(, Lh(, word (:h+d. :le+ (:+m be ('ii, h(;r +t~ ~- (pi~c,,-or h,,,(l)(,,, f~-~ (k,,,der). T lwrefore, 6+r;[t~, ,;,] = 14:,, c i [/,,, q} :-: 0.0r, ++,,,~ c ~ []+, +,,] : M ~;[],,, <~,,] - Mn[]+, +,.,] :- 0.05 (,he s.ynl, a,ctJ(: rule gives Lh(', evi(]eul,hd sITeugLh (I) l)e(:~uise iL defitfilx'Jy is a. gt'amt:na, t;ic~d senl;en(:e. T]wr('l'or(', c/,~[/,, ,:4-- ~ * (, I [/,., ~] ::: 0.~ +,,,d cr'[A, m <~<;~] :~ ~ BIt,,, q~<;,] - ms/)[t,., <~.,<t+<~] =: O,2d • l, he sere;mr;it rule gives i;he evidentia, l st;,'eugiJ, 1) since +t~T.(i;he Io~utcr) (',a,n hame power. 'l~lieref'ore, or':+(~,., ,;:,]-- wi, , (: r'[l+., ~] = o.3 ~l.,i,t C If[D,, c ,&<;~&,<;:~] :: m nit,., .., a+<~.~<~,<,~:,] - M :)[1~ <~.~....~,~.,,.,] : 0.4(J • the world kuowledge rllle gives 1,he evi- dentia,l st;rcngl;h (0.8) I)e(:a~use it; is (lUit;( , Lrue l;}la.i, Lhe lea,der ha.s less t)ower Lha, n Lha, L of t, he [it'sL or second [caxter. There,- for(;> (, 14[I+, q] :-- W4 * U F[D,, (~4] :: 0.32 +u,l " L;h c.i&.c.~&,c:~&,e.+] --M I)[D,, c i &.r.~,~c.:u~q] -: 0.63 The cert,a, iut, y l:a, ct;or CI" of l;}le hyl)ot;hesis -f~: f:l ~,~,:1 ¢'J ~_~! +I,IT- ~Yf ~A: ~); is 0.63. The,'o- [ore, (;his segHietit;a,t;iorl iS likely 1,o hc a, <:oher- enl, sLriug. To esLiina.Le Lhe evidengi~d sLrengt.h of Lhe se(: oud hypol;h('sis, suppose: • l, he word sgrucLm:e rule gives l;he evi- dent, ial st;rengLh (0.5) for Lhis hyp,:~t.he sis since, #[~T" ca, u be eil;her :IEI ]~(piece+ol' ha.u(l) or :I1~ 1:" (le+~der). Therefore, c z", [z,.., ] :-14:1 * C//"[D,, q] ::: (}.05 ~u.l C If[D., eli M.[/,., ,.,] M nit,, <,,] :: o.o5 • Llle syui;a,cl;ic rule gives Lhe evide, uLia,I sl;reuglJI (]) beta.use it; is a. gramma.t.ic~d S(HI [;(;11 C(',. T hcrel'ot:(;~ C' I'~[D,, c,2] := W~ * C'/,'[A, c~] = 0.2 a, nd C l"[h,, ~:l&c'~] -- M u[A, <:,,E~] -- M/)[t,,, <:, ~<~] = 0.:~..I • t;tle se, m~ull;ic rule giw;s l;he uega, l, ive evi dcutM sl,reugl;h (-1) t)e('~ulse t;he t)hra, se ID.c h,a, nd o./'~t co.m, pa, ny vJola, Les Lhe se n,aui,ic coust, raiid,. 'l;herel'ore, C l":/[A, ~'.:~] - l'l/i~ * Ct,'[D,, e,:+] = - 0.3 a, nd C i,'[h, c l&+'.~&c:~] :_: M nil,,, <;,~t+,,~,t;+::+] - Ms)(/,., ,:,,t.:~,t+,:4 -: -0.06 • l, he world knowledge rule gives a, Hega,l, iw'. evidcmi~d stxeugllh (I) boca,use a, <'ore t)a,ny does uot; ha,w' a, }la, Nd a,s ()lie el! it;s COt]l( pOIICIII, S. (71'~[h, c.4] -: -0.4 amt C l;'[h, cl&:.'2&e.:.~x:.l] .... 0.34 The ceH,aiut;y I:a.cLor (~1" of Lhe ll.yl~ol, hcsis #.~ • If] (,,~i.J f¢,j ~2:£ lt~ 1: '~#/ ~).: }~)s is - 0.34. 1247
`
`AOL Ex. 1026
`Page 3 of 5
`
`
`
`Therefore, this segmentation is unlikely to be a coherent; string. 4 Discussion q_'he assignment for the weight vector is empirical. It is based on the following analy- sis in which ~l's reresent the truth of each evi- dence/hypothesis and ~O's represent the false. Since the segmentation algorithm always pro- duces a segmented string, it is assumed that the evidence from morphology is true in vary- ing degrees depending on the complexity of the word chain. The justification of a hy- pothesis is based on the evidence presented by the pragmatic, semantic and syntactic as- pects shown in the following table. ~-~ J pragmte I semte I s-sTfitC- (1) 0 0 0 (2) 0 0 (3) o o (4) 0 1 1 (5) i 0 0 (6) o 1 (7) 1 1 0 (8) 1 1 hypths 0 0 0 0 1 1 1 1 • Case(l) indicates that if no evidence can prove the truth of the hypothesis, then the hypothesis is false. • Case(2) indicates that if the evi- dence supports an incoherent grarumat- ical sentence inconsistent with the con- text/circumstance, then the hypothesis is false as in the case of ~,g~-~(a ba- nana ate a monkey). • Case(3) indicates that if the evidence supports a meaningful but ungrammat- ical string inconsistent with the con- text/circumstance, then the hypothesis is false, i.e. ~g~ (he wretch) against the real fact that he is a nice guy. • Case(4) indicates that even if tile evi- dence supports a grammatical meaning- ful sentence but is inconsistent with the context/circumstance, then tile hypoth- esis is false, i.e., ,~,(~ 7vN ~ ~ N (the president's forced resignation makes peo- ple angry) violates the circumstance that people hate the president. • Case(5) indicates the case of an idiomatic expression where the string is literally ungrammatical and incoherent, but as a whole it can be interpreted figuratively to make perfect sense. Therefore, we as- sutTrle that the hypothesis is true as in tile case of :~z~I:~J£, literally means "car- water-horse--dragon", but figuratively, it nleans "very crowded". ® Case(6) indicates the case of a metaphor or metonymy which superficially it is an incoherent grammatical string, but by reasoning with the support of world knowledge it can be interpreted as a lneaninghd string. Then, it is assumed that the hypothesis is true, i.e., ~NN~g ~t (1 drink North-West wind) means "i have nothing to eat". • Case(7) indicates that the evidence sup- ports a meaningful but ungrammat- ical string consistent with the con- text/circumstance, then the hypothesis is true as in Nla;lti (he wretch) is consis- tent with the real fact that he is a bad guy. • Case(8) indicates that if all evidence gives positive support to the hypothesis, then tile hypothesis is true. 1)Yore the analysis, it seems to be that pragmatic knowledge provides the strongest evidence for the hypothesis. Therefore, the highest weight is assigned to the prag- matic aspect of the certainty factor, in the absence of pragmatic inforrnation a de- fault assumption, that semantic evidence is more important than syntactic evidence, is made. This can he observed in daily life people communicate through many ungram- matical expressions without having a prob- lem of transferring the message such as a brief email message: ~ DRAFT-cornmerzts hard copy best-asap to yw pls. [t means "To 1A brief e_mail message from Dr. Yorick Wilks to the researchers in Computing |{esearch I,aboratory at New Mexico State University. "/248
`
`AOL Ex. 1026
`Page 4 of 5
`
`
`
`write the, comment for the Ill{AFT on the ha.rd COl)y would be the best. Please return it to Yorick Wilks ~s soon as possible." The certainty factor Cl;' ix used under the premise tha,t a,ll of I;he evide, nce is rendered by mutua, lly exclusive observations. Sitice lem- guage is a,n expression integr~ting synl;actic, semantic and pr~Lgmatic information, is the syntat:ti(:, sema,nti(: a,n([ I)r~gmatic evid(mce mutually exclusive? This is not so (:lca,r. All knowledge is cultur~dly (tel)e~l.d(mt , i.e. one paN;ieular instance m~y be ~meepta, b]e in one culture but not in a,nothe, r. In this research a defmflt assumption is made that the obserw> tions from various language ast)ects are inde- pendent. The questioa is left ope, for further discussiou. 5 References |~u(:h~mml, 13. and E. Shortliffe. (1{)84:). Ua- (;erta,inty and F, vident, i~[ ~qupport. iu B. C. Ihwha.na, mid F,. II. Short- lille Ed., ll, ulc-Bascd IJrpcrl S'ystcrns: The M YCIN I¢:rperimc'nts of th, c Sta,,,- ford lleuristic l~rogramming ['reject, Addlson-Wesley l)ltblishing Compa,ly., 1)P. 209-232. Cha.ug, J. S., et el. (1991). Chinese word segmettl,~t;iotl tJn'ottgh (;onsl;r~dnt s~tisfa.t:tion a.nd st~tistical optimiza.tion, Pro< of the 4th ILO. C. (/ompulalional Linguistics Conference, pp. 147-165. Chen, K. J. ~Ltl(:l S. H. /Au. (1992). Word l (lent ill cat ion for M~m (latin Chi nese Sen- tenet:s. I'r'oc, of the 5th Intc'rnatio,ml Conference on (/omputational Linguis- tics, Vol. l, pp. 101-107. Chiang, T. I[., et al. (:1!)92). Statis- tiea.l models for se, gmcnt~tion a.nd u lv known word resolut;ion. I)roc. of th, c 5th 1tO.(7. Computational Linguistics Con- J'crence, I)P. 123-] 46. lie, K. K,,ct el. (11991). The Design l>riu - ciple for a, Written Chinese Automatic Segment~tion Expert Syst;em. ,Journal of Chinese In, formation l'roccssing, re/.5, No. 2, pp. 1-14. l|ua, ng, X. X..~md 1). Y., l,iu. (1988). The Phenomenon of Word Chitin ~nd the Au- tomatic Segmentation in Written Chi- nese. Journal of the Development of I(nowlcdgc I'kzginecring~ pp. 287 291. ,lin, W. anti ,/. Y, Nie. (1993). Segmenta,- 1;ion du Chi~lois-- role El,ape Cruciale vet's la Tra.duction Automa.tique du Chino is. In e.llouillon an(l A. Clas Ed., La 7}'a- ductiquc, l,es presses de l'Universite de Montrea.I, pp. 349-363. ,)in, W. (1992). A Ca.so Study: Chi- /lese Segment~l, ion a.tl(l its lJisaml)igua- tioi~. M(7(Z5'-.92-227, Computing I{,(> search I,aboratory, New Mexi(:o State (i uiversity. 1Anug, N. Y. and Y. It, Zhen. (]991). A Chinese Word Segmentation Model and a Chinese Word Scgmt;nl;a,tiot~ System I)C - CWSS. lh'oc, of COLlt%', gel. l, No. l, I)l).51-,55. IAu, Y. Q. (1!)87). I)itIiculties in Chi- nese l~mguage Processing and Method to their Sohfl;ion. l)roc, of 1987 bzte'rna- tional (7onference on Chinese Informa- tion Processing, Vol. 2, pp. 7125-12(5. Nit;, J. Y. mM W. Jin. (1!)94). A Hybrid Approach ~o Unknown Word l)etection and Segmentation of Chinese, Apl)e~r in Prec. of I'nternational Oonfcrcnce on (/hincse Computing'.04 (ICC(704). Sl)r,,)a.t, 1{. a,t-l(l (~., Shill. (1991). A staA;isLi- (:el reel;hot] R)r finding word boundm'ics in Chim;se text,(fomputer l)rocessin.q of (kincse and Oriental Languages, gel 4, No. 4, PP. 336-351. ~vVmkg , l,. ,J., el; al. (1991). A Parsitlg Metho(l for [dentifying Words in M~m- (tarin Chinese Sentences. l)Tvc, of the 12lh lnternaiional Joint Co~@rencc on Artificial Intelligence , Vol. 2, pp. 1018- 1023. 1249
`
`AOL Ex. 1026
`Page 5 of 5
`
`