IPR2020-00755, No. 1016 Exhibit - Google Exhibit 1016 Choi, A Method for Improving Recall Precision on Information Retrieval Systems Using Multiple Terms (P.T.A.B. Mar. 27, 2020)

Collected Papers for 1998 Autumn Academic Conference by Korean Institute of Information Scientists and Engineers, Vol. 25.
`No. 2
`
`
`
`
`
`Summary
`
`Studies on information retrieval systems using multiple terms instead of single terms for precise
`information queries have been actively carried out. However, there are not many retrieval systems that
`use multiple terms. One of the examples of information retrieval systems using multiple terms are
`information retrieval systems using keyfacts. A keyfact is one of the multiple terms that includes not only
`the key words but also the related information. Information retrieval systems based on keyfacts create
`keyfacts with the same weighted value in the index process of current documents and the keyfact
`extraction process of the query language. However, a noun phrase creates different keyfacts according
`to its meaning, so there are many problems in applying existing information retrieval method to its results.
`Therefore, in this thesis we suggest a more precise information retrieval method by assigning appropriate
`weighted value to each keyfact created during the index process.
`
`
`
`
`1. Overview
`
`Most information retrieval systems retrieve
`information using a keyword, which is a single term.
`When retrieving information with keywords, the
`information to be retrieved can be ambiguous.
`Ambiguity arises when one word has different
`meanings or has too wide a range of meanings. One
`method to resolve this ambiguity is to use multiple
`terms. Multiple terms include the information related
`to the keyword as well as, unlike a single term, a
`keyword that has been used by the existing
`information retrieval system. Since multiple terms
`include related information as well, we can capture
`the meaning of the keyword precisely. Related
`information means information that describes the
`characteristics of keywords so that we precisely
`understand the meaning of keywords. So the
`accompanying noun when consisting of a compound
`noun, idiomatic language when having idiomatic
`language, and a verb or an adjective for a sentence
`acts as related information [1‐3]. With these in mind,
`the concept of a multiple term is a keyfact.
`
`In this thesis, a method to assign weighted
`value to each keyfact is suggested in order to find
`more precise information in the information retrieval
`system using keyfacts. Furthermore, precision when
`using this method vs. when the equal weighted value
`
`Copyright © 2005 NuriMedia Co., Ltd.
`
`
`was assigned was compared. Chapter 2 contains an
`explanation of keyfacts, and Chapter 3 covers the
`method of extracting keyfacts. In Chapter 4, we will
`explain how we assign weighted value while indexing,
`and in Chapter 5, we will compare the precision of the
`retrieved results when assigning different weighted
`values to each keyfact vs. when assigning the equal
`weighted value through an experiment. Finally, in
`Chapter 6, our conclusion and the future research
`direction will be discussed.
`
`2. Keyfacts
`
`We can increase the search precision if we
`retrieve with keywords
`that
`include
`related
`information, not just the keywords when retrieving
`information. This is because it narrows down the
`scope of retrieval. Users create query language with
`multiple terms instead of single terms for more
`precision.
`is originated from the
`
`A word keyfact
`concept that it is not a word but a fact that represents
`the document and the keyfact should have related
`information with the keyword. A keyfact consists of a
`central word and a subordinate word, which means
`the keyword is a central word and the related
`information is the subordinate word. There are
`
`Page 1 of 8
`
`GOOGLE EXHIBIT 1016
`
`

`Collected Papers for 1998 Autumn Academic Conference by Korean Institute of Information Scientists and Engineers, Vol. 25.
`No. 2
`
`according to the keyfact generating rule with the
`morphemes after resolving ambiguity.
`
`A morpheme, after going through the
`morpheme analyser and ambiguity resolver, has one
`part of speech and one meaning. Keyfacts may be
`extracted as follows [5].
`
`
`
`
`It may be a keyfact only with a central word.
`In other words, one noun (existing keyword)
`is used as a keyfact.
` When two keywords are connected with ‘of’
`two keywords can form central words and
`the keyword after ‘of’ can be a subordinate
`word.
` When two keywords are connected with
`‘와/과(and)’ two keywords can form central
`words and the keyword after ‘와/과(and)’
`can be a subordinate word. In this case, it
`doesn’t matter even if the position of the
`central word and the subordinate word can
`be swapped with each other.
` Derivative determiner, descriptive verb and
`non‐descriptive verb can be used only as a
`subordinate word, a relative verb.
` When two keywords connected without
`proposition form one keyword, there is a
`sequence.
`
`When creating a keyfact, it is not about
`making one noun phrase into one keyfact.
`Multiple keyfacts are created from one noun
`phrase. This is because it is possible to express a
`noun phrase as one keyfact but there is a
`problem in the partial matching with keyfacts
`generated by other patterns. Of these keyfacts
`generated in this way, there are not only the ones
`with both a central word and a subordinate word
`but also ones only with a central word. More
`precise searches will not be possible if the equal
`weighted value is assigned to those with both a
`central word and a subordinate word and those
`with only a central word when searching with
`these keyfacts.
`
`
`4. Indexing process
`
`This thesis suggests a different method from
`other existing
`information retrieval systems
`in
`regards to the fact that each keyfact has its unique
`weighted value during the indexing process. Since
`multiple keyfacts are generated from one noun
`phrase, there are problems in applying the equal
`weighted values to all keyfacts. For example, let’s say
`
`Dictionary
`
`
`
`text
`
`Morpheme
`Analyzer
`
`Ambiguity
`Resolver
`
`Keyfact Creator
`
`Keyfact List
`
`Figure 1 Keyfact Extractor Diagram
`
`
`
`different ways to express things in a sentence but if it
`has the same meaning, it becomes the same keyfacts.
`So a keyfact can be the same in terms of meaning but
`it can be different grammatically, because there can
`be different ways to express one keyfact. Keywords
`can be extracted from a document with the existing
`method, and then you can
`infer the original
`document only using the keywords. In addition, noun
`phrases can be extracted from a document and you
`can infer the original document only using the noun
`phrases. It has been proven that the latter better
`expresses the original document [4].
`
`As an index term, it should represent the
`document first, and then there is the possibility of it
`showing up again. Usual noun phrases are
`representative to some degree, but there is almost no
`possibility of them showing up again. So a noun
`phrase should be created with different keyfacts in a
`keyfact based information retrieval system.
`
`3. How to extract keyfacts
`
`
`
`
`
`To extract keyfacts, we should go through
`the three step process. Firstly, the given sentence or
`word phrase should be analysed into morphemes and
`then secondly,
`in
`the analysed morphemes,
`ambiguities should be resolved. When resolving
`ambiguities,
`relevancy with other morphemes
`included in the same sentence or word phrase is
`compared. This can be done easily by using relevant
`nouns. However, there are difficulties in completely
`resolving the ambiguity. So, resolving ambiguity in
`this step is applied only to the very simple patterns
`and others are determined depending on
`its
`frequency in a corpus. Finally, keyfacts are extracted
`
`Copyright © 2005 NuriMedia Co., Ltd.
`
`Page 2 of 8
`
`

`Collected Papers for 1998 Autumn Academic Conference by Korean Institute of Information Scientists and Engineers, Vol. 25.
`No. 2
`
`there is a noun phrase, “Essence of Appreciation.”
`Then in this word phrase the keyfacts below are
`created.
`
`[Appreciation, NIL], [Essence, NIL], [Appreciation,
`Essence], [Appreciation, Essence, NIL]
`
`In this case, [Appreciation, Essence] has more precise
`meaning than [Appreciation, NIL] and [Essence, NIL].
`Thus, [Appreciation, Essence] should have more
`weighted value than [Appreciation, NIL] and [Essence,
`NIL].
`Another example is a noun phrase, “Order of
`
`God and Nature.” Below The keyfacts below are
`extracted.
`
`[God, NIL], [Nature, NIL], [Order, NIL], [God, Order],
`[Nature, Order], [God Nature,], [Nature God,], [God
`Nature, Order], [Nature God, Order]
`
`In this case as well, the weighted value of the keyfacts
`with both a central word and a subordinate word
`should be higher than the one only with a central
`word.
`Also, as you can see above, since each
`
`keyword “Appreciation,” “Essence,” “God,” “Nature,”
`and “Order” appears once in the body of the
`document, the sum of weighted values on each noun
`within the keyfacts generated from these keywords
`should be the same. In other words, of the keyfacts
`extracted
`from the noun phrase “Essence of
`Appreciation,” the “Appreciation” appeared in three
`keyfacts out of a total of four keyfacts. Of the keyfacts
`extracted from noun phrase “Order of God and
`Nature,” the “God” appeared in six keyfacts out of a
`total of nine keyfacts. So each keyfact with
`“Appreciation” should be assigned a weighted value
`of 1/3 and each keyfact with “God” should be
`assigned a weighted value of 1/6. However, more
`experiments and considerations are needed to
`calculate more precise values.
`
`5. Search and experiment
`[central word,
`
`All keyfacts are either
`subordinate word] or [central word, NIL]. [Central
`word, subordinate word] is more narrowed down,
`having a more precise meaning than [central word,
`NIL], thus being more helpful to retrieve appropriate
`documents and information. In this experiment, we
`targeted 23,112 documents from an encyclopaedia
`by Kyemong Co. and the total volume of data was
`approximately 12Mbytes. This experiment was
`conducted according to two cases: all keyfacts having
`
`Copyright © 2005 NuriMedia Co., Ltd.
`
`equal weighted value and each keyfact having
`different weighted value. The formula below was
`used in the latter.
`
`
`Formula 1
`
`
`In formula 1, N is a total number of keyfacts
`generated from one noun phrase. k is the number of
`keyfacts
`including the particular words. p
`is a
`correlation coefficient. In this thesis, 1 was used when
`both a central word and a subordinate word exist and
`0.5 when only a central word exists.
`
`In these two cases, the retrieved document
`actually matches. When the same queries are raised
`the same document is retrieved but its ranking in the
`two cases are different from each other. A vector
`space model was used as a rank assignment algorithm.
`Also, to measure the precision, answers to the
`queries were defined
`in advance. Precision
`represents how much the retrieved result matches
`with the pre‐defined answers to the queries.
`However, the existing concept of precision was
`slightly extended. Precision was determined by
`emphasizing the ranking of the retrieved results. The
`number of appropriate documents included in the top
`15 of the retrieved documents was compared. For
`instance, let’s say there are 10 documents with pre‐
`defined answers. If 10 documents are included in the
`top 15 ranking of the retrieved documents, the
`precision is 100%. If 5 documents are included it
`becomes 50% and if there is no document, it is 0%.
`
`
`Below is the comparison of the answers to
`the queries.
`1) “What is the origin of Chuseok?”
`
`Keyword
`Keyfacts
`with equal
`weighted
`value
`314
`
`Keyfacts with
`different
`weighted
`values
`314
`
`466
`
`Retrieved
`document
`Precision
`
`75%
`
`75%
`
`75%
`
`
`2) “What are Jang Yeong‐sil’s achievements?”
`
`Keyword
`Keyfacts
`Keyfacts with
`with equal
`different
`weighted
`weighted
`value
`values
`
`Page 3 of 8
`
`

`Collected Papers for 1998 Autumn Academic Conference by Korean Institute of Information Scientists and Engineers, Vol. 25.
`No. 2
`
`When retrieving information using keyfacts
`
`– multi terms ‐ more precise results were achieved by
`applying different weighted values to each keyfact,
`rather than applying equal weighted value to all the
`keyfacts. The difference
`in precision was not
`significant in the two cases but when it comes to the
`ranking of the retrieved result we could retrieve more
`appropriate documents when using the keyfacts with
`different weighted values applied. In the future
`research, a method or an algorithm to assign different
`weighted values suited for each case should be
`developed, rather than considering the weighted
`value as two cases only.
`
`References
`
`
`[2] “ETRI‐NLPS natural language process form tag set
`for meaning based information retrieval” by Kyungtak
`Jung, Dongsi Choi, Miseon Jeon, Raewon Seo and
`Seyoung Park, Natural Language Processing Section,
`The Electronics and Telecommunications Research
`Institute, 1997
`
`
`[4] “Contents based multimedia information retrieval
`technology development” The Electronics and
`Telecommunications Research Institute
`[5] “Keyfact concept based information retrieval
`system” by Daisuk Jang, Department of Computer
`Science, Hanyang University, thesis for master’s
`degree, 1997
`
`
`
`
`
`
`Retrieved
`document
`Precision
`
`352
`
`258
`
`258
`
`77%
`
`88%
`
`88%
`
`
`3) “What is the difference between ultraviolet rays
`and infrared rays?”
`
`Keyword
`
`Keyfacts
`with equal
`weighted
`value
`738
`
`Keyfacts with
`different
`weighted
`values
`738
`
`40%
`
`60%
`
`
`
`In the first example, the same precision was achieved
`for the three cases. For the second case, the precision
`result retrieved from keyfacts with equal weighted
`value and the result retrieved from keyfacts with
`different weighted values is the same. But this
`precision is the result of a simple investigation to see
`if it is included in the top 15. If we look into each
`ranking of the retrieved result, the ranking of the
`result retrieved from the keyfacts with different
`weighted values was the closest to the ranking of the
`pre‐defined answers to the queries. In addition, the
`average precision in 40 pre‐defined answers to the
`queries is as follows:
`
`
`Retrieved
`document
`Precision
`
`1075
`
`30%
`
`
`
`Keyword
`
`Keyfacts
`with equal
`weighted
`value
`74%
`
`Keyfacts with
`different
`weighted
`values
`78%
`
`Precision
`
`69%
`
`
`
`6. Conclusion and future research direction
`
`
`Copyright © 2005 NuriMedia Co., Ltd.
`
`Page 4 of 8
`
`

`I, Yoon Hee Choi, am a professional Korean to English translator based in Texas. I am competent to translate
`from Korean to English, and I have 11 years of experience doing so. I hereby certify that the attached English
`document A Method for Improving Recall Precision on Information Retrieval Systems Using Multiple
`Terms is an accurate translation of the attached Korean document 다중단어를 사용한 정보검색 시스템에서의
`재현정확도 향상방법 to the best of my knowledge and belief.
`
`I declare that all statements made herein of my knowledge are true, and that all statements made on
`information and belief are believed to be true, and that these statements were made with the knowledge that
`willful false statements and the like so made are punishable by fine or imprisonment, or both, under Section
`1001 of Title 18 of the United States Code.
`
`Signed:
`
`Name: Yoon Hee Choi
`
`Date: 5 August 2019
`
`Page 5 of 8
`
`

`다중단어믈 사용한 정보검색 시스템에서의
`재현정확도 향상방법
`
`최종회 , ', 최동시"' 박세영", 오회국 I
`
`한양대학교 전자계산학과\ 한국전자통신연구원"
`
`A Method for Improving Recall Precision on Information Retrieval
`Systems Using Multiple Terms
`
`Jonghce Choi', Dongsi Choi", Seyoung Park", Heekuck Oh'
`Dept. of Computer Science, Hanyang Univ. · , ETTII' '
`
`약
`
`요
`
`정확한 정보롤 검색하기 위해 단일단어를 사용하는 대신에 다중단어들 사용하는 정보검색 시스템에 대한 언
`구가 활발히 진행되고 있다. 그러나 아직까지 다중단어를 이용한 겁색시스템은 그리 많지 않다. 다중단어윤
`아용한 정보검색시스템의 한 예가 키팩트를 이용한 정보검색 시스템이다. 키팩트란 키워드뿐만 아니라 관련정
`보를 갈이 포함하고 있는 다중단이의 하나다. 키팩트에 기반한 정보검색 시스템 은 현재 문서의 색인과정과 질
`의어의 키팩트 추출과정에서 같은 가중치을 가진 키팩트를 생성한다. 그러나, 하나의 명사구는 그것 이 갖는
`의미에 마라 각기 다른 다양한 키팩트을 생성하기 때문에, 이들의 절과에 기존의 정보검색 방법을 적용하는
`것은 문재가 많다. 따라서 본 논문에서는 색인시에 생성되는 각각의 키팩트에 적절한 가중치들 부여함으로싸
`보다 정확한 정보겁색이 이루이지도목 하는 방법운 재안한다.
`
`1. 개 요
`한나. 그리 」l, 마지 막으로 6 장에서 는 건론 및 향후 연구방향에 내해
`내부분의 정보접색 시스넵은 단임이 인 키워드물· 사용하여 정보륨 설명한다
`검색한다 키 워드로 정보·읍 검색합 겅우에 짖고자 하는 정 보가 모호
`
`성운 갖기 쉽다 모호성이란 하나의 단이가 여러 가지 의미선 가지고
`
`있거나, 너무 광범위한 의미봅 가지고 있논 것음 말한다 이런 모호성
`
`유 해소하기 위한 한가지 방법이 다중단이봅 사용하높 것이다. 다중
`단어는 기촌의 정 보 겁색 시 스넵이 사용하고 있는 키위드라는 단일이
`와는 달리 이 키워드아 뷔 산된 성보릅 같이 포함하고 있다 다중단이
`는 관련정보·샬 같이 포함하고 있기 때분에 그 키위도가 갖는 의미산
`정확하개 파악한 수 있다. 관련정보란 키위드가 가지는 외미읍 정확
`
`히 안 수 있 노푹 키워드의 뉵·성음 선명해 주는 정보을 말한다. 그 래
`서, 복합명사로 구성된 징우놈 간이 ·상반되 논 명사가, 관형어봅 갖는
`경우t:- 관형이가, 그리고 집장의 형태로 나타나는 성우는 동사나 형
`용사가 관련정보 의 역한윤 한다 l!-31 이렇게 해서 나중단이의 개남
`으로 나온 것이 키팩 트이 다
`
`2. 키팩트
`정보름 검색할 때, 단순히 키워 드뿐만 아니 라 관린된 정보를 같이
`
`가자고 있는 키워드로 십 센 안 하개 되면 검색의 정확성음 높일 수 있
`
`다. 이 것은 검색하고자 하는 법위가 축소되기 때문이다. 사용자가 하
`
`나의 단어로 점의문운· 구성하지 않고 여러 개의 단어로 질의산을 만
`
`드는 것은 바로 정확한 의미륨 나타내가 위해서이다
`
`키팩트란 몬서의 내용융 대표하는 섯이 단어 ( word) 가 아니라 사실
`
`(fact) 이이야 한다는 점에서 만둘이진 말로서, 키팩트는 카워도와 관
`
`런성보릅 가지고 있다. 키팩 토는 중심이와 종속어로 구성되어 있는데,
`
`키위 드는 중심어, 관린정보는 종속이가 된다. 문장에서 표현방법은 여
`
`라 가시 이 지 만 그것이 나타내는 내용이 의머적으로 동일하다면 같은
`
`키팩 트라고 함 수 있댜 그래서 하나의 키 ,;번 트는 의 미적 으로는 같다
`
`본 논문에서는 키백드롭 이 용한 정보검색 시스냅에 서 보다 정확한
`
`할지 라도 문법적으로는 서 모 나름 수 있나. 왜냐하면 하나의 키팩트
`
`정보섭 겁색하 기 위한 방법으로 각각의 키팩트에 가중 지 윤 부여하는
`
`봉 표현하는 데 는 여러 가지 형태의 표현방법이 촌재할 수 있기 때 문
`
`방법음 세안한다. 그 리 고 이 렇 게 했음 때의 정확도와 동안한 가중치
`
`이다. 어떤 문서에서 거촌의 방법으로 키워드롭 추춤하여 놓고 그 키
`
`릅 부여했음 때의 정확도릅 비 교하였 나. 2 장에서는 키팩 트에 내한 션
`명을 하고, 3장에서 는 키팩트봅 추출하논 방법에 내해서 선명한다 ,1
`장에서 는 색인시에 이 I정개 가중지풀 부여하는지에 내해 설명하고, 5
`장에서 는 각각의 키팩트에 다몬 가중치윤 부여했옵 경우와, 동일한
`
`워도만으로 원래의 문시봅 추론하는 것과. 명사구륜 추출한 후 그 명
`사구만으로 원래의 문서봅 추론하논 것을 시 험해 본 경과, 후자의 방
`법이 원래의 문시들 더 잘 표현하고 있다는 섯이 증명 되었다 [4].
`색인어의 조건 은 첫째, 분서를 대표하여야 하고 둘째, 다시 나타날
`
`가중치릅 부여했을 경우에 검색된 선과의 정확도릅 실험윤 뭉해 비교
`
`확윤이 있어야 한다는 섯 이 다 그 러나, 보몽의 명사구는 어느 정도 문
`
`150
`Copyright (C) 2005 NuriMedia Co., Ltd.
`
`Page 6 of 8
`
`

`1998 년 도 한국정보과학회 가을 학술발표논문집 Vol. 25, No. 2
`
`서를 대표하가 논 하나 다시 나타날 확률은 거의 없다. 그래서 키팩토
`
`키팩토몹· 만들 때, 하나의 명사구릅 하나의 키팩토로 만도는 것이
`
`가반 정보검색시스템에서 하나의 명사구는 여러 개의 키팩트로 생성 아니다 하나의 명사구에서 여러 개의 키팩토릅 만들어 낸다 이것 은
`
`되 어야 한다.
`
`3. 키팩트 추출방법
`
`그 림 l 키 팩토 추출기 구성도
`
`키팩트볼 추충하기 위해서 는 크게 세 단계의 과정을 거 서야 힌다
`
`l성사구윤 하나의 키팩토로 표현하는 것은 가능하지만, 다본 패턴에
`
`의해 생성되 는 키팩트와의 부분 매 칭에 문제접이 있기 때문이 다, 이
`
`렇게 생성된 키팩트 중에는 중심이와 종속이뭍 모두 가진 것도 있지
`
`만, 중심어만을 가진 것도 있다. 만일 이 키팩트를 가지고 검 색을 하
`
`는 경우에 중심어와 종속이률 모두 가지고 있는 것과 중심어만욥 가
`
`지고 있는 섯에 같은 가중치 ·d· 부여한다민 보다 정확한 겁색윤 할 수
`
`없을 것이다,
`
`4. 색인과정
`본 논분에 서 제안하는 방법이 다몬 기존의 정보검색 시스넵과 크
`
`게 다른 점은 색인 과정에서 각각의 키팩트가 자신 의 고유한 가중치
`
`릅 갖고 있다는 점이다. 한 명사구에서 여러 개의 키팩 트가 생성되기
`
`때문에 보돈 키 팩트의 가중치릅 같게 한다는 것은 문재가 있디. 예 릅
`
`돕어 , "김상의 본점"이라는 명사구가 있다고 하자 그 러 면 이 이절에
`
`서 다음과 같은 키팩트가 만들어집 것이디
`
`〔감상,NI니, [본질 ,NIL], [감상, 본전 ], [감성 본점 , NIL]
`
`이 경우에 [갑상, NIL) 과 [본질, NIL l보나는 [감상, 본질]이 더 정확
`한 의미 믈 갖는다. 그 러브로, [감상, NIL]. [ 본집,NIL] 보다는 [감상,
`본점]이 다 큰 가중치릅 가져야 한다
`
`일단 주어진 문장이나 어절움 형대소로 분석해야 하고, 두 번째로 분
`
`또 나쁜 예 릅 둘이보면 "신과 자연의 점서 ’’라는 명사구가 있다고
`
`석된 형태소에서 모호성옵 해소해야 한디 모호성음 해 소할 때는 같
`
`할 때 추출기 는 키팩 트는 다음과 갇다
`
`은 문장이나 같은 이질 내에 포함되어 있는 다른 형대소와의 상호 관
`런정도릅 비교하여 모호상을 해소한디 이매 쉽게 할 수 있는 방법이 '" [신, >JIL], [자연, N!Ll. (전서, NIL), [신 , 집 서 ), [자연 , 질
`관련명 사륜 사용하는 섯이다. 그러나, 모호성을 완벽하개 해소하기에 서 I. l 신 자연 , ] , ［자연 신, ], ［신 자연, 질서),[자연 신, 질서]
`는 어려움이 많다 그래서, 이 단계의 모호성 해소는 아주 간난한 패
`
`턴듬에 대해서만 적용하고, 나머지등은 밤뭉치에 나다나 있는 비도수
`
`에 따라 결정한다. 그리고, 마지막으모 보호성 이 해소된 형태소릅 가
`
`이 경우에도 중십이만 있논 것보다는 중심어와 종속이듬 함 께 가지고
`
`있는 키팩트의 가중치가 며 놉아야 한나
`
`또, 위 두 가시 예에서 보 듯이 각각의 키워드 " 갑상", .. 본집", " 신 ",
`시 고 키팩트 생성규칙에 따라 키팩트룝 추출힌다
`형태소 분석기와 모호성 해소기롱 거친 형태소듄은 하나의 품사와 "자연” 그리고. '정서’’는 각각 l 빈 씩 문서의 본문에 서 나타난 것 이 므
`로 아문교 인해 생성된 키팩트 내에서의 각 명사에 내한 가중치 합은
`하나의 의미불 가진다 키팩트·릅 주출하논 방법은 나움과 같다 [S].
`
`갑아야 한 것이 다 다시 말해서, " 갑상의 본집"이라는 명사구에 서 추
`
`출된 키팩트 중에서 "감상"은 전체 4 개의 키팩 토중 3 개의 키팩트에서
`
`• 중 심이만으로도 키팩 트 가 뭡 수 있나. 즉, 하나의 명사(기
`충현했고' "신과 자연의 전 서 ’’라는 명사구에서 추출된 기백 토 중에서
`존의 키위도)도 키백트모 사용한다
`• 두 키위드가 ’의’ 로 연결된 경우에 두 키워 드 는 하니의 중 " 신’’은 선체 9 개의 키 백트중 6개의 키백토에 출현했다 그러므모, ‘컴
`심어 봅 형성할 수 있으며 ,
`’ 의 ’의 디 움 키위·~ 돈 종속 어가 상” 이 출현한 각각의 키팩트는 1/3에 해당하는 가중지션· 부여받아야
`말 수 있냐
`하고, "신’’이 충현한 각각의 거 팩 토는 1 /6에 해당하는 가중지룹 부여
`• 누 키위 도 가 파I과'보 연건된 경우에 두 키 워 -드는 하나의 받아야 한니 .'J. 러나' tcj 정확한 값음 계산하기 위해시 는 많은 실험
`과 고찰음 낌요로 한다
`중신이샵 형성할 수 있으며 , ·와,I과’ 의 나음 키워드높 종속
`이가 뮐 수 있디 이매. 중심 어외 종속어의 위치기 서 로
`바꾸l 이 도 상관 이 없디
`• 파생관형사, 서술적동사, 마서술 직 동사는 관련동사로서
`종속이 로 만 쓰일 수 있다
`• 조사 없 이 연전된 두 키위드가 하나의 키워드몹 형성하논
`경우에는 순서릅 갖 는 다
`
`5, 검색 및 실험
`모든 키백 트는 |중십 어, 종속어] 이거나 [중심이. NIL] 이다. r중심
`
`이. 종속이]는 l중 심어, NIL) 보다는 더 협소하고 정화한 의미봅 가지
`
`므로 석합한 분서나 정보를 찾는데 더 큰 도움옵 줄 것아다. 본 실험
`
`에서는 계몽사 백과사진 23,112개의 문서봄 내상으로 하였고, 총 데
`
`151
`Copyright (C) 2005 NuriMedia Co., Ltd.
`
`Page 7 of 8
`
`

`1998 년도 한국정보과학회 가을 학술밥표논문집 Vol. 25. No. 2
`
`이터의 크기는 약 12:llbytes 이다 그리고, 실험은 두 가지의 경우로 다. 그렇지만, 이 정확도는 단순히 15위 대에 포함되어 있는지를 조사
`나누어서 있다. 측, 모는 키팩트가 동일하게 가중치를 갖는 경우와, 한 경과다 검색된 결과의 순위를 하나하나 살펴보면 다른 가중치를
`각각의 키팩트가 서로 다른 가중치를 갖는 경우로 나누어서 실험했 갖는 키팩트로 검색한 경과의 순위가 미리 만둘어 놓은 집의 응답의
`다 그리고, 후자의 경우에 다음의 공식을 이용했다
`순위에 가장 근접했다 그리고, 미리 정의해 놓은 40 개의 질의 옹답에
`
`대한 평균 정확도는 다음과 같다
`
`키위드
`
`정확도
`
`69%
`
`동일한 가중치를
`
`다른 가중치불
`
`갖는 키팩 E
`一
`74%
`
`갖는 키팩트
`78%
`
`6. 결론 및 향후 연구방향
`다충단어인 키팩트를 이용하여 정보를 검색할 경우에 모든 키팩트
`
`에 같은 가중치를 부여해서 검색을 하는 것보다는 각 키팩트에 서로
`
`다른 가중치들 부여해서 검색하는 것이 정확도에 있어서 더 좋은 걷
`
`과듄 얻을 수 있었다. 키팩트에 각각 다른 가중치를 부여한 것이나 가
`
`중치를 동일하게 부여한 것이나 정확도에 있어서는 그다지 큰 차이를
`
`보아지는 않았지만, 검색된 결과의 순위에 있어서는 각각 다른 가중
`
`치릅 부여한 키팩트를 사용한 경우에 더 적합한 문서를 컵색할 수 있
`
`었다. 그리고 향후예 연구할 방향으로는 가중치를 두 가지 경우로만
`
`생각할 것이 아니라, 각각의 경우에 맞게 가중치률 부여하는 방법이
`
`나 알고리즘이 개발되어야 한다.
`
`참고 문헌
`[I] CJ.
`Info,matlon Ret/,e>al, Butte,worths.
`;an Rljsbe,gen,
`London. ,econd Edition, 1979.
`[2] 정경택. 최동시, 전미선. 서래원, 박세영, ’’의미기반 정보컵
`색을 위한 ETRI-NLPS 자연어처리 형태 태그 세트, 한국전자
`통신연구원 자연어처리연구실, 1997
`/nt,-odoctton
`[3] G. Salton and M. G. McG 서 I.
`/nforn,at,on Ret,/e,a/, ~kGrn"IIIII 、 New Yock, 1983.
`[4] "내용기반 얼터미디어 정보검색 기술 개발". 한국전자통신
`연구원
`[5] 장대석. "키팩트개넘기반 정보겁색시스템", 한양대 전자계
`산학과 석사학위 논문. 1997
`
`.1/ode,n
`
`to
`
`降ight = — xp
`
`k
`N
`식 l 에서 N은 하나의 명사구에서 만들어진 전체 키팩트의 개수이고,
`k 는 특정단어륨 포함하고 있는 키팩트의 개수이다. 그리고, p는 상관
`
`식 1
`
`계수로 본 논문에서는 중심어와 종속어가 모두 존재하는 경우에는 ]
`
`을, 중심어만 촌재하는 경우에는 0.5를 사용했다.
`
`이 두 가지 경우에 실제로 검색되는 문서는 일치한다 갑은 집의를
`
`했을 경우에 같은 문서가 검색되지만, 두 경우의 순위는 서로 다로다
`
`순위부여 알고리즘은 벡터공간모델을 이용했다 그리고, 정확도롱 측
`
`정하기 위해 미리 점의에 대한 옹답을 정의해 놓았다. 정확도는 검색
`된 결과가 미리 정의해 놓은 집의 웅답과 얼마나 일치하는가를 나타
`낸 것이다. 그러나, 기촌의 정확도의 개념을 약간 확장하여 사용하였
`
`다 검색된 결과의 순위률 중요시하여 정확도를 결정하였다 즉, 겁색
`된 문서의 순위 상위 15 위 내에 적합한 문서 몇 개가 포함되어 있는
`지륜 비교하였다 예릅 둘어, 마리 정의해 놓은 응답은 10개의 문서가
`
`있다고 하자. 검색된 푼서의 순위 상위 15위 대에 10 개의 문서가 모
`두 둘어있으면 정확도는 100% 가 되고, 5 개가 둘어있으면 a0%, 하나
`
`도 들어있지 않으면 0% 가 된다.
`
`다음은 몇 가지 징의에 대한 응답을 비교하고 있다
`
`II "추석의 기원은?"
`
`키위드
`
`검색된 문서 |
`4f,6
`I 1s%
`정확도
`
`2) .. 장영실의 업적은?"
`
`농일한 가중치를 다른 가중지불
`
`갖는 키팩 E
`314
`75%
`
`갖는 키팩트
`314
`75%
`
`키위드
`
`3:;2
`6 ? 7
`l1
`
`동일한 가중치를
`
`다몬 가중지를
`
`갖는 키팩트
`258
`88%
`
`갖는 키팩트
`2:58
`88%
`
`겁색된 문서
`
`정확도
`
`31 "자외선과 적외선의 차이는?"
`
`’ 위드
`
`1075
`30%
`
`눕일한 가중차설
`
`다른 가중치릅
`
`갖는 키팩토
`738
`40%
`
`갖는 키팩 E
`一
`738
`60%
`
`검색된 문서
`
`정확도
`
`첫 번째 예에서는 세 가지 경우에 대해 감은 정확도를 얻었고. 두 번
`째 경우에서는 농일한 가중치릅 갖는 키팩트로 검색한 결과의 정확도
`
`와 다른 가중지들 갖는 키팩트로 검색한 결과의 정확도가 갇게 나왔
`
`Copyright (C) 2005 NuriMedia Co., Ltd.
`
`152
`
`Page 8 of 8
`
`

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases