`Ho et al.
`
`I lllll llllllll Ill lllll lllll lllll lllll lllll 111111111111111111111111111111111
`US006498921Bl
`US 6,498,921 Bl
`Dec.24,2002
`
`(10) Patent No.:
`(45) Date of Patent:
`
`(54) METHOD AND SYSTEM TO ANSWER A
`NATURAL-IANGUAGE QUESTION
`
`(76)
`
`Inventors: Chi Fai Ho, 965 Astoria Dr.,
`Sunnyvale, CA (US) 94087; Peter P.
`Tong, 1807 Limetree La., Mountain
`View, CA (US) 94040
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/387,932
`
`(22) Filed:
`
`Sep. 1, 1999
`
`Int. Cl.7 .................................................. G09B 7/00
`(51)
`(52) U.S. Cl. ....................... 434/362; 434/118; 434/169;
`434/325; 704/257; 706/927
`(58) Field of Search ................................. 434/118, 156,
`434/169, 185, 219, 307 R, 308, 322, 323,
`327, 325, 350, 362, 365; 704/1, 207, 257,
`258, 273, 276, 260; 706/927, 11, 55, 59,
`916; 707/5, 35; 709/201, 212, 227, 229,
`242, 244; 382/161, 229, 311; 340/7.23,
`7.29, 825.27; 345/326, 538
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`11/1988 Bourne
`4,787,035 A
`1/1989 Spiece
`4,798,543 A
`4,816,994 A * 3/1989 Freiling et al. ............... 706/59
`7/1989 Clancey
`4,847,784 A
`9/1989 Brush et al.
`4,867,685 A
`4/1990 Loatman et al.
`4,914,590 A
`7/1991 Monson et al.
`5,035,625 A
`8/1993 Gardner et al.
`5,239,617 A
`11/1993 Sack et al.
`5,259,766 A
`11/1993 Haddock et al.
`5,265,014 A
`11/1993 Turtle
`5,265,065 A
`2/1994 Bara bash
`5,286,036 A
`3/1994 Ryu et al.
`5,295,836 A
`4/1994 Gifford et al.
`5,301,314 A
`4/1994 Ujita et al.
`5,306,154 A
`5/1994 Katz et al.
`5,309,359 A
`(List continued on next page.)
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`WO
`WO
`
`0 436 459 A
`WO 93/21587
`WO 95/02221
`
`7 /1991
`10/1993
`1/1995
`
`OTHER PUBLICATIONS
`
`Success Maker Courseware brochure, published by Com(cid:173)
`puter Curriculum Corporation, Dec. 1994.
`Active Mind Series from World Wide Web, URL=http://
`www.broderbund.com/studio/ams.html, 1996.
`Logical Journal of the Zoombinis from World Wide Web,
`URL=http://www.broderbund.com/studio/atoz/zoom(cid:173)
`bini.html, 1996.
`
`(List continued on next page.)
`
`Primary Examiner-Joe H. Cheng
`(74) Attorney, Agent, or Firm-Beyer Weaver & Thomas,
`LLP
`
`(57)
`
`ABSTRACT
`
`Providing methods and systems to quickly and accurately
`respond to a natural-language question. The responses to the
`question can depend on additional information about the
`user asking the question, and the subject matter of the
`question the user asked. For example, the system knows that
`the user understands French, and can supply French answers
`to the user. Such additional information can improve on
`relevancy of the responses to the question. More than one
`responses can be provided to the user to allow the user to
`pick the more appropriate one. One embodiment uses a
`computer with a database having many phrases and question
`formats. The computer identifies phrases in the question
`based on at least one grammatical rules and phrases in the
`database. Then the computer links the phrases to categories
`based on at least one semantic rule, the subject matter of the
`question, and information about the user, such as previous
`questions asked by the user. The computer then selects at
`least two question formats based on at least the scores. After
`the question formats are selected, the system allows the user
`to pick at least one of the question formats so as to have an
`answer to the question generated.
`
`34 Claims, 11 Drawing Sheets
`
`,.--..' 54
`
`Question Regularizer
`
`r-.._,,,, 80
`
`Phrase Identifier
`
`f'\.J 82
`
`Question Structure
`ldent1f1er
`
`"--' 84
`
`Question Format
`Identifier
`
`"--' 86
`
`Answer Identifier
`
`/"\._.,, 88
`
`Score
`
`Question
`Subject'\, 350
`Matter
`
`User
`Profile'\... 354
`
`User
`Previous 'l_., 352
`Question
`
`From
`System\._, 362
`Inquiry
`
`Language
`Skill
`V\
`356
`
`Interest
`V\
`358
`
`Not Directly
`Entered
`by User
`I 'j57
`
`Ethnic
`Background
`V\
`360
`
`IP Address
`V\
`359
`
`IPR2020-00686
`Apple EX1013 Page 1
`
`
`
`US 6,498,921 Bl
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`1/1995 Vassiliadis et al.
`5,384,894 A
`1/1995 Hedin et al.
`5,386,556 A
`4/1995 Katz et al.
`5,404,295 A
`5/1995 Vassiliadis et al.
`5,414,797 A
`5/1995 Su et al.
`5,418,717 A
`6/1995 Byrd et al.
`5,423,032 A
`8/1995 Lee et al.
`5,441,415 A
`8/1995 Kirkbride et al.
`5,446,883 A
`9/1995 Burns et al.
`5,454,106 A
`2/1996 Harding et al.
`5,495,604 A
`5/1996 Kupiec
`5,519,608 A
`12/1996 Allen et al.
`5,581,664 A
`12/1996 Allen
`5,586,218 A
`1/1997 Bloom et al.
`5,597,312 A
`5/1997 Tracz et al.
`5,634,121 A
`7/1997 Silverman
`5,652,828 A
`10/1997 Carbonell et al.
`5,677,835 A
`5,677,993 A * 10/1997 Ohga et al. ................. 704/257
`5,696,980 A * 12/1997 Brew ......................... 704/273
`5,701,399 A
`12/1997 Lee et al.
`5,732,395 A
`3/1998 Silverman
`5,749,071 A
`5/1998 Silverman
`5,751,906 A
`5/1998 Silverman
`5,754,938 A
`5/1998 Herz et al.
`5,754,939 A
`5/1998 Herz et al.
`5,787,234 A
`7/1998 Molloy
`5,794,050 A
`8/1998 Dahlgren et al.
`5,797,135 A
`8/1998 Whalen et al.
`10/1998 Vaithyanathan et al.
`5,819,258 A
`10/1998 Lu et al.
`5,819,260 A
`5,835,087 A
`11/1998 Herz et al.
`5,836,771 A
`11/1998 Ho et al.
`5,852,814 A
`12/1998 Allen
`5,884,302 A
`3/1999 Ho
`5,909,679 A
`6/1999 Hall
`5,933,531 A
`8/1999 Lorie
`5,933,816 A
`8/1999 Zeanah et al.
`8/1999 Braden-Harden et al.
`5,933,822 A
`5,934,910 A
`8/1999 Ho et al.
`9/1999 LaPorta et al.
`5,959,543 A
`10/1999 Shilcrat
`5,963,948 A
`10/1999 Vogel
`5,963,965 A
`5,999,908 A
`12/1999 Abelow
`6,076,088 A
`6/2000 Paik et al.
`6,078,914 A
`6/2000 Redfern
`7/2000 Reed et al.
`6,088,717 A
`6,101,515 A
`8/2000 Wical et al.
`6,160,987 A * 12/2000 No et al.
`.................... 434/350
`6,263,335 Bl * 7/2001 Paik et al.
`..................... 707/5
`7/2001 Russell-Falla et al.
`6,266,664 Bl
`6,269,329 Bl * 7/2001 Nordstrom ..................... 704/1
`6,336,029 Bl
`1/2002 Ho et al.
`6,349,307 Bl
`2/2002 Chen
`6,393,428 Bl
`5/2002 Miller et al.
`
`OIBER PUBLICATIONS
`
`Selecting Software by PC Novice Series, vol. 3, Issue 12, pp.
`51, 64, and 89-92, 1995.
`Computerized Adaptive Testing, Oct. 24, 1994, from World
`Wide Web, URL=Gopher://Gopher.ETS.org.
`Innovation and Technologies, Oct. 21, 1994, from World
`Wide Web, URL=Gopher://Gopher.ETS.org.
`
`Interactive Mathematic Instructor's Guide by Academic
`Systems, pp. 86 and 114, Aug. 1995.
`High School Learning and Resource Library by ProOne,
`photocopy of the box and screen-dump to get 5 pages, 1995.
`Web pages from Ask Jeeves Inc. URL=http://www.ask.com,
`After 1996.
`"Natural Language Interfaces to Databases-An Introduc(cid:173)
`tion" by Androutsopoulos et al, Mar. 1995, pp. 1-49.
`Woods, W.A., "Semantics And Quantification In Natural
`Language Question Answering," Academic Press, pp.
`205-248, 1978.
`Whalen, Thomas, "Computational Behaviourism Applied to
`Natural Language," Communications Research Centre,
`Ottawa, ON, Apr. 1996.
`Bunt, Harry C., "Dialog Control Functions and Interaction
`Design," Institute for Language Technology and Artificial
`Intelligence ITK, The Netherlands, pp. 197-214, 1995.
`Whalen, Thomas, "CHAT Conversational Hypertext Access
`Technology," webpage: http://debra.dgbt.doc.ca/chat/in(cid:173)
`fo.page.html, Apr. 1993.
`"The CHAT Natural Language System," webpage: http://
`debra.dgbt.doc.ca/chat/chat.html, 1999.
`Prince, Violaine M., "Relying on a Sophisticated Student
`Model to Derive Dialog Strategies in an Intelligent Tutoring
`System," pp. 179-194, 1995.
`Kaplan, S. Jerrold, "Cooperative Responses From a Portable
`Natural Language Database Query System," Computational
`Models of Disclosure, MIT Press, Chapter 3, pp. 167-201,
`1983.
`Hendrix et al., "Developing a Natural Language Interface to
`Complex Data," Association for Computing Machinery
`(ACM), 1978.
`Burton et al., "Toward a Natural-Language Capability for
`Computer-Assisted Instruction," Natural Language Pro(cid:173)
`cessing, Kaufmann Publishers, pp. 605-624, 1986.
`Norvig, Peter, "Paradigms Of Artificial Intelligence Pro(cid:173)
`gramming: Case Studies In Common LISP", "Student: Solv(cid:173)
`ing Algebra Word Problems," Chapter 7, Morgan Kaufmann
`Publishers, 1992, pp. 219-235.
`Response Generation, "Question-Answering Systems,"
`Chapter 16, and "Natural Language Generation," Chapter
`17, pp. 468-513.
`Burke et al., "Knowledge-based Information Retrieval from
`Semi-Structured Text," The Artificial Intelligence Labora(cid:173)
`tory, The University of Chicago, Nov. 1995.
`Harris, "Primus releases SolutionBuilder," Service News,
`Dec., 1995.
`Product information re: SolutionBuilder from Primus Com(cid:173)
`munications, Help Desk Buyer's Guide, Call Center Maga(cid:173)
`zine, p. 40, May 1996.
`SolutionBuilder User Guide and Command Reference, Ver(cid:173)
`sion 1.1, Primus Communications Corporation, 1995.
`Lim, "Budding Businessses to Watch in '96," The Seattle
`Times, Jan. 28, 1996.
`Baker, "Snafu-solving software at Primus attracts capital,"
`Puget Sound Business Journal, Apr. 29-25, 1996.
`* cited by examiner
`
`IPR2020-00686
`Apple EX1013 Page 2
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 1 of 11
`
`US 6,498,921 Bl
`
`Input Device
`
`r\.../52
`
`Answer Generator
`
`r'\..J 54
`
`Output Device
`
`r\.../56
`
`Figure 1
`
`IPR2020-00686
`Apple EX1013 Page 3
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 2 of 11
`
`US 6,498,921 Bl
`
`Database
`rJ
`90
`
`Question Regularizer
`
`A.__..,. 80
`
`Phrase Identifier
`
`f"\.../ 82
`
`Question Structure
`Identifier
`
`f'\.J 84
`
`Question Format
`Identifier
`
`f'\.J 86
`
`Answer Identifier
`
`A.__..,. 88
`
`Figure 2
`
`IPR2020-00686
`Apple EX1013 Page 4
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 3of11
`
`US 6,498,921 Bl
`
`,--J 120
`
`Regularize Question ~ 122
`
`Identify Phrases in Question~ 124
`
`Generate Question Structures ~ 126
`
`l
`l
`l
`l
`l
`
`Identify Question Formats~ 128
`
`Allow User Picking Question Format f"'\J 130
`
`Identify Answer~ 132
`
`Figure 3
`
`IPR2020-00686
`Apple EX1013 Page 5
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 4of11
`
`US 6,498,921 Bl
`
`5150
`
`152 s
`
`l:::j
`
`OCJI
`
`B
`
`156 )
`
`D
`
`- - - - . - - -_ _t_ ___ -
`
`-
`
`-
`
`DJ ~154
`
`Figure 4A
`
`IPR2020-00686
`Apple EX1013 Page 6
`
`
`
`172
`
`160
`
`Hard
`Disk Drive
`
`Floppy
`Disk Drive
`
`174
`
`178
`
`Monitor
`
`ALU
`I Control I
`I Reg I
`
`1/0
`Controller.
`
`Graphics
`Adapter
`
`168
`
`Circuit
`Board
`
`164
`
`180
`
`d •
`\JJ.
`•
`
`181
`
`Audio
`Signals
`
`159
`
`162
`
`Main
`Memory
`
`Peripheral
`Controller
`
`166
`
`170
`
`Network
`Interface
`Adapter
`
`Keyboard
`
`176
`
`Mouse
`
`182
`
`120
`
`Figure 48
`
`IPR2020-00686
`Apple EX1013 Page 7
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 6of11
`
`US 6,498,921 Bl
`
`Regularize Question
`
`Change Verbs to
`Present Tense
`~
`202
`
`Change Nouns to
`Singular Form
`
`rJ
`
`204
`
`IPR2020-00686
`Apple EX1013 Page 8
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 7of11
`
`US 6,498,921 Bl
`
`,-/ 124
`
`Identify Phrases ~ 252
`from First Word
`
`Identify Phrases f'\.../ 254
`from Second Word
`
`l
`l
`l
`l
`
`Continue Identify from
`Remaining Words
`
`256
`r'\.../
`
`Remove Words not f'\.../
`258
`in Identified Phrases
`
`Generate Phrased Question f'\.../ 260
`
`Figure 6
`
`IPR2020-00686
`Apple EX1013 Page 9
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 8 of 11
`
`US 6,498,921 Bl
`
`~126
`
`Link Phrases to Categories r\.J 302
`
`Provide Scores to ~ 304
`Categorized Phrase
`
`l
`l
`l
`l
`
`Generate Question Structures r\.J 306
`
`Provide Scores to ~ 308
`Question Structures
`
`Select Question Structures ~ 310
`
`Figure 7
`
`IPR2020-00686
`Apple EX1013 Page 10
`
`
`
`,- -
`
`- -,- - · - -
`
`-
`
`Score
`
`I
`
`-
`
`- -
`
`Question
`Subject'\..,, 350
`Matter
`
`User
`Profile 0 354
`
`User
`Previous'\..,, 352
`Question
`
`From
`System'\..,,362
`Inquiry
`
`I-
`
`-- - -1 - - -
`
`l
`
`- - I
`
`Language
`Skill
`'v\
`356
`
`Interest
`\/\
`358
`
`Not Directly
`Entered
`by User
`\./\
`357
`
`Ethnic
`Background
`\/\
`360
`
`IP Address
`\/\
`359
`
`Figure 8
`
`- - -,----,
`
`d •
`\JJ.
`•
`~
`~ ......
`~ = ......
`
`~
`~
`ri
`N
`~,J;;..
`
`N c c
`
`N
`
`'Jl =(cid:173)~
`~ .....
`\C
`0 .....,
`'"""'
`'"""'
`
`e
`rJ'l
`-..a-..
`.i;;..
`\0
`-..~
`\0
`N
`lo-"
`~
`lo-"
`
`IPR2020-00686
`Apple EX1013 Page 11
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 10 of 11
`
`US 6,498,921 Bl
`
`,-J 128
`
`Identify Question Formats ~ 402
`for Question Structures
`
`Select Question Formats ~ 404
`
`Set Default Values in f"\.J 406
`Question Formats
`
`Figure 9
`
`IPR2020-00686
`Apple EX1013 Page 12
`
`
`
`U.S. Patent
`
`Dec.24,2002
`
`Sheet 11 of 11
`
`US 6,498,921 Bl
`
`Retrieve Answer Formats ~ 452
`
`l
`
`Access Answers ~ 454
`
`Figure 10
`
`IPR2020-00686
`Apple EX1013 Page 13
`
`
`
`1
`METHOD AND SYSTEM TO ANSWER A
`NATURAL-LANGUAGE QUESTION
`CROSS REFERENCE TO RELATED
`APPLICATION
`The present invention is a continuation-in-part of
`co-pending U.S. application entitled, “Learning Method and
`System Based on Questioning III”, filed on Jul. 2, 1999,
`invented by Chi Fai Ho and Peter Tong, and having a Ser.
`No. of 09/347,184, which is hereby incorporated by refer
`ence into this application.
`BACKGROUND OF THE INVENTION
`The present invention relates generally to methods and
`Systems to answer a question, and more particularly to
`methods and Systems to accurately answer a natural
`language question.
`Numerous Search engines in the market have provided uS
`with an unprecedented amount of freely-available informa
`tion. All we have to do is to type in our questions, and we
`will be inundated by information. For example, there is a
`Search engine that regularly gives us tens of thousands of
`Web Sites to a single question. It would take practically days
`to go through every Single Site to find our answer, especially
`if our network connections are through relatively low-speed
`modems. We do not want thousands of answers to our
`questions. All we want is a handful of meaningful ones.
`Another challenge faced by users of many Search engines
`is to search by key words. We have to extract key words
`from our questions, and then use them to ask our questions.
`We might also use enhanced features provided by Search
`engines, Such as + or - delimiters before the key words, to
`indicate our preferences. Unfortunately, this is unnatural.
`How often do we ask questions using key words? The better
`way is to ask with a natural language.
`There are natural-language Search engines. Some of them
`also provide limited number of responses. However, their
`responses are inaccurate, and typically do not provide Sat
`isfactory answers to our questions. Their answers are not
`tailored to our needs.
`Providing accurate responses to natural language ques
`tions is a very difficult problem, especially when our ques
`tions are not definite. For example, if you ask the question,
`“Do you like Turkey'?”, it is not clear if your question is
`about the country Turkey or the animal Turkey. Add to this
`challenge is the need to get answers quickly. Time is very
`valuable and we prefer not to wait for a long time to get our
`SWCS.
`To further complicate the problem is the need to get
`information from documents written in different languages.
`For example, if we want to learn about climbing Mount Fuji
`in Japan, probably most of the information is in Japanese.
`Many Search engines in the United States only Search for
`information in English, and ignore information in all other
`languages. The reason may be because translation errors
`would lead to even less accurate answers.
`It should be apparent from the foregoing that there is Still
`a need for a natural-language question-answering System
`that can accurately and quickly answer our questions, with
`out providing us with thousands of irrelevant choices.
`Furthermore, it is desirable for the system to provide us with
`information from different languages.
`SUMMARY OF THE INVENTION
`The present invention provides methods and Systems that
`can quickly provide a handful of accurate responses to a
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,498,921 B1
`
`2
`natural-language question. The responses can depend on
`additional information about the user and about the Subject
`matter of the question So as to Significantly improve on the
`relevancy of the responses. The user is allowed to pick one
`or more of the responses to have an answer generated.
`Furthermore, the answer to the question can be in a language
`different from the language of the question to provide more
`relevant answers.
`One embodiment of the present invention includes a
`System with an input device, an answer generator and an
`output device. The answer generator, having access to a
`database of phrases and question formats, identifies at least
`one phrase in the question to generate phrased questions.
`This identification process uses phrases in the database and
`at least one grammatical rule.
`The identified phrase can then be linked to at least one
`category based on, for example, one Semantic rule. Then the
`System provides a Score to the categorized phrase. This Score
`can depend on a piece of information about the user and/or
`about the Subject matter of the question. In one embodiment,
`this piece of information is different from the fact that the
`user has asked the question.
`The piece of information can be related to the user's
`response to an inquiry from the System. For example, the
`System can ask the user to specify the Subject matter of the
`question. ASSume that the user asks the following question:
`“In the eighteenth century, what did Indians typically eat?”
`The system can ask the user if the subject matter of the
`question is related to India or the aboriginal peoples of North
`America. Based on the user's response, the System can
`provide a more relevant response to the user.
`In another example, the piece of information is related to
`an interest of the user. Again, if the user is interested in
`traveling, and not food, certain ambiguities in his question
`can be resolved. Based on the user's response to certain
`inquiries from the System, the accuracy of the answer can be
`enhanced.
`In another embodiment, the piece of information about
`the user is related to a question previously asked by the user.
`For example, if the user has been asking questions on Sports,
`probably the word, ball, in his question is not related to ball
`bearings, which are mechanical parts.
`Typically, the more information the System has on the user
`and the Subject matter of the question, the more accurate is
`the answer to the user's question. The reason is similar to the
`Situation of our responding to our friend's question before he
`even asks it. Sometimes we understand what they want to
`know through non-verbal communication or our previous
`interactions.
`Based on information on the user, the Score of the
`categorized phrase can change. In another embodiment,
`based on information of the Subject matter the question is in,
`the Score of the categorized phrase can change.
`After providing the Score to the categorized phrase, the
`System can identify at least two question formats in the
`database based on the Score. These question formats can
`again help the System resolve ambiguities in the question.
`For example, the question is, “How to play bridge'?” ASSume
`that the question is in the general Subject area of card games.
`It is not clear if the user wants to find out basic rules on the
`card game bridge or to learn Some more advanced tech
`niques. Then, one question format can be on basic rules on
`bridge, and the other format can be on bridge techniques.
`The user is allowed to pick at least one of the question
`formats to have the corresponding answer generated.
`In another embodiment, the answer can be in a language
`different from the language of the question. This improves
`
`IPR2020-00686
`Apple EX1013 Page 14
`
`
`
`3
`on the accuracy of the answers to the question. For example,
`if the user is interested in Japan, and if the user understands
`Japanese, based on the question format picked, a Japanese
`answer is identified to his English question. Such answers
`can provide more relevant information to the user.
`Other aspects and advantages of the present invention will
`become apparent from the following detailed description,
`which, when taken in conjunction with the accompanying
`drawings, illustrates by way of example the principles of the
`invention.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shows one embodiment of the invention.
`FIG. 2 shows one embodiment of an answer generator of
`the invention.
`FIG. 3 shows one set of steps implemented by one
`embodiment of an answer generator of the invention.
`FIGS. 4A-B show embodiments implementing the inven
`tion.
`FIG. 5 shows examples of ways to regularize the question
`in the invention.
`FIG. 6 shows one set of steps related to identifying
`phrases in the question of the invention.
`FIG. 7 shows one set of steps related to identifying
`question Structures in the invention.
`FIG. 8 shows examples of factors affecting scores in the
`invention.
`FIG. 9 shows one set of steps related to identifying
`question formats in the invention.
`FIG. 10 shows one set of steps related to identifying
`answer in the invention.
`Same numerals in FIGS. 1-10 are assigned to similar
`elements in all the figures. Embodiments of the invention are
`discussed below with reference to FIGS. 1-10. However,
`those skilled in the art will readily appreciate that the
`detailed description given herein with respect to these fig
`ures is for explanatory purposes as the invention extends
`beyond these limited embodiments.
`DETAILED DESCRIPTION OF THE
`INVENTION
`FIG. 1 shows one embodiment of a system 50 of the
`present invention. It includes an input device 52 coupled to
`an answer generator 54, which is coupled to an output device
`56. FIG. 2 shows one embodiment of the answer generator
`54 implementing a set 120 of steps shown in FIG. 3.
`A user enters a question into the input device 52, Such as
`a keyboard, a mouse or a voice recognition System. The
`question or a representation of the question can be trans
`mitted by the input device to the answer generator 54.
`In one embodiment, the answer generator 54 includes a
`number of elements. The answer generator 54 can include a
`question regularizer 80, a phrase identifier 82, a question
`structure identifier84, a question format identifier86 and an
`answer identifier 88. In general terms, the question regular
`izer 80 regularizes (Step 122) words in the question, Such as
`by replacing words with their roots; the phrase identifier 82
`identifies (step 124) phrases in the regularized question to
`generate phrased questions, the question Structure identifier
`84 generates (step 126) question structures from the phrased
`question, based on the question Structures, the question
`format identifier86 identifies (step 128) and retrieves one or
`more question formats, which the user is allowed to pick
`from; and then the answer identifier88 identifies (step 132)
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,498,921 B1
`
`4
`and retrieves one or more answers for the question. Note that
`the answer identifier 88 can access the Internet or the Web
`for answers.
`The generator 54 can also include a database 90 of
`relevant information to be accessed by different elements of
`the generator 54. The database 90, can be a relational
`database, an object database or other forms of database.
`After the answer is generated, the output device 56, Such
`as a monitor, a printer or a voice Synthesizer, can present the
`answer to the user.
`FIG. 4A shows one physical embodiment 150 implement
`ing one embodiment of the invention, preferably in Software
`and hardware. The embodiment 150 includes a server com
`puter 152 and a number of client computers, such as 154,
`which can be a personal computer. Each client computer
`communicates to the Server computer 152 through a dedi
`cated communication link, or a computer network 156. In
`one embodiment, the link can be the Internet, intranet or
`other types of private-public networks.
`FIG. 4B shows one embodiment of a client computer 154.
`It typically includes a bus 159 connecting a number of
`components, Such as a processing unit 160, a main memory
`162, an I/O controller 164, a peripheral controller 166, a
`graphics adapter 168, a circuit board 180 and a network
`interface adapter 170. The I/O controller 164 is connected to
`components, Such as a harddisk drive 172 and a floppy disk
`drive 174. The peripheral controller 166 can be connected to
`one or more peripheral components, Such as a keyboard 176
`and a mouse 182. The graphics adapter 168 can be connected
`to a monitor 178. The circuit board 180 can be coupled to
`audio signals 181; and the network interface adapter 170 can
`be connected to a network 120, which can be the Internet, an
`intranet, the Web or other forms of networks. The processing
`unit 160 can be an application Specific chip.
`Different elements in the system 50 may be in different
`physical components. For example, the input device 52 and
`the output device 56 may be in a client computer, and the
`answer generator 54 may reside in a server computer. In
`another embodiment, the input device 52, the output device
`56, the answer generator 54 other than the database 90 are
`in a client computer; and the database 90 is in a server
`computer. In another situation, the database 90 can reside in
`a storage medium in a client computer, or with part of it in
`the client computer and another part in the Server computer.
`In a fourth embodiment, the system 50 is in a client
`computer. Yet in another embodiment, the input device 52
`and the output device 56 are in a client computer; the answer
`generator 54 other than the database 90 is in a middleware
`apparatus, such as a Web server; and the database 90 with its
`management System are in a back-end Server, which can be
`a database server. Note that different elements of the answer
`generator 54 can also reside in different components.
`In this invention, the question can be on a Subject, which
`can be broad or narrow. In one embodiment, the Subject can
`cover mathematics or history, or it can cover the JAVA
`programming language. In another embodiment, the Subject
`covers information in a car, Such as a Toyota Camry, and the
`user wants to understand this merchandise before buying it.
`In yet another embodiment, the Subject covers the real estate
`market in a certain geographical area, and again the user
`wants to understand the market before buying a house.
`In one embodiment, a question can be defined as an
`inquiry demanding an answer; and an answer can be defined
`as a Statement Satisfying the inquiry.
`The question can be a natural-language question, which is
`a question used in our everyday language. A natural
`
`IPR2020-00686
`Apple EX1013 Page 15
`
`
`
`US 6,498,921 B1
`
`15
`
`25
`
`S
`language question can be in English or other languages, Such
`as French. Examples of natural-language questions are:
`Who is the President?
`Like cream of mushroom Soup?
`A Statement that is not based on a natural language can be
`a Statement that is not commonly used in our everyday
`language. Examples are:
`For Key in Key-Of(Table) do
`Do while X >2
`In one embodiment, one grammatical rule is that a ques
`tion is made of phrases, another grammatical rule is that
`every phrase is made of one or more words. Such rules can
`define a grammatical Structure. A question formed under
`Such rules is grammatically context-free, and the question is
`in a context-free grammatical Structure.
`FIG. 5 shows examples of ways to regularize the question.
`The question regularizer 80 regularizes words in the
`question, for example, by replacing certain words in the
`question with their roots. One objective of the regularizer is
`to reduce the size of the database 90 and the amount of
`computation required to analyze the question.
`In one embodiment, the regularizer 80 identifies every
`word in the question. Then it replaces words with their roots
`if they are not already in their root forms. For example, the
`regularizer changes verbs (step 202) of different forms in the
`question into their present tense, and nouns (step 204) into
`Singular.
`One approach to implement the replacement proceSS is
`based on a hashing function. Every word in the question can
`be hashed into a hash value. In one embodiment, each
`character is represented by eight bits, Such as by its corre
`sponding eight-bit ASCII codes. The hashing function is
`performed by first pairing characters together in every word
`of the question. If a word has an odd number of characters,
`then the last character of the word is paired with Zero. Each
`paired characters becomes a Sixteen-bit number. Every word
`could have a number of sixteen-bit numbers. The character
`does not have to be represented by the eight-bit ASCII
`codes. In another embodiment, with each character repre
`Sented by its Sixteen-bit unicode, the characters are not
`paired. Again every word could have a number of sixteen-bit
`numbers.
`For a word, add all of its sixteen-bit numbers, and
`represent the sum by a thirty-two bit number. For the
`thirty-two bit number, add the first two bytes and throw
`away the carry to generate a twenty-four bit number. This
`number is the hash value of the word. In one embodiment,
`each hash value can be used to represent two different words.
`One word can be in one language and the other in another
`language, with both languages represented by unicodes. A
`50
`16 Mbit memory could be used to hold different combina
`tions of twenty-four bit hash values to represent different
`words. This approach should be applicable to most natural
`languages.
`In one embodiment, commonly-used words have been
`previously hashed and stored in the database 90. There are
`also tables generated that link the hash values of those words
`with the hash values of their root forms. Then, the hash
`values of words in the question are compared to hash values
`in the tables and may be replaced by root-forms hash values.
`For example, the hash values of verbs of different forms in
`the question are mapped to and replaced by the hash values
`of their present tenses, and Similarly, the hash values of
`plural nouns are mapped to and replaced by their corre
`sponding Singular form hash values.
`In one embodiment, after Some of the words in the
`question have been regularized, the phrase identifier 82 can
`
`6
`identify phrases in the question. FIG. 6 shows one set 124 of
`StepS related to identifying phrases. Note that the process of
`identifying does not have to include the process of
`understanding, determining its presence in the database, or
`extracting.
`In one embodiment, the identifier identifies phrases from
`the beginning or the first word (step 252) of the question. It
`identifies the first word in the question, and then determines
`if the first word is in the database 90. If it is, it will be
`classified as a phrase of the question. Then, the identifier
`identifies the first two words. If there is a corresponding term
`with Such two words in the database 90, then the two words
`are classified as another phrase of the question.
`The phrase determination process can again be done
`through a hashing function. One approach is to add the hash
`values of each of the words in a phrase. If the Sum has more
`than 24 bits, throw away the carry. The remaining 24 bits
`would be the hash value of the phrase. For example, the two
`words in the question can be hashed into a hash value, which
`is compared to hash values in the database 90. If such a hash
`value exists in the database 90, then the two words are
`classified as a phrase. In one embodiment, this process
`continues on up to the first twenty words in the question.
`In one embodiment, when a hash value for a certain
`number of words does not exist, the identifier Stops adding
`another word to identify phrases in the question. However,
`a hash value that exists in the database 90 does not mean that
`its corresponding word or words can have independent
`meaning. The existence of a hash value in the database 90
`can imply that the phrase identifier 82 should continue on
`adding words to look for phrases. For example, the identifier
`82 should continue on adding words to identify the longest
`matching phrase, which can be a phrase with Six words. For
`example, the term, "With respect, may not be a phrase, or
`does not have independent meaning. But the hash Value of
`Such a term can be in the database 90. Then the identifier
`adds the next word in the question to determine if the
`three-word combination exists in the database 90. If the third
`word is the word “to', then the three-word combination is a
`preposition with independent meaning, and can have