`US007203646B2
`
`c12) United States Patent
`Bennett
`
`(IO) Patent No.:
`(45) Date of Patent:
`
`US 7,203,646 B2
`Apr. 10, 2007
`
`(54) DISTRIBUTED INTERNET BASED SPEECH
`RECOGNITION SYSTEM WITH NATURAL
`LANGUAGE SUPPORT
`
`(75)
`
`Inventor:
`
`Ian M. Bennett, Palo Alto, CA (US)
`
`(73) Assignee: Phoenix Solutions, Inc., Palo Alto, CA
`(US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by O days.
`
`(21) Appl. No.: 11/419,736
`
`(22) Filed:
`
`May 22, 2006
`
`(65)
`
`Prior Publication Data
`
`US 2006/0200353 Al
`
`Sep. 7, 2006
`
`Related U.S. Application Data
`
`(63) Continuation of application No. 09/439,174, filed on
`Nov. 12, 1999, now Pat. No. 7,050,977.
`
`(51)
`
`Int. Cl.
`GlOL 15118
`(2006.01)
`G06F 17120
`(2006.01)
`(52) U.S. Cl. ........................ 704/257; 704/270.1; 707/5
`( 58) Field of Classification Search ... ... ... ... .. .. 704/251,
`704/252, 255,257,270,270.1, 275; 707/3,
`707/4, 5
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,473,904 A
`4,587,670 A
`4,783,803 A
`4,785,408 A
`4,852,170 A
`4,914,590 A
`4,991,094 A
`4,991,217 A
`
`9/1984 Suehiro et al.
`5/1986 Levinson et al.
`11/1988 Baker et al.
`11/1988 Britton et al.
`7/1989 Bordeaux
`4/1990 Loatman et al.
`2/1991 Fagan et al.
`2/1991 Garrett et al.
`
`5,068,789 A
`5,146,405 A
`5,157,727 A
`5,231,670 A
`5,278,980 A *
`5,293,584 A
`
`11/1991 van Vliembergen
`9/1992 Church
`10/1992 Schloss
`7/1993 Goldhor et al.
`1/1994 Pedersen et al. ............... 707/4
`3/1994 Brown et al.
`
`(Continued)
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`
`1094388
`
`4/2001
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`Coffman, Daniel et la., Provisional Application for Patent, U.S.
`Appl. No. 60/117,595, filed Jan. 27, 1999, 111 pages.
`
`(Continued)
`
`Primary Examiner-Martin Lerner
`(74) Attorney, Agent, or Firm-J. Nicholas Gross
`
`(57)
`
`ABSTRACT
`
`A speech-enabled internet based computing system includes
`a configurable speech recognition engine used for interact(cid:173)
`ing with content on a web accessible page. The speech
`recognition engine is distributed across a client and server
`architecture, and is adaptive so that speech processing
`operations can be allocated as needed between the two. This
`allows for support for client devices having differing com(cid:173)
`puting capabilities. Natural language operations can also be
`supported as desired. A user can thus interact with a web
`page and select items of interest using speech as a mode of
`input. Dynamic grannnars can assist in the recognition
`operations to improve speed and comprehension.
`
`10 Claims, 31 Drawing Sheets
`
`Speech
`Recognit1onof
`User's Query .11.Q.1
`
`Step 1
`
`IPR2020-00686
`Apple EX1009 Page 1
`
`
`
`US 7,203,646 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`12/1994 Reed et al.
`5,371,901 A
`5,384,892 A
`1/1995 Strong
`9/1995 Burns et al. ................... 707/4
`5,454,106 A *
`5,475,792 A
`12/1995 Stanford et al.
`4/1996 Lee et al.
`5,509,104 A
`5,513,298 A
`4/1996 Stanford et al.
`5,553,119 A
`9/1996 McAlliser et al.
`5,602,963 A
`2/1997 Bissonnette et al.
`5,625,814 A *
`4/1997 Luciw ........................... 707/5
`5,652,897 A
`7/1997 Linebarger et al.
`5,668,854 A
`9/1997 Minakami et al.
`5,675,707 A
`10/1997 Gorin et al.
`5,680,511 A
`10/1997 Baker et al.
`5,680,628 A
`10/1997 Cams et al.
`5,694,592 A * 12/1997 Driscoll ......................... 707/3
`5,727,950 A
`3/1998 Cook et al.
`5,758,322 A
`5/1998 Rongley
`5,802,526 A
`9/1998 Fawcett et al.
`5,819,220 A
`10/1998 Sarukkai et al.
`5,836,771 A
`11/1998 Ho et al.
`5,860,063 A
`1/1999 Gorin et al.
`5,867,817 A
`2/1999 Catallo et al.
`5,873,062 A
`2/1999 Hansen et al.
`5,884,302 A
`3/1999 Ho
`5,915,236 A
`6/1999 Gould et al.
`5,934,910 A
`8/1999 Ho et al.
`5,956,683 A
`9/1999 Jacobs et al.
`5,960,394 A
`9/1999 Gould et al.
`5,960,399 A
`9/1999 Barclay et al.
`5,978,756 A
`11/1999 Walker et al.
`5,987,410 A
`11/1999 Kellner et al.
`5,995,918 A
`11/1999 Kendall et al.
`5,995,928 A
`11/1999 Nguyen et al.
`6,009,387 A
`12/1999 Ramaswamy et al.
`6,023,697 A *
`2/2000 Bates et al. .................... 707/4
`6,029,124 A
`2/2000 Gillick et al.
`6,032,111 A
`2/2000 Mohri
`6,035,275 A
`3/2000 Brode et al.
`6,044,266 A
`3/2000 Kato
`6,044,337 A
`3/2000 Gorin et al.
`6,078,914 A *
`6/2000 Redfern ......................... 707/3
`6,081,774 A *
`6/2000 de Hita et al. .................. 704/9
`6,088,692 A *
`7/2000 Driscoll ......................... 707/5
`6,101,472 A
`8/2000 Giangarra et al.
`6,105,023 A *
`8/2000 Callan ........................... 707/5
`6,112,176 A
`8/2000 Goldenthal et al.
`6,119,087 A
`9/2000 Kuhn et al.
`6,125,284 A
`9/2000 Moore et al.
`6,125,341 A
`9/2000 Raud et al.
`6,138,089 A
`10/2000 Guberman
`6,141,640 A
`10/2000 Moo
`6,144,848 A
`11/2000 Walsh et al.
`6,144,938 A
`11/2000 Surace et al.
`6,173,279 Bl*
`................... 707/5
`1/2001 Levin et al.
`6,178,404 Bl
`1/2001 Hambleton et al.
`6,182,038 Bl
`1/2001 Balakrishnan et al.
`6,182,068 Bl
`1/2001 Culliss
`6,185,535 Bl
`2/2001 Hedin et al.
`6,192,110 Bl
`2/2001 Abella et al.
`6,195,636 Bl
`2/2001 Crupi et al.
`6,226,610 Bl
`5/2001 Keiller et al.
`6,233,559 Bl
`5/2001 Balakrishnan
`6,243,679 Bl
`6/2001 Mohri et al.
`6,246,986 Bl
`6/2001 Ammicht et al.
`6,246,989 Bl
`6/2001 Polcyn
`6,256,607 Bl
`7/2001 Digalakis et al.
`6,269,336 Bl
`7/2001 Ladd et al.
`6,278,973 Bl
`8/2001 Chung et al.
`6,292,767 Bl
`9/2001 Jackson et al.
`6,292,781 Bl
`9/2001 Urs et al.
`6,327,561 Bl
`12/2001 Smith et al.
`
`6,327,568 Bl
`6,330,530 Bl
`6,363,349 Bl
`6,374,219 Bl
`6,374,226 Bl
`6,381,594 Bl
`6,389,389 Bl
`6,408,272 Bl
`6,411,926 Bl
`6,418,199 Bl
`6,427,063 Bl
`6,434,524 Bl *
`6,434,529 Bl
`6,446,064 Bl *
`6,453,020 Bl
`6,499,011 Bl
`6,499,013 Bl
`6,510,411 Bl
`6,513,037 Bl
`6,522,725 B2
`6,532,444 Bl
`6,539,359 Bl
`6,567,778 Bl
`6,574,597 Bl
`6,584,464 Bl *
`6,594,269 Bl
`6,594,348 Bl
`6,601,026 B2 *
`6,614,885 B2
`6,618,726 Bl
`6,633,846 Bl
`6,681,206 Bl
`6,697,780 Bl
`6,742,021 Bl
`6,823,308 B2
`6,862,713 Bl *
`6,871,179 Bl
`6,901,366 Bl
`6,922,733 Bl
`6,940,953 Bl
`6,941,273 Bl
`6,961,954 Bl
`6,964,012 Bl
`6,965,864 Bl
`6,965,890 Bl
`7,058,573 Bl
`2001/0016813 Al
`2001/0032083 Al
`2001/0056346 Al
`2002/0032566 Al
`2002/0046023 Al
`2002/0059068 Al
`2002/0059069 Al
`2002/0086269 Al
`2002/0087325 Al
`2002/0087655 Al
`2002/0091527 Al
`2003/0191625 Al
`2005/0091056 Al
`2005/0131704 Al
`
`12/2001 Joost
`12/2001 Horiguchi et al.
`3/2002 Urs et al.
`4/2002 Jiang
`4/2002 Hunt et al.
`4/2002 Eichstaedt et al.
`5/2002 Meunier et al.
`6/2002 White et al.
`6/2002 Chang
`7/2002 Perrone
`7/2002 Cook et al.
`8/2002 Weber ........................ 704/257
`8/2002 Walker et al.
`9/2002 Livowsky ...................... 707/5
`9/2002 Hughes et al.
`12/2002 Souvignier et al.
`12/2002 Weber
`1/2003 Norton et al.
`1/2003 Ruber et al.
`2/2003 Kato
`3/2003 Weber
`3/2003 Ladd et al.
`5/2003 Chao Chang et al.
`6/2003 Mohri et al.
`6/2003 Warthen ........................ 707/4
`7/2003 Polcyn
`7/2003 Bjurstrom et al.
`7/2003 Appelt et al. . ................. 704/9
`9/2003 Polcyn
`9/2003 Colbath et al.
`10/2003 Bennett et al.
`1/2004 Gorin et al.
`2/2004 Beutnagel et al.
`5/2004 Halverson et al.
`11/2004 Keiller et al.
`3/2005 Kraft et al. ................. 715/728
`3/2005 Kist et al.
`5/2005 Kuhn et al.
`7/2005 Kuiken et al.
`9/2005 Eberle et al.
`9/2005 Loghmani et al.
`11/2005 Maybury et al.
`11/2005 Zirngibl et al.
`11/2005 Thrift et al.
`11/2005 Dey et al.
`6/2006 Murveit et al.
`8/2001 Brown et al.
`10/2001 Van Cleven
`12/2001 U eyama et al.
`3/2002 Tzirkel-Hancock et al.
`4/2002 Fujii et al.
`5/2002 Rose et al.
`5/2002 Hsu et al.
`7/2002 Shpiro
`7/2002 Lee et al.
`7/2002 Bridgman et al.
`7/2002 Shiau
`10/2003 Gorin et al.
`4/2005 Surface et al.
`6/2005 Drago sh et al.
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`
`1096471
`9811534
`9948011
`9950830
`0014727
`0017854
`0020962
`0021075
`0021232
`0022610
`
`5/2001
`3/1998
`9/1999
`10/1999
`3/2000
`3/2000
`4/2000
`4/2000
`4/2000
`4/2000
`
`IPR2020-00686
`Apple EX1009 Page 2
`
`
`
`US 7,203,646 B2
`Page 3
`
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`
`0030072
`0030287
`0068823
`0116936
`0118693
`0126093
`0178065
`0195312
`0203380
`
`5/2000
`5/2000
`11/2000
`3/2001
`3/2001
`4/2001
`10/2001
`12/2001
`1/2002
`
`OTHER PUBLICATIONS
`
`Fourney, G.D., "The Viterbi Algorithm," Proc. IEEE, vol. 73, pp.
`268-278, Mar. 1973.
`Baker, J.H., "The dragon system-An Overview," IEEE Trans. on
`ASSP Proc., ASSP-23(1):Feb. 24-29, 1975.
`Bennett, I., "A Study of Speech Compression Using Analog Time
`Domain Sampling techniques," A Dissertation Submitted to the
`Dept. Of Electrical Engineering and the Committee on Graduate
`Studies of Stanford University, May 1975, pp. 16-32; 76-111.
`Rabiner, L.R., "Digital Processing of Speech Signals," Prentice
`Hall, 1978, pp. 116-171; 355-395.
`Jelinek, F. et a, "Continuous Speech Recognition: Statistical meth(cid:173)
`ods" in Handbook of Statistics, II, P.R. Krishtnaiad, Ed. Amsterdam,
`The Netherlands, North-Holland, 1982.
`Bahl, L.R. et al., "A maximum likelihood approach to continuous
`speech recognition," IEEE Trans. Pattern Anal. Mach. Intell.,
`PAMI-5: 179-190, 1983.
`Hudson, R.A., "Word Grammar," Blackman Inc., Cambridge, MA,
`1984, pp. 1-14; 41-42; 76-90; 94-98; 106-109; 211-220.
`Quirk, R. et al., "A Comprehensive Grammar of English Language",
`Longman, London and New York, 1985, pp. 245-331.
`Makhoul, J. et al., "Vector Quantization in Speech Coding," Pro(cid:173)
`ceedings of the IEEE, vol. 73, No. 11, Nov. 1985, pp. 1551-1588.
`Rabiner, L.R., "A Tutorial on Hidden Markov Models and Selected
`Applications ins Speech Recognition," Proc. IEEE, vol. 77, No. 2,
`Feb. 1989, pp. 257-286.
`Gersho, A. et al., "Vector Quantization and Compression," Kluwer
`Academic Publishers, 1991 , pp. 309-340.
`Rabiner, L.R. et al., "Fundamentals of Speech Recognition,"
`Prentice Hall, 1993, pp. 11-68.
`Morgan, N. et al., "Hybrid Neural Network/Hidden Markov Model
`Systems for Continuous Speech Recognition," Journal of Pattern
`Recognition and Artificial Intelligence, vol. 7, No. 4 pp. 899-916.
`(1993).
`Lieberman, P., "Intonation, Perception and Language," Research
`Monograph No. 38, MIT Press, Cambridge, Mass., 1967, pp. 5-37.
`Unisys Corp., "Natural Language Speech Assistant (NLSA) Capa(cid:173)
`bilities Overview," NLR 3.0, Aug. 1998, Malvern, PA, 27 pages.
`Baum, L.E. et al., "A Maximum Technique Occuring in the Statis(cid:173)
`tical Analysis of Probabilistic Functions of Markov Chains," The
`Annals of Mathematical Statistics, 1970, vol. 41, No. 1, pp. 164-
`171.
`Arons, B., "The Design of Audio Servers and Toolkits for Support(cid:173)
`ing Speech in the User Interface," believed to be published in:
`Journal of the American Voice I/O Society, pp. 27-41, Mar. 1991.
`Hazen, T et al., "Recent Improvements in an Approach to Segment(cid:173)
`BasedAutomatic Language Identification," believed to be published
`in: Proceedings of the 1994 International Conference on Spoken
`Language Processing, Yokohama, Japan, pp. 1883-1886, Sep. 1994.
`House, D., "Spoken-Language Access to Multimedia (SLAM): A
`Multimodal Interface to the World-Wide Web," Masters Thesis,
`Oregon Graduate Institute, Department of Computer Science &
`Engineering, 59 pages, Apr. 1995.
`Julia, L. et al., "http://www.speech.sri.com/demos/ atis.html,"
`believed to be published in: Proceedings AAAI'97: Stanford, pp.
`72-76, Jul. 1997.
`Lau, R. et al, "Webgalaxy-Integrating Spoken Language and
`Hypertext Navigation," believed to be published in: in Kokkinakis,
`G. et al., (Eds.) Eurospeech '97, Proceedings of the 5th European
`Conference on Speech Communication and Technology, Rhodes
`(Greece), Sep. 22-25, 1997: pp. 883-886, 1997.
`
`Digalakis, V. et al., "Product-Code Vector Quantization of Cepstral
`Parameters for Speech Recognition over the WWW,'' believed to be
`published in: Proc. ICSLP '98, 4 pages. 1998.
`Melin, H., "On Word Boundary Detection in Digit-Based Speaker
`Verification," believed to be published in: Workshop on Speaker
`Recognition and Its Commercial and Forensic Applications
`(RLA2C), Avignon, France, Apr. 20-23, pp. 46-49, 1998.
`Ramaswamy, G. et al., "Compression of Acoustic Features for
`Speech Recognition in Network Environments," believed to be
`published in: IEEE International Conference on Acoustics, Speech
`and Signal Processing, pp. 977-980, Jun. 1998.
`Lu, B. et al., "Scalability Issues in the Real Time Protocol (RTP),"
`Project Report for CPSC 663 (Real Time Systems), Dept. of
`Computer Science, Texas A & M University, 19 pages, 1999.
`Giuliani, D. et al., "Training of HMM with Filtered Speech Material
`for Hands-Free Recognition," believed to be published in: Proceed(cid:173)
`ings of ICASSP '99, Phoenix, USA, 4 pages, 1999.
`Digilakis, V. et al., "Quantization of Cepstral Parameters for Speech
`Recognition over the World Wide Web," believed to be published in:
`IEEE Journal on Selected Areas of Communications, 22 pages,
`1999.
`Tsakalides, S. et al., "Efficient Speech Recognition Using Subvector
`Quantization and Discrete-Mixture HMMs," believed to be pub(cid:173)
`lished in: Proc. ICASSP '99, 4 pages, 1999.
`Lin, B. et al., "A Distributed Architecture for Cooperative Spoken
`Dialogue Agents with Coherent Dialogue State and History,"
`believed to be published in: IEEE Automatic Speech Recognition
`and Understanding Workshop, Keystone, Colorado, USA, 4 pages,
`Dec. 1999.
`Meunier, J., "RTP Payload Format for Distributed Speech Recog(cid:173)
`nition," 48th IETF AVT WG-Aug. 3, 2000, 10 pages, 2000.
`Sand Cherry Networks, SoftServer product literature, 2 pages, 2001.
`Kim, H. et al., "A Bitstream-Based Front-End for Wireless Speech
`Recognition on IS-136 Communications System," IEEE Transac(cid:173)
`tions on Speech and Audio Processing, vol. 9, No. 5, pp. 558-568,
`Jul. 2001. (11 pages).
`Agarwal, R., Towards a PURE Spoken Dialogue System for Infor(cid:173)
`mation Access, believed to be published in Proceedings of the
`ACL/EACL Workshop on Interactive Spoken Dialog Systems:
`Bringing Speech and NLP Together in Real Applications, Madrid,
`Spain, 1997, 9 pages.
`Arnmicht, Egbert et al., "Knowledge Collection for Natural Lan(cid:173)
`guage Spoken Dialog Systems," believed to be published in Proc.
`Eurospeech, vol. 3, p. 1375-1378, Budapest, Hungary, Sep. 1999, 4
`pages.
`AT&T Corp., "AT&T Watson Advanced Speech Applications Plat(cid:173)
`form," 1996, 3 pages.
`AT&T Corp., "AT&T Watson Advanced Speech Application Plat(cid:173)
`form Version 2.0," 1996, 8 pages.
`AT&T Corp., "AT&T Watson Advanced Speech Applications Plat(cid:173)
`form Version 2.0," 1996, 3 pages.
`Gorin, Allen, "Processing of Semantic Information in Fluently
`Spoken Language," believed to be published in Proc. ICSLP,
`Philadelphia, PA, Oct. 1996, 4 pages.
`Gorin, Allen et al., "How May I Help You," believed to be published
`in Proc. IVTTA, Basking Ridge, NJ, Oct. 1996, 32 pages.
`Mohri, Mehryar, "String Matching With Automata," Nordic Journal
`of Computing, 1997, 15 pages.
`Prudential News, "Prudential Pilots Revolutionary New Speech(cid:173)
`Based Telephone Customer Service System Developed by AT&T
`Labs----Company Business and Marketing," Dec. 6, 1999, 3 pages.
`Riccardi, Giuseppe et al., "A spoken language system for automated
`call routing," believed to be published in Proc. ICASSP '97, 1997,
`4 pages.
`Sharp, Douglas, et al., "The Watson Speech Recognition Engine,"
`accepted by ICASSP, 1997, 9 pages.
`European Patent Office search report for EP Application No.
`00977144, dated Mar. 30, 2005, 5 pages.
`Burstein, A. et al. "Using Speech Recognition In A Personal
`Communications System," Proceedings of the International Con(cid:173)
`ference on Communications; Chicago, Illinois, Jun. 14-18, 1992,
`pp. 1717-1721.
`
`IPR2020-00686
`Apple EX1009 Page 3
`
`
`
`US 7,203,646 B2
`Page 4
`
`Kuhn, T. et al., "Hybrid In-Car Speech Recognition For Mobile
`Multimedia Applications," Vehicular Technology Conference,
`Houston, Texas, May 1999, pp. 2009-2013.
`Travis, L., "Handbook of Speech Pathology", Appleton-Century(cid:173)
`Crofts, Inc., 1957, pp. 91-124.
`Baum, L.E., et al., "Statistical inference for probabilistic functions
`for finite state Markov chains," Ann. Math. Stat., 37: 1554-1563,
`1966.
`Flanagan, J.L., "Speech Analysis Synthesis and Perception", 2nd
`edition, Springer-Verlag Berlin, 1972, pp. 1-53.
`Baum, L.E., "An inequality and associated maximization technique
`in statistical estimation for probabilistic functions of Markov pro(cid:173)
`cesses," Inequalities 3: 1-8, 1972.
`
`Cox, Richard V. et al., "Speech and Language Processing for
`Next-Millennium Communications Services," Proceedings of the
`IEEE, vol. 88, No. 8, Aug. 2000, pp. 1314-1337.
`Kuhn, Roland, et al, "The Application of Semantic Classification
`Trees to Natural Language Understanding," IEEE Transactions on
`Pattern Analysis and Machine Intelligence, vol. 17, No. 5, May
`1995, pp. 449-460.
`AT&T Corp., "Network Watson 1.0 System Overview," 1998, 4
`pages.
`
`* cited by examiner
`
`IPR2020-00686
`Apple EX1009 Page 4
`
`
`
`100~
`
`Fig. 1
`
`Speech
`Input
`
`SRE:
`Client-side
`155
`
`'!,)\\
`<c,e,S.,
`
`1608
`
`Animated
`Character
`to Guide
`User
`157
`
`Speech
`Output
`
`+-------{ Text-to-Speech
`Engine
`159
`
`CLIENT-SIDE
`150
`
`SRE:
`Server-side 182.
`· Recognized
`Speech - Text
`Text-to-Query
`Converter 1M
`
`Database
`Processor &
`Interface 186
`
`Customized
`SQL Query
`
`Natural(cid:173)
`Language
`.liill
`Engine
`
`Database
`188
`
`SERVER-SIDE
`180
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`> "e
`:-: ....
`
`~o
`N
`0
`0
`-....J
`
`('D
`
`rJJ =(cid:173)
`('D ....
`....
`0 ....
`~ ....
`
`-....l
`
`d r.,;_
`'N = w
`O'I
`~
`0--, = N
`
`IPR2020-00686
`Apple EX1009 Page 5
`
`
`
`Figure 2 (Page 1/3)
`CLIENT-SIDE SYSTEM LOGIC
`
`200A
`
`1. SR Initialize
`
`Initialize
`recognizer
`
`"' Configuration
`file
`
`2. Calibrate
`speech &
`calibrate silence
`
`Calibrate
`audio, calibrate
`speech
`
`1. Initialize
`COM library
`
`2. Create
`instance
`of Agent
`Server
`
`3. Load MS Agent
`Load the character by
`specifying path of the
`character file, character ID,
`and request ID
`
`4. Get
`character
`interface
`
`5. Add com
`mands to 11
`
`aracter Agent ch
`option
`
`6. S
`.gent
`character
`
`1
`
`.¾cs
`
`file
`
`1. Open
`Internet
`Connection
`
`2. Set callback status
`to connection handle
`
`7. AgentNotifySink to handle the events
`
`Create Agent Notify sink
`object
`
`Get agent properties
`interface
`
`Register sink object
`
`Assign property sheet to
`aaent
`
`3. Start new HTTP Internet session
`
`8. Do character animations after displaying it
`
`202
`
`203
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`t :-: ....
`
`"'o
`N
`0
`0
`-....J
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`N
`0 ....
`~ ....
`
`'-"--...l
`
`d r.,;_
`N = "'w
`0--, = N
`
`0--,
`~
`
`IPR2020-00686
`Apple EX1009 Page 6
`
`
`
`41.5
`
`LJ
`
`Figure 2 (Page 2/3)
`CLIENT-SIDE SYSTEM LOGIC
`
`This is an iterative process. This process initiated as and when user speaks by pressing control.
`
`200B
`
`RECEIVE USER SPEECH
`
`208
`
`RECEIVE ANSWER FROM SERVER
`
`207
`
`SRE (recognize the speech)
`
`MS AGENT : Speak( )
`
`1. Prepare
`coder
`
`2. Start source
`
`3. Convert speech into MFCC vectors
`until silence is found
`
`1. Receive the
`decompressed
`answer
`
`206
`
`2. Speak the answer-+--+---------,
`received from server
`
`204
`
`COMMUNICATION. : lnternetRead ()
`
`COMMUNICATION.: OpenHttprequest ()
`
`1. Encode the stream of bytes so that it is
`compatible to send to server via Internet
`using HTTP.
`
`2. Send the
`data (user's
`question) to
`server
`
`3. Wait for the server
`response
`
`2. Uncompress
`the answer
`
`1 . Receive the best
`answer from the server in
`a compressed form
`
`3. Pass the decompressed data to MS
`AGENT
`
`209
`
`UK English
`voice data file
`210
`
`TTS
`
`211
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`> "e
`:-: ....
`
`~o
`N
`0
`0
`--....J
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`0 ....
`~ ....
`
`~
`
`-....l
`
`d r.,;_
`'N = w
`O'I
`~
`0--, = N
`
`IPR2020-00686
`Apple EX1009 Page 7
`
`
`
`Figure 2 (Page 3/3)
`CLIENT-SIDE SYSTEM LOGIC
`
`UN-INITIALIZATION (performs as and when user quits i.e. closes the web page)
`
`COMMUNICATION
`213
`
`1. Close the Internet
`handle i.e. the
`connection
`established with
`server.
`
`2. Close the Internet
`session which is
`created at the time
`of initialization
`
`a. Delete the objects
`created while
`initialization process
`
`b. Deallocate the
`memory assigned to
`the structure, which
`will be holding the
`parameters for
`speech
`
`1. Release
`commands
`Interface
`
`2. Release
`character
`Interface
`
`3. Unload
`agent
`
`4. Release
`AgentNotifysink
`Interface
`
`5. Release
`prop. sheet
`Interface
`
`6. Unregister
`Agent Notifysink
`
`7. Release Agent Interface
`
`214
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`t :-: ....
`
`"'o
`N
`0
`0
`-....J
`
`.i;...
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`0 ....
`~ ....
`
`'-"--...l
`
`d r.,;_
`N = "'w
`0--, = N
`
`0--,
`~
`
`IPR2020-00686
`Apple EX1009 Page 8
`
`
`
`U.S. Patent
`
`Apr. 10, 2007
`
`Sheet 5 of 31
`
`US 7,203,646 B2
`
`Fig. 2-2 Client-side Initialization
`
`SRE
`1. SR Initialize
`I Allocate
`memory 220
`
`•
`
`Load Configuration
`file
`
`221A
`
`Create source &
`coder objects
`221
`
`I
`
`-
`
`2. Calibrate speech &
`Calibrate Silence until
`silence is detected
`222
`
`..
`..
`..
`..
`
`..
`..
`
`M S Agent 220B
`1. Initialize
`COM library
`223
`
`Vconfiguration file
`221B
`
`2. Create
`instance of Agent i--------
`Server 224
`
`3. Load MS Agent
`Load the character
`by specifying path
`of the character file,
`character ID, and
`request ID 225
`
`l
`J
`ACS
`File
`225A
`
`4. Get
`character
`interface 226
`
`5. Add commands
`6. Show the
`to Agent character i-----. agent character
`option 227
`228
`
`7. AgentNotifySink to handle the events
`Create Agent
`notify sink
`object 229
`
`-
`
`Register sink
`object
`230
`I
`
`L GetAgent
`
`Assign property
`property ~ sheet to Agent
`interface 231
`232
`
`8. Display
`Character &
`i----- execute
`specified
`Animations 233
`
`• Communication 220C - -
`1. Open Internet
`Connection 234
`
`.
`
`2. Set callback
`
`3. Start new HTTP
`
`status to the -- internet session with
`
`connection 235
`
`the server
`
`236
`
`IPR2020-00686
`Apple EX1009 Page 9
`
`
`
`U.S. Patent
`
`Apr. 10, 2007
`
`Sheet 6 of 31
`
`US 7,203,646 B2
`
`Speech
`from User
`
`Fig. 3
`Client-side Iterative Process
`
`Encoded
`MFCC
`.... vec.tors
`
`(when User speaks through microphone by clicking
`
`'--.,..+--'.i 1 . Prep a re
`Coder
`248
`
`2. Start
`Source lli
`
`Communication
`
`1. Encode MFCC
`vectors to make it
`compatible to
`send at server
`using HTTP
`
`ill
`
`2. Send
`encoded
`data to
`server~
`
`3. Convert
`speech into
`MFCC vectors
`250
`
`3. Wait for
`response from
`server
`
`Receive Answer (from Server side)
`
`1. Receive
`uncompressed
`Answer 254
`
`2. Articulate the
`Received Answer
`255
`
`,----e2=4_5 ____ TEXT-TO-SPEECH ENGINE
`Natural
`Language Voice
`Data File 256
`
`Text-to-Speech
`Engine 257
`
`COMMUNICATION
`
`1. Receive the
`"Best" Answer
`from server
`(compressed)
`258
`
`Best Answer from Server
`
`2. Uncompress
`the Answer
`259
`
`3. Pass
`Answer to
`MS Agent
`260
`
`IPR2020-00686
`Apple EX1009 Page 10
`
`
`
`U.S. Patent
`
`Apr. 10, 2007
`
`Sheet 7 of 31
`
`US 7,203,646 B2
`
`Fig. 4
`Client-side Un-Initialization
`
`SRE
`
`1. Delete Objects and
`De-allocate Memory
`
`a. Deallocate the
`memory
`assigned to the
`object holding
`the parameters
`273
`for speech
`
`b. Delete all the
`objects created in
`
`- the initialization
`-
`
`process
`
`274
`
`• COMMUNICATION
`
`271
`
`1. Close the Internet
`connection previously
`established with
`server
`
`275
`
`2. Close the
`Internet Session
`created at the time
`- of initialization 276
`
`MS AGENT
`
`272
`
`1. Release
`Commands
`Interface 277
`
`2. Release
`3. Unload the
`f--➔ Character ~ Agent
`lnterface
`278
`
`279
`
`4. Release
`f----.- sink object
`lnterface
`280
`
`I
`
`7. Release
`6. Unregister
`5. Release
`~ Property Sheet i----. Agent Notify ~ Agent
`Interface 283
`Interface
`281
`sink
`282
`
`IPR2020-00686
`Apple EX1009 Page 11
`
`
`
`Fig. 4A
`
`600
`
`I-+
`602A
`
`Create External
`Source
`602B
`
`f---+
`
`602
`Allocate memory to
`hold SRE objects 602C
`I
`
`2. INITIALIZE SRE
`Load SRE
`Library
`•
`objects
`
`Create SRE H Load Dictionary ~I Load HMM 602r.l
`Dictionary file name I
`
`602D
`
`602E
`
`Note: course-DB name
`chapter-table name
`section-table name
`
`encoded
`
`·-----·-
`
`·-,.
`
`1.Decode
`MFCC vectors
`received
`from client 601
`
`Prepare grammar
`and loader file
`' - names using
`course, chapter and
`section names 605
`I
`
`Course,
`Chapter &
`Section
`name
`
`!
`
`qrammar file name
`
`Load Grammar I
`602G
`
`603
`
`4. UN-INITIALIZE SRE
`
`604
`
`.
`
`Delete SRE
`I---+
`objects 604A
`
`Deallocate memory
`assigned to SRE objects
`604A I
`
`603B
`
`3. RECOGNIZE SPEECH
`Process MFCC
`Read MFCC
`vectors to
`vectors from
`r-+ recognize words
`external
`from MFCC
`source 603A
`Vectors
`
`Recognized speech in the form of text
`-
`
`Distributed Comoonent
`of SRE at Server-Side
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`> "e
`:-: ....
`
`~o
`N
`0
`0
`-...J
`
`QO
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`0 ....
`~ ....
`
`d
`rJl.
`-....l
`
`'N = w
`O'I
`~
`0--, = N
`
`IPR2020-00686
`Apple EX1009 Page 12
`
`
`
`~
`
`Table Name
`' (
`
`Customize/Build SQL query
`
`Fig. 4B
`Build of SQL Query
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`1. Construct SELECT SOL
`- statement using the
`CONTAINS predicate
`
`950
`
`2. Concatenate table name
`- to the constructed SELECT
`-
`statement
`951
`
`Noun Phrase (NP)
`
`························································································································································································································································································
`3. Perform this process iteratively to number of NP present in the NP list
`
`> "e
`:-: ....
`
`~o
`N
`0
`0
`-....J
`
`3.1. Get the number of
`- words present in the NP
`952
`
`~
`
`3.2. Allocate the memory
`- as much as required for all
`-
`the words present in the
`953
`NP
`
`3.3. Get the word
`- List present in the
`-
`NP
`954
`
`-
`
`3.4. Concatenate these
`- words to the SOL Query
`-
`separated with NEAR ()
`keyword.
`955
`
`3.5. Concatenate AND
`- key word to the SQL
`Query after each NP
`956
`
`3.6. Free the memory
`- allocated to store the
`words received from NP
`957
`
`·························································································································································································································································································
`
`1,0
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`0 ....
`~ ....
`
`-...l
`
`d r.,;_
`'N = w
`O'I
`~
`0--, = N
`
`IPR2020-00686
`Apple EX1009 Page 13
`
`
`
`U.S. Patent
`
`Apr. 10, 2007
`
`Sheet 10 of 31
`
`US 7,203,646 B2
`
`Fig. 4C
`
`Server-side DBProcess DLL
`
`Best Answer ID
`
`.................
`
`• 711
`• CONNECT TO SQL
`• SERVERDATABASE
`
`FETCH ANSWERPATH USING
`BEST ANSWER NUNBER
`
`Receive best record
`number
`1-1.§8
`
`• Open file using
`the path fetched
`from recordset
`• Read contents
`of file containing
`the answer
`
`Fetch the
`path of
`Answer
`file using
`the given
`record
`number
`716B
`
`Compress answer
`and transmit to
`client
`716D
`
`Database
`Name
`Table
`Name
`
`Get server
`name,
`database
`name 711A
`
`Build
`Connection
`
`711B
`
`- string
`!
`
`Connect to the SQL
`Server Database
`
`711C
`
`SQL
`Query
`
`. 712
`• EXECUTE SQL QUERY
`
`Receive SQL Query
`
`712A
`
`Execute the SQL
`Query 712B
`
`Record Set
`
`• Extract total records
`from recordset
`• Allocate memory to
`stored paired
`questions
`• Store paired question
`ill
`in array
`
`SQL
`Query
`
`1 - - - - - - - - ; Construct SQL
`Query
`Z1Q
`
`Answer
`
`NLQS Database
`
`717
`
`NP List from NLE
`
`IPR2020-00686
`Apple EX1009 Page 14
`
`
`
`U.S. Patent
`
`Apr. 10, 2007
`
`Sheet 11 of 31
`
`US 7,203,646 B2
`
`Note: PQ - Paired Question
`NP- Noun Phrase
`Red Line - I / 0
`
`Fig. 40
`Interface Logic between
`NLE and DBProcess.DLL
`
`Paired Questions from DB
`
`NP list of
`PQ
`
`Best
`Answer
`Number
`
`880 GET NP LIST FOR
`THE USER'S QUESTION
`
`Receive the
`question
`from client
`
`Get the
`NP list
`using
`NLE
`880B
`
`813 GETNPLIST
`FOR PAIRED
`QUESTION
`
`Receive the
`PQs from
`DB Process. di/
`813A
`
`+
`
`Get NP List
`using NLE
`813B
`
`NP List of
`User's Question
`
`NP List
`
`815
`GET BEST ANSWER ID
`Best
`Answer•
`Number
`
`NP List from
`Question
`and PQ
`815A Compare NP list
`
`Compare NP of
`user's question with
`PQ from DB to find
`out the best suitable
`question present in
`the DB
`
`Question
`
`Paired Questions ....._ _____ _
`
`9c. Tag all the
`tokens
`909C
`
`900 INITIALIZE GROUPER
`RESOURCES
`Initialize
`Token
`Resources
`900A
`
`Initialize
`Tagger
`Resources
`900B
`
`Initialize
`Grouper
`resources
`900C
`
`Create
`Grouper
`
`9b. Tokenize
`the words from
`the given text
`909B
`
`9d. Group all
`tagged tokens
`to form the NP
`909D
`
`9E. UN-INITIALIZE GROUPER RESOURCES OBJECT AND FREE THE
`RESOURCES
`
`Free token
`Free tagger
`Free grouper
`resources ~ resources ~ resources
`909EC
`909EA
`9091;B
`
`NLE
`
`IPR2020-00686
`Apple EX1009 Page 15
`
`
`
`Fig. 5 vsoo
`
`Encoded
`MFCC
`Vectors
`
`Course
`Chapter
`Section
`
`...
`
`...
`
`Communications
`Server ISAPI
`
`500A
`
`__., -
`
`DB Process
`
`• Best Answer
`
`I
`
`501 ' - - Answer ID
`
`•
`
`User
`Question
`
`•
`
`Query I
`
`Text
`
`NP
`
`~
`Query I
`
`NLE/DB Interface
`
`Paired 1
`
`Question
`Text
`
`Paired
`Question
`NP
`
`Construct
`Query
`L_
`
`iNP
`
`1--
`
`500B r
`
`NLE
`
`500C
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`> "e
`:-: ....
`
`~o
`N
`0
`0
`-....J
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`....
`N
`0 ....
`~ ....
`
`-....l
`
`d r.,;_
`'N = w
`O'I
`~
`0--, = N
`
`IPR2020-00686
`Apple EX1009 Page 16
`
`
`
`Fig.6
`
`COURSE
`
`701
`
`Chapter 1 702 1
`t
`t
`Section12 .... Section1n
`706
`707
`
`i
`Section21
`
`Section11
`705
`
`I
`
`Chapter 2
`
`703 .......... I
`i
`i
`Section22 .... Section2n
`
`Sectionn1
`
`Chapter n
`
`I
`704
`
`i
`Sectionn2 Sectionnn
`
`J
`
`I
`
`Q-A
`Pair111
`
`Q-A
`Pair112
`
`....
`
`'
`Q-A
`Pair113
`
`708
`
`708
`
`708
`
`------ ........ ------------------.. ---------------------.... ---------.....
`
`Q-A
`Pairnn1
`
`,,
`Q-A
`Pairnn2
`
`Q-A
`Pairnnn
`
`Q-A
`Pair121
`
`Q-A
`Pair122
`
`,,
`Q-A
`Pair123
`
`Q-A
`Pair1n1
`
`Q-A
`Pair1 n2
`
`Q-A
`Pair1 n3
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`> "e
`:-: ....
`
`~o
`N
`0
`0
`-....J
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`....
`0 ....
`~ ....
`
`~
`
`-....l
`
`d r.,;_
`'N = w
`O'I
`~
`0--, = N
`
`IPR2020-00686
`Apple EX1009 Page 17
`
`
`
`U.S. Patent
`
`Apr. 10, 2007
`
`Sheet 14 of 31
`
`US 7,203,646 B2
`
`C'-· ~I
`0
`UJ
`><
`UJ
`0 z -
`>- ~I
`
`UJ
`~
`>-
`a:::
`<(
`:i: a::
`
`Cl.
`
`Q)
`rJ'I
`>-
`
`Q)
`rJ'I
`>-
`
`0
`z
`
`0
`z
`
`~
`ti)
`......
`LL
`
`~ ~I
`
`~ I'--
`::,
`z
`
`0
`z
`
`0
`z
`
`UJ ~I
`N
`if)
`
`LC)
`LC)
`N
`
`LC)
`LC)
`N
`
`~I
`
`UJ a..
`~
`~
`<(
`0
`
`'-
`'-
`ct!
`Cll
`.c
`.c
`~
`~
`ct!
`ct!
`> >
`
`~I Q)~I
`
`UJ
`:i:
`<(
`z
`0
`~
`UJ
`u:::
`
`E
`ct!
`z
`Q)
`'-
`.....
`C.
`ct!
`.c
`(.)
`
`Q) ii
`
`E
`ct!
`z
`C:
`0
`Q)
`:;::::::;
`(.)
`(i)
`
`IPR2020-00686
`Apple EX1009 Page 18
`
`
`
`Fig. 7B
`
`FIELD NAME
`
`Chapter_lD
`
`Answer ID
`
`720
`
`726
`
`727
`
`Section Name
`728
`
`Answer Title
`729
`
`Paired Question
`730
`
`AnswerPath
`
`Creator
`
`731
`
`732
`
`Date_of_ Creation
`733
`
`Date_of_Modification
`734
`
`DATA TYPE
`
`721
`
`SIZE
`722
`
`NULL
`723
`
`PRIMARY KEY
`724
`
`INDEXED?
`
`725
`
`Integer
`
`Char
`
`Varchar
`
`Varchar
`
`Text
`
`Varchar
`
`Varchar
`
`Date
`
`Date
`
`No
`
`No
`
`No
`
`Yes
`
`No
`
`No
`
`No
`
`No
`
`No
`
`5
`
`255
`
`255
`
`16
`
`255
`
`50
`
`-
`
`-
`
`Yes
`
`UNIQUE
`
`UNIQUE
`
`No
`
`No
`
`No
`
`No
`
`No
`
`No
`
`Yes
`
`Yes
`
`Yes
`
`Yes
`
`Yes (Full-Text)
`
`Yes
`
`Yes
`
`Yes
`
`Yes
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`> "e
`:-: ....
`
`~o
`N
`0
`0
`-....J
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`....
`Ul
`0 ....
`~ ....
`
`-....l
`
`d r.,;_
`'N = w
`O'I
`~
`0--, = N
`
`IPR2020-00686
`Apple EX1009 Page 19
`
`
`
`Fig. 7C
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`Field
`
`AnswerlD
`
`Section Name
`
`Answer Title
`
`Paired Question
`
`AnswerPath
`
`Creator
`
`Date of Creation
`
`Date_of_Modification
`
`720
`
`727
`
`728
`
`729
`
`730
`
`731
`
`732
`
`733
`
`734
`
`Description
`
`735
`
`An integer - automatically incremented for user convenience
`
`Name of section to which the particular record belongs. This field along with AnswerlD
`has to be made primary key
`
`A short description of the answer
`
`Contains one or more combinations of questions for the related answer whose path is
`stored in the next column AnswerPath
`Contains the path of text file, which contains the answer to the related questions
`stored in the previous column
`
`t :-: ....
`
`"'o
`N
`0
`0
`-....J
`
`('D
`('D
`
`rJJ =(cid:173)
`.....
`....
`O'I
`0 ....
`~ ....
`
`Name of content creator
`
`Date on which content has been added
`
`Date on which content has been changed or modified
`
`'-"--...l
`
`d r.,;_
`N = "'w
`0--, = N
`
`0--,
`~
`
`IPR2020-00686
`Apple EX1009 Page 20
`
`
`
`Fig. 70
`
`FIELD
`
`Answer ID
`
`740
`
`