`
`(12) United States Patent
`Kennewick et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,693,720 B2
`*Apr. 6, 2010
`
`(54)
`
`l\/IOBILE SYSTEMS AND METIIODS FOR
`RESPONDING TO NATURAL LANGUAGE
`SPEECH UTTERANCE
`
`(56)
`
`References Cited
`'fi
`U.S. PAT_NT DOCUMENTS
`4,430,669 A
`2/ 984 Cheung .................... .. 358/122
`
`liiveiitors: Robert A. Kennewick, Seattle, WA
`(US); David Locke, Redmond, WA
`(US); Michael R. Kennewick, Sr.,
`Bcllevue, WA (US); Michael R.
`Kennewick, J r., Bellevue, WA (U S);
`Richard Kennewick, Woodinville, WA
`(US); Tom Freeman, Mercer Island, WA
`(US); Stephen F. Elston, Seattle, WA
`(US)
`
`(Continued)
`
`'
`
`FORETGN PATENT DOCUMENTS
`W0 01/73055
`10/2001
`
`QT1 1]3R PUBLICATIONS
`_
`_
`_
`_
`_
`_
`Lin et 211., "A Distributed Architecture for Cooperative Spoken Dia-
`logue Agents With Coherent Dialogue Smte and History”, ASRU‘99,
`*
`
`Voiccfiox Technologies, lnc., Kirkland,
`WA (US)
`
`1999'
`
`(Continued)
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(5) by 950 days.
`
`Primary E.ram1'I7er—Jiimes S Wozniak
`.
`_
`,
`(74) Attorney, Agent, or Fzrm—Pillsbury Winthrop Shaw
`Pimmn LU’
`
`This patent is subject to a terminal dis-
`claimer.
`
`(57)
`
`ABSTRACT
`
`10/518,633
`
`Jul_ 15, 2003
`
`prior publication Data
`
`Us 2004/0193420-Al
`
`S911 30: 2004
`
`Related U-S-APPliCafi0l1 Data
`Provisional application No. 60/395,615, filed on Jul.
`15 2002
`‘
`'
`
`Int’ Cl‘
`(200601)
`GIOL 21/00
`(2006-0')
`G10L 15/18
`(2006-01)
`01 0L 13/00
`U.S. Cl.
`................... .. 704/275; 257/270; 257/270.1
`Field of Classification Search ............... .. 704/270,
`704/270.1, 275, 257; 709/202
`See application file for complete search history.
`
`Mobile systems and methods that overcomes the deficiencies
`of prior art speech—based interfaces for telematics applica-
`tions through the use of a complete speech—based information
`query. retrieval, presentation and local or remote command
`environment. This environment makes significant use of con-
`text, prior information, doinani knovvledge, and user specific
`profile data to achieve a natural environment for one or more
`users making queries or commands in multiple domains.
`Through this integrated approach, a complete speech—based
`natural language query and response environment can be
`creatled. Elie .111}’e11t10tIElC1‘:‘aI€S.
`sltlores afiii usgs extensiyegpgr.
`sona pro e in orina ion or eac user,
`ere y improving ie
`reliability of determining the context and presenting the
`expected results for a particular question or command. The
`invention may organize domain specific behavior and infor-
`mation into agents, that are distributable or iipdateable over a
`wide area network. The invention can be used in dynamic
`environments such as those ofmobilc vehicles to control and
`communicate with both vehicle systems and remote systems
`and devices.
`
`55 Claims, 6 Drawing Sheets
`
`TIEHSGEIVET
`15 speeciiunic
`
`‘
`
`130
`
`l
`
`M
`
`Text to Speech
`Ei:ig:lne
`
`I
`Tm» speech
`ceiver Coder
`
`Speech
`Engine
`Recognition
`
`.l7
`I if
`
`UserPmfi1e
`|I
`
`03
`
`—-[
`
`Agenis
`Update lvizmger
`
`3405
`
`Data Network! PSTN Te Network Interface
`
`Giapliinal User
`
`Speech Processing System Block Dlagram
`
`GOOGLE EXHIBIT 1013
`
`
`Page 1 of 28
`
`
`
`US 7,693,720 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`>>>)>i1>>>>J>i1>>>>)>11>>>>J>i1>>>>J>i1>>>>>i1>>>>D>i1>>>>>i1>>>>>i>>i>>D>i1>>i>>>i>>>>>i>>
`
`5,155,743
`5.274.560
`5.377,350
`5,386,556
`5.424,947
`5,471,318
`5.475.733
`5.499,289
`5,500,920
`5.517,560
`5.533,108
`5.537,436
`5.539,744
`5.557,667
`5.563,937
`5,590,039
`5.617,407
`5.633,922
`5.675,629
`5.696,965
`5,708,422
`5.721.938
`5.722,084
`5.742,763
`5,752,052
`5.754,784
`5.761.631
`5.774,859
`5.794,050
`5.797,112
`5.799,276
`5.802.510
`5.832,221
`5.878,385
`5.878,386
`5.892,813
`5.895.466
`5,902,347
`5.911,120
`5.918,222
`5.926,784
`5.933.822
`5,953,393
`5.963,894
`5.963,940
`5.987,404
`5.991.721
`5.995,1 19
`6.009382
`6.014,559
`6.021,384
`6.044.347
`6.049,602
`6.058,187
`6.078,886
`6.081,774
`6.101.241
`6.1 19,087
`6.134,235
`6.144,667
`6.l60,883
`6,173.279
`6.1-/5,858
`6~135a535
`6.192,110
`6.192,338
`6.195534
`6.1912651
`6.208,972
`6.2l9,346
`6.2l9,643
`
`"
`
`.
`
`.
`
`10/1992 Jacobs ....................... .. 375/28
`12/1993 LaRue .............. ..
`.. 364/444
`12/1994 Skinner
`..
`.. 395/600
`1/1995 Hedin el al.
`.............. .. 395/600
`6/1995 Nagao et al.
`......... .. 364/419.08
`11/1995 Ahujaet al.
`..
`.. 358/400
`12/1995 Eisdorfer et al.
`. 379/52
`3/1996 Bruno et al.
`..
`.. 379/220
`3/1996 Kupiec .... ..
`. 395/2.79
`5/1996 Greenspan .
`379/114
`..
`7/1996 Harris et al.
`.. 379/201
`7/1996 Bottoms et al.
`........... .. 375/222
`7/1996 Chu et al.
`................... .. 370/60
`9/1996 Bruno et al.
`.. 379/201
`10/1996 Bruno et al.
`.............. .. 379/201
`12/1996 Ikeda et al.
`395/759
`4/1997 Bareis ..... ..
`369/275.3
`5/1997 August et al.
`............. .. 379/220
`10/1997 Raffel et al.
`................ .. 379/58
`12/1997 Dedrick ...... ..
`.. 395/610
`1/1998 Blonder et al.
`340/825.34
`2/1998 Stuckey ...... ..
`.. 395/754
`2/1998 Chakrin et al.
`.. 455/551
`4/1998
`395/200.3
`.
`5/1998 Richardson et al.
`395/759
`5/1998 Garland et al.
`....... .. 395/200.49
`6/1998 Nasukawa ................... .. 704/9
`6/1998 1-louser et al.
`.
`704/275
`8/1998 Dahlgren et al.
`.......... .. 395/708
`8/1998 Komatsu et al.
`.......... .. 701/201
`8/1998 Konlissarchik et al
`704/251
`9/1998
`.. 707/2
`11/1998
`3 5/200.36
`'
`3/1999
`704/9
`3/1999 Coughlin
`704/10
`..
`4/1999 Morin et al.
`379/88.01
`4/1999 Goldberg et al.
`............. .. 707/5
`5/1999 Backman et al.
`.......... .. 701/200
`6/1999 Jarert et al.
`.. 455/417
`6/1999 Fukui et al.
`.................. .. 707/1
`7/1999 Richardson et al.
`.......... .. 704/9
`8/1999 Braden—Harder et al.
`707/5
`9/1999 Culbreth et al.
`379/88.25
`10/1999 Richardson et al.
`.......... .. 704/9
`10/1999 Liddy et 31.
`....... ..
`.. 707/5
`11/1999 Della Pietra et al.
`.. 704/9
`11/1999 Asano et al.
`.. 704/257
`11/1999 Cosafto et al.
`.. 345/473
`12/1999
`.. 704/1
`1/2 00
`455/413
`2/2000
`3/2000
`4/2 00
`5/2000
`6/2 00 Dragosh ct al.
`6/2000 De Hita et al.
`8/2000 Boyce et al.
`.. 704/270
`9/2 00 Kuhn et al.
`.. 370/352
`10/2000 Goldman et al.
`.. 370/401
`11/2 00 Doshi ct al.
`.. 379/230
`12/2000 Jackson et al.
`1/2001 Levin etal.
`.................. .. 707/5
`1/2 01 Bulfer et al.
`.............. .. 709/206
`2/2001 Hedin 5ta1~ -~
`~- 704/270
`2/2 01 Abclla ct al.
`........... .. 379/88.01
`2/2001 Haszto et al.
`............. .. 704/257
`2/2 01 Dudcmainc ctali
`.. 704/231
`2/2001 Handel et al.
`.
`.. 707/2
`3/2 01 Grant et al.
`704/275
`4/2001 Maxemchuk .
`.. 370/338
`4/2001 Cohen etal.
`.............. .. 704/257
`
`"
`
`704/272
`379/265
`
`........... .. 704/270
`
`..
`
`..
`
`.
`
`WUUEUCWWUUWUCWWUUWUCWDUUUWUCWDUUUWUCWDUUUWUCWD2105WUCWDUUUWUCWDSUUWUCWDUUUWUCWDSUUWUCWDUUUWUCWDSUUWWWDU
`
`6,233,556
`6,233,559
`6,246,931
`6,272,455
`6.292.767
`6,314,402
`6,3663%
`6381535
`6 335 646
`623932423
`6,404,373
`6,408,272
`6,411,810
`614151257
`6141812 10
`6,420,975
`6,430,2g5
`6434523
`6434524
`6442522
`6144611 14
`6,453,153
`6453292
`6,456,711
`6,466,654
`6,466,399
`6,498,797
`6149950 13
`6501333
`6501334
`6,513,006
`615231061
`6532444
`6,539,343
`6,553,372
`6,556,970
`6556973
`6,560,576
`615671773
`6,567,797
`6,570,555
`615701964
`6574597
`6,574,624
`6,594,257
`6,593,013
`616041077
`6,611,692
`6,614,773
`6,615,172
`6,629,066
`616311346
`6,643,620
`6,650,747
`6,691,151
`6 721 001
`62721506
`6735592
`6,741,931
`6,742,021
`6 757 713
`62795303
`6,833,343
`6,365,431
`6,877,134
`6,901,366
`619371977
`6944594
`6,973,387
`6,980,092
`6,990,513 B2
`6,996,531 B2
`7,020,609 B2
`
`IV
`
`I\)
`
`IV
`
`I\Jf~)
`
`"‘
`
`7-
`I
`
`I
`
`..
`
`..
`
`.
`
`,
`
`.
`
`.
`
`704/250
`.... ..
`5/20 1 Te1Ine11 et al.
`. 704/275
`5/2001 Balakrishnan ..
`704/235
`6/20 1 Papincni ct al.
`.
`704/1
`8/2001 Hoshen el al.
`..
`704/1
`9/2001 Jackson et al.
`...... ..
`11/20 1 Monaco et al.
`........... .. 704/275
`4/2002 Dragosh et al.
`.
`704/270,1
`4/20 2 Durocher et al.
`.......... .. 701/202
`5/2002 Brown el al.
`........ ..
`709/217
`5/2002 Miller et al.
`.
`707/102
`6/20 2 Jackson et al
`. 379/221.01
`6/2002 White et al.
`704/270,1
`6/20 2 Maxemchuk
`. 455/453
`7/2002 Junqua el al.
`704/275
`7/2002 Sayko ....... ..
`379/142.15
`7/20 2 DeLine et al.
`.... .. 340/815.4
`8/2002 Bauer et al.
`379/265.01
`8/20 2 Monaco
`. 704/257
`8/2002 Weber .... ..
`704/257
`8/2002 Carberry et al.
`704/257
`..... ..
`9/20 2 Bulfer et al.
`709/206
`9/2002 Bowker et al.
`........... .. 455/67,4
`9/20 2 Ramaswamy et al.
`704/235
`9/2002 Cheung et al.
`..
`. 379/265.09
`10/2002 Cooper et al.
`379/88.01
`10/20 2 Yano et al.
`704/1
`12/2002 Anerousis et al.
`370/522
`12/20 2 Weber . . . . . . . . . . . .
`. . . . .. 704/257
`12/2002 Phillips el al.
`379/88.07
`12/2002 Milewslci et al.
`....... .. 379/93.24
`1/20 3 Howard et al.
`704/257
`...... ..
`2/2003 Halverson et al.
`709/202
`3/20 3 Weber ............. ..
`704/257
`3/2003 Bond el al.
`......... ..
`704/9
`4/2003 Brassell et al.
`707/5
`4/20 3 Sasaki et al.
`/257
`4/2003 Lewin .... ..
`/277
`.
`5/20 3 Cohen et al.
`/270
`.
`5/2003 Chao Chang el al.
`/257
`5/2003 Schuetze et al.
`5/20 3 Prevost et al.
`....... ..
`5/2003
`'
`6/20 3 Mohriet al.
`6/2003
`7/2003 Doshi et al.
`7/20 3
`Junqua ...... ..
`8/2003 Dragosh et al.
`8/20 3 Raffel et al.
`9/2003 Maxemchuk
`..
`9/2003 Bennett et al.
`..
`9/20 3
`Jackson et al
`704/9
`10/2003 Karaorman et al.
`11/20 3 Contolini et al.
`.......... .. 704
`11/2003 Bala et al.
`.
`2/2004 Cheyer et al.
`............. .. 709/202
`4/2004
`348/231.3
`4/2004 Strubbe et al.
`701/275
`5/2004 Neumann et al.
`707/101
`5/2004 Kohut et al.
`........ ..
`701/209
`5/2004 Halverson et al.
`709/218
`6/2004 Halverson et al.
`709/218
`9/2004 Strubbe et al.
`...... ..
`704/275
`12/2004 Wolff et al.
`3/2005 Kawazoe et al.
`4/2005 1:u11e1~ et a1.
`............... .. 704/275
`5/2005 K11hn etal.
`704/201
`g/2005
`704/275
`9/2005 Busayapongchai e1a11
`12/2005 Mascletetal.
`............ .. 701/211
`12/2005
`‘urnbulletal.
`........ .. 340/425.5
`1/2006
`‘
`.
`..
`709/203
`2/2006
`704/270
`.
`3/2006
`............ .. 704/270.1
`
`.... ..
`
`701/211
`
`.
`
`Thrift et al.
`
`Page 2 of 28
`
`
`
`US 7,693,720 B2
`Page 3
`
`7,027,975 B1
`7,062,488
`7,092,928 B1
`7,127,400 B2
`7,137,126 B1
`7,146,319 B2
`7,289,606 B2
`7,376,645 B2
`7,398,209 B2
`7,424,431 B2
`7,461,059 B2
`7,493,559 B1
`20 1/0041980 A1
`2002/0049805 A1
`2002/0065568 A1
`20 2/0082911 A1
`2002/0120609 A1
`20 2/0124050 A1
`2002/0143535 A1
`2002/0188602 A1
`20 2/0198714 A1
`2003/0014261 A1
`20 3/0112267 A1
`2003/0182132 A1
`2004/0025115 A1
`20 4/0044516 A1
`2004/0166832 A1
`20 4/0199375 A1
`2005/0021334 A1
`2005/0021826 A1
`20 5/0033574 A1
`2005/0114116 A1
`20 5/0246174 A1
`2007/0033005 A1
`2007/0038436 A1
`20 7/0050191 A1
`2008/0091406 A1
`20 8/0103761 A1
`2008/0115163 A1
`2008/0133215 A1
`20 9/0117885 A1
`2009/0144271 A1
`
`.
`
`704/9
`4/2 06 Pazandak et al.
`.. 707/8
`6/2006 Reisman
`706/60
`8/2 06 Elad ct a1.
`704/270.1
`10/2006 Koch ....... ..
`........... .. 719/328
`11/2006 Coffman et al.
`12/2 06 Hunt
`........................ .. 704/254
`10/2007 Sibal et al.
`.
`. 379/52
`5/2 08 Bernard . . . . . . . .
`. . . .. 707/3
`7/2008 Kennewick etal.
`....... .. 704/255
`9/2008 Greene et al.
`.. 704/270
`12/2 08 Richardson et al.
`.. 707/5
`2/2009 Wolffet al.
`.... ..
`.. 715/727
`11/2 01 Howard eta].
`.. 704/270
`4/2002 Yarnada etal.
`.. 709/202
`5/2002 Silfvast et a1.
`700/94
`6/2 02 Dunn et al.
`................. .. 705/14
`8/2002 Lang et al.
`................... .. 707/1
`9/2 02 Middeljans
`709/203
`10/2002 Kistetal.
`................. .. 704/251
`12/2002 Stubler et al.
`................ .. 707/3
`12/2 02 Zhou ....... ..
`704/252
`1/2003 Kageyama ........ ..
`.. 704/275
`6/2 03 Belrose ............ ..
`.. 345/728
`9/2003 Niemoeller
`.. 704/275
`2/2004 Sienelet al.
`.. 715/513
`.
`3/2 04 Kennewick et al.
`704/5
`8/2004 Portman et al.
`........ .. 455/412.1
`10/2 04 Ehsaniet al.
`................ .. 704/4
`1/2005 Iwahashi
`704/240
`1/2005
`.. 709/232
`2/2 05 Kim et al.
`................. .. 704/251
`5/2005 Fiedler
`.. 704/201
`11/2 05 DeGolla ........... ..
`.. 704/270
`2/2007 Cristo etal.
`................. .. 704/9
`2/2007 Cristo et al.
`..
`.. 704/9
`3/2 07 VVeideret al.
`.
`.. 704/275
`4/2008 Baldwin et al.
`.. 704/4
`5/2 08 Printz et al.
`.. 704/9
`5/2008 Gilboa etal.
`. 725/34
`6/2008 Sarukkai
`.. 704/2
`5/2 09 Rotl1 ..................... .. 455/414.3
`6/2009 Richardson et al.
`.......... .. 707/5
`
`.
`
`OTHER PUBLICATIONS
`
`Kuhn et al. “Hybrid in-car speech recognition for mobile multimedia
`applications”, Vehicular Technology Conference, 1999, IEEE, Jul.
`1999, pp. 2009-2013*
`
`Belvin et al. “Development of the HRL Route Navigation Dialogue
`System”, Proceedings ofthe first international conference on Human
`language technology research, San Diego, 2001, pp. 1-5.*
`Lind et al. “The network vehicle— A glimpse into the future of mobile
`multimedia,” IEEEAerosp. Electron. Syst. Mag., vol. 14, No. 9, Sep.
`1999, pp. 27-325‘
`Zhao, “Telematics: Safe and Fun Driving,” IEEE Intelligent Systems,
`vol. 17, Issue 1, 2002, pp. 10—14.*
`Elio et al, “On abstract task models and conversation policies,” in
`Workshop on Specifying and Implementing Conversation Policies,
`AutonomousAgents ’99, Seattle, 1999*
`Turunen; “Adaptive interaction methods in speech user interfaces",
`Conference on Htunan Factors in Computing Systems, Seattle,
`Washington, 2001, pp. 91—92.*
`Reuters, “IBM to Enable Honda Drivers to Talk to Cars”, Charles
`Schwab & Co., Inc., Jul. 28, 2002, 1 page.
`Mao, Mark 7.., “Automatic Training Set Segmentation for Multi-Pass
`Speech Recognition”, Department of Electrical Engineering,
`Stanford University, CA, copyright 2005, IEEE, pp. I-685 to I-688
`VanHoucke, Vincent, “Confidence Scoring and Rejection Using
`Multi-Pass Speech Recognition”, Nuance Communications, Menlo
`Park, CA, [no date], 4 pages.
`Weng, Fuliang, et al , “Efficient Lattice Representation and Genera-
`tion”, Speech Technology and Research Laboratory. SRI Interna-
`tional, Menlo Park, CA, [no date], 4 pages.
`Chai et al., “MIND: A Semantics—Based Multirnodal Interpretation
`Framework for Conversational System”, Proceedings ofthe Intermi-
`tional Class Workshop on Natural, Intelligent and Efiective Interac-
`tion in Multimodal Dialogue Systems, Jun. 2002, pp. 37-46.
`Cheyer et al ., “Multimodal Maps: An Agent—Based Approach”, Inter-
`national Conference on Cooperative Mttltimodal Communication
`(CMC/95), May 24-26, 1995, pp. 111-121.
`El Meliani et al., “A Syllabic-Filler-Based Continuous Speech
`Recognizer for Unlimited Vocabulary”, Canadian Conference on
`Electrical and Computer Engineering, vol. 2, Sep. 5-8, 1995, pp.
`1007-1010.
`Arrington, Michael. “Google Redefines GPS Navigation Landscape:
`Google \/laps \Iavigation for Android 2.0”, TechCrunch, printed
`from the Internet <Zhttp://w\Wv.techcrunch.com/2009/10/28/google-
`rcdcfincs-car-gps-navigation-googlc-maps-navigation-android/>,
`Oct. 28, 2009, 4 pages.
`
`* cited by examiner
`
`Page 3 of 28
`
`
`
`U.S. Patent
`
`Apr. 6, 2010
`
`6fl.01teehS
`
`US 7,693,720 B2
`
`
`
`
`
`mmowcom.mbebnmBarnum
`
`zoumooqmm
`
`
`
`moowcom._m>_mow:E._.
`
`mm:<-mU_>>
`
`:o.pmm_>mz
`
`
`
`
`
`wommfioommEMUE055Ew..w>m
`
`mDommafifiEma
`
`u_2_Em_._
`
`momtm.E_
`
`EgoEEM
`
`in..,Vmm
`
`
`
`.uSsmsem€3823%:Suqoom.in23805
`moorfiom
`
`
`QébzZhmmmm2E§>moumfiofiwsooomm:32>2:85mooflhufi
`
`8.5mmcanBbnooN
`
`yam
`
`mm
`
`
`
`mm2838oooN838_.mo_>oQ
`
`E232E552
`
`
`
`_ob:oUfiobnou
`
`
`
`o>uS5m8momoorcummU_Eonw§m.m_oEo>
`
`§_§aE§n_o7._1uuuuuuuuuuuuuuuuuuuuuuuuuuunnJ
`
`
`
`
`
`EEmm_n_xoo_mEmumam«:wE=oonE.u._..m.__u_._‘mSm_n_
`
`
`
`
`
`Page 4 of 28
`
`
`
`U.S. Patent
`
`Apr. 6, 2010
`
`6fl.02teehS
`
`US 7,693,720 B2
`
`
`
`
`
`_um|._22:5:_.22:.§m
`
`
`
`
`
`85mmMQ?zmo_>oQ...mmo_>m.Qroo_>mQ
`
`mm®—D.HT»>
`
`ox
`
`§:mEoU
`
`.N—
`
`
`
`mUOmv%HUwfi;DDTRDQfiendéObEOU
`
`
`
`
`
`
`
`
`
`>o:mw$EmfioéozAmv__..EDHOHEOUm0EdQ.~®~®...__.
`
`
`
`
`
`wopccom
`
`COCNOOQ
`
`
`
`$o.S.$m.m>_momcm#
`
`mE<-%_>>
`
`mooflufifiSam
`
`U_mEUcmI
`
`mum.tmE_
`
`EMUEofibmm
`
`
`
`mmowrsmmfiopmxmmooucomnuoommwonzfioomm
`
`wfimmoooi
`
`EM
`
`
`
`o>uS:om8m9m855m
`
`coufimvtog
`
`“ED
`
`
`
`
`
`
`
`Em._mm_Qxuo_mE3m>mEwE__uonEm_ucoowwN2:9”.
`
`uUnDV_
`
`_
`
`
`
`—N5EN2—N3ENz
`
`Page 5 of 28
`
`
`
`U.S. Patent
`
`Apr. 6, 2010
`
`Sheet 3 of6
`
`US 7,693,720 B2
`
`
`
`$3.:83
`
`xaémz
`
`Am:m_>mom:m_._
`
`mm.<-mU_>>
`
`vtoémz
`
`._m>_mom:m;._.
`
`oEw_o>:oZ
`
`boas?
`
`
`
`32::mc_wwmoo_n_
`
`ENEGM
`
`
`
`SSQEOUEunwcam
`
`UnuflgammQEEQ
`
`m.
`
`mm
`
`ENE
`
`nooomm
`
`wfimwooofi
`
`E5
`
`nooumm
`
`“ED
`
`
`
`
`
`.Em._mm_n_xoo_m._w__.:Q_.:O0Eocucmz.mo._:m_..._
`
`
`
`
`
`Page 6 of 28
`
`
`
`U.S. Patent
`
`Apr. 6, 2010
`
`6fl.04teehS
`
`US 7,693,720 B2
`
`
`
`mm_<_moo._
`
`éoémz
`
`Amrw_>momcm_._.
`
`8_<-8_>>
`
`v_._O>>..®_._
`
`
`
`_m>_oomcmc.
`
`oEw_o>soZ
`
`‘G2:02
`
`
`
`AmEcDm:_mmouo.n_
`
`Emoommom
`
`
`
`Sfiafiouwoxfi
`
`aflfladon»mEm_Qmm
`
`
`
`.ESmm_nxuo_m.o.Sn_Eou_umx_..._.v2:2“.
`
`
`
`
`
`Page 7 of 28
`
`
`
`U.S. Patent
`
`Apr. 6, 2010
`
`Sheet 5 of6
`
`US 7,693,720 B2
`
`_
`
`meEfizanomsmEmfioiSmb
`
`W1
`SMQSEB533
`
`$.53
`
`canbmaouofi
`
`mumwflm
`
`C\l
`3
`
`momsnfimfl
`
`nooomm8and.
`
`ofimnm
`
`
`
`nooomm-m:m.£L
`
`.250$38
`
`nooomm
`
`nowfiwooom
`
`ofimnm
`
`of
`mm
`
`wm
`
`mm
`
`um
`
`nooumméj.H®>«®omENHrfi
`.__%Swoo
`
`\OOOOOono:mo.§E
`>m.u<
`
`o_
`
`E5aooommor
`
`S5Eoflmfiw
`
`oo£.5EH
`
`zemm2532«ED
`
`Page 8 of 28
`
`
`
`U.S. Patent
`
`0102.,6r._pA
`
`Sheet 6 of 6
`
`US 7,693,720 B2
`
`
`
`w.__.nom:;o.<:_%<.ow._:m_u_
`
`mummnfiafl
`
`Eo>m
`
`$352
`
`Page 9 of 28
`
`
`
`US 7,693,720 B2
`
`1
`MOBILE SYSTEMS AND METHODS FOR
`RESPONDING TO NATURAL LANGUAGE
`SPEECH UTTERANCE
`
`This application claims priority from U.S. Provisional
`Patent Application Ser. No. 60/3 95,615 filed Jul. 15, 2002, the
`disclosure of which is hereby incorporated by reference by its
`entirety.
`
`FIELD OF THE INVENTION
`
`Thepresent invention relates to the retrieval ofonline infor-
`mation and processing of commands through a speech inter-
`face in a vehicle environment. More specifically, the inven-
`tion is a fully integrated environment allowing mobile users to
`ask natural language speech questions or give natural lan-
`guage commands in a wide range of domains, supporting
`local or remote conunands, making local and network queries
`to obtain information, and presenting results in a natural
`r11a11ner even i11 cases where the question asked or the
`responses received are incomplete, ambiguous or subjective.
`BACKGROUND OF THE IN VEN 'l'l()N
`
`Telematics systems are systems that bring human-com-
`puter interfaces to vehicular environments. Conventional
`computer interfaces use some combination of keyboards,
`keypads, point and click teclmiques and touch screen dis-
`plays. These conventional interface techniques are generally
`not suitable for a vehicular environment, owing to the speed
`of interaction and the inherent danger and distraction. There-
`fore, speech interfaces are being adopted ir1 many telematics
`applications.
`However, creating a natural language speech interface that
`is suitable for use in the vehicular environment has proved
`difficult. A general—purpose telematics system must accom-
`modate conunands a11d queries from a wide range of domains
`and fror11 many users with diverse preferences a11d needs.
`Further, multiple vehicle occupants may want to use such
`systems, often simultaneously. Finally, most vehicle environ-
`ments are relatively noisy, making accurate speech recogni-
`tion inherently difiicult.
`Human retrieval of both local and network hosted online
`information and processing of commands in a natural manner
`remains a difficult problem in any environment, especially
`onboard vehicles. Cognitive research on human interaction
`shows that a person asking a question or giving a command
`typically relies heavily on context and the domain knowledge
`of the person answering. On the other hand, machine-based
`queries of documents and databases and execution of com— ,
`mands must be highly structured and are not inherently natu-
`ral to the human user. Thus, human questions and commands
`and machine processing of queries are fundamentally incom-
`patible. Yet the ability to allow a person to make natural
`language speech—based queries remains a desirable goal.
`Much work covering multiple methods has been done in
`the fields of natural language processing and speech recog-
`nition. Speech recognition has steadily improved in accuracy
`and today is successfiilly used in a wide range ofapplications.
`Natural language processing has previously been applied to
`the parsing of speech queries. Yet, no system developed pro-
`vides a complete environment for users to make natural lar1-
`guage speech queries or commands and receive natural
`sounding responses in a vehicular environment. There remain
`a number of significant barriers to creation of a complete
`natural language speech—based query and response environ-
`r11er1t.
`
`2
`The fact that mo st natural language queries and commands
`are incomplete in their definition is a significant banier to
`natural human query-response interaction. Further, some
`questions can only be interpreted in the context of previous
`questions, knowledge of the domain, or the user’s history of
`interests and preferences. Thus, some natural language ques-
`tions a11d commands may not be easily transformed to
`machine processable form. Compounding this problem,
`many natural language questions may be ambiguous or sub-
`10 jective. In these cases, the formation ofa machine processable
`query and returning of a natural language response is difiicult
`at best.
`
`’
`
`Even once a question is asked, parsed and interpreted
`machine processable queries and conunands must be formu-
`lated. Depending on the nature of the question, there may not
`be a simple set of queries returning an adequate response.
`Several queries may need to be initiated and even these que-
`ries may need to be chained or concatenated, to achieve a
`complete result. Further, no single available source may
`include the entire set of results required. Thus multiple que-
`ries, perhaps with several parts, need to be made to multiple
`data sources, which can be both local or on a network. Not all
`of these sources and queries will return useful results or any
`results at all. Ir1 a mobile or vehicular environment, the use of
`wireless conununications compounds the chances that que-
`ries will not complete or return useful results. Useful results
`that are retunied are often embedded in other information,
`and from which they may need to be extracted. For example,
`a few key words or numbers often need to be “scraped” from
`a larger amount ofother information in a text string, table, list,
`page or other information. At the same time, other extraneous
`information such as graphics or pictures needs to be removed
`to process the response in speech. In any case, the multiple
`results must be evaluated and combined to form the best
`possible answer, even in the case where some queries do not
`return useful results or fail entirely. In cases where the ques-
`tion is ambiguous or the result inherently subjective, deter-
`mining the best result to present is a complex process. Finally,
`to maintain a natural
`interaction, responses need to be
`returned rapidly to the user. Managing and evaluating com-
`plex and uncertain queries while maintaining real-time per-
`formance is a significant challenge.
`
`These and other drawbacks exist in existing systems.
`
`SUMMARY OF THE INVENTION
`
`An object of the invention is to overco111e these and other
`drawbacks of prior speech—based telematic systems.
`According to one aspect of the invention, systems and
`methods are provided that may overcome deficiencies ofprior
`systems through the application of a complete speech—based
`information query, retrieval, presentation and command envi-
`ronment. This enviromnent makes significant use of context,
`prior information, domain knowledge, and user specific pro-
`file data to achieve a natural environment for one or more
`users making queries or commands in multiple domains.
`Through this integrated approach, a speech—based natural
`language query, response and command environment is cre-
`ated. Further, at each step in the process, accommodation may
`be made for full or partial failure and graceful recovery. The
`robustness to partial failure is achieved through the use of
`probabilistic and fuzzy reasoning at several stages of the
`process. This robustness to partial failure promotes the feel-
`ing of a natural response to questions and commands.
`According to another aspect of the invention. a mobile
`interactive natural language speech system (herein “the sys-
`
`Page 10 of 28
`
`
`
`US 7,693,720 B2
`
`3
`tem”) is provided that includes a speech unit. The speech unit
`may be incorporated into a vehicle computer device or sys-
`tem, or may be a separate device. If a separate device, the
`speech unit may be comiected to the vehicle computer device
`via a wircd or wirclcss coimcction. In some embodiments, thc
`interactive natural language speech device can be handheld.
`The handheld device may interface witl1 vehicle computers or
`other electronic control systems through wired or wireless
`links. The handheld device can also operate independently of
`the vehicle. The handheld device can be used to remotely
`control the vehicle through a wireless local area connection,
`a wide area wireless connection or through other con1n1ur1i-
`cation links.
`According to another aspcct of the invention, thc system
`may include a stand alone or networked PC attached to a
`vehicle, a standalone or networked fixed computer iii a home
`or oflice, a PDA, wireless phone, or other portable computer
`device, or other computer device or system. For convenience,
`these and other computer alternatives shall be simply referred
`to as a computer. One aspect of the invention includes soft-
`ware that is installed onto tl1e computer, where the software
`includes one or more of the following modules: a speech
`rccognition module for capturing the uscr input; a parscr for
`parsing the input, a text to speech engine module for convert-
`ing text to speech; a network interface for enabling the com-
`puter to interface with one or more networks; a graphical user
`interface module, an event manager for managing events and
`other modules. In some embodiments the event manager is in
`communication with a dictionary and phrases module, a user
`profile module that enables user profiles to be created, modi-
`fied and accessed, a personality module that enables various
`personalities to be created and used, an agent module, an
`update manager and one or more databases. It will be under-
`stood that this software can be distributed in any way between
`a handheld device, a computer attached to a vehicle, a desktop
`computer or a server without altering the function, features,
`scopc, or intent of thc invention.
`According to one aspect of the invention, and regardless of
`the distribution of the functionality, the system may include a
`speech interface device that receives spoken natural language
`queries, commands and/or other utterances from a user, and a
`computer device or system that receives input from the speech
`unit and processes the input (e.g.. retrieves information
`responsive to the query, takes action consistent with the com-
`mand and performs other functions as detailed herein), and
`rcsponds to thc uscr with a natural language spccch rcsponsc.
`According to another aspect ofinvention, the system can be
`interfaced by wired or wireless connections to one or more
`vehicle-related systems. These vehicle-related systems can
`themselves be distributed between electronic controls or .
`computers attached to the vehicle or external to the vehicle.
`Vehicle systems employed can include, electronic control
`systems, entertainment devices, navigation equipment, and
`measurement equipment or
`sensors. External
`systems
`cmploycd include those uscd during vchiclc opcration, such
`as, weight sensors, payment systems, emergency assistance
`networks,
`remote ordering systems, and automated or
`attended customer service functions. Systems 011 the vehicle
`typically communicate with external systems via wireless
`communications networks.
`According to another aspect of the invention, the system
`car1 be deployed in a network of devices using common base
`of agents, data, information, user profiles and histories. Each
`uscr can thcn interact with, and receive thc same services and
`applications at any location equipped with the required device
`on the network. For example, multiple devices on which tl1e
`invention is deployed, and connected to a network, can be
`
`4
`placed at different locations throughout a home, place of
`business, vehicle or other location. In such a case, the system
`can use tl1e location ofthe particular device addressed by tl1e
`user as part of the context for the questions asked.
`According to some aspects of thc invcntion, domain spc-
`cific behavior and information are organized into agents.
`Agents are executables that receive, process and respond to
`user questions, queries and commands. The agents provide
`convenient and re-distributable packages or modules of func-
`tionality, typically for a specific domain. Agents can be pack-
`ages of executable code, scripts, links to information, data,
`ar1d other data forms, required to provide a specific/package
`of functionality, usually in a specific domain. In other words,
`an agent may include cvcrything that is nccdcd to cxtcnd thc
`functionality of the invention to a new domain. Further,
`agents and their associated data canbe updated remotely over
`a network as new behavior is added or new information
`becomes available. Agents can use system resources and the
`services of other, typically more specialized, agents. Agents
`‘ can be distributed and redistributed in a number of ways
`including on removable storage media, transfer over net-
`works or attached to emails and other messages. An update
`mangcr is uscd to add ncw agents to the system or update
`existing agents.
`The software behavior and data in an agent can either be of
`a general-purpose nature or specific to a domain or area of
`functionality. One or more system agents include general-
`purpose behaviors and data, which provide core or foundation
`services for more specialized domain or system agents.
`Examples of general-purpose functionality include transmit-
`ting and receiving information over data networks, parsing
`text strings, general commands to the interactive natural lan-
`guage telematics speech interface, and other functions. For
`example, a specific system agent may be used to transmit and
`receive information over a particular type of network, and
`may use the services of a more general network agent.
`Domain spccific agents include the bchavior and data
`required for a specific area of functionality. More specialized
`domain agents can use the functionality of more generalized
`domain agents. Areas offunctionality or specific domains are
`broadly divided into two categories, query and response, and
`control. Examples of query and response domains include
`driving directions, travel services, entertainment scheduling,
`ar1d other information, Agents may ir1,
`turn query other
`agents. For example, a fast food ordering agent, may use the
`services of a rcstaurant ordcring agent and payment agcnt,
`which may in turn, use the services of location agent and a
`travel services agent. Control domains include control of
`specific devices on a vehicle. In each case, the agent includes
`or has access to the data and functionality required to control
`the device through the appropriate interfaces. For example, a
`specific domain agent may be used to control the windshield
`wipers or1 a vehicle. In another example, a domain agent for
`controlling the vehicle’ s headlights may use the services of a
`lighting control agcnt, which may use thc services of an
`electrical device control agent. Some domains, and therefore
`agents, may combine aspects of control with query and
`response functionality. For example, a user may wish to listen
`to a particular piece of music. In this case, the domain agent
`will make one or more queries, possibly using the services of
`other agents, to locate a source for the music and retrieve it.
`Next, the domain agent will activate a suitable player for tl1e
`format ofthe mu sic, again possibly using the services ofother
`agents.
`The invention may provide license management capability
`allowing the sale of agents by third parties to one or more
`users on a one time or subscription basis. In addition, users
`
`Page 11 of 28
`
`
`
`US 7,693,720 B2
`
`5
`with particular expertise can create agents, update existing
`agents by adding new behaviors and information and making
`these agents to other users.
`Given the desire for domain specific behavior, user specific
`behavior and domain specific information, the invention may
`allow both users and content providers to extend the system
`capabilities, add data to local data sources, and add references
`to network data sources. To allow coverage of the widest
`possible range of topics and support for the widest range of
`devices, the system may allow third party content developers
`to develop, distribute and sell specialized or domain specific
`system programs and information. Content is created though
`creation of new agents, scripting existing agents, adding new
`data to agents or databases, and adding or modifying Links to
`information sources. Distribution of this information is sen-
`sitive to the user’s interests a11d use history and to their will-
`ingness to pay for it.
`According to another aspect of the invention, the system
`may include mechanisms to allow users themselves to post
`and distribute agents and information in their particular areas
`ofexpertise, to improved system capability. Further, users can
`extend the system a