`
`(12) United States Patent
`Kurganov et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,386.455 B2
`Jun. 10, 2008
`
`(54) ROBUST VOICE BROWSER SYSTEMAND
`VOICE ACTIVATED DEVICE CONTROLLER
`
`4,327,251 A
`4,340,783 A
`
`4, 1982 Fomenko
`7, 1982 Sugiyama
`
`Buffalo G
`der K
`Al
`t
`(75) I
`nventors: Alexander Kurganov, BuIIalo Jrove,
`IL (US); Valery Zhukoff, Deerfield, IL
`(US)
`
`(73) Assignee: Parus Holdings, Inc., Bannockbum, IL
`(US)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(*) Notice:
`
`21) Appl. No.: 11/409,703
`(21)
`pp
`9
`(22) Filed:
`Apr. 24, 2006
`
`(65)
`
`Prior Publication Data
`US 2006/0190265 A1
`Aug. 24, 2006
`
`Related U.S. Application Data
`(63) Continuation of application No. 10/821,690, filed on
`Apr. 9, 2004, now Pat. No. 7,076,431, which is a
`continuation of application No. 09/776,996, filed on
`Feb. 5, 2001, now Pat. No. 6,721,705.
`(60) Provisional application No. 60/233,068, filed on Sep.
`15, 2000, provisional application No. 60/180.344,
`filed on Feb. 4. 2OOO.
`(51) Int. Cl.
`(2006.01)
`GIOL 2/06
`704/270.1: 704/275: 379/88.01
`(52) U.S. Cl
`(58) Field of Classification search s
`7 O4/27 o 1
`704/275; 379/88.01
`See application file for complete search history.
`References Cited
`U.S. PATENT DOCUMENTS
`
`(56)
`
`4, 1973 Kraus
`3,728,486 A
`4,058,838 A 11/1977 Crager
`4,100,377 A
`7/1978 Flanagan
`4,313,035 A
`1, 1982 Jordan
`
`2f1983 Matthews
`4,371,752 A
`4,481,574. A 1 1/1984 DeFind
`4,489.438 A 12/1984 Hughes
`4,500,751 A
`2f1985 Darland
`4.513,390 A
`4, 1985 Walter
`
`(Continued)
`FOREIGN PATENT DOCUMENTS
`
`CA
`
`1329852
`
`5, 1994
`
`Continued
`(Continued)
`OTHER PUBLICATIONS
`
`“A PABX that Listens and Talks”. Speech Technology, Jan./Feb.
`1984, pp. 74-79.
`
`(Continued)
`Primary Examiner Susan McFadden
`74). Att
`Agent, Or Firm—Folev & Lardner LLP
`(74) Attorney, Agent, or Firm—Foley
`OC
`(57)
`ABSTRACT
`
`The present invention relates to a system for controlling at
`least one remote system operatively connected to the Inter
`net. The system includes a computer operatively connected
`to the Internet and a database operatively connected to the
`computer, the database storing an instruction set used to
`identify the remote system. In response to a speech com
`mand received from a user, the computer is configured to
`access the remote system to prompt the remote system to
`execute at least one pre-selected function.
`
`16 Claims, 4 Drawing Sheets
`
`
`
`BROWSNS
`SERVER
`
`BROWSNG
`
`Petitioner’s Ex. 1001, Page 1
`
`
`
`U.S. PATENT DOCUMENTS
`
`4,523,055
`4,549,047
`4,584,434
`4,585,906
`4,596,900
`4,602,129
`4,635,253
`4,652,700
`4,696,028
`4,713,837
`4,747,127
`4,748,656
`4,755,932
`4,757,525
`4,761,807
`4,763,317
`4,769,719
`4,771,425
`4,776,016
`4,782,517
`4,792.968
`4,799,144
`4,809,321
`4,811,381
`4,837,798
`4,847,891
`4,850,012
`4,866,758
`4,873,719
`4,879,743
`4,893,333
`4,893,335
`4,903,289
`4,905,273
`4,907,079
`4,918,722
`4,922,518
`4,922.520
`4,922.526
`4,926,462
`4,930, 150
`4,933,966
`4,935,955
`4,935,958
`4.941, 170
`4.942,598
`4,953.204
`4,955,047
`4.956,835
`4.959,854
`4,967.288
`4.969, 184
`4,972.462
`4,974,254
`4,975,941
`4,985,913
`4,994,926
`4,996,704
`5,003,575
`5,003,577
`5,008,926
`5,020,095
`5,027,384
`5,029,196
`5,036,533
`5,054,054
`5,065,254
`5,086,385
`5,095,445
`5,099,509
`5,109,405
`
`6, 1985
`10, 1985
`4, 1986
`4, 1986
`6, 1986
`T. 1986
`1, 1987
`3, 1987
`9, 1987
`12, 1987
`5, 1988
`5, 1988
`T. 1988
`T. 1988
`8, 1988
`8, 1988
`9, 1988
`9, 1988
`10, 1988
`11, 1988
`12, 1988
`1, 1989
`2, 1989
`3, 1989
`6, 1989
`7, 1989
`7, 1989
`9, 1989
`10, 1989
`11, 1989
`1, 1990
`1, 1990
`2, 1990
`2, 1990
`3, 1990
`4, 1990
`5, 1990
`5, 1990
`5, 1990
`5, 1990
`5, 1990
`6, 1990
`6, 1990
`6, 1990
`7, 1990
`7, 1990
`8, 1990
`9, 1990
`9, 1990
`9, 1990
`10, 1990
`11, 1990
`11, 1990
`11, 1990
`12, 1990
`1, 1991
`2, 1991
`2, 1991
`3, 1991
`3, 1991
`4, 1991
`5, 1991
`6, 1991
`7, 1991
`7, 1991
`10, 1991
`11, 1991
`2, 1992
`3, 1992
`3, 1992
`4, 1992
`
`Hohl
`Brian
`Hashimoto
`Matthews
`Jackson
`Matthews
`Urui
`Matthews
`Morganstein
`Gordon
`Hansen
`Gibbs
`Diedrich
`Matthews
`Matthews
`Lehman
`Endo
`Baran
`Hansen
`Bernardis
`Katz
`Parruck
`Morganstein
`Woo
`Cohen
`Kotani
`Mehta
`Heinzelmann
`Reese
`Burke
`Baran
`Fuller
`Hashimoto
`Gordon
`Turner
`Buehren
`Gordon
`Bernard
`Morganstein
`Ladd
`Katz
`Hird
`Neudorker
`Morganstein
`Herbst
`Davis
`Cuschleg, Jr.
`Morganstein
`Grover
`Cave
`Mizutori
`Gordon
`Shibata
`Perine
`Morganstein
`Shalom
`Gordon
`Brunsen
`Chamberlin
`Ertz
`Misholi
`Morganstein
`Morganstein
`Morganstein
`Carter
`Pessia
`Hishida
`Launey
`Sekiguchi
`Morganstein
`Morganstein
`
`US 7,386.455 B2
`Page 2
`
`5,128,984
`5,131,024
`5,133,004
`5,146,452
`5,166,974
`5, 193,110
`5, 195,086
`5,233,600
`5,243,643
`5,243,645
`5,249,219
`5,263,084
`5,291.302
`5,291,479
`5,303.298
`5,307.399
`5,309,504
`5,325.421
`5,327,486
`5,327,529
`5,329,578
`5,333,266
`5,347,574
`5,355,403
`5,365,524
`5,375, 161
`5,384,771
`5,404,231
`5,408.526
`5,414,754
`5,436,963
`5,459,584
`5,463,684
`5,475,791
`5,479,487
`5,495,484
`5,497,373
`5.499.288
`5,517,558
`5,526,353
`5,555,100
`5,559,611
`5,559,859
`5,566,236
`5,603,031
`5,603,786
`5,608,786
`5,610,910
`5,610,970
`5,611,031
`5,652,789
`5,657,376
`5,659,597
`5,666.401
`5,675,507
`5,675,811
`5,689,669
`5,692,187
`5,712,903
`5,719,921
`5,721,908
`5,724,408
`5,742,905
`5,752,191
`5,764,736
`5,787,298
`5,793,993
`5,799,065
`5,809,282
`5,812,796
`5,819,306
`5,822,727
`5,832,063
`
`7, 1992
`7, 1992
`7, 1992
`9, 1992
`11, 1992
`3, 1993
`3, 1993
`8, 1993
`9, 1993
`9, 1993
`9, 1993
`11, 1993
`3, 1994
`3, 1994
`4, 1994
`4, 1994
`5, 1994
`6, 1994
`T/1994
`T/1994
`T/1994
`T/1994
`9, 1994
`10, 1994
`11, 1994
`12, 1994
`1/1995
`4, 1995
`4, 1995
`5, 1995
`7, 1995
`10, 1995
`10, 1995
`12, 1995
`12, 1995
`2, 1996
`3, 1996
`3, 1996
`5, 1996
`6, 1996
`9, 1996
`9, 1996
`9, 1996
`10, 1996
`2, 1997
`2, 1997
`3, 1997
`3, 1997
`3, 1997
`3, 1997
`7/1997
`8, 1997
`8, 1997
`9, 1997
`10, 1997
`10, 1997
`11, 1997
`11, 1997
`1, 1998
`2, 1998
`2, 1998
`3, 1998
`4, 1998
`5, 1998
`6, 1998
`7, 1998
`8, 1998
`8, 1998
`9, 1998
`9, 1998
`10, 1998
`10, 1998
`11, 1998
`
`Katz
`Pugh
`Heileman, Jr.
`Pekarske
`Morganstein
`Jones
`Baumgartner
`Pekarske
`Sattar et al.
`Bissell
`Morganstein
`Chaput
`Gordon
`Vaziri
`Morganstein
`Dai
`Morganstein
`Hou
`Wolff
`Fults
`Brennan
`Boaz
`Morganstein
`Richardson, Jr.
`Hiller
`Fuller
`Isidoro
`Bloomfield
`McFarland
`Pugh
`Fitzpatrick
`Gordon
`Morduch
`Schalk
`Hammond
`Self
`Hulen
`Hunt
`Schalk
`Henley
`Bloomfield
`Bloomfield
`Dai
`MeLampy
`White
`Gordon
`Gordon
`Focsaneanu
`Fuller
`Hertzfeld
`Miner
`Espeut et al.
`Bareis
`Morganstein
`Bobo, II
`Broedner
`Lynch
`Goldman
`Bartholomew
`Vysotsky
`Lagarde
`Morganstein
`Pepe
`Fuller
`Shachar
`Broedner
`Broedner
`Junqua et al.
`Cooper
`Broedner
`Goldman
`Garberg
`Vysotsky
`
`Petitioner’s Ex. 1001, Page 2
`
`
`
`US 7,386.455 B2
`Page 3
`
`5,835,570 A 11/1998 Wattenbarger
`5,838,682 A 11/1998 Dekelbaum
`5,867,494 A
`2/1999 Krishnaswamy
`5,867.495 A
`2, 1999 Elliott
`5,873,080 A
`2f1999 Coden
`5,881,134 A
`3, 1999 Foster
`5,884,032 A
`3, 1999 Bateman
`5,884,262 A
`3, 1999 Wise
`5,890,123 A
`3, 1999 Brown
`5,915,001 A
`6/1999 Uppaluru
`5,943,399 A
`8, 1999 Bannister
`5,953,392 A
`9, 1999 Rhie
`5,974,413 A 10/1999 Beauregard
`5.999,525 A 12/1999 Krishnaswamy
`6,012,088 A
`1, 2000 Li
`6,014,437 A
`1/2000 Acker
`6,018,710 A
`1/2000 Wynblatt
`6,021,181 A
`2/2000 Miner
`6,031,904 A
`2/2000 An
`6,038,305 A
`3, 2000 McAllister
`6,047,053 A
`4/2000 Miner
`6,067,516 A
`5/2000 Levay
`6,078,580 A
`6, 2000 Mandalia
`6,081,518 A
`6/2000 Bowman-Amuah
`6,091,808 A
`7, 2000 Wood
`6,104,803 A
`8, 2000 Weser
`6,115,742 A
`9, 2000 Franklin
`6, 195,357 B1
`2/2001 Polcyn
`6,208.638 B1
`3/2001 Rieley
`6,233,318 B1
`5, 2001 Picard
`6,243,373 B1
`6, 2001 Turock
`6,252,944 B1
`6/2001 Hansen, II
`6,269,336 B1
`7, 2001 Ladd
`6,285,745 B1
`9/2001 Bartholomew
`6,366,578 B1
`4/2002 Johnson
`6,424,945 B1
`7/2002 Sorsa ...................... TO4/270.1
`6,446,076 B1
`9/2002 Burkey
`6,477,420 B1
`1 1/2002 Struble et al.
`6,505,163 B1
`1/2003 Zhang
`6,529,948 B1
`3/2003 Bowman-Amuah
`6,546,393 B1
`4/2003 Khan
`6,584,439 B1* 6/2003 Geilhufe et al. ..... ... 704, 270
`6,721,705 B2 * 4/2004 Kurganov et al. ....... TO4/270.1
`6,775,264 B1
`8/2004 Kurganov
`... 715,811
`6,964,023 B2 * 1 1/2005 Maes et al. ....
`704/270.1
`7.003463 B1* 2/2006 Maes et al. .......
`7/2006 Kurganov et al. .......... 704/275
`7,076.431 B2*
`2001/0048676 A1
`12/2001 Jimenez
`
`
`
`FOREIGN PATENT DOCUMENTS
`
`GB
`GB
`GB
`JP
`WO
`WO
`WO
`WO
`WO
`
`2211 698 A
`2240 693
`2317782. A
`1-2.58526
`WO 91,07838
`WO 91 18466
`WO 96,09710
`WO 97.37481
`WO 98.23058
`
`7, 1989
`8, 1991
`1, 1998
`10, 1989
`5, 1991
`11, 1991
`3, 1996
`10, 1997
`5, 1998
`
`OTHER PUBLICATIONS
`
`AT&T, Press Release, "AT&T Customers Can Teach Systems to
`Listen and Respond to Voice”. Jan. 17, 1995, pp. 1-2, Basking
`Ridge, NJ., available at www.lucent.com/press/0195/950117.gbb.
`html (accessed Mar. 15, 2005).
`“Business Phone Systems for Advanced Offices', NTT Review, vol.
`2 (6), Nov. 1990, pp. 52-54.
`Bellcore Technology Licensing, “The Electronic Receptionist - A
`Knowledge-Based Approach to Personal Communications”, 1994,
`pp. 1-8.
`
`Brachman et al., "Fragmentation in Store-and-Forward Message
`Transfer”. IEEE Communications Magazine, vol. 26(7), Jul. 1998,
`pp. 18-27.
`Cole et al., “An Architecture for a Mobile OSI Mail Access
`System”, IEEE Journal on Selected Areas in Communications, vol.
`7 (2), Feb. 1989, pp. 249-256.
`“Data Communications Networks: Message Handling Systems”.
`Fasciele, VIII. 7-Recommendations X.400-X.430, 38 pages, date
`unknown.
`DAX Systems, Inc., Press Release, “Speech Recognition Success in
`DAX's Grasp, Nov. 22, 1995, pp. 1-2, Pine Brook, NJ.
`Faxpak Store and Forward Facsimile Transmission Service, Elec
`trical Communication, vol. 54 (3), 1979, pp. 251-55.
`Garcia et al., “Issues in Multimedia Computter-Based Message
`Systems Design and Standardization”, NATO ASI Series, vol. 1-6,
`1984, 18 pgs.
`Telecommunications
`Global
`IEEE
`85
`“Globecom
`Conference.”New Orleans, LA., Dec. 2-5, 1985, pp. 1295-1300.
`Hemphill et al., “Sepeech-Aware Multimedia.” IEEE MultiMedia,
`Spring 1996, vol. 3, No. 1, pp. 74-78, IEEE. As indicated on the
`cover page of the journal, a copy of which is attached hereto as
`Attachment 4, the reference was received by Cornell University on
`Mar. 25, 1996.
`Hunt et al., “Long-Distance Remote Control to the Rescue”. Chi
`cago Tribune, Jun. 15, 2002, Section 4, p. 15.
`Print outs of the Internet web site, "Wildfire Communications, Inc.,'
`Nov. 5, 1997, including print outs of the following web pages:
`http://www.wildfire.com
`(1
`pg);
`http://www.wildfire.com/
`consumerhome.html (2 pgs.); http://www.wildfire.com/106.html (2
`pgs.); http://www.wildfire.com/carrierhome.html (2 pgs.); http://
`www.wildfire.com/sfandb.html (3 pgs); http://www.wildfire.com/
`about.html (1 pg.); http://www.wildfire.com/abtmgmt.html (3 pgs);
`http://www.wildfire.com/scoop.html (2 pgs.); and http://www.wild
`fire.comintel.html (1 pg.); and http://www.wildfire.com/msft.html
`(2 pgs).
`“Introducing PIC SuperFax. First PC/Fax System to Run Under
`Windows'. Pacific Image Communications, Pasadena, CA. Date
`Unknown, (received at COMDEX show, Nov. 3, 1987). 4 pgs.
`Kubala et al., “BYBLOS Speech Recognition Benchmark Results'.
`Workshop on Speech & Natural Language, Feb. 19-22, 1991.
`According to the web site
`http://portal.acm.org/citation.
`cfm?id=1124.05.112415&coll . . . . attached hereto as Attachment 3,
`the reference was published in 1991, Morgan Kaufman Publishers,
`San Francisco, CA. The distribution date is not presently known.
`Ly, "Chatter: A Conversational Telephone Agent'. Submitted to
`Program in Media Arts & Sciences, MIT, 1993, pp. 1-130.
`Maeda, et al., “An Intelligent Customer-Controlled Switching Sys
`tem”, IEEE Global Telecommunications Conference, Hollywood,
`Florida, Nov. 28-Dec. 1, 1988, pp. 1499-1503.
`Markowitz, J., “The Ultimate Computer Input Device May Be Right
`Under Your Nose'. Byte, Dec. 1995, pp. 1-13, available at www.
`byte.com/art/9512/sec8/art1.htm (accessed Mar. 15, 2005).
`Marx et al., “Mail Call: Message Presentation and Navigation in a
`Nonvisual Environment.” SIGCHI Conference on Human Factors
`in Computing Systems, Vancouver, B.C., Canada, Apr. 13-18, 1996.
`As shown on Attachment 2, the website http:www.usabilityviews.
`com/uv001673.html shows date of Apr. 16, 1996. The distribution
`date is not presently known.
`Marx. M.. “Toward Effective Conversational Messaging” (Thesis).
`As indicated on the cover page, the Sciences, School of Architecture
`and Planning, Massachusetts Institute of Technology on May 12,
`1995. According to the web site http://www.thesis.mit.edu/Dienst/
`Repository/2.0/Body/0018.mit.theses/1995-314/rfe
`1807bib,
`attached hereto as Attachment 1, the thesis was indexed on Mar. 21,
`2000.
`Perdue et al., “Conversant(R) 1 Voice System: Architecture and
`Applications', Jul. 17, 1986, AT&T Technical Journal, pp. 1-14.
`“Proceedings of the IFIP World Computer Congress', Dublin,
`Ireland, Sep. 1-5, 1986.
`Sartori, M., “Speech Recognition”, Apr. 1995, pp. 1-9, Mercury
`Communications, available at www.gar.co.uk/technology watch/
`speech.htm (accessed Mar. 15, 2005).
`
`Petitioner’s Ex. 1001, Page 3
`
`
`
`US 7,386.455 B2
`Page 4
`
`Schmandt et al., “A Conversational Telephone Messaging Systems',
`IEEE Transactions on Consumer Electronics, 1984, vol. CE-30, No.
`3, pp. XXi-XXiv.
`Schmandt et al., “Phone Shell: The Telephone as a Computer
`Terminal', ACM Multimedia, 1993, 11 pgs.
`Schmandt et al., “Phone Slave: A Graphical Telecommunictions
`Interface”. Proceedings of the SID, 1985, vol. 26/1, pp. 79-82.
`"Secretarial Branch Exchanged”. IBM Technical Disclosure Bulle
`tin, vol. 26 (5), Oct. 1983, pp. 2645-47.
`Shimamura, et al., “Review of the Electrical Communication Labo
`ratories', vol. 418 (33), No. 1, Tokyo, Japan, 1985, pp. 31-39.
`
`“Wildfire Communication, Inc.'', Harvard Business School, Mar.
`21, 1996, Publ. No. 9-396-305, pp. 1-22.
`“WordPerfect: New Telephony Features Boost Office”, WordPerfect
`Office TechBrief, 1994, Info-World Publishing. Co., vol. 10, Issue
`2, pp. 2-3.
`Yang, C., “INETPhone - Telephone Services and Servers on the
`Internet”, Apr. 1995, University of North Texas, pp. 1-6.
`
`* cited by examiner
`
`Petitioner’s Ex. 1001, Page 4
`
`
`
`U.S. Patent
`
`Jun. 10, 2008
`
`Sheet 1 of 4
`
`US 7,386.455 B2
`
`I NEIS™ENT
`
`Z??!
`
`O |
`
`Ë0
`
`r- — —
`
`|90||
`
`| |
`
`| | | | | |
`
`| 701
`
`MEWN?S
`
`
`
`ØNISMO}}9 {{BM
`
`Petitioner’s Ex. 1001, Page 5
`
`
`
`U.S. Patent
`
`Jun. 10, 2008
`
`Sheet 2 of 4
`
`US 7,386.455 B2
`
`200
`
`
`
`RANK
`NUMBER
`
`EXTRACTIONAGENT
`COMMAND
`
`TMESTAMP
`
`202
`
`204
`
`206
`
`208
`
`6
`
`SPEECH RECOGNITION
`ENGINE
`
`SPEECH SYNTHESIS
`ENGINE
`
`VRAPPLICATION
`
`CALL PROCESSING SYSTEM
`
`TELEPHONY AND VOICE
`HARDWARE
`
`MEDIASERVER
`
`300
`
`302
`
`304
`
`306
`
`308
`
`FIG. 3
`
`Petitioner’s Ex. 1001, Page 6
`
`
`
`U.S. Patent
`
`Jun. 10, 2008
`
`Sheet 3 of 4
`
`US 7,386.455 B2
`
`
`
`
`
`102
`
`CONTENTEXTRACTION
`AGEN
`
`CONTENT FETCHER
`
`POLING AND RANKINGAGENT
`
`CONTENT DESCRIPTOR FILES
`
`WEB BROWSING SERVER
`
`400
`
`402
`
`406
`
`FIG. 4
`
`Petitioner’s Ex. 1001, Page 7
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jun. 10, 2008
`Jun. 10, 2008
`
`Sheet 4 of 4
`Sheet 4 of 4
`
`US 7,386.455 B2
`US 7,386,455 B2
`
`YOA
`
`'
`
`AL
`
`
`
`YyOOGLNO
`
`yxOOCNI
`
`ONILHON
`
`SNILHON
`
`20S
`
`ALIMNDAS
`WALSAS
`“200g
`
` NS0IAR0co96
`
`SOIASG
`
`ONISMONS
`
`YSANaS
`
`dOIAAG
`
`SNISMOUS
`
`YaAYSS
`
`vida
`
`YSAN3S
`
`asvavivd
`
`Petitioner's Ex. 1001, Page 8
`
`Petitioner’s Ex. 1001, Page 8
`
`
`
`
`
`1.
`ROBUST VOICE BROWSER SYSTEMAND
`VOICE ACTIVATED DEVICE CONTROLLER
`
`US 7,386,455 B2
`
`2
`mation on the web site. Therefore, a limited number of web
`sites are available that are accessible by these web-enabled
`PDAs. Finally, it is very common today for users to carry
`cell phones, however, users must also carry a separate PDA
`if they require the ability to gather information from various
`web sites. Users are therefore subjected to added expenses
`since they must pay for both cellular telephone service and
`also for the web-enabling service for the PDA. This results
`in a very expensive alternative for the consumer.
`The third alternative mentioned above is the use of
`web-phones or web-pagers. These devices suffer many of
`the same drawbacks as PDAs. First, these devices are
`expensive to purchase. Further, the number of web sites
`accessible to these devices is limited since web sites must be
`specifically designed to allow access by these devices.
`Furthermore, users are often required to pay an additional
`fee in order to gain wireless web access. Again, this service
`is expensive. Another drawback of these web-phones or
`web-pagers is that as technology develops, the methods used
`by the various web sites to allow access by these devices
`may change. These changes may require users to purchase
`new web-phones or web-pagers or have the current device
`serviced in order to upgrade the firmware or operating
`system stored within the device. At the least, this would be
`inconvenient to users and may actually be quite expensive.
`Therefore, a need exists for a system that allows users to
`easily access and browse the Internet from any location.
`Such a system would only require users to have access to
`any type of telephone and would not require users to
`subscribe to multiple services.
`In the rapidly changing area of Internet applications, web
`sites change frequently. The design of the web site may
`change, the information required by the web site in order to
`perform searches may change, and the method of reporting
`search results may change. Web browsing applications that
`Submit search requests and interpret responses from these
`web sites based upon expected formats will produce errors
`and useless responses when Such changes occur. Therefore,
`a need exists for a system that can detect modifications to
`web sites and adapt to Such changes in order to quickly and
`accurately provide the information requested by a user
`through a voice enabled device. Such as a telephone.
`When users access web sites using devices such as
`personal computers, delays in receiving responses are tol
`erated and are even expected, however, such delays are not
`expected when a user communicates with a telephone. Users
`expect communications over a telephone to occur immedi
`ately with a minimal amount of delay time. A user attempt
`ing to find information using a telephone expects immediate
`responses to his search requests. A system that introduces
`too much delay between the time a user makes a request and
`the time of response will not be tolerated by users and will
`lose its usefulness. Therefore, it is important that a voice
`browsing system that uses telephonic communications
`selects web sites that provide rapid responses since speed is
`an important factor for maintaining the system's desirability
`and usability. Therefore, a need exists for a system that
`accesses web sites based upon their speed of operation.
`
`SUMMARY OF THE INVENTION
`
`It is an object of an embodiment of the present invention
`to allow users to gather information from web sites by using
`voice enabled devices, such as wireline or wireless tele
`phones.
`An additional object of an embodiment of the present
`invention is to provide a system and method that allows the
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application is a continuation of U.S. patent applica
`tion Ser. No. 10/821,690, filed Apr. 9, 2004, now U.S. Pat.
`No. 7,076,431 now allowed, which is a continuation of U.S.
`patent application Ser. No. 09/776,996, filed Feb. 5, 2001
`and issued as U.S. Pat. No. 6,721,705 on Apr. 13, 2004,
`which claims the benefit of priority to U.S. Provisional
`Application No. 60/180,344, filed Feb. 4, 2000, entitled
`“Voice-Activated Information Retrieval System” and U.S.
`Provisional Application No. 60/233,068, filed Sep. 15, 2000,
`entitled "Robust Voice Browser System and Voice Activated
`Device Controller, all of which are herein incorporated by
`reference in their entirety.
`
`10
`
`15
`
`FIELD OF THE INVENTION
`
`The present invention relates to a robust and highly
`reliable system that allows users to browse web sites and
`retrieve information by using conversational Voice com
`mands. Additionally, the present invention allows users to
`25
`control and monitor other systems and devices that are
`connected the Internet or any other network by using voice
`commands.
`
`BACKGROUND OF THE INVENTION
`
`Currently, three options exist for a user who wishes to
`gather information from a web site accessible over the
`Internet. The first option is to use a desktop or a laptop
`computer connected to a telephone line via a modem or
`connected to a network with Internet access. The second
`option is to use a Personal Digital Assistant (PDA) that has
`the capability of connecting to the Internet either through a
`modem or a wireless connection. The third option is to use
`one of the newly designed web-phones or web-pagers that
`are now being offered on the market. Although each of these
`options provide a user with access to the Internet to browse
`web sites, each of them have their own drawbacks.
`Desktop computers are very large and bulky and are
`difficult to transport. Laptop computers solve this inconve
`nience, but many are still quite heavy and are inconvenient
`to carry. Further, laptop computers cannot be carried and
`used everywhere a user travels. For instance, if a user wishes
`to obtain information from a remote location where no
`electricity or communication lines are installed, it would be
`nearly impossible to use a laptop computer. Oftentimes,
`information is needed on an immediate basis where a
`computer is not accessible. Furthermore, the use of laptop or
`desktop computers to access the Internet requires either a
`direct or a dial-up connection tan an Internet Service Pro
`vider (ISP). Oftentimes, such connections are not available
`when a user desires to connect to the Internet to acquire
`information.
`The second option for remotely accessing web sites is the
`use of PDAs. These devices also have their own set of
`drawbacks. First, PDAs with the ability to connect to the
`Internet and access web sites are not readily available. As a
`result, these PDAs tend to be very expensive. Furthermore,
`users are usually required to pay a special service fee to
`enable the web browsing feature of the PDA. A further
`disadvantage of these PDAS is that web sites must be
`specifically designed to allow these devices to access infor
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Petitioner’s Ex. 1001, Page 9
`
`
`
`US 7,386,455 B2
`
`3
`searching andretrieving of publicly available information by
`controlling a web browsing server using naturally spoken
`voice commands.
`
`is an object of another embodiment of the present
`Tt
`invention to provide a robust voice browsing system that can
`obtain the same information from several web sites based
`upon a ranking order. The ranking order is automatically
`adjusted if the system detects that a given web site is not
`functioning, is too slow, or has been modified in such a way
`that
`the requested information cannot be retrieved any
`longer.
`A still further object of an embodiment of the present
`invention is to allow users to gather information from web
`sites from any location where a telephonic connection can be
`made.
`Another object of an embodimentof the present invention
`is to allows users to browse web sites on the Internet using
`conversational voice commands spoken into wireless or
`wireline telephones or other voice enabled devices.
`An additional object an embodimentof the present inven-
`tion is to provide a system and method for using voice
`commands to control and monitor devices connected to a
`network.
`
`It is an object of an embodimentofthe present invention
`to provide a system and method which allows devices
`connected to a network to be controlled by conversational
`voice commands spoken into any voice enabled device
`interconnected with the same network.
`
`The present invention relates to a system for acquiring
`information from sources on a network, such as the Internet.
`A voice browsing system maintains a database containing a
`list of information sources, such as web sites, connected to
`a network. Each of the information sources is assigned a
`rank number which is listed in the database along with the
`record for the information source. In response to a speech
`command received from a user, a network interface system
`accesses the information source with the highest rank num-
`ber in order to retrieve information requested by the user.
`The a preferred embodiment of the present
`invention
`allows users to access and browse web sites when they do
`not have access to computers with Internet access. This is
`accomplished by providing a voice browsing system and
`methodthat allows users to browse web sites using conver-
`sational voice commands spoken into any type of voice
`enabled device (i.e., any type of wireline or wireless tele-
`phone, IP phone, wireless PDA, or other wireless device).
`These spoken commands are then converted into data mes-
`sages by a speech recognition software engine running on a
`user interface system. These data messages are then sent to
`and processed by a network interface system. This network
`interface system then generates the proper requests that are
`transmitted to the desired web site over the Internet.
`
`Responsessent from the website are received and processed
`by the network interface system and then converted into an
`audio message via a speech synthesis engine or a pre-
`recorded audio concatenation application andfinally trans-
`mitted to the user’s voice enabled device.
`
`A preferred embodimentof the voice browser system and
`method uses a web site polling and ranking methodology
`that allows the system to detect changes in web sites and
`adapt to those changes in real-time. This enables the voice
`browser system of a preferred embodimentto deliver highly
`reliable information to users over any voice enabled device.
`This ranking system also enables the present invention to
`provide rapid responses to user requests. Long delays before
`receiving responses to requests are not tolerated by users of
`voice-based systems, such as telephones. When a user
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`speaks into a telephone, an almost immediate response is
`expected. This expectation does not exist for non-voice
`communications, such as email transmissions or accessing a
`web site using a personal computer. In such situations, a
`reasonable amount of transmission delay is acceptable. The
`ranking system of implemented by a preferred embodiment
`of the present invention ensures users will always receive
`the fastest possible response to their request.
`invention
`An alternative embodiment of the present
`allowsusers to control and monitor the operation of a variety
`of household devices connected to a network using speech
`commands spoken into a voice enabled device.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG.1 is a depiction of the voice browsing system of the
`first embodiment of the present invention;
`FIG.2 is a block diagram of a database record used by the
`first preferred embodiment of the present invention;
`FIG. 3 is a block diagram of a media server used by the
`preferred embodiment;
`FIG.4 is a block diagram of a web browsing server used
`by the preferred embodiment; and
`FIG. 5 is a depiction of the device browsing system of the
`second embodiment of the present invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`
`A first embodiment of the present invention is a system
`and method for allowing users to browse information
`sources, such as web sites, by using naturally spoken,
`conversational voice commandsspokeninto a voice enabled
`device. Users are not required to learn a special language or
`command set
`in order to communicate with the voice
`
`browsing system of the present invention. Common and
`ordinary commands and phrasesare all that is required for
`a user to operate the voice browsing system. The voice
`browsing system recognizes naturally spoken voice com-
`mands and is speaker-independent; it does not have to be
`trained to recognize the voice patterns of each individual
`user. Such speech recognition systems use phonemes to
`recognize spoken words and not predefined voice patterns.
`Thefirst embodiment allows users to select from various
`
`categories of information and to search those categories for
`desired data by using conversational voice commands. The
`voice browsing system of the first preferred embodiment
`includes a user interface system referred to as a media
`server. The media server contains a speech recognition
`software engine. This speech recognition engine is used to
`recognize natural, conversational voice commands spoken
`by the user and converts them into data messages based on
`the available recognition grammar. These data messages are
`then sent to a network interface system.In the first preferred
`embodiment, the network interface system is referred to as
`a web browsing server. The web browsing server then
`accesses the appropriate information source, such as a web
`site, to gather information requested by the user.
`Responsesreceived from the information sources are then
`transferred to the media server where speech synthesis
`engine converts the responses into audio messagesthat are
`transmitted to the user. A more detailed description of this
`embodimentwill now be provided.
`Referring to FIG. 1, a database 100 designed by Webley
`Systems Incorporated is connected to one or more web
`browsing servers 102 as well as to one or more media
`servers 106. The database may store information on mag-
`Petitioner's Ex. 1001, Page 10
`
`Petitioner’s Ex. 1001, Page 10
`
`
`
`5
`netic media, such as a hard disk drive, or it may store
`information via other widely acceptable methods for storing
`data, such as optical disks. The database 100 contains a
`separate set of records for each web site accessible by the
`system. An example of a web site record is shown in FIG.
`2. Each web site record 200 contains the rank number of the
`web site 202, the associated Uniform Resource Locator
`(URL) 204, and a command that enables the appropriate
`“extraction agent” 206 that is required in order to generate
`proper requests sent to and to format data received from the
`web site. The database record 200 also contains the times
`tamp 208 indicating the last time the web site was accessed.
`The extraction agent is described in more detail below. The
`database 100 categorizes each database record 200 accord
`ing to the type of information provided by each web site. For
`instance, a first category of database records 200 may
`correspond to web sites that provide “weather' information.
`The database 100 may also contain a second category of
`records 200 for web sites that provide “stock” information.
`These categories may be further divided into subcategories.
`For instance, the “weather category may contain subcat
`egories depending upon type of weather information avail
`able to a user, such as "current weather' or "extended
`forecast'. Within the “extended forecast” subcategory, a list
`of web site records may be stored that provide weather
`25
`information for multiple days. The use of subcategories may
`allow the web browsing feature to provide more accurate.
`relevant, and up-to-date information to the user by accessing
`the most relevant web site. The number of records contained
`in each category or subcategory is not limited. In the
`preferred embodiment, three web site records are provided
`for each category.
`Table 1 below depicts two database records 200 that are
`used with the preferred embodiment. These records also
`contain a field indicating the “category of the record, which
`is “weather' in each of these examples.
`
`10
`
`15
`
`30
`
`35
`
`US 7,386,455 B2
`
`6
`work (PSTN) 116. In the preferred embodiment, each media
`server is based upon Intel's Dual Pentium III 730 MHz
`microprocessor System.
`The speech recognition function is performed by a speech
`recognition engine 300 that converts Voice commands
`received from the user's voice enabled device 112 (i.e., any
`type of wireline or wireless telephone, Internet Protocol (IP)
`phones, or other special wireless units) into data messages.
`In the preferred embodiment, voice commands and audio
`messages are transmitted using the PSTN 116 and data is
`transmitted using the TCP/IP communications protocol.
`However, one skilled in the art would recognize that other
`transmission protocols may be used for either voice or data.
`Other possible transmission protocols would include SIP