`US0073 86455B2
`
`c12) United States Patent
`Kurganov et al.
`
`(IO) Patent No.:
`(45) Date of Patent:
`
`US 7,386,455 B2
`Jun.10,2008
`
`(54) ROBUST VOICE BROWSER SYSTEM AND
`VOICE ACTIVATED DEVICE CONTROLLER
`
`(75)
`
`Inventors: Alexander Kurganov, Buffalo Grove,
`IL (US); Valery Zhukoff, Deerfield, IL
`(US)
`
`(73) Assignee: Parus Holdings, Inc., Bannockburn, IL
`(US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by O days.
`
`(21) Appl. No.: 11/409,703
`
`(22) Filed:
`
`Apr. 24, 2006
`
`(65)
`
`Prior Publication Data
`
`US 2006/0190265 Al
`
`Aug. 24, 2006
`
`Related U.S. Application Data
`
`(63) Continuation of application No. 10/821,690, filed on
`Apr. 9, 2004, now Pat. No. 7,076,431, which is a
`continuation of application No. 09/776,996, filed on
`Feb. 5, 2001, now Pat. No. 6,721,705.
`
`(60) Provisional application No. 60/233,068, filed on Sep.
`15, 2000, provisional application No. 60/180,344,
`filed on Feb. 4, 2000.
`
`(51)
`
`Int. Cl.
`(2006.01)
`GJOL 21106
`(52) U.S. Cl. ................. 704/270.1; 704/275; 379/88.01
`(58) Field of Classification Search ............. 704/270.1,
`704/275; 379/88.01
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`3,728,486 A
`4,058,838 A
`4,100,377 A
`4,313,035 A
`
`4/ 1973 Kraus
`ll/ 1977 Crager
`7 / 1978 Flanagan
`l/ 1982 Jordan
`
`4,327,251 A
`4,340,783 A
`4,371,752 A
`4,481,574 A
`4,489,438 A
`4,500,751 A
`4,513,390 A
`
`4/1982 Fomenko
`7/1982 Sugiyama
`2/1983 Matthews
`11/1984 DeFind
`12/1984 Hughes
`2/1985 Darland
`4/1985 Walter
`
`(Continued)
`
`FOREIGN PATENT DOCUMENTS
`
`CA
`
`1329852
`
`5/1994
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`"A PABX that Listens and Talks", Speech Technology, Jan./Feb.
`1984, pp. 74-79.
`
`(Continued)
`
`Primary Examiner-Susan McFadden
`(74) Attorney, Agent, or Firm-Foley & Lardner LLP
`
`(57)
`
`ABSTRACT
`
`The present invention relates to a system for controlling at
`least one remote system operatively connected to the Inter(cid:173)
`net. The system includes a computer operatively connected
`to the Internet and a database operatively connected to the
`computer, the database storing an instruction set used to
`identify the remote system. In response to a speech com(cid:173)
`mand received from a user, the computer is configured to
`access the remote system to prompt the remote system to
`execute at least one pre-selected function.
`
`16 Claims, 4 Drawing Sheets
`
`-r
`
`514
`
`I
`506:
`I
`I
`I
`I
`
`504
`
`504
`
`Petitioner Google Ex-1001, 0001
`
`
`
`U.S. PATENT DOCUMENTS
`
`4,523,055 A
`4,549,047 A
`4,584,434 A
`4,585,906 A
`4,596,900 A
`4,602,129 A
`4,635,253 A
`4,652,700 A
`4,696,028 A
`4,713,837 A
`4,747,127 A
`4,748,656 A
`4,755,932 A
`4,757,525 A
`4,761,807 A
`4,763,317 A
`4,769,719 A
`4,771,425 A
`4,776,016 A
`4,782,517 A
`4,792,968 A
`4,799,144 A
`4,809,321 A
`4,811,381 A
`4,837,798 A
`4,847,891 A
`4,850,012 A
`4,866,758 A
`4,873,719 A
`4,879,743 A
`4,893,333 A
`4,893,335 A
`4,903,289 A
`4,905,273 A
`4,907,079 A
`4,918,722 A
`4,922,518 A
`4,922,520 A
`4,922,526 A
`4,926,462 A
`4,930,150 A
`4,933,966 A
`4,935,955 A
`4,935,958 A
`4,941,170 A
`4,942,598 A
`4,953,204 A
`4,955,047 A
`4,956,835 A
`4,959,854 A
`4,967,288 A
`4,969,184 A
`4,972,462 A
`4,974,254 A
`4,975,941 A
`4,985,913 A
`4,994,926 A
`4,996,704 A
`5,003,575 A
`5,003,577 A
`5,008,926 A
`5,020,095 A
`5,027,384 A
`5,029,196 A
`5,036,533 A
`5,054,054 A
`5,065,254 A
`5,086,385 A
`5,095,445 A
`5,099,509 A
`5,109,405 A
`
`6/1985 Hohl
`10/1985 Brian
`4/1986 Hashimoto
`4/1986 Matthews
`6/1986 Jackson
`7/1986 Matthews
`1/1987 Urui
`3/1987 Matthews
`9/1987 Morganstein
`12/1987 Gordon
`5/1988 Hansen
`5/1988 Gibbs
`7/1988 Diedrich
`7/1988 Matthews
`8/1988 Matthews
`8/1988 Lehman
`9/1988 Endo
`9/1988 Baran
`10/1988 Hansen
`11/1988 Bernardis
`12/1988 Katz
`1/1989 Parruck
`2/1989 Morganstein
`3/1989 Woo
`6/1989 Cohen
`7/1989 Kotani
`7/1989 Mehta
`9/1989 Heinzelmann
`10/1989 Reese
`11/1989 Burke
`1/1990 Baran
`1/1990 Fuller
`2/1990 Hashimoto
`2/1990 Gordon
`3/1990 Turner
`4/1990 Buehren
`5/1990 Gordon
`5/1990 Bernard
`5/1990 Morganstein
`5/1990 Ladd
`5/1990 Katz
`6/1990 Hird
`6/1990 Neudorker
`6/1990 Morganstein
`7/1990 Herbst
`7/1990 Davis
`8/1990 Cuschleg, Jr.
`9/1990 Morganstein
`9/1990 Grover
`9/1990 Cave
`10/1990 Mizutori
`11/1990 Gordon
`11/1990 Shibata
`11/1990 Perine
`12/1990 Morganstein
`1/1991 Shalom
`2/1991 Gordon
`2/1991 Brunsen
`3/1991 Chamberlin
`3/1991 Ertz
`4/1991 Misholi
`5/1991 Morganstein
`6/1991 Morganstein
`7/1991 Morganstein
`7/1991 Carter
`10/1991 Pessia
`11/1991 Hishida
`2/1992 Launey
`3/1992 Sekiguchi
`3/1992 Morganstein
`4/1992 Morganstein
`
`US 7,386,455 B2
`Page 2
`
`5,128,984 A
`5,131,024 A
`5,133,004 A
`5,146,452 A
`5,166,974 A
`5,193,110 A
`5,195,086 A
`5,233,600 A
`5,243,643 A
`5,243,645 A
`5,249,219 A
`5,263,084 A
`5,291,302 A
`5,291,479 A
`5,303,298 A
`5,307,399 A
`5,309,504 A
`5,325,421 A
`5,327,486 A
`5,327,529 A
`5,329,578 A
`5,333,266 A
`5,347,574 A
`5,355,403 A
`5,365,524 A
`5,375,161 A
`5,384,771 A
`5,404,231 A
`5,408,526 A
`5,414,754 A
`5,436,963 A
`5,459,584 A
`5,463,684 A
`5,475,791 A
`5,479,487 A
`5,495,484 A
`5,497,373 A
`5,499,288 A
`5,517,558 A
`5,526,353 A
`5,555,100 A
`5,559,611 A
`5,559,859 A
`5,566,236 A
`5,603,031 A
`5,603,786 A
`5,608,786 A
`5,610,910 A
`5,610,970 A
`5,611,031 A
`5,652,789 A
`5,657,376 A
`5,659,597 A
`5,666,401 A
`5,675,507 A
`5,675,811 A
`5,689,669 A
`5,692,187 A
`5,712,903 A
`5,719,921 A
`5,721,908 A
`5,724,408 A
`5,742,905 A
`5,752,191 A
`5,764,736 A
`5,787,298 A
`5,793,993 A
`5,799,065 A
`5,809,282 A
`5,812,796 A
`5,819,306 A
`5,822,727 A
`5,832,063 A
`
`7/1992 Katz
`7/1992 Pugh
`7/1992 Heileman, Jr.
`9/1992 Pekarske
`11/1992 Morganstein
`3/1993 Jones
`3/1993 Baumgartner
`8/1993 Pekarske
`9/1993 Sattar et al.
`9/1993 Bissell
`9/1993 Morganstein
`11/1993 Chaput
`3/1994 Gordon
`3/1994 Vaziri
`4/1994 Morganstein
`4/1994 Dai
`5/1994 Morganstein
`6/1994 Hou
`7/1994 Wolff
`7/1994 Fults
`7/1994 Brennan
`7/1994 Boaz
`9/1994 Morganstein
`10/1994 Richardson, Jr.
`11/1994 Hiller
`12/1994 Fuller
`1/1995 Isidoro
`4/1995 Bloomfield
`4/1995 McFarland
`5/1995 Pugh
`7/1995 Fitzpatrick
`10/1995 Gordon
`10/1995 Morduch
`12/1995 Schalk
`12/1995 Hanunond
`2/1996 Self
`3/1996 Hulen
`3/1996 Hunt
`5/1996 Schalk
`6/1996 Henley
`9/1996 Bloomfield
`9/1996 Bloomfield
`9/1996 Dai
`10/1996 MeLampy
`2/1997 White
`2/1997 Gordon
`3/1997 Gordon
`3/1997 Focsaneanu
`3/1997 Fuller
`3/1997 Hertzfeld
`7/1997 Miner
`8/1997 Espeut et al.
`8/1997 Bareis
`9/1997 Morganstein
`10/1997 Bobo, II
`10/1997 Broedner
`11/1997 Lynch
`11/1997 Goldman
`1/1998 Bartholomew
`2/1998 Vysotsky
`2/1998 Lagarde
`3/1998 Morganstein
`4/1998 Pepe
`5/1998 Fuller
`6/1998 Shachar
`7/1998 Broedner
`8/1998 Broedner
`8/1998 Junqua et al.
`9/1998 Cooper
`9/1998 Broedner
`10/1998 Goldman
`10/1998 Garberg
`11/1998 Vysotsky
`
`Petitioner Google Ex-1001, 0002
`
`
`
`US 7,386,455 B2
`Page 3
`
`5,835,570 A
`11/1998 Wattenbarger
`5,838,682 A
`11/1998 Dekelbaum
`5,867,494 A
`2/1999 Krishnaswamy
`5,867,495 A
`2/1999 Elliott
`5,873,080 A
`2/1999 Coden
`3/1999 Foster
`5,881,134 A
`5,884,032 A
`3/1999 Bateman
`5,884,262 A
`3/1999 Wise
`5,890,123 A
`3/1999 Brown
`6/1999 Uppaluru
`5,915,001 A
`5,943,399 A
`8/1999 Bannister
`5,953,392 A
`9/1999 Rhie
`10/1999 Beauregard
`5,974,413 A
`5,999,525 A
`12/1999 Krishnaswamy
`6,012,088 A
`1/2000 Li
`1/2000 Acker
`6,014,437 A
`1/2000 Wynblatt
`6,018,710 A
`2/2000 Miner
`6,021,181 A
`2/2000 An
`6,031,904 A
`6,038,305 A
`3/2000 McAllister
`6,047,053 A
`4/2000 Miner
`6,067,516 A
`5/2000 Levay
`6,078,580 A
`6/2000 Mandalia
`6/2000 Bowman-Amuah
`6,081,518 A
`6,091,808 A
`7/2000 Wood
`8/2000 Weser
`6,104,803 A
`9/2000 Franklin
`6,115,742 A
`2/2001 Polcyn
`6,195,357 Bl
`3/2001 Rieley
`6,208,638 Bl
`6,233,318 Bl
`5/2001 Picard
`6/2001 Turock
`6,243,373 Bl
`6,252,944 Bl
`6/2001 Hansen, II
`7/2001 Ladd
`6,269,336 Bl
`9/2001 Bartholomew
`6,285,745 Bl
`4/2002 Johnson
`6,366,578 Bl
`6,424,945 Bl *
`7/2002 Sorsa ...................... 704/270.1
`9/2002 Burkey
`6,446,076 Bl
`11/2002 Struble et al.
`6,477,420 Bl
`6,505,163 Bl
`1/2003 Zhang
`3/2003 Bowman-Amuah
`6,529,948 Bl
`6,546,393 Bl
`4/2003 Khan
`6,584,439 Bl *
`6/2003 Geilhufe et al. . ........... 704/270
`6,721,705 B2 *
`4/2004 Kurganov et al.
`. ...... 704/270.1
`8/2004 Kurganov
`6,775,264 Bl
`6,964,023 B2 * 11/2005 Maes et al. . . . . . . . . . . . . . . . . . 715/811
`7,003,463 Bl*
`2/2006 Maes et al. .............. 704/270.1
`7,076,431 B2 *
`7/2006 Kurganov et al.
`.......... 704/275
`2001/0048676 Al
`12/2001 Jimenez
`
`FOREIGN PATENT DOCUMENTS
`
`GB
`GB
`GB
`JP
`WO
`WO
`WO
`WO
`WO
`
`2 211 698 A
`2 240 693
`2 317 782 A
`1-258526
`WO 91/07838
`WO 91/18466
`WO 96/09710
`WO 97/37481
`WO 98/23058
`
`7 /1989
`8/1991
`1/1998
`10/1989
`5/1991
`11/1991
`3/1996
`10/1997
`5/1998
`
`OTHER PUBLICATIONS
`
`AT&T, Press Release, "AT&T Customers Can Teach Systems to
`Listen and Respond to Voice", Jan. 17, 1995, pp. 1-2, Basking
`Ridge, NJ., available at www.lucent.com/press/O 195/950117.gbb.
`html (accessed Mar. 15, 2005).
`"Business Phone Systems for Advanced Offices", NTT Review, vol.
`2 (6), Nov. 1990, pp. 52-54.
`Bellcore Technology Licensing, "The Electronic Receptionist - A
`Knowledge-Based Approach to Personal Communications", 1994,
`pp. 1-8.
`
`Brachman et al., "Fragmentation in Store-and-Forward Message
`Transfer", IEEE Communications Magazine, vol. 26(7), Jul. 1998,
`pp. 18-27.
`Cole et al., "An Architecture for a Mobile OSI Mail Access
`System", IEEE Journal on Selected Areas in Communications, vol.
`7 (2), Feb. 1989, pp. 249-256.
`"Data Communications Networks: Message Handling Systems",
`Fasciele, VIII. ?-Recommendations X.400-X.430, 38 pages, date
`unknown.
`DAX Systems, Inc., Press Release, "Speech Recognition Success in
`DAX's Grasp", Nov. 22, 1995, pp. 1-2, Pine Brook, NJ.
`Faxpak Store and Forward Facsimile Transmission Service, Elec(cid:173)
`trical Communication, vol. 54 (3), 1979, pp. 251-55.
`Garcia et al., "Issues in Multimedia Computter-Based Message
`Systems Design and Standardization", NATO ASI Series, vol. 1-6,
`1984, 18 pgs.
`Telecommunications
`Global
`IEEE
`'85
`"Globecom
`Conference,"New Orleans, LA., Dec. 2-5, 1985, pp. 1295-1300.
`Hemphill et al., "Sepeech-Aware Multimedia," IEEE MultiMedia,
`Spring 1996, vol. 3, No. 1, pp. 74-78, IEEE. As indicated on the
`cover page of the journal, a copy of which is attached hereto as
`Attachment 4, the reference was received by Cornell University on
`Mar. 25, 1996.
`Hunt et al., "Long-Distance Remote Control to the Rescue", Chi(cid:173)
`cago Tribune, Jun. 15, 2002, Section 4, p. 15.
`Print outs of the Internet web site, "Wildfire Communications, Inc.,"
`Nov. 5, 1997, including print outs of the following web pages:
`http;//www.wildfire.com
`(1
`pg);
`http;//www.wildfire.com/
`consumerhome.html (2 pgs.); http;//www.wildfire.com/106.html (2
`pgs.); http;//www.wildfire.com/carrierhome.html (2 pgs.); http;//
`www.wildfire.com/sfandb.html (3 pgs); http;//www.wildfire.com/
`about.html (1 pg.); http;//www.wildfire.com/abtmgmt.html (3 pgs);
`http;//www.wildfire.com/scoop.html (2 pgs.); and http;//www.wild(cid:173)
`fire.comintel.html (1 pg.); and http;//www.wildfire.com/msft.html
`(2 pgs).
`"Introducing PIC SuperFax, First PC/Fax System to Run Under
`Windows", Pacific Image Communications, Pasadena, CA, Date
`Unknown, (received at COMDEX show, Nov. 3, 1987). 4 pgs.
`Kubala et al., "BYBLOS Speech Recognition Benchmark Results",
`Workshop on Speech & Natural Language, Feb. 19-22, 1991.
`According
`to
`the web
`site
`http:/ /portal.acm.org/citation.
`cfm?id~ll2405.112415&coll ... , attached hereto as Attachment 3,
`the reference was published in 1991, Morgan Kaufman Publishers,
`San Francisco, CA. The distribution date is not presently known.
`Ly, "Chatter: A Conversational Telephone Agent", submitted to
`Program in Media Arts & Sciences, MIT, 1993, pp. 1-130 .
`Maeda, et al., "An Intelligent Customer-Controlled Switching Sys(cid:173)
`tem", IEEE Global Telecommunications Conference, Hollywood,
`Florida, Nov. 28-Dec. 1, 1988, pp. 1499-1503.
`Markowitz, J., "The Ultimate Computer Input Device May Be Right
`Under Your Nose", Byte, Dec. 1995, pp. 1-13, available at www.
`byte.corn/art/9512/sec8/artl.htm (accessed Mar. 15, 2005).
`Marx et al., "Mail Call: Message Presentation and Navigation in a
`Nonvisual Environment," SIGCHI Conference on Human Factors
`in Computing Systems, Vancouver, B.C., Canada, Apr. 13-18, 1996.
`As shown on Attachment 2, the website http:www.usabilityviews.
`corn/uv001673.html shows date of Apr. 16, 1996. The distribution
`date is not presently known.
`Marx, M., "Toward Effective Conversational Messaging" (Thesis).
`As indicated on the cover page, the Sciences, School of Architecture
`and Planning, Massachusetts Institute of Technology on May 12,
`1995. According to the web site http://www.thesis.mit.edu/Dienst/
`Repository/2.0/Body/0018.mit.theses/1995-314/rfc
`1807bib,
`attached hereto as Attachment 1, the thesis was indexed on Mar. 21,
`2000.
`Perdue et al., "Conversant® 1 Voice System: Architecture and
`Applications", Jul. 17, 1986, AT&T Technical Journal, pp. 1-14.
`"Proceedings of the IFIP World Computer Congress", Dublin,
`Ireland, Sep. 1-5, 1986.
`Sartori, M., "Speech Recognition", Apr. 1995, pp. 1-9, Mercury
`Communications, available at www.gar.co.uk/technology_watch/
`speech.htm (accessed Mar. 15, 2005).
`
`Petitioner Google Ex-1001, 0003
`
`
`
`US 7,386,455 B2
`Page 4
`
`Schmandt et al., "A Conversational Telephone Messaging Systems",
`IEEE Transactions on Consumer Electronics, 1984, vol. CE-30, No.
`3, pp. xxi-xxiv.
`Schmandt et al., "Phone Shell: The Telephone as a Computer
`Terminal", ACM Multimedia, 1993, 11 pgs.
`Schmandt et al., "Phone Slave: A Graphical Telecommunictions
`Interface", Proceedings of the SID, 1985, vol. 26/1, pp. 79-82.
`"Secretarial Branch Exchanged", IBM Technical Disclosure Bulle(cid:173)
`tin, vol. 26 (5), Oct. 1983, pp. 2645-47.
`Shimarnura, et al., "Review of the Electrical Communication Labo(cid:173)
`ratories", vol. 418 (33), No. 1, Tokyo, Japan, 1985, pp. 31-39.
`
`"Wildfire Communication, Inc.", Harvard Business School, Mar.
`21, 1996, Pub!. No. 9-396-305, pp. 1-22.
`"WordPerfect: New Telephony Features Boost Office", WordPerfect
`Office TechBrief, 1994, Info-World Publishing. Co., vol. 10, Issue
`2, pp. 2-3.
`Yang, C., "INETPhone - Telephone Services and Servers on the
`Internet", Apr. 1995, University of North Texas, pp. 1-6.
`
`* cited by examiner
`
`Petitioner Google Ex-1001, 0004
`
`
`
`UI = N
`
`UI
`~
`0--,
`00
`
`-....l w
`d r.,;_
`
`FIG. 1
`
`112
`
`116
`
`118
`
`/_-: .. ~ 10~11<-112
`
`'-------:r--·11 uNe•----r---~
`
`SERVER
`MEDIA
`
`--..
`
`SERVER
`MEDIA
`
`________ _
`
`106
`
`0 ....
`....
`.....
`rJJ =(cid:173)
`
`('D
`('D
`
`.i;...
`
`QO
`0
`0
`N
`~o
`? ....
`2'
`
`1 os I
`I
`I
`I
`I
`I
`I
`I
`102 I
`I
`I
`
`~----, ,---~
`
`--~-_-_-_-_-_~_,r----
`10 Lr------c:iccii;m-----,
`'----r--TCP/IP ,___--"".r----
`
`SERVER
`
`SERVER
`
`WEB BROWSING
`r--~ ~---,
`
`WEB BROWSING
`
`102
`
`FIREWALL
`
`100
`
`--..,..---
`
`--
`
`-----,1
`I
`I
`1
`I
`I ---
`I ---
`I
`I
`I
`I
`I
`r-------108 ____
`
`~ = ~
`
`~
`~
`~
`•
`00
`
`e •
`
`WEBSITEN
`
`0
`
`0
`
`0
`
`0
`
`WEBSITE2
`
`114 ~ I WEB SITE 1
`
`•
`
`Petitioner Google Ex-1001, 0005
`
`
`
`U.S. Patent
`
`Jun.10,2008
`
`Sheet 2 of 4
`
`US 7,386,455 B2
`
`200
`
`1
`
`RANK
`NUMBER
`
`URL
`
`EXTRACTION AGENT
`COMMAND
`
`TIMESTAMP
`
`202
`
`204
`
`206
`
`208
`
`FIG. 2
`
`6
`
`~
`
`SPEECH RECOGNITION
`ENGINE
`
`SPEECH SYNTHESIS
`ENGINE
`
`IVR APPLICATION
`
`CALL PROCESSING SYSTEM
`
`TELEPHONY AND VOICE
`HARDWARE
`
`MEDIA SERVER
`
`-
`
`-
`
`-
`
`...--
`
`300
`
`302
`
`304
`
`306
`
`308
`
`FIG. 3
`
`Petitioner Google Ex-1001, 0006
`
`
`
`U.S. Patent
`
`Jun. 10, 2008
`
`Sheet 3 of 4
`
`US 7,386,455 B2
`
`102
`
`~
`
`CONTENT EXTRACTION
`AGENT
`
`CONTENT FETCHER
`
`POLLING AND RANKING AGENT
`
`CONTENT DESCRIPTOR FILES
`
`WEB BROWSING SERVER
`
`400
`
`402
`
`404
`
`406
`
`FIG. 4
`
`Petitioner Google Ex-1001, 0007
`
`
`
`UI = N
`
`UI
`~
`0--,
`00
`
`-....l w
`d r.,;_
`
`0 ....
`('D a
`rJJ =(cid:173)
`
`.i;...
`
`.i;...
`
`~ = ? ....
`
`QO
`0
`0
`N
`0
`
`~
`
`FIG. 5
`
`504
`
`504
`
`~01
`
`~
`
`512
`
`514
`I
`'\J
`
`508 L----T------------------~
`I
`510 I
`I
`I
`I
`I ---
`I
`I
`I
`I
`I
`I
`I
`506 I
`I
`I
`r------------
`I
`
`T1 LINE
`
`510
`
`SERVER
`MEDIA
`
`SERVER
`MEDIA ~-
`
`-~ ~--~ ~---
`
`TCP/IP
`
`SERVER
`
`BROWSING
`
`DEVICE
`
`~~~-
`
`SERVER
`BROWSING
`
`DEVICE
`
`506
`
`:.====
`
`~ = ~
`
`~
`~
`~
`•
`00
`
`e •
`
`0 0 O
`
`VCR
`
`TV
`
`500 ~ I SYSTEM
`SECURITY
`
`Petitioner Google Ex-1001, 0008
`
`
`
`US 7,386,455 B2
`
`1
`ROBUST VOICE BROWSER SYSTEM AND
`VOICE ACTIVATED DEVICE CONTROLLER
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`5
`
`2
`mation on the web site. Therefore, a limited number of web
`sites are available that are accessible by these web-enabled
`PDAs. Finally, it is very common today for users to carry
`cell phones, however, users must also carry a separate PDA
`if they require the ability to gather information from various
`web sites. Users are therefore subjected to added expenses
`since they must pay for both cellular telephone service and
`also for the web-enabling service for the PDA. This results
`in a very expensive alternative for the consumer.
`The third alternative mentioned above is the use of
`web-phones or web-pagers. These devices suffer many of
`the same drawbacks as PDAs. First, these devices are
`expensive to purchase. Further, the number of web sites
`accessible to these devices is limited since web sites must be
`specifically designed to allow access by these devices.
`Furthermore, users are often required to pay an additional
`fee in order to gain wireless web access. Again, this service
`is expensive. Another drawback of these web-phones or
`web-pagers is that as technology develops, the methods used
`20 by the various web sites to allow access by these devices
`may change. These changes may require users to purchase
`new web-phones or web-pagers or have the current device
`serviced in order to upgrade the firmware or operating
`system stored within the device. At the least, this would be
`25 inconvenient to users and may actually be quite expensive.
`Therefore, a need exists for a system that allows users to
`easily access and browse the Internet from any location.
`Such a system would only require users to have access to
`any type of telephone and would not require users to
`30 subscribe to multiple services.
`In the rapidly changing area of Internet applications, web
`sites change frequently. The design of the web site may
`change, the information required by the web site in order to
`perform searches may change, and the method of reporting
`35 search results may change. Web browsing applications that
`submit search requests and interpret responses from these
`web sites based upon expected formats will produce errors
`and useless responses when such changes occur. Therefore,
`a need exists for a system that can detect modifications to
`40 web sites and adapt to such changes in order to quickly and
`accurately provide the information requested by a user
`through a voice enabled device, such as a telephone.
`When users access web sites using devices such as
`personal computers, delays in receiving responses are tol-
`45 erated and are even expected, however, such delays are not
`expected when a user communicates with a telephone. Users
`expect communications over a telephone to occur immedi(cid:173)
`ately with a minimal amount of delay time. A user attempt(cid:173)
`ing to find information using a telephone expects immediate
`50 responses to his search requests. A system that introduces
`too much delay between the time a user makes a request and
`the time of response will not be tolerated by users and will
`lose its usefulness. Therefore, it is important that a voice
`browsing system that uses telephonic communications
`55 selects web sites that provide rapid responses since speed is
`an important factor for maintaining the system's desirability
`and usability. Therefore, a need exists for a system that
`accesses web sites based upon their speed of operation.
`
`This application is a continuation of U.S. patent applica(cid:173)
`tion Ser. No. 10/821,690, filed Apr. 9, 2004, now U.S. Pat.
`No. 7,076,431 now allowed, which is a continuation of U.S.
`patent application Ser. No. 09/776,996, filed Feb. 5, 2001 10
`and issued as U.S. Pat. No. 6,721,705 on Apr. 13, 2004,
`which claims the benefit of priority to U.S. Provisional
`Application No. 60/180,344, filed Feb. 4, 2000, entitled
`"Voice-Activated Information Retrieval System" and U.S.
`Provisional Application No. 60/233,068, filed Sep. 15, 2000, 15
`entitled "Robust Voice Browser System and Voice Activated
`Device Controller," all of which are herein incorporated by
`reference in their entirety.
`
`FIELD OF THE INVENTION
`
`The present invention relates to a robust and highly
`reliable system that allows users to browse web sites and
`retrieve information by using conversational voice com(cid:173)
`mands. Additionally, the present invention allows users to
`control and monitor other systems and devices that are
`connected the Internet or any other network by using voice
`commands.
`
`BACKGROUND OF THE INVENTION
`
`Currently, three options exist for a user who wishes to
`gather information from a web site accessible over the
`Internet. The first option is to use a desktop or a laptop
`computer connected to a telephone line via a modem or
`connected to a network with Internet access. The second
`option is to use a Personal Digital Assistant (PDA) that has
`the capability of connecting to the Internet either through a
`modem or a wireless connection. The third option is to use
`one of the newly designed web-phones or web-pagers that
`are now being offered on the market. Although each of these
`options provide a user with access to the Internet to browse
`web sites, each of them have their own drawbacks.
`Desktop computers are very large and bulky and are
`difficult to transport. Laptop computers solve this inconve(cid:173)
`nience, but many are still quite heavy and are inconvenient
`to carry. Further, laptop computers cannot be carried and
`used everywhere a user travels. For instance, if a user wishes
`to obtain information from a remote location where no
`electricity or communication lines are installed, it would be
`nearly impossible to use a laptop computer. Oftentimes,
`information is needed on an immediate basis where a
`computer is not accessible. Furthermore, the use of laptop or
`desktop computers to access the Internet requires either a
`direct or a dial-up connection tan an Internet Service Pro(cid:173)
`vider (ISP). Oftentimes, such connections are not available
`when a user desires to connect to the Internet to acquire
`information.
`The second option for remotely accessing web sites is the
`use of PDAs. These devices also have their own set of 60
`drawbacks. First, PDAs with the ability to connect to the
`Internet and access web sites are not readily available. As a
`result, these PDAs tend to be very expensive. Furthermore,
`users are usually required to pay a special service fee to
`enable the web browsing feature of the PDA. A further 65
`disadvantage of these PDAS is that web sites must be
`specifically designed to allow these devices to access infor-
`
`SUMMARY OF THE INVENTION
`
`It is an object of an embodiment of the present invention
`to allow users to gather information from web sites by using
`voice enabled devices, such as wireline or wireless tele(cid:173)
`phones.
`An additional object of an embodiment of the present
`invention is to provide a system and method that allows the
`
`Petitioner Google Ex-1001, 0009
`
`
`
`US 7,386,455 B2
`
`4
`speaks into a telephone, an almost immediate response is
`expected. This expectation does not exist for non-voice
`communications, such as email transmissions or accessing a
`web site using a personal computer. In such situations, a
`reasonable amount of transmission delay is acceptable. The
`ranking system of implemented by a preferred embodiment
`of the present invention ensures users will always receive
`the fastest possible response to their request.
`An alternative embodiment of the present invention
`allows users to control and monitor the operation of a variety
`of household devices connected to a network using speech
`commands spoken into a voice enabled device.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`15
`
`FIG. 1 is a depiction of the voice browsing system of the
`first embodiment of the present invention;
`FIG. 2 is a block diagram of a database record used by the
`first preferred embodiment of the present invention;
`FIG. 3 is a block diagram of a media server used by the
`preferred embodiment;
`FIG. 4 is a block diagram of a web browsing server used
`by the preferred embodiment; and
`FIG. 5 is a depiction of the device browsing system of the
`25 second embodiment of the present invention.
`
`3
`searching and retrieving of publicly available information by
`controlling a web browsing server using naturally spoken
`voice commands.
`It is an object of another embodiment of the present
`invention to provide a robust voice browsing system that can 5
`obtain the same information from several web sites based
`upon a ranking order. The ranking order is automatically
`adjusted if the system detects that a given web site is not
`functioning, is too slow, or has been modified in such a way
`that the requested information cannot be retrieved any 10
`longer.
`A still further object of an embodiment of the present
`invention is to allow users to gather information from web
`sites from any location where a telephonic connection can be
`made.
`Another object of an embodiment of the present invention
`is to allows users to browse web sites on the Internet using
`conversational voice commands spoken into wireless or
`wireline telephones or other voice enabled devices.
`An additional object an embodiment of the present inven- 20
`tion is to provide a system and method for using voice
`commands to control and monitor devices connected to a
`network.
`It is an object of an embodiment of the present invention
`to provide a system and method which allows devices
`connected to a network to be controlled by conversational
`voice commands spoken into any voice enabled device
`interconnected with the same network.
`The present invention relates to a system for acquiring
`information from sources on a network, such as the Internet. 30
`A voice browsing system maintains a database containing a
`list of information sources, such as web sites, connected to
`a network. Each of the information sources is assigned a
`rank number which is listed in the database along with the
`record for the information source. In response to a speech 35
`command received from a user, a network interface system
`accesses the information source with the highest rank num(cid:173)
`ber in order to retrieve information requested by the user.
`The a preferred embodiment of the present invention
`allows users to access and browse web sites when they do 40
`not have access to computers with Internet access. This is
`accomplished by providing a voice browsing system and
`method that allows users to browse web sites using conver(cid:173)
`sational voice commands spoken into any type of voice
`enabled device (i.e., any type of wireline or wireless tele- 45
`phone, IP phone, wireless PDA, or other wireless device).
`These spoken commands are then converted into data mes(cid:173)
`sages by a speech recognition software engine running on a
`user interface system. These data messages are then sent to
`and processed by a network interface system. This network 50
`interface system then generates the proper requests that are
`transmitted to the desired web site over the Internet.
`Responses sent from the web site are received and processed
`by the network interface system and then converted into an
`audio message via a speech synthesis engine or a pre(cid:173)
`recorded audio concatenation application and finally trans(cid:173)
`mitted to the user's voice enabled device.
`A preferred embodiment of the voice browser system and
`method uses a web site polling and ranking methodology
`that allows the system to detect changes in web sites and
`adapt to those changes in real-time. This enables the voice
`browser system of a preferred embodiment to deliver highly
`reliable information to users over any voice enabled device.
`This ranking system also enables the present invention to
`provide rapid responses to user requests. Long delays before
`receiving responses to requests are not tolerated by users of
`voice-based systems, such as telephones. When a user
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`
`A first embodiment of the present invention is a system
`and method for allowing users to browse information
`sources, such as web sites, by using naturally spoken,
`conversational voice commands spoken into a voice enabled
`device. Users are not required to learn a special language or
`command set in order to communicate with the voice
`browsing system of the present invention. Common and
`ordinary commands and phrases are all that is required for
`a user to operate the voice browsing system. The voice
`browsing system recognizes naturally spoken voice com(cid:173)
`mands and is speaker-independent; it does not have to be
`trained to recognize the voice patterns of each individual
`user. Such speech recognition systems use phonemes to
`recognize spoken words and not predefined voice patterns.
`The first embodiment allows users to select from various
`categories of information and to search those categories for
`desired data by using conversational voice commands. The
`voice browsing system of the first preferred embodiment
`includes a user interface system referred to as a media
`server. The media server contains a speech recognition
`software engine. This speech recognition engine is used to
`recognize natural, conversational voice commands spoken
`by the user and converts them into data messages based on
`the available recognition grammar. These data messages are
`then sent to a network interface system. In the first preferred
`55 embodiment, the network interface system is referred to as
`a web browsing server. The web browsing server then
`accesses the appropriate information source, such as a web
`site, to gather information requested by the user.
`Responses received from the information sources are then
`60 transferred to the media server where speech synthesis
`engine converts the responses into audio messages that are
`transmitted to the user. A more detailed description of this
`embodiment will now be provided.
`Referring to FIG. 1, a database 100 designed by Webley
`65 Systems Incorporated is connected to one or more web
`browsing servers 102 as well as to one or more media
`servers 106. The database may store information on mag-
`
`Petitioner Google Ex-1001, 0010
`
`
`
`US 7,386,455 B2
`
`6
`work (PSTN) 116. In the preferred embodiment, each media
`server is based upon Intel's Dual Pentium III 730 MHz
`microprocessor system.
`The speech recognition function is performed by a speech
`recognition engine 300 that converts voice commands
`received from the user's voice enabled device 112 (i.e., any
`type of wireline or wireless telephone, Internet Protocol (IP)
`phones, or other special wireless units) into data messages.
`In the preferred embodiment, voice commands and audio
`10 messages are transmitted using the PSTN 116 and data is
`transmitted using the TCP/IP communications protocol.
`However, one skilled in the art would recognize that other
`transmission protocols may be used for either voice or data.
`Other possible transmission protocols would include SIP/
`15 VoIP (Session Initiation ProtocolNoice over IP), Asynchro(cid:173)
`nous Transfer Mode (ATM) and Frame Relay. A preferred
`speech recognition engine is developed by Nuance Com(cid:173)
`munications of 1380 Willow Road, Menlo Park, Calif.
`94025 (www.nuance.com). The Nuance engine capacity is
`20 measured in recognition units based on CPU type as defined
`in the vendor specification. The natural speech recognition
`grammars (i.e., what a user can say that will be recognized
`by the speech recognition engine) were developed by Web-
`ley Systems.
`Table 2 below provides a partial source code listing of the
`recognition grammars used by the speech recognition engine
`of the preferred embodiment for obtaining weather infor-
`mation.
`
`5
`netic media, such as a hard disk drive, or it may store
`information via other widely acceptable methods for storing
`data, such as optical disks. The database 100 contains a
`separate set of records for each web site accessible