throbber
VWGoA - Ex. 1006
`Volkswagen Group of America, Inc., Petitioner
`
`1
`
`

`
`&U
`
`ta
`
`yaM
`
`Po
`
`
`
`tEm2:Qsfio$2.
`
`8meE:
`
` III'|.luI
`
`
`
`2,225.8sH._n_m
`
`h
`
`mM
`
`US 6,230,132 B1
`
`
`
`.zo:<z:m.uaS.2253L22.
`
`
`
`11zo_:.,§E52232w22:893%zo_=z__§
`
`m<222228E
`
`
`
`Emazo=<z_Eo
`
`
`
`
`
`EEGzo=<o_><z9mm~_8<zofizzaozwmzé
`
`
`
`B8258.ummsou~m=zmzo_zz__ma3%.zo=<z_Ee
`
`
`
`
`
`2
`
`
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 2 0f 10
`
`US 6,230,132 B1
`
`
`
`
`
` RESOLVE
`
`1000
`
`"DESTINATION
`LOCATION ENTRY’
`
`1010
`
`LOAD BASIC
`VOCABULARY
`
`1020
`
`"PLEASE SAY THE
`DESTINATION LOCATION"
`
`1030
`
`<DESTlNAT|ON
`LOCATl0N_1>
`
`1040 <DESTINATlON LOCAT|0N_1>
`T0 SPEECH RECOGNITION
`ENGINE
`
`1050
`
`IS <HYPO.1.1>
`CORRECT?
`
`1050
`
`STORE
`<DESTINAT|0N LOCATlON_1>
`
`‘PLEASE SAY oesnwmou
`
`1080
`
`<DESTINATl0N
`LOCATl0N__2>
`
`<DEST|NATl0N LOCAT|0N__2>
`TO SPEECH RECOGNITION
`ENGINE
`
`
`
`AMBIGUITY
`
`"IS <POSTAL CODE)
`<DEST|NATl0N LOCATION)
`CORRECT?
`
`YES
`
`STORE DESTINATION
`LOCATION
`
`1 190
`
`1140
`
`commug wmq
`57512350
`
`SPELL DESTINATION
`LOCATION
`
`F|G.2
`
`3
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 3 of 10
`
`US 6,230,132 B1
`
`.
`1
`1
`1
`|
`1
`
`I
`I
`I
`I
`I
`
`I
`
`GENERATE
`
`AMBIGUITY LIST
`
`DESTINATION
`LOCATION
`UNAMBIGUOUS?
`
`1 O00
`
` ENTER
`DESTINATION
`LOCATION
`
`
`
`I 01 0
`
`1020
`
`
`"PLEASE SAY
`DLE§3\NT’%£-N
`
`
`
`"<DESTINATION
`L0CAn0N_1>'
`
`
`
`T 030
`
`
`1 040
`
`
`
`
`
`
`
`
`
`
`<DESTINATION LOCAT|ON_I>
`TO SPEECH RECOGNITION
`ENGINE
`
`RESOLVE
`AMBICUITY
`
`
`
`"IS <POSTAL CODE>
`(DESTINATION LOCATION)
`CORRECT?"
`
`1 050
`
`
`
`
`
` 1 060 STORE
`DESTINATION
`LOCATION
`
`
`
`DE5Sn,E,L\%,0N
`I
`
`
`
`I
`
`LOCAHON
`
`P
`
`
`
`1140
`
`
`
`
`
`
`
`CONTINUE WITH
`
`STEP 350
`
`F|G.3
`
`‘IS <HYPO.1.1>
`CORRECT?
`
`N0
`
`STORE
`
`<DESTINAT|0N LOCATION_1>
`
`"IS <HYPO.1.2>
`CORRECT?
`
`
`
`YES
`
`I 075
`
`4
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 4 0f 10
`
`US 6,230,132 B1
`
`SELECT
`FROM LIST
`
`1 430
`
`‘LIST COMPRISES
`<N> ENTRIES"
`
`1440
`
`1 445 “S
`
`1450
`
`1460
`
`1500
`
`LOCATION/STREET
`couw NOT BE
`rouuo
`
`1475
`
`1490
`
`PASS ON
`<ENTRY_X>
`
`CONTINUE WITH
`wm STATE 0
`
`F|G.4
`
`
`
`DISPLAY UST:
`DMDED INTO PAGES
`WITH <K> ENTRIES
`
`
`
`"PREPARE
`NEXT PAGE'
`
` 1430
`
`5
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 5 of 10
`
`US 6,230,132 B1
`
`RESOLVE
`AMBIGUITY
`
`
`
`‘THERE ARE <N>
`INSTANCES OF
`<DEST|NATION LOCATION)
`
`
`
`
`
`
`SORT LIST BY
`NUMBER OF
`INHABITANTS
`
`"DO YOU WANT TO (30
`To <LARGEST CnY>?‘
`
`N0
`
`
`
`FIRST
`INTERROGATION
`
`
`
`SELECT
`FROM LIST
`
`1240
`
`
` 1410
`
`DESTINATION LOCATION
`UNAMBIGUOUS?
`
`NO
`
`1260
`
`RETURN RESULTS TO
`DIALOGUE CALLING
`THEM UP
`
`I
`I
`I
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`I
`I
`
`F|G.5
`
`6
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 6 of 10
`
`US 6,230,132 B1
`
`SPELL DESTINATION
`
`'|S <NEW_HYPO.4.I >
`CORRECT?
`
`YES "PLEASE SPELL 201 O
`
`DESTINATION
`LOCATION"
`
`SELECT
`FROM LIST
`
`2020
`
`SPEECH INPUT
`
`2030
`
`2040
`
`2050
`2°75 NO
`
`RETURN
`HYPOTHESIS LIST
`
`GENERATE NEW
`LOCATION UST
`
`ACOUSTIC VALUE
`STORED?
`
`2060
`
`YES
`
`LOAD AND GENERATE
`NEW
`LEXICON FROM NEW
`
`LOCATION
`LOCATION LIST
`LIST
`
`_. NEW
`Hypomgggg
`ATCR(§\JISSTFl(t;RV’§LTl<’Jl§IETI)0
`
`
`“SI
`
`
`
`
`
`ENGINE
`
`2100
`
`2110
`
`2130
`
`DESTINATION LOCATION
`UNAMBIGUOUS?
`YES
`
`I
`
`
`
`"0
`
`
`
`RESOLVE
`
`AMBIGUITY
`
`I
`
`N0
`
`‘IS <POSTAL CODE>
`<DESTINATION LOCATION)
`coRREcT?'
`YES
`
`2150
`
`2140
`
`TEMPORARILY
`sToREOgE%TCII’iATI0N
`
`N0 DESTINATION
`L0CglT:I0FI*:JUCN0DlJND
`
`CONTINUE WITH
`P
`
`STE
`
`350
`
`CONTINUE WITH
`T TT
`
`WA‘
`
`SPEECH RECOGNITION
`OS AE
`
`
`
`SPEECH RECOGNITION
`ENGINE**—NEW
`HYPOTHESIS LIST
`
`HYPOTHESIS LIST FROM
`
`
`
`2075
`
`I-'|G.6
`
`7
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 7 of 10
`
`US 6,230,132 B1
`
`COURSE
`
`DESTINATION INPUT
`
`3000
`
`
`
` 3010
`
`ENTER COURSE
`
`'
`
`DESTINATION
`
`CALCULATE I500
`
`LOCATIONS IN THE
`
`VICINITY OF <COURSE
`
`
`
` 3300
`
`DESTINATION)
`
`
`
`
`DESTINATION LEXICON
`
` CENERATE FINE
` 3310
`
`RECOGNITION ENGINE
`
`
`
`AND LOAD IN
`
`
`
` DESTINATION INPUT
`
`3320
`IMPLEMENTATION
`
`STEP IOIO
`
`
`WITHOUT
`
`F|G.7
`
`
`
`8
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 8 of 10
`
`US 6,230,132 B1
`
`STORE
`ADDRESS
`
`7000
`
`
`
`DESTINATION ADDRESS
`IN MEMORY?
`
`"WOULD YOU LIKE TO STORE
`THE CURRENT DESTINATION?‘
`
`7010
`
`7020
`
`7030
`
`ENTER
`
`ADDRESS
`
`7040
`
`
`
`'PLEASE SAY THE NAME
`UNDER WHICH YOU WANT
`THE STORE TO DESTINATION"
`
`
`
`<NAME_T > —>-
`SPEECH RECOGNITION
`ENGINE
`
`7060
`
`DESTINATION
`ADDRESS
`—- <NAME_1>
`
`7070
`
`'AODRESS STORED UNDER
`
`CONTINUE WITH
`NAIT STATE 0
`
`7090
`
`F|G.8
`
`9
`
`

`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 9 of 10
`
`US 6,230,132 B1
`
`YES
`
`DESTINATION LOCATION
`ALREADY ENTERED?
`
` "WOULD YOU
`
`YES
`
`5010
`
`"IS <HYPO.5.I>
`CORRECT?"
`N0
`
`STORE
`
`5 10
`
`1
`
`5120
`
`NO
`
`5130
`
`SPELL
`STREET
`
`5140
`
`' 5150
`
`N0
`
`"TS <HYPO.5.2>
`CORRECT?‘
`
`GENERATE LIST
`WITH AMBIGUITY
`
`STREET
`
`
`
`
`
`LOCATION
`
`NUMBER or STREETS
`(NS) < M?
`
`5040
`
`N0
`
`5050
`
`N0
`
`
` STREET
`
`
`
`RESOLVE
`STREET
`AMBIGUITY
`
`
`
`AMBIGUOUS?
`
`GENERATE STREET
`LEXICON AND LOAD
`
`IN RECOGNITION ENGINE
`
`"PLEASE SAY THE
`STREET NAME'
`
`5060
`
`5070
`
`<STREET_I>
`
`5030
`
`5200
`
`<STREET_1> T0 SPEECH
`RECOGNITION ENGINE
`
`5090
`
`CONTINUE WIT“
`STEP 500
`
`5190
`
`9
`
`10
`
`5030
`DESTINATION
`
`"0
`
`ENTER
`
`LIKE A STREET IN
`<‘3”RRE”T—°'">'-"
`
`
`
`10
`
`

`
`S.U
`
`wa
`
`mMmmmwM
`
`US 6,230,132 B1
`
`P...,:N
`
`Emamzo=§><z
`
`o_.o:
`
`11
`
`

`
`US 6,230,132 B1
`
`1
`PROCESS AND APPARATUS FOR REAL-
`TIME VERBAL INPUT OF A TARGET
`ADDRESS OF A TARGET ADDRESS SYSTEM
`
`BACKGROUND AND SUMMARY OF THE
`INVENTION
`
`This application claims the priority of German patent
`document 197 09 518.6, filed Mar. 10, 1997, the disclosure
`of which is expressly incorporated by reference herein.
`The invention relates to a method and apparatus for
`real-time speech input of a destination address into a navi-
`gation system.
`German patent document DE 196 00 700 describes a
`target guidance system for a motor vehicle in which a fixedly
`mounted circuit, a contact field circuit or a voice recognition
`apparatus can be used as an input device. The document,
`however, does not deal with the vocal input of a target
`address in a target guidance system.
`Published European patent application EP 0 736 853 A1
`likewise describes a target guidance system for a motor
`vehicle. The speech input of a target address in a target
`guidance system is, however, not the subject of this docu-
`ment.
`
`Published German patent application DE 36 08 497 A1
`describes a process for speech controlled operation of a long
`distance communication apparatus, especially an auto tele-
`phone. It is considered a disadvantage of the process that it
`does not deal with the special problems in speech input of a
`target address in a target guidance system.
`Not yet prepublished German patent application P 195 33
`541.4-52 discloses a method and apparatus of this type for
`automatic control of one or more devices, by speech com-
`mands or by speech dialogue in real time. Input speech
`commands are recognized by a speech recognition device
`comprising a speaker-independent speech recognition
`engine and a speaker-independent additional speech recog-
`nition engine that identifies recognition probability as the
`input speech command, and initiates the functions of the
`device or devices associated with this speech command. The
`speech command or speech dialogue is formed on the basis
`of at least one syntax structure, at least one basic command
`vocabulary, and if necessary at least one speaker-specific
`additional command vocabulary. The syntax structures and
`basic command vocabularies are presented in speaker-
`independent form and are established in real
`time. The
`speaker-specific additional vocabulary is input by the
`respective speaker and/or modified by him/her, with an
`additional speech recognition engine that operates according
`to a speaker-dependent recognition method being trained in
`training phases, during and outside real-time operation by
`each speaker, to the speaker-specific features of the respec-
`tive. speaker by at least one-time input of the additional
`command. The speech dialogue and/or control of the devices
`is developed in real time as follows:
`Speech commands input by the user are fed to a speaker-
`independent speech recognition engine operating on
`the basis of phonemes, and to the speaker-dependent
`additional speech recognition engine where they are
`subjected to feature extraction and are checked for the
`presence of additional commands from the additional
`command vocabulary and classified in the speaker-
`dependent additional speech recognition engine on the
`basis of the features extracted therein.
`
`Then the classified commands and syntax structures of the
`two speech recognition engines, recognized with a
`
`2
`certain probability, are assembled into hypothetical
`speech commands and the latter are checked and clas-
`sified for their reliability and recognition probability in
`accordance with the syntax structure provided.
`Thereafter, the additional hypothetical speech commands
`are checked for their plausibility in accordance with
`specified criteria and, of the hypothetical speech com-
`mands recognized as plausible, the one with the highest
`recognition probability is selected and identified as the
`speech command input by the user.
`Finally, the functions of the device to be controlled that
`are associated with the identified speech command are
`initiated and/or answers are generated in accordance
`with a predetermined speech dialogue structure to
`continue the speech dialogue. According to this
`document, the method described can also be used to
`operate a navigation system, with a destination address
`being input by entering letters or groups of letters in a
`spelling mode and with it being possible for the user to
`supply a list for storage of destination addresses for the
`navigation system using names and abbreviations that
`can be determined in advance.
`
`the special
`The disadvantage of this method is that
`properties of the navigation system are not discussed, and
`only the speech input of a destination location by means of
`a spelling mode is described.
`The object of the invention is to provide an improved
`method and apparatus of the type described above, in which
`the special properties. of a navigation system are taken into
`account and simplified.
`Another object of the invention is to provide such an
`arrangement which enables faster speech input of a desti-
`nation address in a navigation system, improving operator
`comfort.
`
`These and other objects and advantages are achieved by
`the method and apparatus according to the invention for
`speech input of destination addresses in a navigation system,
`which uses a known speech recognition device, such as
`described for example in the document referred to above,
`comprising at
`least, one speaker-independent speech-
`recognition engine and at least one speaker-dependent addi-
`tional speech-recognition engine. The method according to
`the invention makes possible various input dialogues for
`speech input of destination addresses. In a first input dia-
`logue (hereinafter referred to as the “destination location
`input”), the speaker-independent speech recognition device
`is used to detect destination locations spoken in isolation,
`and if such destination location is not recognized, to recog-
`nize continuously spoken letters and/or groups of letters. In
`a second input dialogue (hereinafter referred to as “spell
`destination location”), the speaker-independent speech rec-
`ognition engine is used to recognize continuously spoken
`letters and/or groups of letters. In a third input dialogue
`(hereinafter referred to as “coarse destination input”), the
`speaker-independent speech-recognition engine is used to
`recognize destination locations spoken in isolation, and if
`such destination location is recognized, to recognize con-
`tinuously spoken letters and/or groups of letters. In a fourth
`input dialogue (hereinafter referred to as “indirect input”),
`the speaker-independent speech recognition engine is used
`to recognize continuously spoken numbers and/or groups of
`numbers. In a fifth input dialogue (hereinafter referred to as
`“street input”), the speaker-independent speech-recognition
`device is. used to recognize street names spoken in isolation
`and if the street name spoken in isolation is not recognized,
`to recognize continuously spoken letters and/or groups of
`letters.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`12
`
`12
`
`

`
`US 6,230,132 B1
`
`3
`the
`By means of the input dialogues described above,
`navigation system is supplied with verified destination
`addresses, each comprising a destination location and a
`street. In a sixth input dialogue (hereinafter referred to as
`“call up address”), in addition to the speaker-independent
`speech-recognition engine,
`the speaker-dependent addi-
`tional speech-recognition engine is used to recognize key-
`words spoken in isolation.
`In a seventh input dialogue
`(hereinafter referred to as “store address”), a keyword spo-
`ken in isolation by the user is assigned a destination address
`entered by the user, so that during the input dialogue “call up
`address” a destination address associated with the corre-
`sponding recognized keyword is transferred to the naviga-
`tion system.
`The method according to the invention is based primarily
`on the fact
`that
`the entire admissible vocabulary for a
`speech-recognition device is not loaded into the speech-
`recognition device at the moment it is activated; rather, at
`least a required lexicon is generated from the entire possible
`vocabulary during real-time operation and is loaded into the
`speech-recognition device as a function of the required input
`dialogue for executing an operating function. There are more
`than 100,000 locations In the Federal Republic of Germany
`that can serve as vocabulary for the navigation system. If
`this vocabulary were to be loaded into the speech-
`recognition device,
`the recognition process would be
`extremely slow and prone to error. A lexicon generated from
`this vocabulary comprises only about 1500 words, so that
`the recognition process would be much faster and the
`recognition rate higher.
`At least one destination file that contains all possible
`destination addresses and certain additional information for
`
`the possible destination addresses of a guidance system, and
`is stored in at least one database, is used as the database for
`the method according to the invention. From this destination
`file, lexica are generated that comprise at least parts of the
`destination file, with at least one lexicon being generated in
`real
`time as a function of at
`least one activated input
`dialogue. It is especially advantageous for the destination
`file for each stored destination location to contain additional
`
`information, for example political affiliation or a additional
`naming component, postal code or postal code range, tele-
`phone area code, state, population, geographic code, pho-
`netic description, or membership in the lexicon. This addi-
`tional information can then be used to resolve ambiguities or
`to accelerate the search for the desired destination location.
`
`Instead of the phonetic description, a transcription of the
`phonetic description in the form of a chain of indices,
`depending on the implementation of the transcription, can be
`used instead of the phonetic description for the speech-
`recognition device. In addition, a so-called automatic pho-
`netic transcription that performs a rule-based conversion of
`orthographically present names using a table of exceptions
`into a phonetic description can be provided. Entry of lexicon
`membership is only possible if the corresponding lexica are
`generated in an “off-line editing mode,” separately from the
`actual operation of the navigation system, from the destina-
`tion file and have been stored in the (at least one) database,
`for example a CD-ROM or a remote database at a central
`location that can be accessed by corresponding communi-
`cations devices such as a mobile radio network. Generation
`
`of the lexica in the “off-line editing mode” makes sense only
`if sufficient storage space is available in the (at least one)
`database and is especially suitable for
`lexica that are
`required very frequently. In particular, a CD-ROM or an
`external database can be used as the database for
`the
`
`destination file since in this way the destination file can
`always be kept up to date.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`At the moment, not all possible place names in the Federal
`Republic of Germany have been digitized and stored in a
`database. Similarly, a corresponding street list is not avail-
`able for all locations. Therefore it is important to be able to
`update the database at any time. An internal nonvolatile
`storage area of the navigation system can also be used as the
`database for the (at least one) lexicon generated in the
`“off-line editing mode.”
`To facilitate more rapid speech entry of a desired desti-
`nation address into the navigation system, following the
`initialization phase of the navigation system or with suffi-
`ciently large nonvolatile internal storage, a basic vocabulary
`is loaded each time the database is changed, which vocabu-
`lary comprises at least one basic lexicon generated from the
`destination file. This basic lexicon can be generated in the
`“off-line editing mode.” The basic lexicon can be stored in
`the database in addition to the destination file or can be
`
`stored in a nonvolatile internal memory area of the naviga-
`tion system. As an alternative, generation of the basic
`lexicon can wait until after the initialization phase. Dynamic
`generation of lexica during real-time operation of the navi-
`gation system, in other words during operation, offers two
`important advantages. Firstly this creates the possibility of
`putting together any desired lexica from the database stored
`in the (at least one) database, and secondly considerable
`storage space is saved in the (at least one) database since not
`all of the lexica required for the various input dialogues need
`to be stored in the (at least one) database prior to activation
`of the speech-recognition engine.
`In the embodiment described below, the basic vocabulary
`comprises two lexica generated in the “off-line editing
`mode” and stored in the (at least one) database, and two
`lexica generated following the initialization phase. If the
`speech-recognition device has sufficient working memory,
`the basic vocabulary is loaded into it after the initialization
`phase, in addition to the admissible speech commands for
`the speech dialogue system, as described in the above
`mentioned German patent application P 195 33 541.4-52.
`Following the initialization phase and pressing of the PTT
`(push-to-talk) button,
`the speech dialogue system then
`allows the input of various information to control
`the
`devices connected to the speech dialogue system as well as
`to perform the basic functions of a navigation system and to
`enter a destination location and/or a street as the destination
`address for the navigation system. If the speech-recognition
`device has. insufficient RAM, the basic vocabulary is not
`loaded into it until a suitable operating function that accesses
`the basic vocabulary has been activated.
`The basic lexicon, stored in at least one database, com-
`prises the “p” largest cities in the Federal Republic of
`Germany, with the parameter “p” in the design described
`being set at 1000. This directly accesses approximately 53
`million citizens of the FRG or 65% of the population. The
`basic lexicon comprises all locations with more than 15,000
`inhabitants. A regional lexicon also stored in the database
`includes “z” names of regions and areas such as Bodensee,
`Schwabische Alb, etc., with the regional
`lexicon in the
`version described comprising about 100 names for example.
`The regional
`lexicon is used to find known areas and
`conventional regional names. These names cover combina-
`tions of place names that can be generated and loaded as a
`new regional lexicon after the local or regional name is
`spoken. An area lexicon, generated only after initialization,
`comprises “a” dynamically loaded place names in the vicin-
`ity of the actual vehicle location, so that even smaller places
`<<
`2:
`in the immediate vicinity can be addressed directly, with the
`parameter
`a in the embodiment described being set at 400.
`
`13
`
`13
`
`

`
`US 6,230,132 B1
`
`5
`This area lexicon is constantly updated at certain intervals
`while driving so that it is always possible to address loca-
`tions in the immediate vicinity directly. The current vehicle
`location is reported to the navigation system by a positioning
`system known from the prior art, for example by means of
`a global positioning system (GPS). The previously described
`lexica are assigned to the speaker-independent speech-
`recognition engine. A name lexicon that is not generated
`from the destination file and is assigned to the speaker-
`dependent speech-recognition engine comprises approxi-
`mately 150 keywords from the personal address list of the
`user, spoken by the user. Each keyword is then given a
`certain destination address from the destination file by the
`input dialogue “store address.” These specific destination
`addresses are transferred to the navigation system by speech
`input of the associated keywords using the input dialogue
`“call up address.” This results in a basic vocabulary of about
`1650 words that are recognized by the speech-recognition
`device and can be entered as words spoken in isolation
`(place names, street names, keyword).
`Provision can also be made for transferring addresses
`from an external data source, for example a PDA (personal
`digital assistant) or a portable laptop computer, by means of
`data transfer to the speech dialogue system or to the navi-
`gation system and integrate it as an address lexicon in the
`basic vocabulary. Normally, no phonetic descriptions for the
`address data (name, destination location, street) are stored in
`the external data sources. Nevertheless in order to be able to
`
`transfer these data into the vocabulary for a speech-
`recognition device, an automatic phonetic transcription of
`these address data, especially the names, must be performed.
`Assignment to the correct destination location is then per-
`formed using a table.
`For the sample dialogues described below, a destination
`file must be stored in the (at least one) database of the
`navigation system that contains a data set according to Table
`1 in the place found in the navigation system. Depending on
`the storage location and availability, parts of the information
`entered can also be missing. However, this only relates to
`data used to resolve ambiguities, for example additional
`naming component, county, telephone area codes, etc. If
`address data from an outside data source are used,
`the
`address data must be supplemented accordingly. The word
`subunits for the speech-recognition device are especially
`important, which act as hidden Markov model speech rec-
`ognition engines (HMM recognition engines).
`
`TABLE 1
`
`Description of Entry
`
`Place Name
`Political Afliliation or
`additional naming component
`Postal Code or Postal Code
`Range
`Telephone Area Code
`County
`State
`Population
`Geographic Code
`Phonetic Description
`Word Subunits for HMM Speech-
`Recognizing Device
`
`Lexicon Membership
`
`Example
`
`Flensburg
`—
`
`24900-24999
`
`0461
`Flensburg, county
`Schleswig-Holstein
`87,526
`9.43677, 54.78204
`Ifl'EnsIbUrkI
`f[LN]le e[LN] n[C] s b[Vb]
`U[Vb]r k. or 101 79 124 117
`12 39 35 82 68
`3, 4, 78 .
`.
`.
`
`Other objects, advantages and novel features of the
`present invention will become apparent from the following
`detailed description of the invention when considered in
`conjunction with the accompanying drawings.
`
`6
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a schematic diagram providing an overview of
`the possible input dialogues for speech input of a destination
`address for a navigation system according to the invention;
`FIG. 2 is a schematic representation of a flowchart of a
`first embodiment of the input dialogue “destination location
`input”;
`FIG. 3 is a schematic view of a flowchart of a second
`
`embodiment for the input dialogue “destination location
`input”;
`FIG. 4 is a schematic view of a flowchart for the input
`dialogue “choose from list”;
`FIG. 5 is a schematic view of a flowchart for the input
`dialogue “resolve ambiguity”;
`FIG. 6 is a schematic diagram of a flowchart for the input
`dialogue “spell destination location”;
`FIG. 7 is a schematic view of a flowchart for the input
`dialogue “coarse destination input”;
`FIG. 8 is a schematic view of a flowchart for the input
`dialogue “store address”;
`FIG. 9 is a schematic view of a flowchart for the input
`dialogue “street input”; and
`FIG. 10 is a schematic view of a block diagram of a device
`for performing the method according to the invention.
`
`DETAILED DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shows an overview of the possible input dialogues
`for speech input of a destination address for a navigation
`system. A speech dialogue between a user and a speech
`dialogue system according to FIG. 1 begins following the
`initialization phase with a wait state 0, in which the speech
`dialogue system stops until the PTT button (push-to-talk
`button) is actuated, and to which the speech dialogue system
`returns after the speech dialogue is terminated. The user
`activates the speech dialogue system by actuating the PTT
`button in step 100. The speech dialogue system replies in
`step 200 with an acoustic output, for example by a signal
`tone or by a speech output indicating to the user that the
`speech dialogue system is ready to receive a speech com-
`mand. In step 300, the speech dialogue system waits for an
`admissible speech command in order, by means of dialogue
`and process control, to control the various devices connected
`to the speech dialogue system or to launch a corresponding
`input dialogue. However, no details of the admissible speech
`commands will be provided at this point that relate to the
`navigation system. The following speech commands relating
`to the various input dialogues of the navigation system can
`now be entered:
`
`“Destination location input” E1: This speech command
`activates the input dialogue “destination location
`input.”
`“Spell destination location” E2: This speech command
`activates the input dialogue “spell destination loca-
`tion.”
`
`“Coarse destination input” E3: This speech command
`activates the input dialogue “coarse destination input.”
`“Postal code” E4 or “telephone area code” E5: The input
`dialogue “indirect
`input” is activated by these two
`speech commands.
`“Street input” E6: This speech command activates the
`input dialogue “street input.”
`“Store address” E7: This speech command activates the
`input dialogue “store address.”
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`14
`
`14
`
`

`
`US 6,230,132 B1
`
`7
`“Call up address” E8: This speech command activates the
`input dialogue “call up address.”
`Instead of the above, of course, other terms can be used
`to activate the various input dialogues. In addition to the
`above speech commands, general speech commands can
`also be used to control the navigation system, for example
`“navigation information,” “start/stop navigation,” etc.
`After starting an input dialogue by speaking the corre-
`sponding speech command,
`the corresponding lexica are
`loaded as the vocabulary into the speech recognition device.
`With a successfully performed speech input of the destina-
`tion location as part of the destination address input by
`means of one of the input dialogues “destination location
`input” in step 1000, “spell destination location” in step 2000,
`“coarse destination input” in step 3000, or “indirect input”
`in step 4000, a check is then made in step 350 whether or not
`a corresponding street list is available for the recognized
`destination location. If the check yields a negative result, a
`branch is made to step 450. If the check yields a positive
`result, a check is made in step 400 to determine whether or
`not the user wants to enter a street name. If the user responds
`to question 400 by “yes,” the input dialogue “street input” is
`called up. If the user answers question 400 by “no” a branch
`is made to step 450. Question 400 is therefore implemented
`only if the street names for the corresponding destination
`location are included in the navigation system. In step 450,
`the recognized desired destination location is automatically
`updated by entering “center” or with “downtown” as the
`street input, since only a complete destination address can be
`transferred to the navigation system, with the destination
`address in addition to the destination location also compris-
`ing a street or a special destination, for example the railroad
`station, airport, downtown, etc. In step 500, the destination
`address is passed to the navigation system. Then the speech
`dialogue is concluded and the speech dialogue system
`returns to wait state 0.
`
`If the speech command “street input” E6 was spoken by
`the user at the beginning of the speech dialogue in step 300
`and recognized by the speech recognition device, in step
`5000 the input dialogue “street input” will be activated.
`Then, following the successful input of the desired destina-
`tion location and the street, the destination address is trans-
`ferred to the navigation system in step 500. If the speech
`command “call up address” E8 was spoken by the user at the
`beginning of the speech dialogue in step 300 and was
`recognized by the speech recognition device, in step 6000
`the input dialogue “call up address” will be activated. In the
`input dialogue “call up address” a keyword is spoken by the
`user and the address associated with the spoken keyword
`will be transferred in step 500 as a destination address to the
`navigation system. If the speech command “store address”
`E7 was spoken by the user at the beginning of the. speech
`dialogue in step 300 and recognized by the speech recog-
`nition device, in step 7000 the input dialogue “store address”
`is activated. By means of input dialogue “store address,” a
`destination address that has been entered is stored under a
`
`keyword spoken by the user in the personal address list.
`Then the input dialogue “call up address” is ended and the
`system returns to wait state 0.
`FIG. 2 shows in a schematic form a first embodiment of
`
`the input dialogue “enter destination location.” Following
`activation of the input dialogue “enter destination location”
`in step 1000, by virtue of the speech command “enter
`destination location” E1 spoken in step 300 by the user and
`recognized by the speech recognition device, in step 1010
`the basic vocabulary is loaded into the speech recognition
`device as can be seen from FIG. 2. The loading of the basic
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`vocabulary into the speech recognition device basically can
`also be performed at another point, for example after the
`initialization phase or following the actuation of the PTT
`button. This depends on the speed of the loading process and
`on the type of speech recognition device used. Then in step
`1020 the user is requested to enter a destination location. In
`step 1030 the user enters the desired destination location by
`speech input. This speech input is transferred in step 1040 as
`an acoustic value <destination location_1> to the speech
`recognition device and compared there with the basic
`vocabulary that was loaded; sampling values in the time or
`frequency domain or feature vectors can be transmitted to
`the speech recognition device as an acoustic value. The
`nature of the acoustic value thus transferred likewise
`
`depends on the type of speech recognition engine employed.
`As a result, the speech recognition engine supplies a first
`hypothesis list hypo.1 with place names which are sorted by
`probability of recognition. If the hypothesis list or hypo.1
`contains homophonic place names, i.e. place names that are
`pronounced identically but are written differently,
`for
`example Ahlen and Aalen, both place names receive the
`same recognition probability and both place names are taken
`into account in the continuation of the input dialogue. Then
`in step 1050 the place name with the greatest recognition
`probability is output as speech output <hypo.1.1> to the user
`with the question as to whether or not <hypo.1.1.> corre-
`sponds to the desired input destination location <destination
`location,1>. (At
`this point
`it still makes no difference
`whether several entries are present at the first location on the
`hypothesis list since the place names are pronounced
`identically.) If the answer to question 1050 is “yes” a jump
`is made to step 1150. If the user answers the question with
`“no” the acoustic value <destination location_1> of the
`destination location entered in step 1060 is stored for a
`possible later recognition process using another lexicon.
`Then the user is requested in step 1070 to pronounce the
`destination location again. In step 1080 the user enters the
`destination location once again by speech input. This speech
`input
`is transferred in step 1090 as the acoustic value
`<destination location,2> to the speech recognition device
`and compared there with the basic vocabulary that has been
`loaded. As a result the speech recognition device offers a
`second hypothesis list hyp

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket