throbber
(12) United States Patent
`Young et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,120,582 B1
`Oct. 10, 2006
`
`US007120582B1
`
`(54) EXPANDING AN EFFECTIVE VOCABULARY
`OF A SPEECH RECOGNITION SYSTEM
`
`(*) Notice:
`
`(75) Inventors: Jonathan H. Young, Newton, MA
`(US); Haakon L. Chevalier,
`Cambridge, MA (US); Laurence S.
`Gillick, Newton, MA (US); Toffee A.
`Albina, Cambridge, MA (US);
`Marlboro B. Moore, III, Jamaica
`Plain, MA (US); Paul E. Rensing, W.
`Newton, MA (US); Jonathan P.
`Yamron, Sudbury, MA (US)
`(73) Assignee: Dragon Systems, Inc., Newtonville,
`MA (US)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`(21) Appl. No.: 09/390,370
`(22) Filed:
`Sep. 7, 1999
`(51) Int. Cl.
`(2006.01)
`GOL 5/00
`(2006.01)
`GIOL I5/06
`(52) U.S. Cl. ....................................... 704/255; 704/243
`(58) Field of Classification Search ........ 704/243-245,
`704/255 257, 4, 9–10
`See application file for complete search history.
`References Cited
`
`(56)
`
`U.S. PATENT DOCUMENTS
`
`1, 1980 Pirz et al.
`4,181,821 A
`4,227,176 A 10, 1980 Moshier
`4,481,593. A 11, 1984 Bahler
`4.489.435 A 12/1984 Moshier
`4,783,803 A 11, 1988 Baker et al.
`4,805,218 A
`2/1989 Bamberg et al.
`4,805.219 A
`2f1989 Baker et al.
`4,829,576 A
`5, 1989 Porter
`4,837,831 A
`6, 1989 Gillick et al.
`
`(Continued)
`FOREIGN PATENT DOCUMENTS
`
`DE
`
`1951O 083. A
`
`9, 1996
`
`(Continued)
`OTHER PUBLICATIONS
`
`SYSTRANR) Personal for Windows 95 or NT (Version 1.0.2):
`http://www.systransoft.com/personal.html, pp. 1-2, May 6, 1998.
`(Continued)
`Primary Examiner—Angela Armstrong
`(74) Attorney, Agent, or Firm Fish & Richardson P.C.
`(57)
`ABSTRACT
`
`The invention provides techniques for creating and using
`fragmented word models to increase the effective size of an
`active vocabulary of a speech recognition system. The active
`Vocabulary represents all words and word fragments that the
`speech recognition system is able to recognize. Each word
`may be represented by a combination of acoustic models. As
`Such, the active vocabulary represents the combinations of
`acoustic models that the speech recognition system may
`compare to a user's speech to identify acoustic models that
`best match the user's speech. The effective size of the active
`Vocabulary may be increased by dividing words into con
`stituent components or fragments (for example, prefixes,
`Suffixes, separators, infixes, and roots) and including each
`component as a separate entry in the active vocabulary.
`Thus, for example, a list of words and their plural forms (for
`example, "book, books, cook, cooks, hook, hooks, look and
`looks') may be represented in the active vocabulary using
`the words (for example, “book, cook, hook and look”) and
`an entry representing the Suffix that makes the words plural
`(for example, "+s', where the "+” preceding the “s' indi
`cates that “+s' is a suffix). For a large list of words, and
`ignoring the entry associated with the Suffix, this technique
`may reduce the number of vocabulary entries needed to
`represent the list of words considerably.
`
`43 Claims, 31 Drawing Sheets
`
`1605
`
`1700
`Postulate fragments
`
`1705
`
`Split Words
`
`1707
`ldentify and Keep
`Best Fragments
`
`1710
`Postulate Fragments
`for Umsplit Words
`1715
`
`Split Words
`
`720
`
`Keep fragments
`up to threshold
`
`1750
`Split Backup Dictionary Words
`
`1745
`
`Create Stop list
`
`1740
`Using Short roots, Split insplit
`Backup Dictionary Words
`
`1730
`Make Threshold
`1725
`33d More Difficult
`to Satisfy
`Yes
`1735
`Excluding Short Roots, Split
`Backup Dictionary Words
`
`IPR2023-00037
`Apple EX1015 Page 1
`
`

`

`US 7,120,582 B1
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`2f1990 Gillick et al.
`4,903,305 A
`6, 1991 Roberts et al.
`5,027,406 A
`4, 1993 Gillick et al.
`5,202,952 A
`8, 1993 Bahl et al.
`5,233,681 A
`5,267,345 A 11/1993 Brown et al.
`5,428,707 A
`6, 1995 Gould et al.
`5,526.463. A
`6, 1996 Gillick et al.
`5,680,511 A 10, 1997 Baker et al.
`5,754,972 A
`5, 1998 Baker et al.
`5,765,132 A * 6/1998 Roberts ...................... TO4,254
`5,797,122 A
`8/1998 Spies
`5,835,888 A * 11/1998 Kanevsky et al. ............. TO4/9
`6,092,044 A * 7/2000 Baker et al. ......... ... 704,254
`6,212,498 B1 * 4/2001 Sherwood et al. .......... 704,244
`FOREIGN PATENT DOCUMENTS
`
`
`
`EP
`EP
`
`O 982 712 A2
`O 992 979 A2
`
`3, 2000
`4/2000
`
`OTHER PUBLICATIONS
`SYSTRANR PROfessional for Windows (Version 2.0); http://www.
`systransoft.com/pro.html, pp. 1-4, Nov. 13, 1997.
`SYSTRANR) Classic for Windows (Version 1.6.2): http://www.
`systransoft.com/clas.html, pp. 1-3, Nov. 12, 1997.
`SYSTRANR PROfessional Client/Server (Version 1.6.2): http://
`www.systransoft.com/cliser.html, pp. 1-3, Nov. 12, 1997.
`SYSTRAN's MT Architecture, http://www.systransoft.com/how
`works.html, pp. 1-3, Nov. 19, 1997.
`Langenscheidt's T1, “The translator for translators'. http://www.
`gmsmuc.de/english/tl.html, pp. 1-2, 1997.
`Langenscheidt's T1 Plus, http://www.gmsmuc.de/english/tlplus.
`html, p. 1, 1997.
`Langenscheidt's T1, “Professional—Setting New Standards in
`Machine Translation'. http://www.gmsmuc.de/english/tlprofi.html.
`pp. 1-2, 1997.
`Langenscheidt's T1Translation Memory, http://www.gmsmuc.de/
`english/memory.html, p. 1, 1997.
`Langenscheidt's T1 Hotline Support, , http://www.gmsmuc.de/
`english hotline.html, p. 1-2, 1997.
`GLOBALINK(R) Power Translator 6.0, http://www.globalink.com/
`pages/product-pwtrans6.html, p. 1.
`Comprende—Real Time Internet Translation Services, http://www.
`globalink.com/pages/product-comprende.html, p. 1.
`Intranet Translator RealTime Intranet Translation Services, http://
`www.globalink.com/pages/product-intranet-translator.html, p. 1.
`Web Translator, http://www.globalink.com/pages/product-web
`translator.html, pp. 1-2.
`
`Talk to Me, http://www.globalink.com/pages/product-talktome.
`html, p. 1.
`Language Assistance Series, http://www.globalink.com/pages/prod
`uct-language-assistant.html, p. 1.
`Subject Dictionaries, http://www.globalink.com/pages/product-Sub
`ject-dictionaries.html, p. 1.
`E-mail Translator Plug-In for Eudora, http://www.globalink.com/
`pages/product-plugins.html, p. 1.
`Frisch et al., “Spelling Assistance for Compound Words'. IBM
`Journal of Research & Development; vol. 32, No. 2, Mar. 1, 1988,
`pp. 197-198.
`Marcus Spies; "A Language Model for Compound Words in Speech
`Recognition'. European Conference on Speech Communication and
`Technology, Sep. 1995, pp. 1767-1770.
`Bandara et al., “Handling German Compound Words in an Isolated
`Word Speech Recognizer", IEEE Workshop on Speech Recognition,
`Harriman, NY, Dec. 15-18, 1992, pp. 1-3.
`Steeneken et al., “Multi-Lingual Assessment of Independent Large
`Vocabulary Speech-Recognition Systems: The Sqale-Project”.
`European Conference on Speech Communication and Technology,
`Sep. 1995, pp. 1271-1274.
`Dugast et al., “The Philips Large-Vocabulary Recognition System
`for Americal English, French and German'. European Conference
`on Speech Communication and Technology, Sep. 1995, pp. 197
`200.
`Pye, et al., "Large Vocabulary Multilingual Speech Recognition
`Using HTK'. European Conference on Speech Communication and
`Technology, Sep. 1995, pp. 181-184.
`Lamel et al., “Issues in Large Vocabulary, Multilingual Speech
`Recognition'. European Conference on Speech Communication and
`Technology, Sep. 1995, pp. 185-188.
`Barnett et al., Comparative Performance in Large-Vocabulary Iso
`lated-Word Recognition in Five European Languages, European
`Conference on Speech Communication and Technology, Sep. 1995,
`pp. 189-192.
`Geutner, P.; “Using Morphology Towards Better Large-Vocabulary
`Speech Recognition Systems'; Proceedings of the International
`Conference on Acoustics, Speech and Signal Processing (ICASSP);
`pp. 445-448; May 9, 1995; XP 000658026.
`Hwang, “Vocabulary Optimization Based on Perplexity”; IEEE
`International Conference on Acoustics, Speech and Signal Process
`ing (CASSP); pp. 1419-1422; Apr. 21, 1997; XP000822723.
`Berton et al., “Compound Words in Large-Vocabulary German
`Speech Recognition Systems'; Proceedings of the International
`Conference on Spoken Language Processing, vol. 2: pp. 1165-1168;
`Oct. 3, 1996; XP002142831.
`Wothke, K.; “Morphologically based automatic phonetic transcrip
`tion: IBM Systems Journal, vol. 32(3): pp. 486-511; 1993.
`* cited by examiner
`
`IPR2023-00037
`Apple EX1015 Page 2
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 1 of 31
`
`US 7,120,582 B1
`
`100 Y
`
`120
`
`Display
`
`115
`
`125
`
`
`
`Computer
`
`Memory
`150
`Operating
`System
`
`Speech
`Recognition
`Software
`
`FIG. 1
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 3
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 2 of 31
`
`US 7,120,582 B1
`
`SAIPY
`
`Aseinqeson
`
`onsnooy
`
`sjepoyy
`
`be eeea
`
`JUIBSUOD
`
`SJBWLUEJS)
`
`dnyoeg
`
`Ayeuonoig
`
`SVC
`
`uonluBooay
`
`sojepipued
`
`|Ou}UOT)
`
`GLE
`
`SOLLJ9}U|/[OUOD
`
`gINpoyy
`
`0é¢
`
`
`
`Jo
`
`aoe
`
`SPION\
`
`sjsenbey
`
`Ove
`
`Buueyy-ad
`
`aINPSd0/¢
`
`JaziubooeyBuisseoold
`
`
`"SIBJOWEIEd|pys-ju014sojdwes
`
`josowel4jevbig
`
`
`
`aInpo!W0Sz
`
`IPR2023-00037
`Apple EX1015 Page 4
`
`IPR2023-00037
`Apple EX1015 Page 4
`
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 3 of 31
`
`US 7,120,582 B1
`
`300
`
`305
`
`31 O
`
`Produce X(f)
`
`Determine
`log (X(f))2
`
`315
`Frequency Warping
`
`320
`Filter Bank Analysis
`
`3 25
`Cepstral Analysis
`
`3 30
`Channel Normalization
`
`3 35
`Produce Cepstral
`Differences
`
`3 40
`Produce Cepstral
`Second Differences
`
`3 45
`
`MELDA
`
`FIG. 3
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 5
`
`

`

`U.S. Patent
`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 4 of31
`
`US 7,120,582 B1
`US 7,120,582 B1
`
`
`
`
`oO_
`ao
`JU2s
`> Oo
`© 8S |i
`oe
`=t
`
`
`
`Oo
`“
`
`
`
`
`
`"select" 405
`
`
`
`at
`+ <
`«ho
`O ©
`La
`—
`
`mt
`+ <
`.
`bo
`Oo
`Lg
`ema?
`
`IPR2023-00037
`Apple EX1015 Page 6
`
`IPR2023-00037
`Apple EX1015 Page 6
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet S of 31
`
`US 7,120,582 B1
`
`
`
`FIG. 5
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 7
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 6 of 31
`
`US 7,120,582 B1
`
`600-N
`
`512
`
`O
`
`505
`
`C
`
`61 O
`
`O
`H
`
`O
`AA
`
`t
`
`O
`Z
`
`QU
`
`OR
`
`LQ
`
`CM
`
`6
`
`O
`
`ZO
`
`62O
`
`NGC
`
`615
`
`Hear
`
`Heal,
`Heel
`
`O
`
`Hum
`
`Hug
`
`Heals,
`Heels
`
`Healing
`
`ONG
`
`Humming
`
`FIG. 6
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 8
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 7 of 31
`
`US 7,120,582 B1
`
`700
`
`Get Next Frame
`
`Find an Active Node with no
`Active, Unprocessed Subnodes
`
`710
`
`
`
`715
`
`
`
`At
`Highest Node
`2
`
`
`
`Yes
`
`725
`
`730
`
`Go to Next Node with no
`Unprocessed Subnodes
`
`735
`More
`Y
`Words Fossible)
`
`
`
`
`
`740
`
`No
`
`Return List
`
`FIG. 7
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 9
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 8 of 31
`
`US 7,120,582 B1
`
`800
`
`"
`
`80s.'
`
`830-
`
`512-N
`845
`()
`so.5 865
`
`FIG. 8B
`(Prior Art)
`
`"
`
`a15 825
`
`"
`
`815
`
`825-
`
`FIG. 8A
`(Prior Art)
`
`505
`860,
`
`as
`
`-
`
`FIG. 8C
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 10
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 9 of 31
`
`US 7,120,582 B1
`
`
`
`(„0.) 018
`
`
`
`(„8.) 908
`
`
`
`006
`
`G06
`
`O L6
`
`OZ6
`
`006
`
`G06
`
`0 | 6
`
`G?6
`
`OZ6
`
`IPR2023-00037
`Apple EX1015 Page 11
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 10 of 31
`
`US 7,120,582 B1
`
`1100
`
`1105
`
`1110
`
`1115
`
`
`
`1125
`
`Update
`SCOres/Times
`
`Score(s)
`D
`Threshold
`
`No
`
`Deactivate
`States/Node
`
`WOrd
`to be Added
`?
`
`
`
`Add Words to List
`
`1130
`
`Save SCOre for
`Reseeding
`
`FIG. 11
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 12
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 11 of 31
`
`US 7,120,582 B1
`
`1200-
`
`1205
`
`
`
`1210
`
`Initialize Lexical Tree
`
`Retrieve Frame
`
`1215
`No
`Hypothesiso Consider
`
`Y eS
`Go to First Hypothesis
`
`Compare Frame to Hypothesis
`
`Update Score
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`1235
`
`1245
`
`Delete Hypothesis
`
`Set Word Ending Flag
`
`1255
`Additional ilypothesis
`
`1260
`
`Next Hypothesis
`
`1265
`
`1270
`
`
`
`
`
`
`
`
`
`1275
`1280
`
`Request Pre-filtering List
`
`Create/Expand Hypotheses
`
`1285
`
`Return Recognition Candidates
`
`
`
`No
`
`
`
`
`
`FIG. 12
`(Prior Art)
`
`IPR2023-00037
`Apple EX1015 Page 13
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 12 of 31
`
`US 7,120,582 B1
`
`OOOOG
`
`
`
`
`
`
`
`OOOOG
`
`Spo/W
`
`IPR2023-00037
`Apple EX1015 Page 14
`
`

`

`U.S. Patent
`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 13 of 31
`
`US 7,120,582 B1
`US 7,120,582 B1
`
`SorL
`
`sjuowBel4
`
`oS©
`
`>Oo
`
`oO
`
`Ozrl3S.
`SZbl2goOonn
`
`oerl=
`
`Gehl
`
`Oo
`
`35aNon
`
`VvlSls dnyoegKJenvy2
`sexijeld[]]sjooyeny[||aiqenids-uoyTM=aiqenids0SOXIUNSKYS}OOYPONEZsjusuBel4RYSPIOAA
`SAIOY£7
`orb‘OIdarb“Sid
`
`
`
`
`
`
`
`
`
`
`
`Jayy-youel4
`
`
`
`uol}e}IIGsJa;dwoy
`
`Asejnqeson
`
`a10Jog-youel4
`
`
`
`uOolje}LDIGgeye|dwo04y
`
`Aieinqeso0a
`
`
`
`
`000‘0SZ 000°00Z 000'0SL 000°001
`
`000'0S
`
`0
`
`SPJON
`
`000
`
`NaoO
`
`NS©2SSo
`
`—222=)S—wn2l=)2oS
`
`0000S
`
`IPR2023-00037
`Apple EX1015 Page 15
`
`IPR2023-00037
`Apple EX1015 Page 15
`
`
`
`
`

`

`U.S. Patent
`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 14 of 31
`
`US 7,120,582 B1
`US 7,120,582 B1
`
`sjuowbe4
`
`sjuowBes4 PJOAA
`
`SOXISJg[I]s}jooyey
`SOXILINSKJS]OONPJOAAFY
`
`aiqeniids-uon
`
` aiqenndsOsjuewbel4AYSPIOAASAIDY Aseynqesoa
`dnyoegRY)eA
`
` 0
`
`O
`
`9S}‘Sls
`
`
`
`gS-‘Old
`
`VSb‘Sis
`
`
`
`
`
`
`Jayy-ys![6uq“sn
`
`elojog-ysi[buq“s'n
`
`
`
`uolepIGgeye;}dwo4y
`
`
`
`uole}IGgeyajdwoy
`
`000‘0SZ O00'00Z 000'0SL 000'001
`
`000°0S
`
`SPIOM
`
`Asejnqesoa
`
`000‘0SZ 000°00Z/ 000'0Sl 000°001
`
`000'0S
`
`O
`0
`
`IPR2023-00037
`Apple EX1015 Page 16
`
`IPR2023-00037
`Apple EX1015 Page 16
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 15 of 31
`
`US 7,120,582 B1
`
`Generate New ACtive
`Vocabulary & Backup
`Dictionary
`
`
`
`Perform Speech
`Recognition on Utterance
`
`Perform Post-Recognition
`Processing on Utterance
`
`
`
`FIG. 16B
`
`IPR2023-00037
`Apple EX1015 Page 17
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 16 of 31
`
`US 7,120,582 B1
`
`1605
`
`1700
`Postulate Fragments
`
`1750
`Split Backup Dictionary Words
`
`11 77 OO
`
`Split Words
`
`Identify and Keep
`Best Fragments
`
`1 7 1 O
`Postulate Fragments
`for Unsplit Words
`
`1715
`
`1720
`
`Split Words
`
`Keep Fragments
`up to Threshold
`
`
`
`1745
`
`Create Stop List
`
`1740
`Using Short Roots, Split Unsplit
`Backup Dictionary Words
`
`1725
`Make Threshold
`Syd More Difficult
`to Satisfy
`
`Yes
`1735
`Excluding Short Roots, Split
`Backup Dictionary Words
`
`FIG. 17
`
`IPR2023-00037
`Apple EX1015 Page 18
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 17 of 31
`
`US 7,120,582 B1
`
`
`
`1800
`
`Postulate Affixes
`
`1805
`Find Word-ROOtS and
`True-ROOtS
`
`
`
`1810
`POStulate Affixes With ACtive
`Vocabulary Supplemented
`by Roots
`
`
`
`Find Useful Spelling Rules
`
`
`
`
`
`
`
`Group Spelling Rules
`to Form Affixes
`1910 N.
`Keep Most Useful Spelling
`Rules for Each Affix
`
`Keep Most Useful Affixes
`
`FIG. 18
`
`FIG. 19
`
`IPR2023-00037
`Apple EX1015 Page 19
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 18 of 31
`
`US 7,120,582 B1
`
`1900
`
`2000
`Read Backup Dictionary
`into Data Structure
`
`2005
`Index Words in Active
`Vocabulary Using j
`and Set j=0
`
`2010
`
`End
`
`2040
`
`Retrieve Active Wordj
`
`Output Spelling Rules
`
`
`
`2015
`For Active Word j,
`Search for Similar Words
`in Backup Dictionary
`
`
`
`Increment j
`
`Yes
`
`2030
`
`No
`
`More
`Active
`Words
`2
`
`2025
`Store All Spelling
`Rules
`
`FIG. 20
`
`IPR2023-00037
`Apple EX1015 Page 20
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 19 Of 31
`
`US 7,120,582 B1
`
`2100
`Find POSSible ROOtS Based On
`Junction with Affixes to Form
`Backup Dictionary Words
`
`2105
`
`Retrieve ROOt
`
`
`
`2115
`Call Root
`WOrd ROOt
`
`
`
`2110
`Yes
`
`ROOt
`Close to Existing
`Word
`
`Call ROOt
`True ROOt
`
`2125
`
`2135
`
`
`
`More
`Ropts
`
`NO
`
`Keep Most Useful
`New ROOtS
`
`End
`
`Retrieve
`Next ROOt
`
`IPR2023-00037
`Apple EX1015 Page 21
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 20 of 31
`
`US 7,120,582 B1
`
`2200
`Read Backup Dictionary and
`Affixes into Data Structure
`
`
`
`2205
`Retrieve Word in Backup
`Dictionary
`
`2210
`
`Retrieve Affix j
`
`2215
`Root ij= Word i-Affix.j
`
`2220
`
`Store Root ij
`
`
`
`
`
`Retrieve Next
`Affix j
`
`
`
`2240
`Retrieve Next
`Word
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IPR2023-00037
`Apple EX1015 Page 22
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 21 of 31
`
`US 7,120,582 B1
`
`Find WOrd-ROOtS
`and True-ROOtS
`2305
`Postulate Affixes for
`Unsplit Words for All
`Words in Active
`Vocabulary + Latest
`Set of ROOtS
`
`FIG. 23
`
`
`
`
`
`
`
`Do Splitting
`
`2410
`Count Uses of
`Each Fragment
`
`2415
`Keep Most Useful
`Fragments
`
`FIG. 24
`
`IPR2023-00037
`Apple EX1015 Page 23
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 22 of 31
`
`US 7,120,582 B1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`2500
`
`Determine Possible Splits
`
`2505
`Load All Words and Word
`Fragments into Data Structure
`2510
`
`Set Word Index KEO
`
`2515
`
`For Wordk, Find
`Partial PrOn Matches
`
`Split Cover
`Entire Word
`p
`
`
`
`Split
`Valid
`
`
`
`
`
`2535
`Increment Yes
`K
`
`
`
`
`
`S
`Matching
`Complete
`?
`
`
`
`Output
`Splits
`
`FIG. 25
`
`IPR2023-00037
`Apple EX1015 Page 24
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 23 of 31
`
`US 7,120,582 B1
`
`assaloocano
`41%.1%LOOO
`ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZl 9,000 &
`SN
`S.
`S.
`SZZZZZ 967000 Y2
`44444 %GOOO Q6
`SNS
`O
`S.
`
`S.
`NNNNS
`ZZZZZZZZZZZZZZZZ 9, O'O & Q
`SNNNNS
`t
`S. 9,
`SZZZZ, 9% OO
`ZaayaaaaaaaaaaaaZ % G OO
`O
`YNNass
`ZZZZZZZZZ (9%GOO
`SYYYYY
`O
`ZZZZZZZ 9.9700
`RSSN
`Z41 % AOO
`RNNYSSSNYSSYYY
`444 %OO
`NNNNNNSNNNNYY
`ZZZZZZZZZ 900
`SSNNNNNSNSN
`7.7272/72222222222
`N %OZO
`
`OY
`S. &
`t
`s^
`?o
`S.
`t
`S.
`S. 2 co
`S.
`N
`& '9' as
`(D
`S.
`S.
`so LL
`
`
`
`w
`O w
`r – S 6
`R 9
`5
`O Go O
`9 5
`p O g
`Z st
`t
`r:
`S.
`&
`
`}
`
`()
`
`n
`
`Y
`
`SaaSYSSSSSSYNY.
`o
`S44%Og O
`ZZZZZZZZZZZ 94O9 O
`41 %OAAO
`YSSYSSSSSSSSNYSSYS
`ZZ 9%OO60
`SSSYaaaaar-SSNSSN
`O
`
`SSSYaaaaaSSSSS
`
`y O
`4 %OO6
`San SSSSSSSSSSSSSNNNN a
`ZYZ 9/OOR6
`SaaSNNNSSSSSSSSSSSNYS
`stanz
`SNSSNNNNNNNNNNNSNS
`%000 OCZ
`4 %OOOC
`sy O
`24 %OOOC7
`SNNaxNSSNNaNNSN O
`
`O O O v- O Co O O O
`O o v
`O r
`v
`
`equunN/Ue3e
`
`O O
`O o
`O O
`O O
`o y
`v Y
`S
`S
`
`W.
`OY
`S.
`/
`s
`s
`1,
`O
`27 o
`
`(S
`%
`SO
`9,
`S '9
`S.
`h
`2,
`1, O
`or
`S.
`
`IPR2023-00037
`Apple EX1015 Page 25
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 24 of 31
`
`US 7,120,582 B1
`
`Begin
`
`2705
`
`Limit NE
`
`2 7 1 to Build
`
`Active Vocabulary
`2 7 1 5
`Split Words in
`Active Vocabulary
`2720
`Build NeWACtive
`Vocabulary
`
`2725
`Not Using Short Roots,
`Split Words in
`Backup Vocabulary
`2730
`Using Short Roots,
`Split Words in
`Backup Dictionary
`2735
`Make List of
`Unused Fragments
`
`End
`
`FIG. 27
`
`IPR2023-00037
`Apple EX1015 Page 26
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 25 Of 31
`
`US 7,120,582 B1
`
`Generate Language Model SCOres for
`Words and Word Fragments in Active
`Vocabulary
`
`FG. 28A
`
`
`
`
`
`
`
`
`
`Retrieve Training Collection of Text
`
`2830
`
`Build N-gram Language Model
`
`
`
`2835
`For Each N-gram Sequence, Replace
`Splittable Backup Words with Corresponding
`Words and Word Fragments
`
`2840
`Generate N-gram Language Model
`for Words and Word Fragments
`
`
`
`FIG. 28B
`
`IPR2023-00037
`Apple EX1015 Page 27
`
`

`

`
`
`eBenbue7qwesbiunS1OUMnN+OOZE=BtOUM0}N+00LS=o6OJOUM
`
`
`lapoyyON+006h=0}Ayyoinb
`
`S86|Ain+oge+009==Alt06
`cggz—”JxoLJO992
`
`zsezseLS8z
`"*J8A0OBAj+yoinboy="=aueuMO68Aj+JUSHINYIM
`
`
`weBIN499g¢=juabinAjuebun
`
`MPNOge=yoINbJ8A0
`
`
`
`
`“*yaa006Ayyoinboo}«6°auaumMO68Ajuabin[iM*°
`
`JOA0V4OGG=JAAO
`
`
`IMN+OSGE
`
`9S87¢
`
`}XOL_JOUTOa||OD
`
`
`
`UdH99]/ODpolylpow
`
`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 26 of 31
`
`US 7,120,582 B1
`
`982‘Sls
`
`IPR2023-00037
`Apple EX1015 Page 28
`
`IPR2023-00037
`Apple EX1015 Page 28
`
`
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 27 of 31
`
`US 7,120,582 B1
`
`
`
`ebenbueqwesbig
`
`|spo/
`
`*JOAOOBAl+YOIND0}
`
`
`aJayMOBAj+JueGun[IM-~
`
`
`
`U9H99|[ODPopo
`
`jx]JO
`
`}XOlJOUOlaTIOD
`
`CS8SA
`
`NSLS8c
`
`**JQ@A006AyyoOINboO}
`
`
`
`aJ9ymobAjuebun[IM
`
`PAIN+OOL+002A\wsbin,|+002
`
`Aj+juebun
`
`06Aj+
`
`BJOUMob+009
`8J8UMOB
`
`yuebuinIMn+OSz
`
`
`yUeBHIN||IM
`
`INDON+OSL
`
`APPIN+COL
`
`
`
`J8AOOBDy00+
`
`Aj+yoinb
`
`yoinbo}
`
`Jaaoob
`
`LL82
`
`
`
`Ajjua6un|yim
`
`06Ajjuebun
`
`Ajyoinb0}
`
`06Ajyoinb
`
`3J8UMOB
`
`JaA006
`
`qagz
`
`r
`
`IPR2023-00037
`Apple EX1015 Page 29
`
`IPR2023-00037
`Apple EX1015 Page 29
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 28 of 31
`
`US 7,120,582 B1
`
`IPR2023-00037
`Apple EX1015 Page 30
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 29 of 31
`
`US 7,120,582 B1
`
`2880
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`5
`
`;
`
`;
`
`FIG. 28F
`
`IPR2023-00037
`Apple EX1015 Page 31
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 30 of 31
`
`US 7,120,582 B1
`
`1660
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Receive Recognition
`Candidates
`2900
`Select Next
`Candidate
`
`includes
`Fragments
`
`Yes
`
`No
`2910-
`Add Candidate
`to Revised List
`
`
`
`2915
`Process Word
`Fragments
`
`NeW
`Candidate(s)
`
`2925
`
`ResCOre
`Candidate(s)
`
`Add ResCored
`Candidate(s) to
`Revised List
`
`More
`Candidates
`
`
`
`FIG. 29
`
`
`
`
`
`IPR2023-00037
`Apple EX1015 Page 32
`
`

`

`U.S. Patent
`
`Oct. 10, 2006
`
`Sheet 31 of 31
`
`US 7,120,582 B1
`
`2915
`
`-3000
`Retrieve Next Sequence
`of Fragments
`
`3010
`Get First Spelling
`Rule Set
`
`Z-3015
`Form Prospective Word
`
`
`
`invalid
`Sequence
`
`Yes
`
`Backup
`Dictionary
`2
`
`NO
`
`Yes
`Generate Candidate
`Using Prospective Word
`
`Active
`Vocabulary
`
`
`
`
`
`
`
`
`
`
`
`
`
`Process Word Yes
`Fragments
`
`
`
`More
`Fragments in
`Generated
`Candidate
`
`
`
`
`
`
`
`
`
`
`
`
`
`Spelling
`Rules
`
`No
`
`Yes
`3040
`Get Next Spelling
`Rule Set
`
`FIG. 30
`
`3045
`
`
`
`
`
`More
`Sequences
`p
`
`Yes
`
`No End
`
`IPR2023-00037
`Apple EX1015 Page 33
`
`

`

`US 7,120,582 B1
`
`1.
`EXPANDING AN EFFECTIVE VOCABULARY
`OF A SPEECH RECOGNITION SYSTEM
`
`BACKGROUND
`
`2
`an available amount of memory is limited. Since the recog
`nizer does not recognize words that are not included in the
`active vocabulary, the ability of the recognizer to recognize
`less-frequently-used words may be improved by increasing
`the size of the active vocabulary.
`The effective size of the active vocabulary may be
`increased by dividing words into constituent components or
`fragments (for example, prefixes, Suffixes, separators,
`infixes, and roots) and including each component as a
`separate entry in the active vocabulary. Thus, for example,
`a list of words and their plural forms (for example, “book,
`books, cook, cooks, hook, hooks, look and looks') may be
`represented in the active vocabulary using the words (for
`example, “book, cook, hook and look”) and an entry repre
`senting the Suffix that makes the words plural (for example,
`+s', where the "+
`& G
`“+” preceding the
`indicates that "+s” is
`a Suffix). For a large list of words, and ignoring the entry
`associated with the Suffix, this technique may reduce the
`number of vocabulary entries needed to represent the list of
`words considerably.
`The invention provides a method of expanding an effec
`tive active vocabulary of a speech recognition system that
`uses a speech recognizer. The speech recognizer perform
`speech recognition on a user utterance to produce one or
`more recognition candidates. Speech recognition includes
`comparing digital values representative of the user utterance
`to a set of acoustic models representative of an active
`vocabulary of the system. The set of acoustic models
`includes models of words and models of word fragments.
`The method further includes receiving the recognition
`candidates from the speech recognizer. When a received
`recognition candidate includes a word fragment, the method
`includes determining whether the word fragment may be
`combined with one or more adjacent word fragments or
`words to form a proposed word included in a backup
`dictionary of the speech recognition system.
`Furthermore, if the word fragment may be combined with
`one or more adjacent word fragments or words to form a
`proposed word included in a backup dictionary of the speech
`recognition system, the method includes modifying the
`recognition candidate to Substitute the proposed word for the
`word fragment and the one or more adjacent word fragments
`or words used to form the proposed word.
`Moreover, if the word fragment may not be combined
`with one or more adjacent word fragments or words to form
`a proposed word included in a backup dictionary of the
`speech recognition system, the method includes discarding
`the recognition candidate.
`Embodiments may include one or more of the following
`features. For example, the expanded effective vocabulary
`may include words from the backup dictionary that are
`formed from a combination of words and word fragments or
`word fragments and word fragments from an active Vocabu
`lary that includes words and word fragments, and words
`from the active vocabulary.
`The word fragments may include Suffixes, prefixes, and
`roots that are not words. Additionally, one or more spelling
`rules may be associated with each prefix and each suffix.
`Determining whether the word fragment may be combined
`with one or more adjacent word fragments or words to form
`a proposed word may include using a prefix or Suffix as the
`particular word fragment and using an associated spelling
`rule in forming the proposed word. As a result of using the
`associated spelling rule, a spelling of the proposed word may
`differ from a spelling that would result from merely con
`catenating the particular word fragment with the one or more
`adjacent word fragments or words.
`
`10
`
`15
`
`The invention relates to expanding an effective Vocabu
`lary of a speech recognition system.
`A speech recognition system analyzes a user's speech to
`determine what the user said. Most speech recognition
`systems are frame-based. In a frame-based system, a pro
`cessor divides a signal descriptive of the speech to be
`recognized into a series of digital frames, each of which
`corresponds to a small time increment of the speech.
`A speech recognition system may be a "discrete” system
`that recognizes discrete words or phrases but which requires
`the user to pause briefly between each discrete word or
`phrase. Alternatively, a speech recognition system may be a
`“continuous system that can recognize spoken words or
`phrases regardless of whether the user pauses between them.
`Continuous speech recognition systems typically have a
`higher incidence of recognition errors in comparison to
`discrete recognition systems due to complexities of recog
`nizing continuous speech. A detailed description of continu
`ous speech recognition is provided in U.S. Pat. No. 5,202,
`952, entitled “LARGE-VOCABULARY CONTINUOUS
`25
`SPEECH PREFILTERING AND PROCESSING SYS
`TEM, which is incorporated by reference.
`In general, the processor of a continuous speech recog
`nition system analyzes “utterances” of speech. An utterance
`includes a variable number of frames and corresponds, for
`example, to a period of speech followed by a pause of at
`least a predetermined duration.
`The processor determines what the user said by finding
`acoustic models that best match the digital frames of an
`utterance, and by identifying text that corresponds to those
`acoustic models. An acoustic model may correspond to a
`word, phrase or command from a vocabulary. An acoustic
`model also may represent a sound, or phoneme, that corre
`sponds to a portion of a word. Collectively, the constituent
`phonemes for a word represent the phonetic spelling of the
`word. Acoustic models also may represent silence and
`various types of environmental noise. In general, the pro
`cessor may identify text that corresponds to the best-match
`ing acoustic models by reference to phonetic word models in
`an active Vocabulary of words and phrases.
`The words or phrases corresponding to the best matching
`acoustic models are referred to as recognition candidates.
`The processor may produce a single recognition candidate
`for an utterance, or may produce a list of recognition
`candidates.
`
`30
`
`35
`
`40
`
`45
`
`50
`
`SUMMARY
`
`The invention provides techniques for creating and using
`fragmented word models to increase the effective size of an
`active vocabulary of a speech recognition system. The active
`Vocabulary represents all words and word fragments that the
`speech recognition system is able to recognize. Each word
`may be represented by a combination of acoustic models. As
`Such, the active vocabulary represents the combinations of
`acoustic models that the speech recognition system may
`compare to a user's speech to identify acoustic models that
`best match the user's speech.
`Memory and processing speed requirements tend to
`increase with the number of entries in the active vocabulary.
`As such, the size of the active vocabulary that may be
`processed in an allotted time by a particular processor using
`
`55
`
`60
`
`65
`
`IPR2023-00037
`Apple EX1015 Page 34
`
`

`

`US 7,120,582 B1
`
`10
`
`15
`
`3
`Determining whether the word fragment may be com
`bined with one or more adjacent word fragments or words
`may include retrieving from the received recognition can
`didate a sequence that includes the particular word fragment
`and adjacent word fragments or words. Determining may
`further include determining if the sequence is a valid
`Sequence.
`A valid sequence may include only one or more allowed
`adjacent combinations of word fragments and words. More
`over, allowed adjacent combinations may include one or
`more prefixes, followed by a root or a word, followed by one
`or more Suffixes. Other allowed adjacent combinations may
`include a root or a word followed by one or more suffixes,
`and one or more prefixes followed by a root or a word.
`The method may further include combining the particular
`word fragment with the one or more adjacent word frag
`ments or words to form a second proposed word that differs
`from the first proposed word by using a second associated
`spelling rule in forming the proposed word.
`One or more spelling rules may be associated with a
`particular word fragment. And, combining the particular
`word fragment with one or more adjacent word fragments or
`words to form a proposed word may include using an
`associated spelling rule in forming the proposed word. As a
`result of using the associated spelling rule, a spelling of the
`proposed word may differ from a spelling that would result
`from merely concatenating the particular word fragment
`with the one or more adjacent word fragments or words.
`Determining whether the word fragment may be com
`30
`bined with one or more adjacent word fragments or words to
`form a proposed word included in a backup dictionary of the
`speech recognition system may include searching the
`backup dictionary for the proposed word.
`Modifying the recognition candidate may include forming
`a prospective recognition candidate, and if the prospective
`recognition candidate includes an additional word fragment,
`forming a final recognition candidate. The prospective rec
`ognition candidate may be formed by modifying the recog
`nition candidate to substitute the proposed word for the word
`fragment and the one or more adjacent word fragments or
`words used to form the proposed word. Moreover, the
`prospective recognition candidate may be further processes
`to generate an additional word using the additional word
`fragment and one or more adjacent words or word frag
`ments. The final recognition candidate may be formed by
`replacing the additional word fragment and the one or more
`adjacent words with the additional word.
`A score may be associated with the received recognition
`candidate. Such that the method further includes producing
`a score associated with the modified recognition candidate
`by rescoring the modified recognition candidate.
`The score associated with the received recognition can
`didate may include an acoustic component and a language
`model component. Rescoring the modified recognition can
`didate may therefore include generating a language model
`score for the modified recognition candidate.
`Producing the score associated with the modified recog
`nition candidate may include combining the acoustic com
`ponent of the score for the received recognition candidate
`60
`with the language model score generated for the modified
`recognition candidate.
`Rescoring the modified recognition candidate may
`include generating an acoustic model score for the modified
`recognition candidate. Furthermore, producing the score
`associated with the modified recognition candidate may
`include combining the acoustic model score generated for
`
`50
`
`4
`the modified recognition candidate with the language model
`score generated for the modified recognition candidate.
`The score associated with the received recognition can
`didate may include an acoustic component and a language
`model component. Rescoring the modified recognition can
`didate may therefore include generating an acoustic score
`for the modified recognition candidate.
`Producing the score associated with the modified recog
`nition candidate may include combining the language model
`component of the score for the received recognition candi
`date with the

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket