throbber
I 1111111111111
`
`111 1111111111 1111111111 11111 11111 1111111111 111111111111111111
`
`US010971140B2
`
`(IO) Patent No.: US 10,971,140 B2
`
`c12) United States Patent
`
`(45)Date of Patent:
`Apr. 6, 2021
`Catchpole
`
`(54)SPEECH RECOGNITION CIRCUIT USING
`
`PARALLEL PROCESSORS
`
`(56)
`
`
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`
`
`
`
`
`
`(71)Applicant: Zentian Limited, Cambridge (GB)
`
`
`
`
`
`
`
`4,977,599 A * 12/1990 Bahl ....................... Gl0L 15/02
`
`704/256.4
`
`4,980,918 A * 12/1990 Bahl ....................... Gl0L 15/08
`704/240
`(Continued)
`
`
`
`
`
`(72)Inventor: Mark Catchpole, Prickwillow (GB)
`
`
`
`
`
`
`
`
`
`
`
`(73)Assignee: Zentian Limited, Cambridge (GB)
`
`the term ofthis ( *) Notice: Subject to any disclaimer,
`
`
`GB
`under 35 patent is extended or adjusted
`
`
`
`GB
`U.S.C. 154(b) by 0 days.
`
`(21)Appl. No.: 16/266,265
`
`
`
`(22) Filed: Feb. 4, 2019
`
`(65)
`
`
`
`Prior Publication Data
`
`
`
`US 2019/0172447 Al Jun. 6, 2019
`
`
`
`FOREIGN PATENT DOCUMENTS
`
`2 112 194 A 7 /1983
`
`2 331 392 A
`5/1999
`(Continued)
`
`OTHER PUBLICATIONS
`
`R.M. Woodward et al., "Tissue Classification Using Terahertz
`
`
`
`
`
`
`
`
`
`Pulsed Imaging", Proceedings of the SPIE-The International Soci­
`
`
`
`
`ety for Optical Engineering SPIE-INT. Soc. Opt. Eng. USA, vol.
`
`
`
`5318, No. 1, 2004, pp. 23-33. Abstract also attached.
`(Continued)
`
`Primary Examiner - Daniel Abebe
`
`
`Related U.S. Application Data
`
`
`
`(74)Attorney, Agent, or Firm - Finnegan Henderson
`
`Farabow Garrett & Dunner LLP
`
`
`
`(63)Continuation of application No. 15/392,396, filed on
`
`Dec. 28, 2016, now Pat. No. 10,217,460, which is a
`(57)
`(Continued)
`
`
`
`ABSTRACT
`
`A speech recognition circuit comprises an input buffer for
`
`
`
`
`
`
`
`
`receiving processed speech parameters. A lexical memory
`
`
`
`
`
`contains lexical data for word recognition. The lexical data
`
`
`
`
`comprises a plurality of lexical tree data structures. Each
`
`Feb. 4, 2002 (GB) ...................................... 0202546
`
`
`
`lexical tree data structure comprises a model of words
`
`
`
`
`having common prefix components. An initial component of
`
`
`
`
`each lexical tree structure is unique. A plurality of lexical
`
`
`
`
`tree processors are connected in parallel to the input buffer
`(2013.01)
`
`
`
`for processing the speech parameters in parallel to perform
`(2013.01)
`GlOL 151187
`
`
`
`
`
`parallel lexical tree processing for word recognition by
`(Continued)
`
`
`
`
`
`accessing the lexical data in the lexical memory. A results
`(52)U.S. Cl.
`
`
`
`
`memory is connected to the lexical tree processors for
`CPC ............ GlOL 151187 (2013.01); GlOL 15105
`
`
`
`
`
`
`
`
`storing processing results from the lexical tree processors
`
`(2013.01); GlOL 15134 (2013.01)
`
`
`
`
`and lexical tree identifiers to identify lexical trees to be
`
`
`
`
`processed by the lexical tree processors. A controller con­
`( 58)Field of Classification Search
`
`
`
`
`
`
`trols the lexical tree processors to process lexical trees
`
`
`CPC ................................ Gl0L 15/32; Gl0L 15/34
`
`
`
`See application file for complete search history.
`(Continued)
`
`
`
`
`
`(30) Foreign Application Priority Data
`
`
`
`(51)Int. Cl.
`GlOL 15132
`
`d-r+ao
`
`LEXICAL TREE 3
`ao-t+sii
`
`HARD
`
`ROCK
`
`LEXICAL TREE 4
`
`LOT
`
`LOCK
`
`..
`
`,.
`
`....
`
`-► ©
`
`r-d+r
`r-d+I
`
`

`

`US 10,971,140 B2
`
`Page 2
`
`identified in the results memory by performing parallel
`
`
`2003/0178584 Al 9/2003 Arnone et al.
`
`
`
`
`2005/0119883 Al* 6/2005 Miyazaki .............. GlOL 15/142
`
`
`
`processing on a plurality of said lexical tree data structures.
`704/231
`
`
`8 Claims, 6 Drawing Sheets
`
`
`
`
`
`2008/0255839 Al 10/2008 Larri et al.
`
`FOREIGN PATENT DOCUMENTS
`
`OTHER PUBLICATIONS
`
`(51)
`Int. Cl.
`GlOL 15134
`
`GlOL 15105
`
`(2013.01)
`(2013.01)
`
`(56)
`
`
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`7,120,582 Bl 10/2006 Young et al.
`
`
`
`7,174,037 B2 2/2007 Arnone et al.
`
`
`7,899,669 B2 3/2011 Gadbois
`
`
`2001/0011218 Al 8/2001 Phillips et al.
`
`
`2002/0143531 Al 10/2002 Kahn
`
`
`
`* cited by examiner
`
`
`
`
`
`Related U.S. Application Data
`
`9/2000
`2 347 835 A
`
`GB
`4/2003
`
`2 380 920 A
`GB
`8/2000
`WO 00/50859
`WO
`
`
`
`continuation of application No. 14/309,476, filed on
`8/2003
`
`WO-03/067572 A2
`WO
`
`
`Jun. 19, 2014, now Pat. No. 9,536,516, which is a
`
`
`
`continuation of application No. 13/253,223, filed on
`
`
`Oct. 5, 2011, now Pat. No. 8,768,696, which is a
`V.P. Wallace et al., "Biomedical Applications of THz Imaging",
`
`
`
`
`
`continuation of application No. 12/554,607, filed on
`
`
`
`Microwave Symposium Digest, 2004, IEEE MTT-S International
`
`Sep. 4, 2009, now Pat. No. 8,036,890, which is a
`
`
`Fort Worth, TX, USA, Jun. 6-11, 2004, Piscataway, NJ, USA, IEEE,
`
`
`
`continuation of application No. 10/503,463, filed as
`
`vol. 3, Jun. 6, 2004 (Jun. 6, 2004), pp. 1579-1581.
`
`
`application No. PCT/GB03/00459 on Feb. 4, 2003,
`
`
`
`V.P. Wallace et al., "Biomedical Applications of Terahertz Pulse
`now Pat. No. 7,587,319.
`
`
`Imaging", Second Joint EMBS-BMES Conference 2002. Confer­
`
`
`
`ence Proceedings. 24th Annual International Conference of the
`
`
`
`
`Engineering in Medicinal and Biology Society. Annual Fall Meeting
`
`
`
`
`
`of the Biomedical Engineering Society, Houston, TX, Oct. 23-26,
`
`
`
`2002, Annual vol. 1 of 3. Conf. 24, Oct. 23, 2002 (Oct. 23, 2002),
`pp. 2333-2334.
`A.J. Fitzgerald et al., "Terahertz Imaging of Breast Cancer, a
`
`
`
`
`
`
`
`Feasibility Study", Conference Digest of the 2004 Joint 29th
`
`
`
`
`International Conference on Infrared and Millimeter Waves and
`
`
`
`
`12th International Conference on Terahertz Electronics IEEE Piscataway,
`Klein et al.
`
`5,293,213 A 3/1994
`
`
`NJ, USA, 2004, pp. 823-824.
`
`5,349,645 A 9/1994 Zhao
`
`B.E. Cole et al., "Terahertz Imaging and Spectroscopy of Human
`
`
`
`
`5,457,768 A 10/1995 Tsuboi
`
`Skin, In-vivo", Proceedings of SPIE, vol. 4276, pp. 1-10.
`
`
`
`5,579,436 A * 11/1996 Chou .................... G lOL 15/063
`
`
`
`
`S.Glinski et al., "Spoken Language Recognition on a DSP Array
`704/236
`
`
`
`
`Processor", IEEE Transactions on Parallel and Distributed Systems,
`5,621,859 A 4/1997 Schwartz
`
`
`Jul. 5, 1994, No. 7, New York, USA, pp. 697-703.
`
`
`5,710,866 A * 1/1998 Alleva .................... Gl0L 15/10
`
`
`
`S.Chatterjee et al., "Connected Speech Recognition on a Multiple
`704/246
`
`
`
`
`Processor Pipeline", ICASSP 89, May 23, 1989, Glasgow, UK, pp.
`5,832,428 A 11/1998 Chow
`
`
`774-777.
`
`5,881,312 A 3/1999
`
`Dulong
`S. H. Chung et al. "A Parallel Phoneme Recognition Algorithm
`
`
`
`
`5,960,395 A 9/1999 Tzirkel-Hanock
`
`
`
`Based on Continuous Hidden Markov Model", Proceedings 13th
`
`
`5,983,180 A * 11/1999 Robinson .............. GlOL 15/142
`
`
`
`
`International Parallel Processing Symposium and 10th Symposium
`704/254
`
`
`
`on Parallel and Distributed Processing, IPPS/SPDP 1999, Proceed­
`
`5,995,930 A 11/1999 Rab-Umbach et al.
`
`
`
`6,047,283 A 4/2000 Braun
`
`
`
`
`ings of 13th International Parallel Processing Symposium and 10th
`
`
`6,374,222 Bl 4/2002 Kao
`
`
`
`
`
`Symposium on Parallel and Distributed Pro, IEEE Comput. Soc. pp.
`6,526,380 Bl 2/2003 Thelen
`
`453-457 (1999).
`
`
`
`7,024,359 B2 * 4/2006 Chang ................... G lOL 15/065
`S.H. Chung et al. "A Parallel Computation Model for Integrated
`
`
`704/251
`
`
`
`Speech and Natural Language Understanding", IEEE Transactions
`7,035,802 Bl 4/2006 Rigazio
`
`
`
`
`on Computers, vol. 42, No. 10, Oct. 1, 1993.
`
`
`7,065,488 B2 * 6/2006 Yajima .................... Gl0L 15/20
`
`
`N. Deshmukn et al. "Hierarchical Search for Large-Vocabul
`ary
`704/226
`
`
`
`
`
`Conversational Speech Recognition: Working Toward a Solution to
`
`
`
`
`
`the Decoding Problem", IEEE Signal Processing Magazine, vol. 16,
`
`
`No. 5, pp. 84-107 (Sep. 1999).
`
`

`

`�
`
`3
`
`ADC
`fs = 48 kHz
`20 bit
`
`4
`
`normalise
`
`9
`
`t
`
`� i .-10
`.-:,( '�,�;��;�
`
`Energy I
`
`..,
`
`Fig 1
`
`14
`
`�
`
`Vt::LlUlti� /
`
`rinn1 ,_
`
`e •
`00
`
`• � � ��
`�
`
`=
`
`> "e
`:-:
`'"Cl's
`N
`0
`N
`
`....
`
`rJJ
`
`=-('D
`('D
`.....
`
`....
`0
`
`....
`Cl's
`
`d
`
`r.,;_ "'""'
`-....l "'""''" "'""' �
`
`'"= \0
`
`=
`=
`N
`
`

`

`e •
`00 •
`
`�
`�
`�
`�
`
`�
`=
`
`I
`
`N bej
`sentem
`
`--
`
`t :-:
`
`'"Cl's
`N
`0
`N
`
`....
`
`rJJ
`
`=­('D
`('D .....
`N
`
`0
`
`....
`Cl's
`
`d
`r.,;_
`"'""'
`
`-..= \0
`-....l
`"'""''"
`"'""'
`�
`
`=
`=
`N
`
`.,--- 28
`
`I
`
`program & data bus
`memory -
`
`/ 27
`
`,,,- 2s
`
`Search
`Controller
`
`-
`
`path scores
`
`1ex1ca1 tree
`idtmtifiers
`
`,,...- 2s
`
`'l
`
`Lexical Tree
`Processor
`
`Control bus
`
`Language
`- Model
`Processor ,,
`To all Lexical
`Tree Processors
`
`-
`
`-
`-
`
`
`
`Results Memory
`
`/ 31
`
`Language
`
`Model
`Memory
`
`word
`constralnt
`K
`I
`r
`
`r
`
`feature
`vector
`
`c)
`
`buffer
`,_ ----,.._
`____... --
`I 21
`/21
`21 1•
`I
`I 1.
`tree lexical tree lexical tree lexical
`
`
`
`I
`I
`processor I
`processor processor
`I
`I 1
`I
`I
`I
`I
`
`I Acoustic Model Memory
`l
`Lexical Tree Processor Cluster 1
`
`
`I
`______________
`I
`
`model bus ,,,- 30
`I language
`bus ,,,- 25
`oath score and hlstorv
`
`
`feature vector bus
`
`,,
`
`I
`
`2
`
`k
`••
`
`,
`
`1
`
`2
`
`
`
`I
`I
`
`I
`I
`I
`______________ J
`
`
`
`Lexical Tree Processor Cluster N
`
`Fig 2
`
`/ 24
`,_ .
`--------_,..._
`I 21
`21
`I 1,,. /21
`r-
`, ' I
`tree lexical tree lextca!
`
`I
`I
`lexical
`tree
`processor processor
`processor I
`I
`k
`••
`I
`I
`I 22
`I
`I Search Engine
`I
`...
`/23
`/2.3
`I
`r
`�22
`I
`I
`I Acoustic Model_ Memory
`
`I
`I
`I
`J
`
`

`

`LEXICAL rREE 1 r-d+r
`
`YOU
`
`�--··
`
`-.e y-uw+k
`
`y-uw+h
`
`Fig 3a
`
`LEXICAL TREE 2
`
`_J
`HART I
`
`ao-t+silLEXICAL TREE 3
`ROT
`
`I
`I
`
`��--
`
`r-d+r
`r-d+I
`
`/I
`
`\ d:_ao
`
`\ l\. I -• ROCK
`
`I
`1-ao+t ao-t+s11LEXICAL TREE 4
`1-ao+k f+\__.
`I
`
`LOT
`
`Fig 3b
`
`I
`
`- �LOCK
`
`I
`
`e •
`00 •
`�
`
`�
`�
`
`�
`=
`
`t :-:
`N ....
`
`'"Cl's
`N
`0
`
`('D
`
`rJJ=-('D
`0 ....
`
`�
`
`Cl's
`
`d
`r.,;_
`"'""'
`
`'"= \0
`-....l
`"'""''"
`"'""'
`�
`
`=
`
`= N
`
`

`

`U.S. Patent Apr. 6, 2021 Sheet 4
`of 6
`
`
`
`US 10,971,140 B2
`
`$1 (INITIALISE)
`
`I
`
`N0�
`
`YES
`READ FEATURE VECTOR FROM
`
`S5
`
`BUFFER
`
`EVALUATE STATE TRANSITIONS FOR
`S6 FIRST LEXICAL TREE NODE USING
`ACOUSTIC MODEL DATA
`
`DETERMINE SCORE FOR FIRST
`LEXICAL TREE NODE
`
`S7
`
`REQUEST LANGUAGE MODELSCORE
`S8 USING PREVIOUS N-1 WORDS iN
`RECEIVED DATA AND LEXICAL TREE
`OATA IN ACOUSTIC MODa MEMORY
`
`RECEIVE LANGUAGE MODa SCORES
`S9 FOR WORDS IN THE LEXICAL TREE
`AND PICK HIGHEST SCORE
`
`GENERATE TEMPORARY LEXICAL TREE
`S10 SCORE USING THE SCORE FOR FIRST
`LEXICAL TREE NODE AND THE
`HIGHEST LANGUAGE Mooa SCORE
`
`S11 SEND SCORE TO RESULTS MEMORY
`'-----1
`Af!, TEMPORARY LEXICAL TREE SCORE
`
`Fig 4
`
`

`

`U.S. Patent
`Apr. 6, 2021 Sheet 5 of 6
`
`
`
`US 10,971,140 B2
`
`S20 STNlT
`
`S23
`
`No�
`
`YES
`
`S24
`
`S25
`
`NO
`
`S26
`
`EVALUATE-STATE TRANSITIONS FOR
`
`EACH PAlli USIHG ACOUSTIC.MODEL
`DA.TA
`DETERMINE SCORES FOR PATHS, SEND
`
`BEST SCORE TO RESULTS MEMORY AND
`STORE PATH HISTORIES LOCALLY
`PRUNING APPLIED TO LEXICAL TREE
`TO DELETE PATHS
`
`S27
`
`s
`
`C
`
`S29
`
`S30
`
`APPLY LANGUAGE MODEL SCORE
`SEND SCORE AND HISTORYTO
`RESULTS MEMORY
`AND
`DELETE PATH(S)
`HISTORY DATA
`
`S31
`
`Fig 5
`
`S33 MESSAGE SENT TO SEARCH COtv'TROUER TO
`
`INDICATE THAT LEXICAL TREE HAS BEEN
`PROCESSED
`
`NO
`
`

`

`U.S. Patent Apr. 6, 2021
`Sheet 6 of 6
`
`
`US 10,971,140 B2
`
`S40 INITIALISATION
`
`READ INITIAL LEXICAL TREE DATA IN
`RESULTS MEMORY
`S41
`
`INITIAL LEXICAL TREE DATA
`DIS1RIBUTED AMONGST LEXICAL TREE
`S42 PROCESSORS FOR TEMPORARY
`LEXICAL TREE SCORE DETERMINATION
`
`TEMPORARY LEXICAL TREE SCORES
`S43 RETURNED TO RESULTS MEMORY
`
`Fig 6
`
`PRUNE THE LEXICAL TREES IN THE
`S44 RESULTS MEMORY ON BASIS OF
`TEMPORARY LEXICAL TREE �RES
`
`LEXICAL TREE PROCESSING
`S45 DISTRIBUTED AMONGST LEXICAL TREE�--------,
`
`PROCESSORS
`
`NO
`
`YES
`
`DETERMINE NEXT POSSIBLE LEXICAL
`
`S47 TREES USING CROSS WORD
`TRIPHONES
`
`LEXICAL TREE DATA DISTRIBUTED
`AMONGST LEXICAL TREE
`PRQCESSORSFORTEMPORARY
`S48
`LEXICAL TREE SCORE DETERMINATION
`
`S49 TEMPORARY LEXICAL TREE SCORES
`RETURNED TO RESULTS MEMORY
`
`PRUNE THE LEXICAL TREES IN THE
`
`S50 RESULTS MEMORY ON BASIS OF
`TEMPORARY LEXICAL SCORES
`
`YES
`
`t
`S521 RESULTS
`OUTPUT
`
`

`

`US 10,971,140 B2
`
`PARALLEL PROCESSORS
`
`1
`
`SPEECH RECOGNITION CIRCUIT USING
`
`
`2
`data structures representing a plurality of lexical trees. Each
`
`
`
`
`
`
`lexical tree data structure comprises a model of words
`
`
`
`having common prefix components and an initial component
`This is a continuation of application Ser. No. 15/392,396,
`
`
`
`
`
`
`which is unique as an initial component for lexical trees. A
`
`filed Dec. 28, 2016, which is a continuation of application
`
`
`
`
`5 plurality of lexical tree processors are connected in parallel
`
`
`Ser. No. 14/309,476, filed Jun. 19, 2014, now U.S. Pat. No.
`
`
`to the input port and perform parallel lexical tree processing
`
`
`
`9,536,516, which is a continuation of application Ser. No.
`
`
`
`for word recognition by accessing the lexical data in the
`
`
`13/253,223, now U.S. Pat. No. 8,768,696, filed Oct. 5, 2011,
`
`
`
`lexical memory arrangement. A results memory arrangement
`
`
`which is a continuation of application Ser. No. 12/554,607,
`
`
`
`
`is connected to the lexical tree processors for storing pro-
`
`now U.S. Pat. No. 8,036,890, filed Sep. 4, 2009, which is
`
`
`
`
`10 cessing results from the lexical tree processors and lexical
`
`
`
`continuation of application Ser. No. 10/503,463, now U.S.
`
`
`
`
`tree identifiers to identify lexical trees to be processed by the
`
`
`Pat. No. 7,587,319, filed May 24, 2005, which is a 371 of
`
`
`
`
`lexical tree processors. A controller controls the lexical tree
`
`
`
`International Application No. PCT/GB2003/000459, filed
`
`
`
`
`processors to process lexical trees identified in the results
`
`
`
`Feb. 4, 2003, which claims foreign priority to United King­
`
`
`
`
`memory arrangement by performing parallel processing of a
`
`
`dom Application No. 0202546.8 filed Feb. 4, 2002, the
`
`15 plurality of lexical tree data structures.
`
`
`
`
`disclosures of which are incorporated herein by reference in
`
`
`
`Thus in accordance with this embodiment of the present
`their entireties.
`
`
`
`invention, the processing in order to perform word recog­
`
`
`
`
`The present invention generally relates to a speech rec­
`
`
`
`
`
`nition is distributed across the processors by controlling the
`
`
`
`
`ognition circuit which uses parallel processors for process­
`
`
`
`
`processors to perform processing on different lexical trees.
`
`
`ing the input speech data in parallel.
`
`
`
`
`20 The controller controls the processor by the processes to
`
`
`Conventional large vocabulary speech recognition can be
`
`
`
`provide for efficient process management by distributing
`
`
`
`divided into two processes: front end processing to generate
`
`
`
`lexical processing to appropriate processors.
`
`
`
`
`processed speech parameters such as feature vectors, fol­
`
`
`
`The lexical tree data structure can comprise a phone
`
`
`lowed by a search process which attempts to find the most
`
`
`
`model of words, wherein the components comprise phones.
`
`likely set of words spoken from a given vocabulary (lexi­
`
`
`
`
`25 For reduced storage, the lexical tree data structure can
`con).
`
`
`comprise a mono phone lexical tree. The mono phone lexical
`
`
`
`The front end processing generally represents no problem
`
`
`
`tree can be used to generate context dependent phone
`
`
`
`
`for current processing systems. However, for large vocabu­
`
`
`
`models dynamically. This enables the use of context depen­
`
`
`
`lary, speaker independent speech recognition, it is the search
`
`
`dent phone models for matching and hence increased accu-
`
`
`
`
`
`process that presents the biggest challenge. An article by
`
`
`30 racy whilst not increasing memory requirements. Alterna­
`
`
`Deshmukh et al entitled "Hierarchical Search for Large­
`
`
`
`
`tively, the lexical tree data structure can comprise context
`
`
`Vocabulary Conversational Speech Recognition" (IEEE Sig­
`
`dependent phone models.
`
`
`
`nal Processing Magazine, September 1999, pages 84 to
`The processing performed by each processor in one
`
`
`
`
`107), the content of which is hereby incorporated by refer­
`
`
`
`embodiment comprises the comparison of the speech param-
`
`
`
`ence, discusses the general concepts of large vocabulary
`
`35 eters with the lexical data, e.g. phone models or data derived
`
`
`
`speech recognition. As discussed in this paper, one algorithm
`
`
`
`from the lexical data ( e.g. dynamically generated context
`
`
`
`
`for performing the search is the Viterbi algorithm. The
`
`
`
`dependent phone models) to identify words as a word
`
`
`
`
`
`
`Viterbi algorithm is a parallel or breadth first search through
`
`
`
`recognition event and to send information identifying the
`
`
`
`
`a transition network of states of Hidden Markov Models. An
`
`
`
`identified words to the results memory as the processing
`
`
`
`acoustic model for words in a lexicon are represented as
`
`
`
`40 results. In this embodiment a language model processor
`
`
`
`states of Hidden Markov Models. These states represent
`
`
`
`arrangement can be provided for providing a language
`phones or n phones in a phone model of the words. The
`
`
`
`
`model output for modifying the processing results at a word
`
`
`
`
`search requires the evaluation of possible word matches. It
`
`
`
`recognition event by a lexical tree processor. The modifica­
`
`
`is known that such a search is computationally intensive.
`
`
`
`tion can either take place at each lexical tree processor, or at
`
`
`In order to speed up the processing,
`
`
`
`45 the performed during language model processing arrangement.
`
`
`
`
`
`such a search in a speech recognition system, parallel In one embodiment each lexical tree processor determines
`
`
`
`
`processing has been explored. In an article by M K Rav­
`
`
`an output score for words in the processing results at word
`
`
`
`
`
`
`
`
`ishankar entitled "Parallel Implementation of Fast Beam recognition events. Thus in this embodiment the language
`
`
`
`
`
`Search for Speaker-Independent Continuous Speech Recog­model processing arrangement can modify the score using a
`
`
`
`
`
`
`
`nition" (Indian Institute of Science, Bangalor, India, Jul. 16, 50 score for a language model for n preceding words, where n
`
`
`
`
`1993) a multi-threaded implementation of a fast beam search is an integer.
`algorithm is disclosed. The multi-threading implementation In one embodiment the controller instructs a lexical tree
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`a lexical tree tree by passing to process a lexical and syn­processor amount of communication requires a sign ificant
`
`
`
`
`chronization among threads. In an MSC project report by R.
`
`
`
`identifier for the lexical tree and history data for a recogni­
`
`
`
`
`
`
`
`Dujari entitled "Parallel Viterbi Search Algorithm for 55 tion path associated with the lexical tree from the results
`
`
`
`
`
`
`
`Speech Recognition" (MIT, February 1992) the parallel memory. The history data preferably includes an accumu­
`
`
`
`
`
`
`processing of input speech parameters is disclosed in which lated score for the recognition path. This enables a score to
`
`
`
`a lexical network is split statically
`
`
`be determined among processors. based on the score for the recogn ition path to
`
`
`
`
`
`accumulate to provide a new score during recognition carried out using
`
`
`It is an object of the present invention an
`
`
`improved circuit
`
`
`
`the lexical which can perform parallel processing of 60 tree data structure. The scores can be output in the
`
`
`
`
`
`processing results to the results memory during the process­
`speech parameters.
`
`
`In accordance with a first embodiment of the
`
`ing of the present speech parameters so that the scores can be used
`
`
`
`invention, a speech recognition circuit comprises for pruning. an input
`
`
`
`
`port such as input buffer for receiving parameterized speech In one embodiment of the present invention, each lexical
`
`
`
`
`
`
`
`
`
`
`
`
`
`data such as feature vectors. A lexical memory arrangement 65 tree processor operates on more than one lexical tree at the
`
`
`
`
`
`
`is provided which contains lexicon data for word recogni­same time, e.g. two lexical trees represented by two different
`
`
`
`lexical tree data structures, or two lexical trees represented
`
`
`
`
`tion. The lexical data comprises a plurality of lexical tree
`
`

`

`3
`
`US 10,971,140 B2
`
`
`4
`
`by the same data structure but displaced in time (which can
`
`
`benefits from a limited segmentation of the lexical data. By
`
`
`
`
`
`be termed to instances of the same lexical tree).
`
`
`
`providing a plurality of processors in a group with a com­
`
`
`
`
`At word recognition events, the controller determines new
`
`
`
`mon memory, flexibility in the processing is provided with­
`lexical tree identifiers for storing in the results memory for
`
`
`
`
`
`
`
`
`
`out being bandwidth limited by the interface to the memory
`
`5 that would occur if only a single memory were used for all
`
`words identified in the results memory for respective word
`
`
`
`
`
`events. In order to reduce the processing, the controller can
`
`
`
`
`processors. The arrangement is more flexible than the par­
`
`
`
`prune the new lexical tree identifiers to reduce the number
`
`
`allel processing arrangement in which each processor only
`
`
`
`of lexical trees which are required to be processed. This
`
`has access to its own local memory and requires fewer
`
`
`
`
`pruning can be achieved using context dependant n phones
`
`
`
`
`memory interfaces (i.e. chip pins). Each processor within a
`10
`
`to reduce the number of possible next phones. The number
`
`group can access the same lexical data as any other proces­
`
`
`
`can be further reduced by using a language model look
`
`
`
`sor in the group. The controller can thus control the parallel
`ahead technique.
`
`
`
`processing of input speech parameters in a more flexible
`
`
`
`In one embodiment of the present invention, the lexical
`
`
`
`manner. For example, it allows more than one processor to
`tree processors are arranged in groups or clusters. The
`
`
`
`
`datausing the same lexical input speech parameters lexical memory arrangement comprises a plurality of partial 15 process
`
`
`
`
`
`
`
`in a lexical memory. This is because the lexical data is
`
`
`
`
`lexical memories. Each partial lexical memory is connected
`
`
`segmented into domains which are accessible by multiple
`
`
`
`to one of the groups of lexical tree processors and contains
`processors.
`part of the lexical data. Thus a group of lexical tree proces­
`
`
`In a preferred embodiment this aspect of the present
`
`
`sors and a partial lexical memory form a cluster. Each lexical
`
`
`
`
`
`invention is used in combination with the first aspect of the
`
`
`
`tree processor is operative to process the speech parameters 20
`
`
`
`present invention. In such an arrangement each processor
`
`
`
`using a partial lexical memory and the controller controls
`
`
`
`each lexical tree processor to process a lexical tree corre­
`
`
`
`performs lexical tree processing and the lexical data stored
`
`
`sponding to partial lexical data in a corresponding partial
`
`
`
`in each lexical memory comprises lexical tree data structures
`lexical memory.
`
`
`which each comprise a model of words having common
`
`
`
`
`
`In another embodiment of the present invention
`
`
`
`
`prefix the lexi- 25 components and an initial component that is unique.
`cal memory arrangement comprises
`
`
`In preferred a plurality of partial embodiments of the second aspect of the
`
`
`
`
`
`
`present invention, the preferred embodiments of the first
`
`
`
`lexical memories. Each partial lexical memory being con­
`
`
`processors
`
`aspect of the present nected to one of the lexical tree and containing invention are incorporated.
`
`part of the lexical data. Each lexical tree
`
`
`
`Embodiments processor processes of the present invention will now be
`
`
`
`the speech parameters using a corresponding
`
`
`
`described partial lexical 30 with reference to the accompanying drawings in
`
`
`
`
`
`memory and the controller is operative which: to control each
`
`
`circuit for of a speech data processing lexical tree processor to process a lexical tree corresponding FIG. 1 is a diagram
`
`
`
`
`
`
`
`
`
`
`to partial lexical data in a corresponding partial
`
`
`
`generating lexical parameterized speech data (feature vectors);
`
`
`
`circuit in FIG. 2 is a diagram of a speech recognition
`memory.
`
`
`
`In one embodiment of the present
`
`35 accordance invention the lexical with an embodiment of the present invention;
`
`memory arrangement stores the lexical
`
`FIGS. 3a and 3b are schematic diagrams tree data structures illustrating
`
`
`
`
`as Hidden Markov Models and the lexical lexical tree processors tree structures;
`
`
`process per­diagram illustrating the are operative to perform the Viterbi search algorithm using FIG. 4 is a flow
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`each respective lexical tree data structure. Thus in this way, formed by a lexical tree processor to determine a temporary
`
`
`
`
`
`
`
`
`this embodiment of the present invention provides a parallel 40 lexical tree score in accordance with an embodiment of the
`
`present invention;
`
`
`
`Viterbi lexical tree search process for speech recognition.
`process per­FIG. 5 is a flow diagram illustrating the
`
`
`
`
`
`
`The first aspect of the present invention is a special
`
`
`
`formed by the lexical tree processor for processing the input
`
`
`purpose circuit built for performing the speech recognition
`
`
`
`feature vectors in accordance with an embodiment of the
`
`
`search process in which there are a plurality of processors
`
`
`present invention; and
`
`
`
`
`
`for performing parallel lexical tree processing on individual 45
`process per­FIG. 6 is a flow diagram illustrating the
`
`
`
`
`
`lexical tree processors.
`
`
`formed by the controller in accordance with an embodiment
`
`
`
`In another aspect of the present invention a speech
`
`of the present invention.
`
`
`
`recognition circuit comprises an input port such as an input
`
`
`FIG. 1 illustrates a typical circuit for the parameterization
`
`
`buffer for receiving parameterized speech data such as
`
`of input speech data. In this embodiment the parameters
`
`
`
`
`
`feature vectors. A plurality oflexical memories are provided 50
`
`generated are speech vectors.
`
`
`
`
`which contain in combination complete lexical data for word
`form and A microphone speech in an analogue 1 records
`
`
`
`
`
`
`recognition. Each lexical memory contains part of the com­
`
`
`this is input through an anti-aliasing filter 2 to an analogue­
`
`
`
`plete lexical data. A plurality of processors are provided
`
`
`
`to-digital converter 3 which samples the speech at 48 kHz at
`
`
`
`
`
`connected in parallel to the input port for processing the
`
`
`20 bits per sample. The digitized output signal is normalized
`
`
`
`
`speech parameters in parallel. The processors are arranged in 55
`
`
`(4)to generated a 10 millisecond data frame every 5
`
`
`groups in which each group is connected to a corresponding
`
`milliseconds with 5 milliseconds overlap (5). A pre-empha­
`
`
`
`
`
`lexical memory to form a cluster. A controller controls each
`
`sis operation to the data followed by a hamming6 is applied
`
`
`
`
`processor to process the speech parameters using partial
`
`
`(FFT)window 7. The data is then fast Fourier transformed
`
`
`
`lexical data read from a corresponding lexical memory. The
`
`using a 512 point fast Fourier transform (8) before being
`
`
`
`
`results of processing the speech parameters are output from 60
`
`
`
`energy infiltered by filter bank 9 into 12 frequencies. The
`
`
`the processors as recognition data.
`
`featurethe data frame 5 is also recorded (13) as an additional
`
`
`
`
`
`Thus this aspect of the present invention provides a circuit
`
`
`
`
`filter bankoutputs of the and together with the 12 frequency
`
`
`in which speech recognition processing is performed in
`
`
`
`
`produced and these are9, 13 feature vectors (10) are thus
`
`
`
`
`parallel by groups of processors operating in parallel in
`
`
`
`vectors output as part of the 39 feature 14. First and second
`
`which each group accesses a common memory of lexical 65
`
`
`
`
`feature vectors (11 and derivatives 12) are taken of the 13 10
`
`
`
`data. This aspect of the present invention provides the
`
`
`
`
`vectors to complete the generation of the 39 feature 14.
`
`
`
`advantage of parallel processing of speech parameters and
`
`

`

`5
`
`US 10,971,140 B2
`
`
`6
`
`The arrangement illustrated in FIG. 1 is purely given for
`
`
`tree processor can modify the score in accordance with the
`
`
`
`
`
`
`
`illustration. The present invention encompasses any means
`
`
`
`language model and output a score to the results memory
`25
`
`
`
`
`
`for a word at the by which speech and data can be parameterized to a suitable end of a branch of the lexical tree
`
`
`form for input to the search process
`
`
`
`processing. as will be described in Thus the results memory stores the results as an
`
`
`
`
`
`
`5 ordered list of scores for words together with their histories.
`
`more detail hereinafter.
`
`
`FIG. 2 is a schematic diagram of a speech recognition
`
`
`
`following data: The results memory 25 stores the
`
`
`
`circuit in accordance with an embodiment of the present
`
`
`
`
`1.Initial lexical tree data. This comprises pointers to an
`
`
`
`
`invention for performing the search process. The parameter­
`
`
`
`
`initial set of lexical trees. No history data is associated with
`
`
`ized speech data, which in this embodiment comprise fea­
`
`
`
`
`
`the initial set of lexical trees. The initial set of lexical trees
`
`
`ture vectors, are input to a feature vector buffer 20. The 10
`
`
`
`is predetermined and stored in the results memory 25 based
`
`
`to buffer the incoming feature vector buffer 20 is provided
`
`
`
`on the most likely initial phones of as utterance. This initial
`
`
`
`
`feature vectors to allow lexical tree processors 21 to read and
`
`
`
`lexical tree data is required to initialize the search process.
`
`
`
`process the feature vectors in the buffer 20 via a feature
`
`
`
`2.History data for search results. This comprises a record
`
`
`
`
`k of lexical tree processors vector bus 24. A plurality 21 are
`
`
`
`of a recognition path through the lexical tree recognition
`
`
`
`
`arranged in a respective lexical tree processor cluster 22. 15
`
`
`
`
`process performed by the lexical tree processors 21. The
`
`
`
`
`model Each lexical tree processor cluster 22 has an acoustic
`
`
`
`
`
`history data includes the current word, the previous N-1
`
`memory 23 in which is stored lexical data for use by the
`
`
`
`words, the current accumulated score, the phone history (for
`
`
`tree processor lexical tree processors 21 within the lexical
`
`
`
`
`
`use in the determination of likely next lexical trees using
`
`
`
`
`tree lexical processor cluster 22. Each lexical tree 21 in the
`
`
`
`
`cross word context dependent tri-phones), and an identifier
`
`
`
`
`
`20 to the acoustic model processor cluster 22 is connected
`
`
`
`or pointer to the lexical tree used for identifying the word.
`
`
`
`tree processor memory 23 within the lexical 22. There are N
`
`3. Best scores for best paths being processed by each
`
`
`
`lexical tree processor clusters and thus there are Nk lexical
`
`
`
`enables the lexical tree processor 21. This information
`
`
`
`
`tree processors by the feature vector 21 connected bus 24 to
`
`
`
`being per-the processing search controller 27 to monitor
`
`
`
`
`tree processor the feature vector buffer 20. Each lexical 21
`
`
`
`
`whether a formed by lexical tree processors 21 to determine
`
`
`
`
`is capable of processing a different lexical tree and thus Nk 25
`
`
`
`global pruning strategy should be applied in order to reas­
`
`
`
`
`lexical trees can be processed in parallel. The acoustic model
`
`
`
`
`sign processing performed by a lexical tree processor if its
`
`
`
`
`set of lexical data, whole a complete memories 23 store as a
`
`
`
`
`best score for its best path is below a threshold or well below
`
`
`
`i.e. lexical tree data structures for use in the lexical tree
`
`
`
`the best scores for the paths being processed by other lexical
`
`
`
`processing by the lexical tree processors 21. Each acoustic
`tree processors
`21.
`
`
`
`30of the lexical model memory 23 contains part or a segment
`4.Temporary lexical tree scores. These comprise tree
`
`
`
`
`
`treetree data. Since lexical tree processors 21 in a lexical
`
`
`
`scores which are determined as temporary scores to prune
`
`
`modelprocessor cluster 22 access the same acoustic
`
`
`
`the next lexical trees to be processed at word ends. The
`
`
`
`
`one lexical treememory 23, it is possible for more than
`
`
`
`
`temporary lexical tree scores include lexical tree identifiers
`
`
`
`
`data. This providesthe same lexical processor 21 to process
`
`
`
`or pointers to identify the next lexical trees to be processed.
`
`
`
`for some degree of flexibility in the controlling of the 35
`
`
`The scores enable the pruning of this list.
`
`
`
`
`theprocessing by the lexical tree processors 21. Further,
`5.Pruning threshold. This can be a global threshold value
`
`
`
`
`
`
`
`only one copyacoustic model memories 23 need not contain
`for use in the pruning of the lexical trees globally, or a local
`
`
`
`of the lexical data. It is-possible to build in a redundancy in
`
`
`
`threshold value for use by a lexical processor for locally
`
`
`the data to further enhance the flexibility. This avoids any
`
`
`
`
`pruning the processing performed by the lexical processor
`
`
`
`
`bottleneck in the processing due to the search processing 40
`21.
`
`
`focusing on a small number of lexical trees.
`Hidden Markov The acoustic model memory 23 stores a
`
`
`
`
`
`processing for storing A results memory 25 is provided
`
`
`Model for acoustically modelling w

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket