throbber
I 1111111111111111 11111 1111111111 1111111111 11111 1111111111111111 IIII IIII IIII
`
`US007587319B2
`
`(10)Patent No.:
`c12) United States Patent
`US 7,587,319 B2
`
`(45)Date of Patent:
`Catchpole
`Sep.8,2009
`
`(54)SPEECH RECOGNITION CIRCUIT USING
`FOREIGN PATENT DOCUMENTS
`PARALLEL PROCESSORS
`
`GB
`
`2 112 194 A
`
`7 /1983
`
`
`
`(75)Inventor: Mark Catchpole, Cambridgeshire (GB)
`
`
`
`
`
`
`
`
`
`(73)Assignee: Zentian Limited, Cambridge (GB)
`
`
`
`
`
`( *) Notice: Subject to any disclaimer, the term ofthis
`
`
`
`
`
`patent is extended or adjusted under 35
`
`
`U.S.C. 154(b) by 823 days.
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`S.Glinski et al., "Spoken Language Recognition on a DSP Array
`
`
`
`
`
`
`
`Processor", IEEE Transactions on Parallel and Distributed Systems,
`
`Jul. 5, 1994, No. 7, New York, USA, pp. 697-703.
`
`(Continued)
`
`Primary Examiner-Daniel D Abebe
`
`
`
`
`(74) Attorney, Agent, or Firm-Dickstein Shapiro LLP
`
`(57)
`
`ABSTRACT
`
`(21)Appl. No.: 10/503,463
`
`(22)PCT Filed:Feb.4, 2003
`
`(86)PCT No.: PCT /GB03/00459
`
`§371 (c)(l),
`
`
`
`(2), ( 4) Date:May 24, 2005
`
`
`
`(87) PCT Pub. No.: WO03/067572
`
`
`
`PCT Pub. Date: Aug. 14, 2003
`
`(65)
`
`Prior Publication Data
`
`
`
`
`
`US 2005/0251390Al Nov. 10, 2005
`
`(30)
`
`
`
`A speech recognition circuit comprises an input buffer for
`
`
`
`
`
`
`
`
`receiving processed speech parameters. A lexical memory
`
`
`
`
`contains lexical data for word recognition. The lexical data
`
`
`
`
`comprises a plurality of lexical tree data structures. Each
`
`
`
`lexical tree data structure comprises a model of words having
`
`
`
`common prefix components. An initial component of each
`
`
`
`
`
`lexical tree structure is unique. A plurality of lexical tree
`
`Feb. 4, 2002 (GB) ................................. 0202546.8
`
`
`
`
`processors are connected in parallel to the input buffer for
`(51)Int. Cl.
`
`
`
`
`processing the speech parameters in parallel to perform par­
`GJ0L 15100 (2006.01)
`
`
`
`
`allel lexical tree processing for word recognition by accessing
`
`
`
`
`(52)U.S. Cl. ....................... 704/235; 704/242; 704/243;
`
`
`
`
`the lexical data in the lexical memory. A results memory is
`704/258
`
`
`
`
`connected to the lexical tree processors for storing processing
`
`
`(58)Field of Classification Search ................. 704/235,
`
`
`
`
`results from the lexical tree processors and lexical tree iden­
`704/242,243,256,258
`
`
`
`
`tifiers to identify lexical trees to be processed by the lexical
`
`See application file for complete search history.
`
`
`
`
`tree processors. A controller controls the lexical tree proces­
`
`References Cited
`
`
`
`
`
`sors to process lexical trees identified in the results memory
`
`
`
`
`by performing parallel processing on a plurality of said lexi­
`cal tree data structures.
`
`
`
`
`
`Foreign Application Priority Data
`
`
`
`(56)
`
`U.S. PATENT DOCUMENTS
`
`Klein et al.
`
`5,293,213 A 3/1994
`
`
`
`
`
`
`5,832,428 A * 11/1998 Chow et al. ................. 704/254
`
`(Continued)
`
`
`
`69 Claims, 6 Drawing Sheets
`
`Results Memory
`
`Search Engine
`
`22
`
`I
`
`LmdealTree PfOoessorClu$ter1 I
`
`L ____________
`'--------------..J
`
`Lexical Tree ProcessorClusterN
`
`.J
`
`I
`
`IPR2023-00033
`Apple EX1001 Page 1
`
`

`

`US 7,587,319 B2
`
`Page 2
`
`S-H. Chung et al., "A Parallel Computational Model for Integrated
`
`
`U.S. PATENT DOCUMENTS
`
`
`
`Speech and Natural Language Understanding", IEEE Transactions
`5,881,312 A 3/1999 Dulong
`
`
`
`
`on Computers, New York, USA, vol. 42, No. 10, Oct. 1, 1993, pp.
`5,995,930 A * 1111999 Rab-Umbach et al. ...... 704/254
`
`
`
`
`1171-1183.
`
`
`
`6,047,283 A * 4/2000 Braun .. ... ... .. ... ... ... ... .. ... 707 /3
`
`
`
`
`N. Deshmukh et al., "Hierarchical Search for Large-Vocabulary Con­
`
`
`
`7,035,802 Bl* 4/2006 Rigazio et al. .............. 704/256
`
`
`
`
`versational Speech Recognition: Working Toward a Solution for the
`
`
`
`7,120,582 Bl* 10/2006 Young et al ................. 704/255
`
`
`
`
`Decoding Problem", IEEE Signal Processing Magazine, vol. 16, No.
`
`
`7,174,037 B2 2/2007 Arnone et al.
`
`5,Sep. 1999, pp. 84-107.
`
`
`
`
`2001/0011218 Al* 8/2001 Phillips et al. .............. 704/256
`R.M. Woodward et al., "Tissue Classification Using Terahertz Pulsed
`
`
`
`
`
`2003/0178584 Al 9/2003 Arnone et al.
`
`
`
`
`Imaging", Proceedings of the SPIE-The International Society for
`
`
`2008/0255839 Al 10/2008 Larri et al.
`
`
`
`Optical Engineering SPIE-INT. Soc. Opt. Eng. USA, vol. 5318, No.
`
`
`1, 2004, pp. 23-33. Abstract also attached.
`
`
`
`V.P. Wallace et al., "Biomedical Applications of THz Imaging",
`FOREIGN PATENT DOCUMENTS
`
`
`
`Microwave Symposium Digest, 2004, IEEE MTT-S International
`
`
`
`Fort Worth, TX, USA, Jun. 6-11, 2004, Piscataway, NJ, USA, IEEE,
`
`vol. 3, Jun. 6, 2004, pp. 1579-1581.
`V.P. Wallace et al., "Biomedical Applications of Terahertz Pulse
`
`
`
`
`
`
`Imaging", Second Joint EMBS-BMES Conference 2002. Confer­
`
`
`
`ence Proceedings. 24th Annual International Conference of the Engi­
`
`
`
`
`
`neering in Medicinal and Biology Society. Annual Fall Meeting of the
`
`
`
`
`
`
`Biomedical Engineering Society, Houston, TX, Oct. 23-26, 2002,
`
`
`Annual vol. 1 of 3. Conf. 24, Oct. 23, 2002, pp. 2333-2334.
`
`
`
`
`
`A.J. Fitzgerald et al., "Terahertz Imaging of Breast Cancer, a Feasi­
`S. Chatterjee et al., "Connected Speech Recognition on a Multiple
`
`
`
`
`
`
`
`bility Study", Conference Digest of the 2004 Joint 29th International
`
`
`
`
`Processor Pipeline", ICASSP 89, May 23, 1989, Glasgow, UK, pp.
`
`
`
`Conference on Infrared and Millimeter Waves and 12th International
`774-777.
`
`
`
`
`Conference on Terahertz Electronics IEEE Piscataway, NJ, USA,
`
`
`S-H. Chung et al., "A Parallel Phoneme Recognition Algorithm
`2004, pp. 823-824.
`
`
`
`Based on Continuous Hidden Markov Model", Proceedings of 13th
`B.E. Cole et al., "Terahertz Imaging and Spectroscopy of Human
`
`
`
`
`
`
`
`
`International Parallel Processing Symposium and 10th Symposium
`
`
`Skin, In-vivo", Proceedings ofSPIE, vol. 4276, pp. 1-10.
`
`
`
`
`on Parallel and Distributed Processing, 1999, Los Alamitos, CA,
`
`
`USA, pp. 453-457.
`
`2 331 392 A
`5/1999
`2 347 835 A
`9/2000
`2 380 920 A
`4/2003
`WO 00/50859
`8/2000
`
`WO 03/067572 A2
`8/2003
`
`GB
`GB
`GB
`WO
`WO
`
`OTHER PUBLICATIONS
`
`* cited by examiner
`
`IPR2023-00033
`Apple EX1001 Page 2
`
`

`

`O'I
`
`....
`
`=-
`
`('D
`('D
`
`rJJ
`
`0
`0
`N
`QO �
`
`('D
`rJJ
`
`Fig 1
`
`13
`
`Window
`
`(Hamming)
`
`9
`
`7
`
`4
`
`20 bit
`
`filter
`
`fs =48 kHz
`
`aliasing
`
`ADC
`
`Anti
`
`3
`
`2
`
`IPR2023-00033
`Apple EX1001 Page 3
`
`

`

`O'I
`....
`0
`
`N
`
`...,._
`('D
`
`=­('D
`
`
`
`rJJ
`
`••
`
`I<
`
`I
`,,
`21
`
`tree
`
`processor
`tree lexical tree lexical
`
`processor processor
`lexlcal
`
`I
`I
`
`I ,. ,, r21 1
`,------,--------,
`
`21
`
`I
`I
`
`k
`
`2
`
`1
`
`I
`I
`
`processor processor
`lexical
`
`I
`I
`
`sontcnces
`
`Nb I
`
`Results Memory
`
`
`
`-�
`
`,,
`
`Control bus
`Processor
`Lexical Tree
`
`.
`
`'
`
`Processor
`
`Language
`
`Model
`
`r2s
`
`feature vector bus r24
`
`bus r26
`
`oath score and
`historv
`
`To all Lexical
`
`/30
`Tree Processors
`
`language model bus
`
`path scores
`
`Controller -leXlc;aJ tree
`Search
`
`identifiers
`
`,_,-25
`
`r21
`
`program & data bus
`'
`,,.-28
`
`memory
`
`-
`
`
`
`--
`
`-
`
`-
`
`Memory
`Model
`
`Language
`
`.,..-31
`
`c=,
`
`I'
`
`constrain �
`wort!
`
`21
`
`',
`
`-,
`
`--
`
`,...,__ ...... -----
`
`buffer
`vector
`feature
`
`c)
`
`,.,.-21 1• 21
`
`,,
`
`I22
`I Search Engine
`I
`
`.,..-23
`
`Fig 2
`
`I Lexical Tree Processor Cluster N
`I
`I Acoustic Model Memory r
`I
`
`I
`
`L ____________ _.
`
`I
`
`1 I
`I
`I
`
`Cluster
`
`Processor
`
`L-------------'
`
`I
`
`r23
`
`I
`
`I Lexical Tree
`I
`I Acoustic Model Memory
`I
`I
`I
`
`v-22
`I
`I ...
`
`I
`I
`
`2
`
`1
`
`tree lexical tree lexical tree
`
`••
`processor
`I
`
`I
`
`IPR2023-00033
`Apple EX1001 Page 4
`
`

`

`\0
`"'""'
`
`--...l
`00
`
`
` UI
`'-"--...l
`
`LOCK
`
`LOT
`
`LEXICAL TREE 4
`
`ROCK
`
`Fig 3b
`
`--•---
`
`HARD
`
`r-d+I
`
`r-d+r
`
`►0
`
`.... O'I
`
`ROT
`
`0
`
`�
`
`('D
`('D
`
`rJJ
`
`0
`0
`N
`
`
`"'CIO
`
`'?
`('D
`rJJ
`
`
`00•
`
`�
`�
`�
`
`�
`
`LEXICAL TREE 3
`
`ao-t+sil
`
`Fig 3a LEXICAL TREE 2
`
`HART
`
`aa-r+t
`
`YOU ...-e---
`
`y-uw+h
`y-uw+k
`►9
`
`LEXICAL TREE 1
`
`r-d+r
`
`IPR2023-00033
`Apple EX1001 Page 5
`
`

`

`U.S. Patent
`US 7,587,319 B2
`Sep.8,2009 Sheet 4 of 6
`
`S1 INITIALISE
`
`i------NO
`
`YES
`
`YES
`
`S4
`
`NO�
`
`S5
`
`READ FEATURE VECTOR FROM
`BUFFER
`
`EVALUATE STATE TRANSITIONS FOR
`
`
`FIRST LEXICAL TREE NODE USING
`S6
`ACOUSTIC MODEL DATA
`
`S7
`
`DETERMINE SCORE FOR FIRST
`
`LEXICAL TREE NODE
`
`REQUEST LANGUAGE MODEL SCORE
`S USING PREVIOUS N-1 WORDS IN
`B
`
`
`RECEIVED DATAAND LEXICAL TREE
`
`DATA IN ACOUSTIC MODEL MEMORY
`
`RECEIVE LANGUAGE MODEL SCORES
`
`
`
`S9 FOR WORDS IN THE LEXICAL TREE
`
`AND PICK HIGHEST SCORE
`
`GENERATE TEMPORARY LEXICAL TREE
`
`
`S1 Q SCORE USING THE SCORE FOR FIRST
`
`
`LEXICAL TREE NODE AND THE
`HIGHEST LANGUAGE MODEL SCORE
`
`S11
`SEND SCORE TO RESULTS MEMORY
`AS TEMPORARY LEXICAL TREE SCORE
`
`Fig 4
`
`IPR2023-00033
`Apple EX1001 Page 6
`
`

`

`U.S. Patent
`US 7,587,319 B2
`Sep.8,2009 Sheet 5 of 6
`
`S20 START
`
`YES
`
`YES
`
`S23
`
`NO�
`
`READ FEATURE VECTOR FROM
`S24
`BUFFER
`
`EVALUATE STATE TRANSITIONS FOR
`
`EACH PATH USING ACOUSTIC MODEL
`S25
`DATA
`
`NO
`
`DETERMINE SCORES FOR PATHS, SEND
`BEST SCORE TO RES UL T
`S MEMORY AND
`S26
`
`STORE PATH HISTORIES LOCALLY
`
`YES
`
`PRUNING APPLIED TO LEXICAL TREE
`
`TO DELETE PATHS
`S27
`
`YES
`
`
`S29
`
`APPLY LANGUAGE MODEL SCORE
`
`SEND SCORE AND HISTORY TO
`S30
`RESULTS MEMORY
`
`S31
`
`DELETE PATH(S) AND
`HISTORY DATA
`
`Fig 5
`
`NO
`
`S33 MESSAGE SENT TO SEARCH CONTROLLER TO
`
`INDICATE THAT LEXICAL TREE HAS BEEN
`
`PROCESSED
`
`IPR2023-00033
`Apple EX1001 Page 7
`
`

`

`Sheet 6 of 6 US 7,587,319 B2
`U.S. Patent Sep.8,2009
`
`S40 INITIALISATION
`
`READ INITIAL LEXICAL TREE DATA IN
`
`S41
`RESULTS MEMORY
`
`INITIAL LEXICAL TREE DATA
`
`
`DISTRIBUTED AMONGST LEXICAL TREE
`S42 PROCESSORS FOR TEMPORARY
`
`LEXICAL TREE SCORE DETERMINATION
`
`TEMPORARY LEXICAL TREE SCORES
`S43
`RETURNED TO RESULTS MEMORY
`
`Fig 6
`
`PRUNE THE LEXICAL TREES IN THE
`S44 RES UL TS MEMORY ON BASIS OF
`TEMPORARY LEXICAL TREE SCORES
`
`LEXICAL TREE PROCESSING
`S45 DISTRIBUTED
`AMONGST LEXICAL TREE 14---------,
`PROCESSORS
`
`NO
`
`YES
`
`DETERMINE NEXT POSSIBLE LEXICAL
`
`
`S47 TREES USING CROSS WORD
`TRI PHONES
`
`LEXICAL TREE DATA DISTRIBUTED
`AMONGST LEXICAL TREE
`S48
`
`PROCESSORS FOR TEMPORARY
`LEXICAL TREE SCORE DETERMINATION
`
`S49 TEMPORARY LEXICAL TREE SCORES
`RETURNED TO RESULTS MEMORY
`
`
`PRUNE THE LEXICAL TREES IN THE
`S50 RES UL TS MEMORY ON BASIS OF
`TEMPORARY LEXICAL SCORES
`
`YES
`
`S52 RESULTS
`OUTPUT
`
`IPR2023-00033
`Apple EX1001 Page 8
`
`

`

`
`
`US 7,587,319 B2
`
`1
`
`PARALLEL PROCESSORS
`
`
`
`2
`SPEECH RECOGNITION CIRCUIT USING Thus in accordance with this embodiment of the present
`
`
`
`
`
`
`invention, the processing in order to perform word recogni­
`
`
`
`
`tion is distributed across the processors by controlling the
`
`
`
`to a speech
`
`processors The present invention generally relates recog­ to perform processing on different lexical trees.
`
`nition circuit which uses parallel processors for processing
`
`
`
`
`
`
`5 The controller controls the processor by the processes to
`
`the input speech data in parallel.
`
`
`
`
`provide for efficient process management by distributing
`
`
`
`Conventional large vocabulary speech recognition can be
`
`
`lexical processing to appropriate processors.
`
`
`
`divided into two processes: front end processing to generate
`
`
`The lexical tree data structure can comprise a phone model
`
`
`
`
`processed speech parameters such as feature vectors, fol­
`
`
`
`of words, wherein the components comprise phones. For
`
`
`
`lowed by a search process which attempts to find the most
`
`
`
`10 reduced storage, the lexical tree data structure can comprise a
`
`
`
`likely set of words spoken from a given vocabulary (lexicon).
`
`
`
`mono phone lexical tree. The mono phone lexical tree can be
`
`
`
`The front end processing generally represents no problem
`
`
`
`
`used to generate context dependent phone models dynami­
`
`
`
`for current processing systems. However,
`
`
`
`
`
`cally. for large vocabu­ This enables the use of context dependent phone models
`
`
`lary, speaker independent
`
`
`for matching speech recognition, it is the search and hence increased accuracy whilst not
`
`
`
`
`
`
`
`
`process that presents the biggest challenge. An article by 15 increasing memory requirements. Alternatively, the lexical
`
`
`
`
`
`
`
`
`
`
`
`
`
`Deshmukh et al entitled "Hierarchical Search for Large-Vo­tree data structure can comprise context dependent phone
`
`
`
`
`cabulary Conversational Speech Recognition" (IEEE Signal models.
`The processing performed by each processor in one
`
`
`
`
`
`Processing Magazine, September 1999, pages 84 to 107), the
`
`
`
`
`
`
`
`
`
`
`content of which is hereby incorporated by reference, dis­embodiment comprises the comparison of the speech param­
`
`
`
`
`
`
`cusses the general concepts of large vocabulary speech rec-20 eters with the lexical data, e.g. phone models or data derived
`
`
`
`
`
`
`
`
`
`ognition. As discussed in this paper, one algorithm for per­from the lexical data ( e.g. dynamically generated context
`
`
`
`
`
`
`
`forming the search is the Viterbi algorithm. The Viterbi dependent phone models) to identify words as a word recog­
`
`
`
`
`
`
`
`
`
`
`algorithm is a parallel or breadth first search through a tran­nition event and to send information identifying the identified
`
`
`
`
`
`
`
`
`
`sition network of states of Hidden Markov Models. An acous-words to the results memory as the processing results. In this
`
`
`
`
`
`
`
`
`tic model for words in a lexicon are represented as states of 25 embodiment a language model processor arrangement can be
`
`
`
`
`
`
`
`Hidden Markov Models. These states represent phones or n provided for providing a language model output for modify­
`
`
`
`
`
`
`
`phones in a phone model of the words. The search requires the ing the processing results at a word recognition event by a
`
`
`
`
`
`
`
`
`evaluation of possible word matches. It is known that such a lexical tree processor. The modification can either take place
`
`
`search is computationally intensive.
`
`
`
`
`at each lexical tree processor, or at the language model pro-
`
`
`
`
`In order to speed up the processing performed during such 30 cessing arrangement.
`
`
`
`
`
`
`
`a search in a speech recognition system, parallel processing In one embodiment each lexical tree processor determines
`
`
`
`
`has been explored. In an article by MK Ravishankar entitled
`
`
`
`an output score for words in the processing results at word
`"Parallel Implementation of Fast Beam Search for Speaker­
`
`
`
`
`
`recognition events. Thus in this embodiment the language
`Independent Continuous Speech Recognition" (Indian Insti­
`
`
`
`
`
`model processing arrangement can modify the score using a
`
`tute of Science, Bangalor, India, Jul. 16, 1993) a multi­
`
`
`
`
`
`
`35 score for a language model for n preceding words, where n is
`an integer.
`
`threaded implementation of a fast beam search algorithm is
`
`
`
`
`In one embodiment the controller instructs a lexical tree
`
`
`
`
`
`
`
`
`disclosed. The multi-threading implementation requires a
`
`
`
`
`processor to process a lexical tree by passing a lexical tree
`
`
`significant amount of communication and synchronization
`
`
`
`identifier for the lexical tree and history data for a recognition
`
`
`
`among threads. In an MSC project report by R Dujari entitled
`
`
`
`path associated with the lexical tree from the results memory.
`
`
`
`
`"Parallel Viterbi Search Algorithm for Speech Recognition"
`40
`
`
`
`
`The history data preferably includes an accumulated score for
`
`
`
`(MIT, February 1992) the parallel processing of input speech
`
`
`the recognition path. This enables a score to be determined
`
`
`
`
`parameters is disclosed in which a lexical network is split
`
`
`based on the score for the recognition path to accumulate a
`
`statically among processors.
`
`
`
`new score during recognition carried out using the lexical tree
`
`
`
`
`It is an object of the present invention to provide an
`
`
`
`data structure. The scores can be output in the processing
`
`
`
`
`
`improved circuit which can perform parallel processing of 45
`
`
`
`
`results to the results memory during the processing of the
`
`speech parameters.
`
`
`
`speech parameters so that the scores can be used for pruning.
`
`
`
`
`In accordance with a first embodiment of the present inven­
`
`
`
`In one embodiment of the present invention, each lexical
`
`
`
`
`
`tion, a speech recognition circuit comprises an input port such
`
`
`
`tree processor operates on more than one lexical tree at the
`
`
`
`
`as input buffer for receiving parameterized speech data such
`
`
`same time, e.g. two lexical trees represented by two different
`
`
`
`
`as feature vectors. A lexical memory arrangement is provided
`50
`
`
`
`
`lexical tree data structures, or two lexical trees represented by
`
`
`
`which contains lexicon data for word recognition. The lexical
`
`
`
`the same data structure but displaced in time (which can be
`
`
`
`data comprises a plurality of lexical tree data structures rep­
`
`termed to instances of the same lexical tree).
`
`
`
`
`
`resenting a plurality of lexical trees. Each lexical tree data
`
`
`
`
`At word recognition events, the controller determines new
`
`
`structure comprises a model of words having common prefix
`
`
`
`
`
`lexical tree identifiers for storing in the results memory for
`
`
`
`components and an initial component which is unique as an 55
`
`
`
`
`words identified in the results memory for respective word
`
`
`
`
`initial component for lexical trees. A plurality of lexical tree
`
`
`
`events. In order to reduce the processing, the controller can
`
`
`
`processors are connected in parallel to the input port and
`
`
`
`prune the new lexical tree identifiers to reduce the number of
`
`
`
`perform parallel lexical tree processing for word recognition
`
`
`lexical trees which are required to be processed. This pruning
`
`
`
`by accessing the lexical data in the lexical memory arrange­
`
`
`can be achieved using context dependant n phones to reduce
`
`
`
`ment. A results memory arrangement is connected to the 60
`
`
`the number of possible next phones. The number can be
`
`
`
`
`
`lexical tree processors for storing processing results from the
`
`
`
`
`further reduced by using a language model look ahead tech­
`
`
`
`
`lexical tree processors and lexical tree identifiers to identify
`nique.
`
`
`
`lexical trees to be processed by the lexical tree processors. A
`In one embodiment of the present invention, the lexical tree
`
`
`
`
`
`
`
`controller controls the lexical tree processors to process lexi-
`
`
`
`processors are arranged in groups or clusters. The lexical
`
`
`
`
`cal trees identified in the results memory arrangement by 65
`
`
`
`
`memory arrangement comprises a plurality of partial lexical
`
`
`
`
`performing parallel processing of a plurality of lexical tree
`
`
`
`
`memories. Each partial lexical memory is connected to one of
`data structures.
`
`IPR2023-00033
`Apple EX1001 Page 9
`
`

`

`
`
`US 7,587,319 B2
`
`4
`3
`the groups of lexical tree processors and contains part of the
`
`
`
`invention. In such an arrangement each processor performs
`
`
`
`
`
`
`lexical data. Thus a group of lexical tree processors and a
`
`
`
`
`lexical tree processing and the lexical data stored in each
`
`
`
`partial lexical memory form a cluster. Each lexical tree pro­
`
`
`
`
`lexical memory comprises lexical tree data structures which
`
`
`
`
`cessor is operative to process the speech parameters using a
`
`
`
`
`each comprise a model of words having common prefix com-
`
`
`
`
`partial lexical memory and the controller controls each lexical
`
`
`
`5 ponents and an initial component that is unique.
`
`
`
`
`tree processor to process a lexical tree corresponding to par­
`
`
`In preferred embodiments of the second aspect of the
`
`
`
`
`tial lexical data in a corresponding partial lexical memory.
`
`
`
`
`present invention, the preferred embodiments of the first
`
`
`In another embodiment of the present invention the lexical
`
`
`
`
`aspect of the present invention are incorporated.
`
`
`
`
`memory arrangement comprises a plurality of partial lexical
`
`
`
`Embodiments of the present invention will now be
`
`
`
`
`memories. Each partial lexical memory being connected to
`
`
`
`
`10 described with reference to the accompanying drawings in
`
`
`
`one of the lexical tree processors and containing part of the
`which:
`
`
`
`
`lexical data. Each lexical tree processor processes the speech
`FIG. 1 is a diagram of a speech data processing circuit for
`
`
`
`
`
`
`
`parameters using a corresponding partial lexical memory and
`
`
`
`
`generating parameterized speech data (feature vectors);
`
`
`
`
`the controller is operative to control each lexical tree proces-
`
`
`
`FIG. 2 is a diagram of a speech recognition circuit in
`
`
`
`
`sor to process a lexical tree corresponding to partial lexical
`
`
`
`accordance with an embodiment of the present invention;
`15
`
`
`
`data in a corresponding partial lexical memory.
`
`
`FIGS. 3a and 3b are schematic diagrams illustrating lexical
`
`
`
`In one embodiment of the present invention tree structures; the lexical
`
`FIG. 4 is a flow diagram illustrating the process performed
`
`
`
`
`
`memory arrangement stores the lexical tree data structures as
`
`
`
`
`
`by a lexical tree processor to determine a temporary lexical
`
`
`Hidden Markov Models and the lexical tree processors are
`
`
`
`
`tree score in accordance with an embodiment of the present
`
`
`
`
`
`operative to perform the Viterbi search algorithm using each 20
`invention;
`
`
`
`
`respective lexical tree data structure. Thus in this way, this
`FIG. 5 is a flow diagram illustrating the process performed
`
`
`
`
`
`
`
`
`embodiment of the present invention provides a parallel Vit­
`
`
`
`by the lexical tree processor for processing the input feature
`
`
`
`
`erbi lexical tree search process for speech recognition.
`
`
`
`vectors in accordance with an embodiment of the present
`
`
`
`
`
`The first aspect of the present invention is a special purpose
`
`invention; and
`
`
`
`circuit built for performing the speech recognition search
`25
`FIG. 6 is a flow diagram illustrating the process performed
`
`
`
`
`
`
`process in which there are a plurality of processors for per­
`
`
`
`by the controller in accordance with an embodiment of the
`
`
`
`
`
`forming parallel lexical tree processing on individual lexical
`
`present invention.
`tree processors.
`FIG. 1 illustrates a typical circuit for the parameterization
`
`
`
`
`
`
`
`
`
`In another aspect of the present invention a speech recog­
`
`
`30 of input speech data. In this embodiment the parameters
`
`
`
`nition circuit comprises an input port such as an input buffer
`
`generated are speech vectors.
`
`
`
`
`for receiving parameterized speech data such as feature vec­
`A microphone 1 records speech in an analogue form and
`
`
`
`
`
`
`
`
`tors. A plurality of lexical memories are provided which
`
`
`
`this is input through an anti-aliasing filter 2 to an analogue­
`
`
`
`
`contain in combination complete lexical data for word recog­
`
`
`
`to-digital converter 3 which samples the speech at 48 kHz at
`
`
`
`
`nition. Each lexical memory contains part of the complete
`
`
`
`
`35 20 bits per sample. The digitized output signal is normalized
`
`
`
`
`lexical data. A plurality of processors are provided connected
`
`
`( 4) to generated a 10 millisecond data frame every 5 millisec­
`
`
`
`in parallel to the input port for processing the speech param­
`
`
`
`onds with 5 milliseconds overlap (5). A pre-emphasis opera­
`
`
`
`
`eters in parallel. The processors are arranged in groups in
`
`
`tion 6 is applied to the data followed by a haniming window
`
`
`
`which each group is connected to a corresponding lexical
`
`
`
`7. The data is then fast Fourier transformed (FFT) using a 512
`
`
`
`memory to form a cluster. A controller controls each proces­
`
`
`
`40 point fast Fourier transform (8) before being filtered by filter
`
`
`
`
`
`sor to process the speech parameters using partial lexical data
`
`
`bank 9 into 12 frequencies. The energy in the data frame 5 is
`
`
`
`
`read from a corresponding lexical memory. The results of
`
`
`
`
`also recorded (13) as an additional feature and together with
`
`
`
`
`processing the speech parameters are output from the proces­
`
`
`
`the 12 frequency outputs of the filter bank 9, 13 feature
`
`sors as recognition data.
`
`
`
`vectors (10) are thus produced and these are output as part of
`
`
`
`
`Thus this aspect of the present invention provides a circuit
`
`
`
`
`
`45 the 39 feature vectors 14. First and second derivatives (11 and
`
`
`
`in which speech recognition processing is performed in par­
`
`
`
`
`12)are taken of the 13 feature vectors 10 to complete the
`
`
`
`allel by groups of processors operating in parallel in which
`
`
`
`
`generation of the 39 feature vectors 14.
`
`
`each group accesses a common memory of lexical data. This
`The arrangement illustrated in FIG. 1 is purely given for
`
`
`
`
`
`
`
`
`aspect of the present invention provides the advantage of
`
`
`
`
`illustration. The present invention encompasses any means by
`
`
`
`
`
`parallel processing of speech parameters and benefits from a
`
`50 which speech and data can be parameterized to a suitable
`
`
`
`
`
`limited segmentation of the lexical data. By providing a plu­
`
`
`
`form for input to the search process as will be described in
`
`
`
`rality of processors in a group with a common memory,
`
`more detail hereinafter.
`
`
`
`
`
`flexibility in the processing is provided without being band­
`
`
`FIG. 2 is a schematic diagram of a speech recognition
`
`
`width limited by the interface to the memory that would occur
`
`
`circuit in accordance with an embodiment of the present
`
`
`if only a single memory were used for all processors. The
`
`
`
`
`invention for performing the search process. The parameter­
`arrangement is more flexible than the parallel processing
`
`
`
`55
`
`
`
`
`ized speech data, which in this embodiment comprise feature
`
`arrangement in which each,processor only has access to its
`
`
`
`
`
`
`vectors are input to a feature vector buffer 20. The feature
`
`own local memory and requires fewer memory interfaces (i.e.
`
`
`
`
`
`
`
`vector buffer 20 is provided to buffer the incoming feature
`chip pins). Each processor within a group can access the same
`
`
`
`
`
`
`
`
`
`vectors to allow lexical tree processors 21 to read and process
`
`lexical data as any other processor in the group. The controller
`
`
`
`
`
`
`
`the feature vectors in the buffer 20 via a feature vector bus 24.
`can thus control the parallel processing of input speech 60
`
`
`
`
`
`
`
`A plurality k of lexical tree processors 21 are arranged in a
`parameters in a more flexible manner. For example, it allows
`
`
`
`
`
`
`
`
`respective lexical tree processor cluster 22. Each lexical tree
`more than one processor to process input speech parameters
`
`
`
`
`
`
`
`processor cluster 22 has an acoustic model memory 23 in
`using the same lexical data in a lexical memory. This is
`
`
`
`
`
`
`which is stored lexical data for use by the lexical tree proces-
`because the lexical data is segmented into domains which are
`
`
`
`
`
`
`
`65 sors 21 within the lexical tree processor cluster 22. Each
`
`
`accessible by multiple processors.
`
`
`
`this aspect
`
`
`
`
`lexical tree In a preferred embodiment of the present inven­ processor 21 in the lexical tree processor cluster
`
`tion is used in combination with the first aspect
`
`
`22 are connected of the present to the acoustic model memory 23 within the
`
`
`
`
`IPR2023-00033
`Apple EX1001 Page 10
`
`

`

`
`
`US 7,587,319 B2
`
`6
`
`5
`lexical tree processor 22. There are N lexical tree processor
`
`
`
`3.Best scores for best paths being processed by each lexical
`
`
`
`
`
`clusters and thus there are Nk lexical tree processors 21
`
`
`
`tree processor 21. This information enables the search con­
`
`
`
`
`connected by the feature vector bus 24 to the feature vector
`
`
`
`
`troller 27 to monitor the processing being performed by
`
`
`
`buffer 20. Each lexical tree processor 21 is capable of pro­
`
`
`
`
`lexical tree processors 21 to determine whether a global
`
`
`
`cessing a different lexical tree and thus Nk lexical trees can be 5
`
`
`pruning strategy should be applied in order to reassi
`gn
`
`
`
`
`
`
`
`
`
`processed in parallel. The acoustic model memories 23 store processing performed by a lexical tree processor if its best
`
`
`score for its best path is below a threshold or well below the
`
`
`
`as a whole a complete set oflexical data, i.e. lexical tree data
`
`
`best scores for the paths being processed by other lexical
`structures for use in the lexical tree processing by the lexical
`
`
`
`
`tree processors 21.
`tree processors 21. Each acoustic model memory 23 contains
`
`
`
`part or a segment of the lexical tree data. Since lexical tree
`
`
`10 4. Temporary lexical tree scores. These comprise tree scores
`
`
`
`
`
`
`
`
`
`processors 21 in a lexical tree processor cluster 22 access the
`
`
`
`which are determined as temporary scores to prune the next
`
`
`
`same acoustic model memory 23, it is possible for more than
`
`
`lexical trees to be processed at word ends. The temporary
`
`
`
`
`one lexical tree processor 21 to process the same lexical data.
`
`
`
`
`
`lexical tree scores include lexical tree identifiers or pointers
`
`
`
`to identify the next lexical trees to be processed. The scores
`
`
`
`This provides for some degree of flexibility in the controlling
`
`
`enable the pruning of this list.
`
`
`
`of the processing by the lexical tree processors 21. Further,
`15
`
`
`5. Pruning threshold. This can be a global threshold value for
`
`
`
`the acoustic model memories 23 need not contain only one
`
`
`use in the pruning of the lexical trees globally, or a local
`
`
`
`copy of the lexical data. It is possible to build in a redundancy
`
`
`
`threshold value for use by a lexical processor for locally
`
`
`
`in the data to further enhance the flexibility. This avoids any
`
`
`
`
`pruning the processing performed by the lexical processor
`
`
`
`
`bottleneck in the processing due to the search processing
`
`focusing on a small number oflexical trees. 20
`21.
`The acoustic model memory 23 stores a Hidden Markov
`
`
`
`
`A results memory 25 is provided for storing processing
`
`
`
`
`
`Model for acoustically modelling words as lexical trees. The
`
`
`
`results from the lexical tree processors 21 which are received
`
`
`
`acoustic model memory 23 stores a plurality of lexical tree
`
`
`over the path score and history bus 26. The results memory 25
`
`
`
`
`data structures. Each lexical tree data structure comprises an
`
`
`
`
`also stores information on lexical trees to identify which
`n phone model of a number of words having common prefix
`
`
`
`
`lexical trees are to be processed. A search controller 27 is 25
`
`
`
`
`phones. The first node of the lexical tree (the root) comprises
`
`
`
`
`provided to control the processing performed by the lexical
`
`
`a common n phone to all words in the lexical tree and uniquely
`
`
`
`tree processors 21 in dependence upon a program and data
`
`
`identifies the lexical tree.
`
`
`
`stored in program and data memory 28. The search controller
`Each lexical tree processor 21 includes on-board memory
`
`
`
`
`
`
`
`reads the path scores and lexical tree identifiers from the
`
`
`
`( or local memory) to be used during the lexical tree process­
`
`
`
`results memory and controls the lexical tree processors
`30
`
`
`ing. This working memory has to store all of the parameters
`
`
`
`accordingly. A language model processor 29 is provided
`
`
`
`
`currently working on including current scores for all paths
`
`
`
`which is connected to each lexical tree processor 21 by a
`
`
`
`being processed within the lexical tree, and previous N-1
`
`
`language model bus 30. The language model processor 29
`
`
`words for the lexical tree. The local storage of the previous
`
`accesses a language model memory 31 to read language
`
`
`N-1 words enables the lexical tree processor 21, when a word
`
`
`
`model data for provision to lexical tree processors 21 in 35
`
`
`end is reached along a branch of the lexical tree, to send a
`
`
`
`
`
`response to language model data requests. External control of
`
`
`
`request for the language model score for an N gram by send­
`
`
`the language model memory 31 is provided by a word con­
`
`
`
`ing the identity of the N-1 previous words together with the
`
`
`
`
`
`strains input. The language model processor 29 determines a
`
`word identified at the end of the branch.
`
`
`
`score for a word occurring following N previous words using
`FIG. 3a and 3b schematically illustrate lexical trees which
`
`
`
`
`
`
`
`N grams. When a lexical tree processor requires a language 40
`
`
`
`
`can be processed by the lexical tree processor 21 during the
`
`
`
`model score a request is sent to the language model processor
`
`
`recognition of the two words HARD ROCK. In FIG. 3a a
`
`
`29 over the language model bus 30 identifying the current
`
`
`
`
`
`previous lexical tree terminated at a branch recognizing the
`
`
`
`word and the N-1 previous words. A language model score
`
`word YOU and terminating with the mono phone uw which is
`
`
`
`for the N gram can be returned to the lexical tree processor 21
`
`
`
`
`associated with two context dependent tri-phones y -uw+k
`
`for the modification of the score at the end of a branch of
`45
`
`
`
`
`
`lexical tree processing. The lexical tree processor
`
`
`associ-and y -uw+h. Thus the context dependent tri-phone can modify
`
`ated with the last phone in the lexical tree word model for
`
`
`
`the score in accordance with the language model and output a
`
`
`
`
`YOU indicates two possible next lexical trees, i.e. the lexical
`score to the results memory 25 for a word at the end of a
`
`
`
`trees beginning with the mono phone k and hand having a left
`branch of the lexical tree proces

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket