throbber
ISCA Archive
`http://www.iscaĆspeech.org/archive
`
`First European Conference on
`Speech Communication and Technology
`EUROSPEECH '89
`Paris, France, September 27Ć29, 1989
`
`Multi-DSP and VQ-ASIC Based Acoustic Front-End
`for Real-Time Speech Processing Tasks
`
`Abdulmesih Aktas and Harald Höge
`
`Siemens AG, München
`Otto-Hahn-Ring 6, D-8000 München 83, West Germany
`
`Abstract
`
`This paper describes the architecture of a multi(cid:173)
`DSP based acoustic front-end (AkuFE) developed
`within the speaker adaptive continuous speech
`understanding and dialogue system SPICOS1. The
`AkuFE
`is designed as a configurable high
`performance signal processing VMEbus system
`employing up
`to
`five Texas
`Instruments
`TMS320C25 signal processors and an ASIC for
`vector quantization (VQ) developed within the
`project. The system is realized on three boards and
`achieves a total computational power of more than
`100 MIPs. In this case two VQ processors can work
`in parallel. A 68020 based workstation serves as
`host computer. The AkuFE is employed for the
`real-time acoustic-phonetic decoding task in the
`SPICOS system. Due to its flexibility, it can be
`used for a wide range of real-time speech
`processing tasks.
`
`1. Introduction
`
`The multi-DSP and ASIC based acoustic front-end
`(AkuFE) has been developed in the framework of the
`SPICOS system/11 and is build up ofthree boards: Aso
`called master board and two slave boards. The system is
`designed for the analysis and synthesis of telephone
`quality speech (8KHz sampling frequency and 3.2 KHz
`bandwidth) and high quality speech (16 KHz sampling
`frequency and 6.4 KHz bandwidth). Figure 1 shows an
`general overview ofthe AkuFE system.
`
`Master·
`Board
`
`Slave·
`Boards
`
`Figure 1: The multi-DSP acoustic front-end system
`(AkuFE)
`
`The master board is the basic control system of an
`extendable multi-DSP architecture. The complete
`system is based on Texas Instruments TMS320C25
`digital signal processor VLSI chips and built as a
`VMEbus system. Within the distributed architecture of
`SPICOS /2/ it is plugged in a SUN workstation and
`employed for the real-time acoustic-phonetic decoding
`task. The master board of the AkuFE consists of an
`analog (A/D and D/A converter) and digital subsystem,
`while the slave board employs two TMS320 DSPs and a
`VLSI chip for VQ /3/. A local data and program memory
`is dedicated to each signal processor. Two slave boards
`can be connected to a master board, achieving a
`computational power of more than 100 MIPs. But the
`slave boards can be used also without a master board.
`
`The inter-DSPbus allows fast communication between
`the master and the slave processors without interfering
`with the VMEbus. A global memory scheme is realized
`in order to perform data transfers between the
`processors. Control is handled by interrupts. In order to
`test, download and control the system, the host
`machine has read and write access to the whole memory
`ofthe AkuFE system. Because ofits configurability and
`free programmability, the AkuFE system can be used
`in a wide range of signal processing applications, such
`as speech analysis and synthesis, speech coding and
`speech recognition. Used as a fast array processor for
`algorithms with extensive computational require(cid:173)
`ments, it can highly improve the performance of a
`standard workstation. The codebook generation /4/
`particularly can be accelerated by employing the VQ
`Chip on the slave board in connection with a host
`processor. Further applications are spectral analysis,
`instrumentation and image processing.
`
`A more detailed descri ption of the master board is gi ven
`in section 2. In section 3 the description of the slave
`DSP subsystems and the VQP follows. Finally an
`application of AkuFE system for real-time acoustic(cid:173)
`phonetic decoding is given in Chapter 4. Performance
`results for the specific application are given.
`
`It should be mentioned that the notation master and
`slave are not used here in the sense of VMEbus
`definitions.
`
`1 SPICOS (~iemens, ~hilips, !PO Continuous ~eech Recognition
`and U nderstanding) is carried out as a joint project and sponsored
`by the German Federal Ministry for Research and Technology
`(BMFT) under the grant No. ITM 8801 B9.
`
`EUROSPEECH '89, Paris, France, September 1989
`
`1586
`
`IPR2023-00035
`Apple EX1058 Page 1
`
`

`

`Sampie
`
`P1
`
`VMEbus
`
`Contr.-Reg.
`
`INTerrupt
`
`LEDs
`
`Ext. Buffer
`
`P2
`
`INTERwDSPbus
`
`Figure 2: Block diagram ofthe AkuFE master board
`
`2. The AkuFE Master System
`
`The master board 151 performs the initial signal
`preprocessing steps, like amplification, sampling, A/D
`and DIA conversion. Computationally
`intensive
`operations required for a spectral or time analysis can
`be executed on the DSP subsystem. Further task of this
`board is the control of data flow to the host machine
`over a so-called Mailbox. Figure 2 shows the general
`blockdiagram ofthe AkuFE master board.
`
`2.1. Analog Subsystem
`
`Two different audio channels, one for telephone and
`one for a high quality microphone input, are provided.
`The board allows the signal acquisition, filtering and
`sampling oftelephone quality speech (8KHz sampling
`frequency and 3.4 KHz bandwidth) and high quality
`speech (16 KHz sampling frequency and 6.8 KHz
`bandwidth). The corner frequency of the anti-aliasing
`filters for both sampling rates are software selectable.
`In order to increase the dynamic range ofthe 12 Bit A/D
`converter, a programmable gain preamplifier (0-24dB)
`is implemented.
`
`The DIAsubsystems consists of a 12 bit DIA converter
`and
`the associated interpolation filter with
`two
`different software selectable corner frequencies.
`
`The master board provides further outputs for
`telephone and a headset.
`
`a
`
`capable of executing ten million instructions per
`second, where most of the instructions require one
`cycle.
`A RAM based local data and program memory is
`dedicated to the DSP. The host processor has read and
`write access to the whole memory. A local data memory
`of 32 KWords and program memory of 16 KWords are
`provided on the master board. The remaining 32
`KWords of data memory are reserved for global
`memory communication areas of four
`further DSP
`subsystems located on two slave boards. The memory
`expansion and the fast communication to the slave
`board subsystems is performed via inter-DSP bus. Data
`exchange can be done via VMEbus interface as well.
`
`A data memory region of 1 KWord called MAILBOX is
`provided as dual-ported memory enabling a parallel
`access from both the DSP and the host processor
`without holding the DSP during a block data transfer
`routine. This allows, e.g.in case of signal acquisition, a
`real-time transfer of the preprocessed and analyzed
`data.
`
`For moreflexible VMEbus communication an interrupt
`control register is provided. Hold and reset control is
`performed by a further register which additionally can
`be used to toggle between the sampling rates. A third
`register allows the digital control ofthe gain amplifier.
`Several LEDs provide visual
`control of
`the
`communication.
`
`3. The AkuFE Slave System
`
`2.2. Digital Subsystem
`
`The second generation chip TMS320C25 from Texas
`Instruments provides a high performance 16 bitdigital
`subsystem for the AkuFE. The modified Harvard
`architecture of the processor allows a fast memory
`access and high data through-put with a 100 ns
`instruction cycle. The CMOS version of the DSP is
`
`The AkuFE slave is a powerful optional system that
`can also be used without the AkuFE master board 161.
`The two TMS320C25 based subsystems and the vector
`quantization processor (VQP) provides a computational
`power of about 50 MIPs and is appropriate for use as an
`array processor in many speech applications with high
`computational requirements. Two slave boards can be
`used in conjunction with the AkuFE master board.
`
`EUROSPEECH '89, Paris, France, September 1989
`
`1587
`
`IPR2023-00035
`Apple EX1058 Page 2
`
`

`

`Local memory is attached to each DSP subsystem. In
`order to test, download and control the system, the hast
`machine has read and write access to the whole memory
`of the AkuFE slave system. Four memory banks each
`with 1 Mbit are available, allowing storage of
`codebooks up to the size 8,000 in our application. The
`mapping of the VQP control register into the global
`memory area permits the control of codebook search
`from either one of the DSP subsystem of the AkuFE or
`from hast computer. Figure 3 shows the basic VQP-DSP
`interfacing.
`
`SANK 3
`
`SANK 2
`
`BANK 1
`
`BANK 0
`
`VECTOR QUANTIZATION
`PROCESSOR
`
`10MHz
`
`MICROPROCESSOR
`SYSTEM
`
`Figure 3: Vector Quantization Processor and DSP
`interfacing
`
`3.1. DSP Subsystems
`
`Two TMS320C25 based subsystems can be employed for
`parallel processing. Each subsystem has a dedicated
`da talprogram memory of 56 KW ords/64 KW ords. A
`data area of 8 KWords
`is reserved for global
`communication. Each global RAM area is accessible
`from all processors, including the VQPs. A priority
`controlled access is implemented. The master DSP has
`the highest priority. Theserialinput of each DSP can
`be used to build up a second communication path to
`other processors.
`
`The very highly flexible interrupt handling allows the
`implementation of various software architectures and
`communication schemes. A bus arbiter controls the
`access to the internal DSP bus.
`
`3.2. Vector Quantization Processor (VQP)
`
`As codebook search is. one of the most computationally
`extensive operations in speech processing, a VLSI Chip
`using high concurrency was developed to perform this
`task
`for
`real-time applications
`/3/. The VQP
`implements a full search algorithm. lt supports two
`software selectable distance measures for the search :
`Euclidian and city-block. The VQP is able to handle
`vectors of variable dimensions,
`codebooks
`of
`programmable size. Each codebook vector can be
`extended to up to 64 dimensions of 8 Bit components,
`without noticeable lass of performance. The chip
`delivers the codebook entry and the distance ofthebest
`codebook candidate.
`Finally, some important features ofthis ASIC should be
`mentioned:
`
`• on-chip cache RAM for input vector
`• vector dimensions u p to 64 wi th word-length of 8 bi t
`• 4 software selectable codebook banks, each oflMbit
`•
`throughput rate of 107 vector components/s
`• on-chip watchdog
`• 16 bitparallel interface
`easy communication with the hast processor
`•
`• needs two addresses in the host addressing space
`
`4. Real-time Acoustic Phonetic Decoding
`
`Within the SPICOS system, a man-machine dialogue
`system allowing data queries about office activities
`using a vocabulary of about 1,000 words, the AkuFE is
`used for real-time acoustic phonetic decoding and
`diphone synthesis /8/. The acoustic-phonetic decoding is
`based on articulatory features which describe the place
`the manner
`(labial, dental, alveolar, etc.) and
`(consonant, liquid, nasal, plosive, etc.) of articulation
`171 The feature vector generated from these categories
`is extracted every 10 ms and used by subsequent
`Hidden Markov Models for the recognition of subword
`units. It has been shown /7/ that the mapping of the
`speech signal onto articulatory categories has several
`advantages, such as the reduction ofthe feature set and
`speaker invariability. For the place of articulation,
`formants are required. Instead of implementing a
`computationally expensive root solving algorithm to
`calculate the LPC poles, a faster alternative approach
`using vector quantization was chosen. The spectral
`peaks in precomputed LPC spectra are stored in a
`codebook and used as formant candidates.
`The following processing steps are performed on the
`AkuFE: The master board performs the primarily
`signal
`preprocessing
`steps,
`like
`amplification,
`
`EUROSPEECH '89, Paris, France, September 1989
`
`1588
`
`IPR2023-00035
`Apple EX1058 Page 3
`
`

`

`/3/ E. Preiss, A. Stölzle, W. Drews and J. Pandel:
`"Architecture of a CMOS VLSI Vector Quantization
`Processor", Proc. of the EUSIPCO 88, Grenoble/
`France (1988), Vol. 3, pp.1241-1244
`/4/ T.M. Liu and H. Höge: "Phonetically Based LPC
`Vector Quantization ofHigh Quality Speech", Proc.
`of this Conference
`/5/ G. Gohn and A. Peham: "Pflichtenheft, Akustisches(cid:173)
`Frontend", Interna! Report, Siemens Wien (1986)
`"AkuFE-Slave, Spezifikationen",
`/6/ S.
`Szikora:
`Interna! Report, Siemens Wien (1988)
`171 0. Schmidbauer : "Robust Modelling of Systematic
`Variabilities in Continuous Speech Incorporating
`Acoustic Articulatory Relations", Proc. ofte ICASSP
`'89, Glasgow/U.K. (1989), pp 616-619
`/8/ A. Aktas and H. Höge: "Real-Time Recognition of
`Subword Units on a Hybrid Multi-DSP/ASIC Based
`Acoustic Front-End", Proc. of the ICASSP '89,
`Glasgow/U.K. (1989), pp. 100-104
`/9/J. Sokat : "Untersuchung der Autokorrelation(cid:173)
`Ladder-Algorithmen zur Berechnung von Prädik(cid:173)
`tionsfilter-Koeffizienten
`in
`Integer-arithmetik",
`Diplom-Thesis, Uni Duisbu"rg/W. Germany(1987)
`
`sampling, AfD and DIA conversion with a linear
`resolution of12 bits. The analogue speech signal is low(cid:173)
`pass filtered at a corner frequency of 6.4 kHz and
`digitized with a sampling rate of 16 KHz. Further
`tasks of this board are the control of data flow over a so(cid:173)
`called Mailbox and the computation of some basic
`parameters like signal energy and zero crossings.
`The acoustic analysis and synthesis algorithms are
`implemented on the second board. In order to reach
`high spectral resolution and detect reliable formants in
`the 0-6.5 KHz region of the spectrum, a 16th order LPC
`analysis is performed. The LeRoux and Guegen
`algorithm
`is
`implemented
`in mixed 16/32-bit
`arithmetic /9/.
`
`The VLSI VQ-Chip performs a full search over 4,000
`LPC codebook entries represented by 16 coded
`reflection coefficients each coded with 8 bits. The
`codebook entries deliver formant candidates and
`corresponding sub-band energies for the succeeding
`formant tracking algorithm. Both formant tracking
`and articulatory analysis with the composition of the
`AFV consisting of the occurrence probabilities of
`articulatory categories /7/ are performed on the host
`machine.
`The homogeneaus articulatory feature vector which is
`composed of probabilities describing the manner and
`place of articulation, serves as an input parameter for a
`phoneme recognition scheme based on HMMs. Finally
`the phoneme models are concatenated in order to model
`400 subword units of consonant clusters and syllabic
`these
`nuclei. The Viterbi algorithm to recognize
`clusters is now going to be implemented on an
`accelerator board with an ASIC. This board will be
`built up of a high performance 32 bit floating-point DSP
`and and a RISC processor.
`
`5. Performance
`
`The processing steps, from the sampled speech data
`through the calculation of the reflection coefficients,
`require about 8 ms on a TMS320C25, leading to a frame
`rate of 10 ms. Besides the weighting and calculation of
`the autocorrelation coefficients, which require about
`1.5 ms, about 0.5 ms is necessary for the double(cid:173)
`buffered processing and real-time signal acquisition. As
`the VQ processor can perform a full search over the
`total codebook in Iess then 9 ms, the processing can be
`performed in real-time.
`
`6. References
`
`111M. Brenner, H. Höge, E. Marschall and J. Romano:
`"Word Recognition in Continuous Speech Using a
`Phonological-Based Two Network Matehing Parser
`and a Synthesis Based Prediction", Proc. of the
`ICASSP '89, Glasgow/U.K. (1989), pp. 457-460
`/2/ H. Höge : ''Verteilte Prozessarchitektur für ein
`sprach verstehendes Ech tzei tsystem", ·. Proc. of the
`DAGM-Symposium, Braunschweig!W. Germany
`(1987), pp.l06-110
`
`EUROSPEECH '89, Paris, France, September 1989
`
`1589
`
`IPR2023-00035
`Apple EX1058 Page 4
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket