throbber
IPR2017-01244
`Saint Lawrence Communications
`Exhibit 2014
`
`REAL-TIME IMPLEMENTATION OF HIGH-QUALITY 32 KBPS WIDEBAND LD-CELP CODER
`
`Oded Gottesman
`Yair Shoham
`
`Speech Coding Research Department
`AT&T Bell Laboratories
`Murray Hill, New Jersey 07974
`
`ABSTRACT
`The Wideband-Audio Low-Delay CELP (LD-CELP) coder
`produces speech with quality as high as the CCITT 64 kb/s
`standard (G.722) at half the bitrate. The computational load of the
`encoder is almost 900% processor time of the 12.5 MIPS DSP32c.
`This makes a
`real-time
`implementation
`impractical. We
`investigated the Gain-Shape Vector-Quantization (GSVQ) in
`order to reduce the computational load of the encoder. This paper
`describes a real-time implementation of the LD-CELP encoder
`based on the AT&T SURFboard using two DSP32c operating in
`parallel. A computational load of 180% processor time has been
`achieved. The respective decoder requires 42% processor time.
`The implementation of a full-duplexed coding system requires
`three 12.5 MIPS Digital-Signal-Processors (DSPs) and has one-
`way coding delay of less than 1ms. The coder also performs well
`for non-speech wideband audio signals such as music.
`
`Keywords: Wideband, LD-CELP.
`
`1. INTRODUCTION
`The growing pool of ISDN applications intensifies the
`interest in new and more advanced coding algorithms for
`wideband speech [6, 7]. The major requirements expected from
`such coders are: a) the coded speech quality should be comparable
`to that of the G.722; b) the bitrate should be at least halved; and c)
`the one-way coding delay should be minimal. The 32 kb/s
`wideband-speech LD-CELP coder was shown to potentially
`satisfy
`these
`requirements
`[10, 11]. However,
`the high
`computational load of the encoder, which is approximately 900%
`of the processor-time of a 12.5 MIPS DSP [5], makes the
`implementability of this algorithm questionable. With the present
`DSP technology, the use of several DSPs operating in parallel
`seems to be unavoidable even if the algorithm is greatly
`simplified. Therefore, we were challenged to implement the
`encoder using only two DSPs.
`
`In this paper we present a real-time DSP implementation of
`this coder and describe its performance. Section 2 gives a brief
`overview of the initial 32 kb/s wideband-speech LD-CELP
`algorithm and analyzes its computational load. Section 3 shows
`how we dealt with the problem of the computational load
`reduction. Section 4 describes the development of the parallel-
`processing operated DSP software, the processor time and
`memory usage. Section 5 discusses the subjective performance
`test results.
`
`sn
`
`cj
`
`x
`
`c1
`c2
`
`Codebook
`
`cN
`
`gain
` adaptor
`
`+
`
`W(z)
`
`^
`rn
`
`1
`A(z)
`
`^
`sn
`
`c1
`c2
`
`Codebook
`
`cN
`
`cj
`
`x
`
`gain
` adaptor
`
`^
`rn
`
`1
`A(z)
`
`^
`sn
`
`Figure 1. Fully backward adaptive LD-CELP coder
`
`2. OVERVIEW OF THE 32 KB/S WIDEBAND-
`SPEECH LD-CELP ALGORITHM
`The LD-CELP is basically a backward-adaptive version of
`the conventional CELP coder [1, 8, 12]. The basic structure of the
`LD-CELP [3-5, 10, 11] is illustrated in Fig. 1. The LD-CELP
`encoder implements a closed-loop (analysis-by-synthesis) search
`procedure for finding the best excitation cj drawn from an
`excitation codebook. Each possible excitation vector is passed
`through the adaptive gain
` [2, 5] and the LPC filter 1/A(z) and
`results with a synthetic output vector. The encoder selects the
`excitation whose synthetic output vector sn is the best match to
`the input speech sn, usually in a Weighted-Mean-Squared Error
`
`The European Conference on Speech Communication and Technology, EUROSPEECH, 1993 ©
`
`IPR2017-01077
`Saint Lawrence Communications
`Exhibit 2014
`
`

`

`Element
`
`Encoder
`Real-Time (%)
`16 kHz
`Sampling rate
`7 kHz audio
`Coded data
`32 kb/s (2 bit/sample)
`Bitrate
`5 samples (0.3125 ms)
`Vector length
`backward mode
`LPC analysis
`32
`LPC Synthesis filter order
`16
`Noise-Weighting filter order
`2
`Spectral-Tilt filter order
`Noise-Weighting filter
`z=0.95, p=0.8, t=0.7
`Excitation signal, 10 bit
`Quantization
`Not used
`Pitch-Synthesis filter
`Backward-mode
`Adaptive predictive gain
`Table 1. The Wideband LD-CELP configuration
`
`Process
`
`Convolution plus energy
`VQ search
`LPC of order 32
`Recursive autocorrelation
`Impulse response
`Zero input response
`Update filters states
`Pre-filter the input
`Autocorrelation of order 3
`LPC of order 2
`Weight filters
`Predictive gain update
`Compute the VQ’s target
`Total Real-Time %
`MIPS
`
`Encoder
`Real-Time (%)
`429.36
`341.56
`61.67
`19.62
`10.62
`10.62
`10.62
`5.57
`2.37
`2.25
`2.05
`1.66
`0.32
`898.30
`112.29
`
`Decoder
`Real-Time (%)
`0
`0
`61.67
`19.62
`0
`0
`5.31
`0
`0
`0
`2.05
`1.66
`0
`90.32
`11.29
`
`* All real-time % computations are with respect to DSP32C having 12.5 MIPS
`Table 2. The computational complexity of the initial LD-CELP
`coder [5].
`
`Fig. 2 illustrates the complexity of the two investigated LD-CELP
`encoders [5] as a function of the LPC update period. The initial
`system performs an exhaustive search in a 10-bit codebook for the
`best matched shape-vector cj(n), hence is denoted by Shape-VQ
`(SVQ). The quantized excitation vector r(n) for the SVQ system
`is given by:
`
`(4)
` cj(n) ; 0 n N-1
`r(n) =
` is the adaptive predictive gain
`where N is the vector length and
`[2, 5] illustrated in Fig. 1. The second system performs a Gain-
`Shape VQ (GSVQ) [3-5], where 7 bits are allocated to represent a
`shape vector qj(n) and 3 bits are used for gain factor gk. The
`quantized excitation vector r(n) for the GSVQ system is given
`by:
`
`r(n) =
`
` gk qj(n) ; 0 n N-1
`
`(5)
`
`(WMSE) sense. The WMSE matching is accomplished via the use
`of a noise-weighting filter W(z). The parameter j that describes
`the selected excitation vector is then transmitted to the decoder
`where the synthesis process is duplicated.
`
`The parameters of the filters 1/A(z) and W(z) are determined
`via the LPC analysis applied to the recent past output speech in a
`backward-adaptive mode. The filter W(z) is important for
`achieving a high perceptual quality in CELP systems. The
`conventional form of noise-weighting filter Wc(z) is given by [1, 3-
`5, 8, 10, 11]:
`
`0;
`
`p
`
`z
`
` (1)
`
`1
`
`P
`
`1
`
`k
`P
`
`k
`
`1
`
`a
`
`k
`
`k
`
`z
`
`z
`
`a
`
`k
`
`k
`
`z
`
`p
`
`k
`
`k
`
`11
`
`)(
`zW
`c
`
`where A(z) is the LPC polynomial. Such a filter has an inherent
`limitation in modeling concurrently the formant structure and the
`spectral tilt. Since at high frequencies the data is highly
`unstructured and the initial unweighted SNR tends to be highly
`negative [11], noise-weighting filter is more critical in wideband
`speech coding. Therefore, an enhanced form of noise-weighting
`filter [5, 10, 11] is used for wideband speech LD-CELP coder, given
`by:
`
`W(z) = Wc(z) T(z)
`
`where T(z) is a tilt controlling second order section given by:
`
`)(
`zT
`
`1
`
`k
`
`1
`
`2
`
`k
`
`1
`
`k
`
`k
`
`z
`
`t
`
`(2)
`
`(3)
`
`2
`where the coefficients { k}k=1
` are computed by applying the
`standard LPC procedure to the first three correlation coefficients
`p
`of the current frame LPC coefficients {ak}k=0
` [5, 10, 11]. The
`parameter
`t is used to adjust the spectral tilt of T(z). Table 1
`shows the configuration of the wideband LD-CELP coder
`investigated and implemented [5].
`
`3. COMPUTATIONAL LOAD REDUCTION
`The computational complexity of our initial LD-CELP coder
`is depicted in Table 2 [5]. It is measured as a percentage of 12.5
`MIPS processor time. The overall complexity of the encoder is
`approximately 900%
`real-time. The most
`intensive
`task
`(429.36%) is the convolution of the synthesis filters with the
`entire set of excitation vectors and the computation of the energy
`of the resulted vectors [5]. The second intensive task is VQ search
`(341.56%). We selected to reduce the complexity of these two
`tasks by using Gain-Shape VQ [3-5]. Additional reduction of the
`algorithm complexity may be obtained by performing the LPC
`analysis once in every given number of vectors rather than every
`vector.
`
`The European Conference on Speech Communication and Technology, EUROSPEECH, 1993 ©
`
`

`

`the processing stream, performed by each DSP in the real-time
`implementation of the selected GS LD-CELP encoder. The master
`DSP handles the DMA with the analog interface. As soon as a
`new vector of samples is filled, the first (master) DSP starts
`processing the VQ processes. In the background the LPC
`processes are handled by the second (slave) DSP. The slave DSP
`is synchronized to the master DSP such that they share the VQ
`search. The arrows denote parameter transfer between the two
`DSPs. The illustrated process is repeated in a 4 vector period (the
`LPC update period). The 4 vectors in this period are denoted by
`vector #1 to vector #4. Table 3 summarizes the complexity of our
`real-time implemented GSVQ LD-CELP encoder [5]. Table 4
`summarizes the memory usage of the implementation.
`
`Processing Stream of 32 kb/s GSVQ LD-CELP encoder
`
`DMA
`
`DMA
`
`DMA
`
`Vector #1
`
`Vector #2
`
`DSP#1:
`(Master)
`
`DSP#2:
`(Slave)
`
`DSP#1:
`(Master)
`
`DSP#2:
`(Slave)
`
`0
`
`20
`
`40
`
`60
`
`80
`
`100
`
`120
`
`140
`
`160
`
`180
`
`200
`
`DMA
`
`DMA
`
`Vector #3
`
`DMA
`
`Vector #4
`
`Figure 2. The computational complexity vs LPC update rate for
`the two LD-CELP encoders [5].
`
`Fig. 3 illustrates the output SNR obtained [5] for the respective
`systems. The GSVQ encoder having an update rate of every 4
`vectors requires 180% real-time was selected for real-time
`implementation on a two-DSP hardware. The computational load
`of the respective decoder is 42% processor time. Therefore it is
`implemented on a third DSP [5].
`
`1
`
`Figure 3. Output SNR vs LPC update rate for the two LD-CELP
`coders [5].
`
`200
`
`220
`
`240
`
`260
`
`280
`
`DSP #1 (Master)
`
`320
`
`300
`
`340
`400
`360
`380
`Time ( 100% = 312.5µs )
`DSP #2 (Slave)
`
`4. SURFBOARD IMPLEMENTATION
`The original LD-CELP algorithm was written in C language.
`First we compiled and simulated the algorithm in general-purpose
`computer. Later we used the AT&T DSP32 C Language Compiler
`to compile the entire C code to DSP32 assembly code [13]. We
`ran this code on the AT&T DSP32 SURFboard. The encoding
`algorithm was then divided into two parts, to distribute its
`processing over two DSPs [5]. The first part includes the
`processes that are directly related to the VQ search and are
`performed every vector. We denoted this class of processes by VQ
`processes. The second part includes the processes that are directly
`related to the LPC analysis and are performed once in every given
`number of vectors. We denoted this class of processes by LPC
`processes. We ran these two parts of the algorithm on two DSPs
`where each one of them ran a different part of the algorithm in a
`master-slave manner. During this phase, the interface between the
`two DSPs was developed. We were greatly helped at this phase by
`a locally developed program called “dspx” which handled the
`downloading and the I/O between the Unix environment and the
`SURFboard. At this point we completed the allocation and
`scheduling of the tasks and interfaces performed by each one of
`the DSPs, but we still processed data files rather then real-time
`sampled data. The next step was to take a conservative approach
`in converting C subroutines step-by-step into DSP32 assembly
`code [13]. After all the C subroutines were converted to hand
`optimized DSP32 subroutines, we wrote a DSP32 assembly code
`to handle the DMA for the real-time processing. Fig. 5 illustrates
`
`Figure 4. Processing Stream of 32 kb/s Wideband GSVQ LD-
`CELP encoder [5].
`
`5. RESULTS
`The performance of the 32 kb/s wideband LD-CELP was
`evaluated by comparing it to the 64 kb/s G.722 CCITT standard
`wideband coder [9]. The test material included four male and four
`female utterances. Each utterance was coded by the G.722 and by
`the real-time LD-CELP to form a pair of utterances. Twenty-four
`listeners took part in the test. Twelve of the listeners work in
`speech processing and are well acquainted with this kind of test,
`and therefore were denoted "trained" listeners. The other twelve
`listeners, who are not experienced with this kind of test, were
`denoted "naive" listeners. The listener was asked to vote for the
`better sounding utterance in his judgment, or, to split his vote
`equally, if no preference could be made. The final scores were
`defined as the percentage of the number of votes for each system.
`
`The European Conference on Speech Communication and Technology, EUROSPEECH, 1993 ©
`
`

`

`Powered by TCPDF (www.tcpdf.org)
`
`computational complexity at a reasonable level. The results of the
`subjective A/B comparison tests indicate that the reproduced-
`speech quality of the 32 kb/s GS LD-CELP is comparable to the
`64 kb/s ADPCM standard. The two major advantages of our real-
`time implemented GS LD-CELP coder over the ADPCM standard
`are: a) it operates at half the bitrate of the ADPCM standard; and
`b) it has an extremely low one-way-delay of less than 0.94 ms
`compared to about 1.5 ms for the ADPCM standard. This work
`presents a real-time implemented coder which can be an excellent
`candidate
`for wideband
`audio
`coding
`in high-quality
`communication networks.
`
`REFERENCES
`[1] B. S. Atal, M. R. Schroeder, “Code Excited Linear
`Predictive (CELP): High Quality Speech at Very Low Bit Rates”,
`Proc. IEEE Int. Conf. ASSP, 1985, pp. 937-940.
`[2]
`J. H. Chen and Allen Gersho, “Gain-Adaptive Vector
`Quantization with Application to Speech Coding", IEEE Trans.
`Comm., Vol. 35 No. 9, September 1987, pp. 918-930.
`[3]
`J. H. Chen, “ A Robust Low-Delay CELP Speech Coder at
`16 kb/s”, Proc. GLOBECOM-89, Vol. 2, November 1989, pp.
`1237-1240.
`[4]
`J. H. Chen et. al., “Low-delay CELP coder for the CCITT
`16 kb/s speech coding standard”, IEEE SAC, vol. 10 no. 5, June
`1992, pp. 830-849.
`[5] O. Gottesman, “Algorithm Development and Real-Time
`Implementation of High-Quality 32kbps Wideband Speech LD-
`CELP Coder”, MS Thesis, ECE Dept., Drexel University, January
`1993.
`[6] N. S. Jayant et al., "Coding of Wideband Speech", Proc.
`2nd Europ. Conf. Speech. Commun. Technol., Sept 1991.
`[7] N. S. Jayant , "Signal Compression: Technology Targets
`and Research Directions", IEEE SAC, vol. 10 no. 5, June 1992,
`pp. 796-818.
`[8]
`Peter Kroon and Ed. F. Deprettere, "A Class of Analysis-
`by-Synthesis Predictive Coder for High Quality Speech Coding at
`Rates Between 4.8 and 16 kbits/s", IEEE J. SAC, vol. 6 No. 2,
`February 1988, pp. 353-363.
`[9]
`P. Mermelstein, "G.722, A New CCITT Coding Standard
`for Digital Transmission of Wideband Audio Signals", IEEE
`Communications Magazine, January 1988, pp. 8-15
`[10] E. Ordentlich, “Low Delay Code Excited Linear Predictive
`(LD-CELP) Coding of Wide Band Speech at 32Kbit/sec”, MS
`Thesis, EE Dept., MIT, March 1990.
`[11] E. Ordentlich, Y. Shoham, “Low-delay code-excited linear-
`predictive coding of wideband speech at 32 kbps”, Proc.
`ICASSP-91, pp. 9-12.
`[12] Y. Shoham, “Constrained-Stochastic-Excitation Coding of
`Speech at 4.8 Kb/s”, In B. S. Atal et al., editor, Advances in
`Speech Coding, Kluwer Academic Publishers, 1990, pp. 339-348.
`[13] "WE® DSP32C Digital Signal Processor - Information
`Manual", AT&T, January 1990.
`
`Process
`
`Decoder
`Encoder
`Real-Time (%)
`Real-Time (%)
`0
`13.42
`Convolution plus energy
`0
`88.70
`VQ search
`15.42
`15.42
`LPC of order 32
`18.83
`18.83
`Recursive autocorrelation
`0
`2.66
`Impulse response
`0
`10.62
`Zero input response
`5.31
`10.62
`Update filters states
`0
`5.57
`Pre-filter the input
`0
`0.59
`Autocorrelation of order 3
`0
`0.56
`LPC of order 2
`0.51
`1.02
`Weight filters
`1.66
`1.66
`Predictive gain update
`0
`0.32
`Compute the VQ’s target
`0
`10.03
`DSP interface
`41.73
`180.03
`Total Real-Time %
`5.22
`22.50
`MIPS
`* All real-time % computations are with respect to DSP32C having 12.5 MIPS
`Table 3. The computational complexity of the initial LD-CELP
`coder [5].
`
`Decoder
`
`\ System
`Encoder
`Encoder
`Block \
`DSP#2
`DSP#1
`2402
`3460
`4476
`Program
`6940
`7172
`7556
`Data
`9342
`10632
`12032
`Total
`Table 4. Memory usage of the Wideband LD-CELP (in bytes) [5]
`
`Type of input
`
`32 kb/s GSVQ
`64 kb/s ADPCM
`LD-CELP (%)
`(G.722) (%)
`45.57
`54.43
`Total Score
`41.15
`58.85
`Trained Listeners Score
`50.00
`50.00
`Naive Listeners Score
`46.61
`53.39
`Male’s utterances only
`44.53
`55.47
`Female’s utterances only
`Table 5. A/B-test results for 32 kb/s GS LD-CELP vs 64 kb/s
`ADPCM (G.722) [5].
`
`test are
`the subjective
`results of
`The experimental
`summarized in Table 5 [5]. The total results indicate that, on the
`average, our real-time 32 kb/s coder and the 64 kb/s ADPCM
`standard, which operates at twice the bit rate, provide comparable
`speech quality. Among naive-listeners, the two systems performed
`alike, on the average. We may, therefore, be able to halve the
`bitrate while preserving the high quality of the reproduced speech.
`Another observation is that LD-CELP does better on males than
`on females.
`
`6 CONCLUSIONS
`The main conclusion of this work is that high-quality coding
`of wideband audio at 32 kb/s is feasible while keeping the
`
`The European Conference on Speech Communication and Technology, EUROSPEECH, 1993 ©
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket