`
`RPCELP: A HIGH QUALITY AND LOW COMPLEXITY
`SCHEME FOR NARROW BAND CODING OF SPEECH
`
`PI, LEVER and M. DELPRAT
`
`MAT R A- C om m un i cat i on
`Rue J.P. Timbaud, B.P. 26
`78392 Bois d'Arcy Cedex
`FRANCE
`
`The f i r s t section describes the basic CELP
`scheme and present a new strategy for complexity
`reduction, based on
`the choice of a convenient
`perceptual weighting f i l t e r Section I I compares
`several types of excitation and a regular pulse
`codebook i s shown to be a good choice The RPCELP
` and performances
`algorithm i s detailed i n section I
`l
`l
`Are reported for various b i t rates Finally the paper
`considers the design of a RPCELP coder i n terms of
`quantization issues, real time implementation and
`protect ion against transmissior errors
`
`I. CELP coding and excitation estimation
`
`I .
`
` 1 ,
`
` CELP coding
`In CELP coding, the speech spectruni i s modeled
`by a time-varying linear prediction f i l t e r and the
`residual signal i s vector quantized using a codebook
`of waveforms (figure 1). Assuming the f i l t e r s A(z)
`and B(z) have been computed and quantized,
`the
`coding operation consists of determining
`the
`optimum innovation sequence (codeword ck and gain
`for each block of
`Gk)
`the
`frame
`through an
`analysis-by-synthesis procedure. for each codeword
`ck, the resulting synthetic signal i s compared to the
`original one and the difference signal i s processed
`through the perceptual weighting f i l t e r W(z), which
`i s expressed as W(z)=A(z)/A(z/y), w i t h y around 0.8
`The codeword that minimizes the weighted error
`signal energy is then selected for the current block.
`The basic CELP coder described above can
`easily be transformed i n a more convenient (but
`i s moved
`equivalent) structure [4,5,61. First W(z)
`from toe output of the error subtraction operation to
`both of i t s input branches. Then the contribution of
`OriPiilal
`sholt term 'peech
`predictor
`
`long term
`predictor
`
`~~
`
`Abstract
`
`i s a
`Code-Excited Linear Prediction (CELP)
`powerful technique for low b i t rate speech coding
`but the basic scheme leads t o a huge computational
`load The paper introdukes a related scheme that
`replaces the conventional stocnastic excitation bv
`an efficient Regular Pulse excitation This new
`method, called Recular Pulse CELP (RPCELP). takes
`advantage of the codebook structure together w i t h
`the use of a convenient Derceptual c r i t e r i o n to
`achieve 3 very low comple;gi ty while maintaining
`high quali ty output speech O b J W t i P Derformance:
`3re reported for several configurations
`
`Introduction
`
`Low b i t rate coding techniques are of major
`interest for narrow band speech transmissions such
`as i n mobile radio applicat!ons. For instance, the
`available b i t rate for a digital transmission over a
`12.5 KHm channel
`i s around 8 Kbps. Because of
`transmission errors, a part of this rate i s allocated
`t o an error correcting code, so the speech signal
`must actually be coded w i t h at most 6 Kbps. At such
`a low b i t rate, Some existing coders achieve good
`results [ 11, but the reconstructed speech quality
`must be improved for general public applications.
`Code-Excited Linear Prediction (CELP) i s a very
`attractive approach for low b i t rate coding of speech
`121. For each block of samples, an
`innovation
`sequence IS picked up in a codebook of waveforms
`and processed through a synthesis
`f i l t e r which
`exhibits both a short term predictor (LP f i l t e r ) and a
`long
`term
`(pitch) predictor. The
`innovation
`reiequences are Optimally selected
`through an
`analysis-by-synthesis
`procedure according
`to a
`given perceptual criterion. The reconstructed speech
`quality is quite good, but the basic scheme leads to a
`huge computational load.
`Thr paper presents a new CELP scheme called
`Regular Pulse Code-Excited
`Linear Prediction
`innovation sequence
`!R,PCELFj because each
`i s
`constituted of equidistant pulses separated by zeros.
`A very fast search proceaure in a "binary" codebook
`15< e:xposed. that takes advantage ( i n a similar way as
`irl RPE coders [3]) of the regular pulse structure w i t h
`d suitable choice of the perceptual weighting filter.
`
`Codebook I--..
`
`Figure 1 . The basic CELP analysis scheme
`88CH2607-0 C 1988 IEEE
`
`24
`
`Ex. 1041 / Page 1 of 4
`Apple v. Saint Lawrence
`
`
`
`the memory i n both long term predictor 1/B(z) and
`i s
`1 /A(z/y)
`weighted
`short
`term predictor
`subtracted t o the weighted original signal, yielding
`prior t o the start of a codebook search So
`signal x,,
`during the search procedure, the codeword ck i s only
`the memoryless weighted synthesis
`iirocessed by
`l/&z/y),
`yielding
`signal
`zk(n) This
`r i l t e r
`memoryless filtering can be expressed w i t h a
`convolution of t w o finite sequences and represented
`as a matrix-vector product
`zk = Gk H ck
`where H i s a L K L
`elernents dre from
`1 / A ( z/y,
`
`( 1 )
`1 ower tr i angu I ar ma tr i x whose
`I-eSponse hii) of
`the in.lpulse
`
`h(0)
`
`(1
`
`. . . . . . . . .
`
`i s w r i t t e n as
`
`In the Same way, the vector x=(x,,)
`- zo
`( 5 )
`fi = H r +
`where r i s the residual signal With the effects of
`term Predictor subtracted, xo and zo
`the
`long
`represent the contribution of the f i l t e r memory i n
`the computatiort of x and zk respectively Now the
`weighted error f o r the kth codeword i s expressed as
`E(k) = II k - Zk II‘ = II X - Gk H Ck 112
`(4)
`The search procedure t o determine the optimum
`i s then derived from
`innovation sequence (Ck,Gk)
`equation (4)
`1 ) Find the index k o which maximizes the
`we i gh ted i m e r product P,Jk 1
`= ( X t HCk) / 11 HCk 11
`P,,(k)
`2) Compute the related gain 6 ,
`
`(5)
`
`(6)
`This inner product formulation i s faster than
`the previous eucl i dean di stance formulation, but the
`codebook sear Ch cornplexi t y remains very high
`
`1,. 1.2.
`time invariant. Such a f i l t e r has already been csed i n
`multipulse coding of speech [31 and has probed i t s
`remarkable ability t o provide almost equivalent
`subjective results.
`the comparative objective
`shows
`Table
`1
`results we obtained i n introducing a fixed weighted
`synthesis f i l t e r i n a CELP coding scheme involving
`different codebooks. In spite of a relatively lower
`SNR, the perceived lost i n quality due t o a fixed
`f i l t e r i s quite low and acceptable w i t h regard t o the
`enabled simplifications i n the search pro( edure.
`Indeed,
`this procedure does not
`involvt? any
`repetitive filtering operation anymore sin:e all
`excitation sequences may be pre-filtered and :#tored.
`
`II. Design o f the excitation codebooc
`
`I I 1 Binarv versus 4tochastic e x c i t a t i o n
`In the initial CELP scheme [?I,
`the innodation
`codebook i s populated w i t h 1 1 d Gaussian szmples
`(stochastic coding) Recent works have show I that
`better
`performances
`can
`be
`achieved w i t h
`statistical codebooks [6] However stochas IC
`or
`statistical codebooks are essentially not stru:tured
`and the codebook structure i s one of the kfnys t o
`complexity reduction
`As pointed out i n [71, sequences of + 1 m d - 1
`are just as good as stochastic sequences c o n c m i n g
`vector quantization performances at IOW b i : rate
`( 1 12 b i t per sample or below) Algebraic stru1:tures
`from Code Theory provide efficient binary codebooks
`The related fast algorithms are used t o speed JP the
`search procedure and the codebook does not h3ve to
`be stored anymore I t has been shown i n [41 that such
`codebooks are quite effective even
`for small
`dimensions Table 1 shows that a binary cociebook
`derived
`from
`the Reed-Muller
`code
`and
`complemented
`w i t h
`single-pulse
`sequences
`compares favourably w i t h a stochastic codebiok at
`rate 3/8 b i t per sample i n dimension 16
`
`1::
`
`II 2 Reaular Excitation structure
`The good results achieved w i t h codebooks
`I I 1 and [63) suggest
`including pulse sequences (cf
`Percentual weighting f i l t e r
`the use of excitation sequences of length L having a
`The major drawback of the basic CELP scheme
`regular structure consisting i n q equidistant Iulses
`described above i s the huge amount of computations
`separated by D-1 zeros, the f i r s t pulse ( i n i t i a l phase
`involved i n the filteririg ot all the codewords by the
`p’i being at one o f the locations 0 t o D-1 This
`1 /A(z/y) Recent
`time varying synthesis
`f i l t e r
`approach attempts t o better represent the phase
`research has focused on thls conlplexfty issue and
`information i n the excitation signal
`several strategies have been proposed [4,5,6]
`The Reguiar Pulse (RP-) codebook, populated i n
`The complex1 t y considerably decreases when
`a statistical or stochastic manner, n7Ely he
`the weighted synthests f l l t e r i s flxed This 1s the
`constituted of K independent sequences or of the D
`caqe i f the perceptual weighting f i l t e r has the form
`possible shifts of a basic set of K I D sequence: made
`of RP-squences w i t h initial phase zero. In the latter
`W’(z) = A(z) / C(z/Y)
`( 7 )
`case, each codeword ck i s expressed as
`where 1/C(z) i s an average iow order linear short
`term speech predictor The weighted synthesis f i l t e r
`Ck = Ap d,
`( 8 )
`i s then modified i n l/C(z/y) whose coefficients are
`i s a L x q decimation matrix, function of
`where A,,
`25
`
`Ex. 1041 / Page 2 of 4
`
`
`
`I I I 2 RPCELP algorithm and Derformances
`The error E(k) i n the codebook search (see II 1 )
`i s traditionally minimized over the block (0,l L-1)
`An other approach consists of minimizing E(k) w i t h
`longer sequences x, and zk(n), say L+J samples, such
`that the impulse response of the weighted synthesis
`f l l t e r W(z)/A(z) 1s practically zero after J samples
`Signals r, and ck(n) are set to zero for n 2 L and the
`impulse response matrix H becomes a
`(L+J) x L
`Toeplitz matrix
`h(0)
`0
`0
`hi11 h(0)
`0
`h(2) h(1) h!O)
`
`H =
`
`0
`
`h(J- 1 ) h(J-2)
`
`1
`
`i s a q-dimensional vector
`
`D. 1.2.
`the i n i t i a l phase p and d,
`w i t h I., = p (WD) + m
`The RP-structure can obviously be exploited to
`reduce both cornputationdl
`load and Storing
`F U r t h e r m or e,
`the
`cons i de I- i n g
`r e q u i rem en t s
`efficiency of binary vectors at low b i t rates (see
`/ I I ) a 'binary' RP-codebook i s of great interest It
`i s wilt from the 24 binary words of length q (0
`reports the excellent results
`becoming -
`'1 Table I
`I
`
`provided bv cuch 3 RP-ercitation codebqok, which
`appears to be the best or all tested codebooks
`
`Table I Performances (average SI;IRseg i n dB) of
`various codebooks w i t h variable/f ixed weighted
`svnthesis f i l t e r for t w o sDeakers
`
`I stochastic
`
`I 8 5 7 I 6 8 6 I 10621 8 1 7 I
`
`I I I. Regular Pulse Code-Excited
`Linear Predict ion (RPCELP)
`
`I l l 1 Base Band CELE
`Regular pulse sequences as defined i n the last
`section resemble upsampled versions ( i n a D ratio)
`of vectors of length q=L/D The same observation has
`already been made in RPE speech coding [31, derived
`from Plulti-Pulse Linear Predictive Coding RPE
`coding combines the use of RP excitation and of a
`suitable weighting f i l t e r i n an efficient algorithm
`that niav be considered as a generalization of Base
`Band sfleech codinq w i t h spectral folding as high
`freqiienc-v regeneration technirlue A low compleicity
`1 high qiuality RPE coder at 1 7 t bpc has recently
`been chocm 3s a standard for
`the Pan European
`mobile radio system [8] The optimal evcitation
`seauence 15 here determined by down sampling the
`LP r e f i l u a l w i t h a choice of the best decimation
`grid,
`involving a fixed smoothing f i l t e r The RP
`e\.c]tation may be vector quantized to achieve lower
`b i t rates [91 Thougn, vector quantizing once the
`excitation n3s been determined does not lead t o an
`opt irndl excitation sequence
`in a CELP coding
`lntroduclng a RP codebook
`dlgorithrn i s a different approach yielding an optimal
`choice of
`the excitation sequence This section
`show? that similar transformations as i n RPE can be
`p w f o r m e d In RPCELP analysis, providing a very fast
`SP3rch procedure RPCELP can be considered as a RPE
`teChnlque
`i n Which
`the pulse amplitudes are
`optimallv vector quantized and i n that sense may be
`viewed as a Base Band CELP coding technique
`
`26
`
`h(J-
`0
`h ( l ) 1
`)
`" aut oc or r e 1 at i on" m et hod, the "memory
`In this
`) i n equation (3) i s minimal and w i l l be
`error" xo - i
`approxirnated to zero Thus, substituting equation ( 3 )
`i n equation (5) gives
`Purk) = (rtH'HcL) / II HCk II
`(10)
`The matrix R = H t H i s a L x L syrnetrical
`Toeplitz matrix built on the autocorrelations R(i) of
`the lmpulse response h(n) Then the vector yt = r t Ht H
`i s efficiently computed (once per block) as the
`result of a filtering operation (smoother, [31) So the
`codewords are now filtered only once per frame to
`11, which
`compute the weighting coefficients II Hc,
`results i n a low complexity algorithm
`In the case of a Regular Pulse excitation, the
`codebook structure can be exploited t o speed up even
`more the search procedure As a matter of fact, it
`comes froin equation ( 8 )
`II H Ck 112 = ckt H' H C C = dmt R, d,,
`( 1 1 )
`Where
`RD = Apt Ht HA,,
`12)
`i s d q x q symetrical Toeplitz matrix whose it1'
`diagonal
`term
`i s R ( ( i - I ) D ) Note
`that R,
`15
`independent of
`the phase p as a result of
`the
`sutocorrelation method defined above
`Moreover, RD can be forced to a diagonal matrix,
`using a reasonable approximation on the weighted
`Synthesis f i l t e r i t s lmpulse response IS shortened
`for n 2 D In the case of a
`i n order that h(n) = 0
`fixed weighted synthesis f i l t e r (see I I 31, It can
`merely be designed such that RtiD) = 0 f o r
`1 > 0
`the normalization by R(O), the matrix RD
`With
`becomes the identity matrix and equation ( I 1 ) gives
`112
`11 HCk 112 = 11 d,
`(13)
`Assuming the codewords are normalized, the
`search procedure comes down to maximize the inner
`product P(k) = y t c k , which represents a small
`amount of computations ( a l l the more because the
`codewords are sparse) The RPCELP algorithm i s
`illustrated i n figure 2
`
`Ex. 1041 / Page 3 of 4
`
`
`
`speech -
`
`Original s n
`
`I !
`
`4 SNRseg
`
`D, 1.2.
`
`....... ..........................
`
`. !
`
`: y\. ' ,j
`
`orediction
`..__...._._..........__...........
`. - I - -
`
`A
`
`roducts
`
`-
`
`Codeword
`selection
`t
`Figure 2 Block diagram of the RPCELP coder
`the
`Besides, w i t h a binary RP codebook
`optimum codeword i s eft iciently determined in a
`two-steps procedure
`1 ) Find the phase po which maximizes M(p) w i t h
`M(p) = &ly(p+iD)I, sum over i=O,
`,q-1
`(14)
`2) Choose the vector dmo such that yt cko= M(po).
`d,
`,q-1
`, for 1 =O,
`(1) = sign of y(p,+iD)
`The related gain i s then given by
`Gko= M(Po)/q ,
`since I/ Hck I I ~ = q for any K
`The above procedure involves a very small
`amount of computations around 20 instructlons per
`sample, whtch 1s 2000 tlmes less than In the basic
`scheme [51 Moreover, this very low Complexity 1s
`independent of the block length L
`Figure 3 shows the performances obtained w i t h
`d binary RP codebook, a decimation factor D=4 and
`various block lengths L=8,16,20,40 I t can be seen
`that t o r L=16 ( 3 / 8 bit per sample) these results are
`very close t o those preseinted In table 1 w i t h a RP
`In
`codebook and a fixed weighted synthesis f i l t e r
`fact, the approximations introduced in the design of
`the
`fast RPCELP algorithm do not produce any
`not iceable degradation
`
`I I I 3 D+sian of a RPCEI P i ~ d + l -
`The great advantage of the RPCELP technique i s
`i t s very low complexity The whole coding/decoding
`I nc 1 ud 1 ng protect 1 on again5 t transm is5 ion
`scheme,
`errors, can easily be real time implemented on any
`commercially available DSP chip
`With a binary RP codebook, the b i t r a t e R o f the
`excitation is expressed as
`R = (L/D + log2D)/ L b i t per sample
`( 15)
`The values 0-4 and L=20 provide a suitable b i t rate /
`quality
`trade o f f
`In
`this case R=7/20, or
`equivalently 2800 bps w i t h an 8 KHZ sampling
`frequency, so the global rate may be around 6 Kbps
`or even lower if the LP f i l t e r s are vector quantized
`the binary
`Finally, i t should be noted that
`regular pulse excitation
`i s theoretically robust
`against transmission errors As a matter of fact, the
`codebook can be effIcier\tly indexed so that each
`i s directly deduced from the binary
`codeword ck
`decomposi tlon of 1 ts lndex k [ 1 @I Then one error on a
`
`5
`
`I
`
`,
`
`I
`
`
`
`Bit/sarnple
`f
`-
`+
`
`1 / 2
`7/20 318
`3/10
`Figure 3: performances of the RPCELP coder f o r 2
`speakers, as a function of the excltatlon b l t r a :e.
`given b i t of k (except for the phase information)
`produces only a single wrong pulse in the innodation
`sequence.
`
`Concl us1 on
`In this paper, we have formulated dif"erent
`ways of reducing the computational load of CELP
`coders without
`significantly
`degrading
`the
`reconstructed speech quality.
`Furthermore, a very low complexity Rase Band
`CELP algorithm, called RPCELP, nas been presented.
`A suitable perceptual weighting f i l t e r t q e t h e i - w i t h
`3 binary regular pulse couebook enzlr~~i.:~ 2 v e r j fast
`search procedure, which involves only 20 multiply-
`adds per sample instead of 40000 for t h e basic CELP
`scheme. This algorithm can easily be real time
`implemented and may provide a good iornmunication
`quality a t 6 KbpS, even over noisy t t " m s i o n
`channels.
`
`References
`[ 11 I. Lecomte, M. Lever, L. Lelievre, M. Delprat, A. Tassy,
`"Medium Band Speech Coding ior Mobile Radio Communications", in
`Proc. ICASSP, Apr. 1988,
`[21 P1.R. Schroeder and 6.S Atal, "Code-excited linear prediction
`(CELP): high-quality speech at very low bit rates", ill Pr-oc.
`ICASSP, Mar 1985
`131 P. Kroon, E. F. Deprettere, R. J. Sluyter, "Regular Pulse
`Excitation: a Novel Approach to Effective and Efficient Multi-Pulse
`Coding of Speech", IEEE Trans. on A S P . Vol. ASSP-34, Oct. 986.
`[41 J.P. Adoul, P. Mabilleau, M. Delprat and S Morissettc!, "Fast
`CELP coding based on algebrarc Codes", i n Proc. ICASSP, Apr. 1 987.
`[SI I.M. Trancoso and B S. Atal, "Efficient procedures for finding
`the optimum innovation iri stochastic coders", i n Proc. IcAS:P, Apr.
`1980.
`I61 G. Davldson, M. Yong and A. Gersho,
`"Real-time vector
`excitation coding of speech at 4 8 0 0 bps", i n Proc. ICASS), Apr.
`1987.
`[71 J.P. Adoul, C. Lamblin, "A comparison of some alpbraic
`structures for CELP coding of speech". i n Proc. ICASSP, Apr. 1987.
`[81 P. Vary, R.J. Sluyter, C. Galand, M. Rosso, "RPE-LPC Cokc: the
`candidate for the GSM radio communication system", Int. Conf. on
`Digital Land Mobile Radto Communicatton, Venice, Jul. 1 9 8 7
`[91 P. Kroon and E.F. Deprettere, "Quantization Procedu-es for
`Regular-Pulse Excitation Coders", i n Proc. 3rd European Signal
`Processing Conference (EUSIPCO-86), Sep. 1986.
`[ 101 A. Le Guyader, P. Combescure, C. Lamblin, M. Mouly i n d J.F.
`Zurcher, "A robust 16 Kbitsls vector adaptive predictive coder for
`mobile communications", i n Proc. ICASSP, Apr. 1986.
`
`Ex. 1041 / Page 4 of 4
`
`