`Processing
`of
`
`Exhibit 2019
`
`9.5?
`we
`85".:
`“’1:
`53
`a:
`5'3
`:0)
`as
`(IMO
`
`0=
`
`Speech
`Signals
`
`L. R. Rabiner / R.W. Schafer
`
`
`
`AT&T
`
`DIGITAL PROCESSING
`
`OF
`
`SPEECH SIGNALS
`
`Lawrence R. Rabiner
`
`Ronald W. Schafer
`
`Acoustics Research Laboratory
`Bell Telephone Laboratories
`Murray Hill, New Jersey
`
`School ofElectrical Engineering
`Georgia Institute of Technology
`Atlanta, Georgia
`
`PRENTICE-HALL SIGNAL PROCESSING SERIES
`
`Alan V. Oppenheim, Editor
`
`II3lg 1
`
`ANDREWS and HUNT Digital Image Restoration
`BRIGHAM The Fast Fourier Transform
`BURDIC Underwater Acoustic System Analysis
`CASTLEMAN Digital Image Processing
`CROCHIERE and RABINER Multirate Digital Signal Processing
`DUDGEON and MERSEREAU Multidimensional Digital Signal Processing
`HAMMING Digital Filters, 2e
`HAYKIN, ED. Array Signal Processing
`LEA, ED. Trends in Speech Recognition
`LIM, ED.
`Speech Enhancement
`MCCLELLAN and RADER Number Theory in Digital Signal Processing
`OPPENHEIM, ED. Applications of Digital Signal Processing
`OPPENHEIM, WILLSKY, with YOUNG Signals and Systems
`OPPENHEIM and SCHAFER Digital Signal Processing
`RABINER and GOLD Theory and Applications of Digital Signal Processing
`RABINER and SCHAFER Digital Processing of Speech Signals
`ROBINSON and TREITEL Geophysical Signal Analysis
`TRIBOLET Seismic Applications of Homomorphic Signal Processing
`
`
`
`Library ofCounsel Cataloging In Public-lion DIM
`Rubiner. Lawrence R (date)
`Digital procuring of speech rim-ls.
`(Prentice-Hall sign-I processing reries)
`includes bibliographies and index.
`1. Speech processin‘ ryltenu.
`2. Di ‘ul elec-
`l. Schnferdlonnld W.,jainnu
`I.
`“(7882365113
`fill .38 ‘04l2
`784555
`lSBN 0-lJ-213603-l
`
`© 1978 by Bell Laboratories, Incorporated
`
`All rights reserved. No part of this book
`may be reproduced in any form or
`by any means without permission in writing
`from the publisher and Bell Laboratories.
`
`Printed in the United States of America
`
`17
`
`16
`
`15
`
`14
`
`PRENTICE— HALL INTERNATIONAL, INc.. London
`
`T
`
`ents,
`
`0 our par
`Dr. and Mrs. Nathan Rabiner
`
`and
`Mr. and Mrs. William Schafer’
`for instilling within us the thirst for knowledge
`and the quest for excellence;
`
`and to our families,
`Suzanne, Sheri, and Wendi Rabiner
`
`and
`Dorothy, Bill, John, and Kate Schafer,
`
`for their love, encouragement, and support.
`
`5
`‘
`
`
`
`
`
`8
`
`Linear Predictive Coding
`of Speech
`
`One of the most powerful speech analysis techniques is the method of linear
`predictive analysis. This method has become the predominant technique for
`estimating the basic speech parameters, e.g., pitch, formants, spectra, vocal
`tract area functions, and for representing speech for low bit rate transmission or
`storage. The importance of this method lies both in its ability to provide
`extremely accurate estimates of the speech parameters, and in its relative speed
`In this chapter, we present a formulation of the ideas behind
`linear prediction, and discuss some of the issues which are involved in using it
`in practical speech applications.
`The basic idea behind linear predictive analysis is that a speech sample can
`be approximated as a linear combination of past speech samples. By minimiz-
`ing the sum of the squared differences (over a finite interval) between the
`actual speech samples and the linearly predicted ones, a unique set of predictor
`
`Linear predictive techniques have already been discussed in the context of
`the waveform quantization methods of Chapter 5. There it was suggested that
`a linear predictor could be applied in a differential quantization scheme to
`reduce the bit rate of the digital representation of the speech waveform.
`In
`fact,
`the mathematical basis for an adaptive high order predictor used for
`DPCM waveform coding is identical to the analysis that we shall present in this
`chapter.
`In adaptive DPCM coding the emphasis is on finding a predictor that
`will reduce the variance of the difference signal so that quantization error can
`also be reduced.
`In this chapter we take a more general viewpoint and show
`how the basic linear prediction idea leads to a set of analysis techniques that can
`be used to estimate parameters of a speech model.- This general set of linear
`predictive analysis techniques is often referred to as linear predictive coding or
`LPC.
`
`The techniques and methods of linear prediction have been available in
`the engineering literature for a long time. The ideas of linear prediction have
`been in use in the areas of control, and information theory under the names of
`system estimation and system identification. The term system identification is
`particularly descriptive of LPC methods in that once the predictor coefficients
`have been obtained, the system has been uniquely identified to the extent that
`it can be modelled as an all-pole linear system.
`
`the term linear prediction refers toga
`As applied to speech processing,
`variety of essentially equivalent formulations of the problem of modelling the
`speech waveform [1-18]. The differences among these formulations are often
`those of philosophy or way of viewing the problem.
`In other cases the
`differences concern the details of the computations used to obtain the predictor
`coefficients. Thus as applied to speech, the various (often equivalent) formula-
`tions of linear prediction analysis have been:
`
`the covariance method [3]
`the autocorrelation formulation [1,2,9]
`the lattice method [11,12]
`the inverse filter formulation [1]
`the spectral estimation formulation [12]
`the maximum likelihood formulation [4,6]
`the inner product formulation [1]
`
`NP‘MPP’N?‘
`
`In this chapter we will examine in detail the similarities and differences among
`
`
`
`
`
` PITCH
`
`
`
`
`PERIOD
`
` IMPULSE
`TRAIN
`
`GENERATOR
`
`
`
`
`VOICED/
`VOCAL TRACT
`UNVOICED
`PARAMETERS
`
`SWITCH
`OTiME-VARYING
`utn)
`DIGITAL
`
`
` s(n)
`FILTER
`
`
`RANDOM
`
`
`NOISE
`GENERATOR
`
`Fig. 8.1 Block diagram of simplified model for speech production.
`
`8.1 Basic Principles of Linear Predictive Analysis
`
`Throughout this book we have repeatedly referred to the basic discrete-time
`model for speech production that was developed in Chapter 3. The particular
`form of this model that is appropriate for the discussion of linear predictive
`analysis is depicted in Fig. 8.1.
`In this case, the composite spectrum effects of
`radiation, vocal tract, and glottal excitation are represented by a time-varying
`digital filter whose steady-state system function is of the form
`
`S(z)
`G
`= ——
`(1(2)
`1 — i akZ—k
`k-l
`
`H(z) =
`
`(8.1)
`
`This system is excited by an impulse train for voiced speech or a random noise
`sequence for unvoiced speech. Thus,
`the parameters of this model are:
`voiced/unvoiced classification, pitch period for voiced speech, gain parameter
`G, and the coefficients {ak} of the digital filter. These parameters, of course, all
`vary slowly with time.
`
`The pitch period and voiced/ unvoiced classification can be estimated using
`one of the many methods already discussed in this book or by methods based
`on linear predictive analysis to be discussed later in this chapter. As discussed
`in Chapter 3, this simplified all-pole model is a natural representation of non-
`nasal voiced sounds, but for nasals and fricative sounds, the detailed acoustic
`theory calls for both poles and zeros in the vocal tract transfer function. We
`
`A linear predictor with prediction coefficients, ak is defined as a system whosr
`output is
`
`50:) = )5 aks(n—k)
`k-l
`
`(8.3)
`
`Such systems were used in Chapter 5 to reduce the variance of the difference
`signal in differential quantization schemes. The system function of a p'h order
`linear predictor is the polynomial
`
`P(z) = i akz‘k
`k=l
`
`The prediction error, e(n), is defined as
`
`e(r1) = s(n) — s(n) = s(n) — f aks(n—k)
`k=l
`
`(8.4)
`
`(8.5)
`
`From Eq. (8.5) it can be seen that the prediction error sequence is the output
`of a system whose transfer function is
`
`A(z) =1 - f akz‘k
`k-l
`
`(8.6)
`
`It can be seen by comparing Eqs. (8.2) and (8.5) that if the speech signal obeys
`the model of Eq. (8.2) exactly, and if ak = ak, then e(n) = Gu(n). Thus, the
`prediction error filter, A (2), will be an inverse filter for the system, H (z), of Eq.
`(8.1), i.e.,
`
`H(z) =
`
`
`G
`A (z)
`
`(8.7)
`
`The basic problem of linear prediction analysis is to determine a set of
`predictor coefficients {a k} directly from the speech signal in such a manner as
`to obtain a good estimate of the spectral properties of the speech signal through
`the use of Eq. (8.7). Because of the time-varying nature of the speech signal
`the predictor coefficients must be estimated from short segments of the speech
`signal. The basic approach is to find a set of predictor coefficients that will
`minimize the mean-squared prediction error over a short segment of the speech
`waveform. The resulting parameters are then assumed to be the parameters of
`the system function, H (z), in the model for speech production.
`That
`this approach will
`lead to useful results may not be immediately
`
`
`
`very pragmatic justification for using the minimum mean-squared prediction
`error as a basis for estimating the model parameters is that this approach leads
`to a set of linear equations that can be efficiently solved to obtain the predictor
`parameters. More importantly the resulting parameters comprise a very useful
`and accurate representation of the speech signal as we shall see in this chapter.
`The short-time average prediction error is defined as
`
`13,, == 2 e,,2(m)
`
`= 2 (s,,(m) - §”(m))2
`m
`= 2 s,,(m) — f: aks"(m—-k)
`m
`k-l
`
`2
`
`(8.8)
`
`(8.9)
`
`(8.10)
`
`where s,,(m) is a segment of speech that has been selected in the vicinity of
`
`s,,(m) -= s(m+n)
`
`(8.11)
`
`The range of summation in Eqs. (8.8)-(8.10) is temporarily left unspecified, but
`since we wish to develop a short-time analysis technique, the sum will always
`be over a finite interval. Also note that to obtain an average we should divide
`by the length of the speech segment. However, this constant is irrevelant to
`the set of linear equations that we will obtain and therefore is omitted. We can
`the values of
`01k
`that minimize E,
`in Eq.
`(8.10)
`by
`setting
`aE,/aa i = 0, i=l,2, .
`.
`, 1), thereby obtaining the equations
`
`.
`
`2.3,,(m—i)s,,(m) = i a, 2 s,,(m—i)s,,(m—k)
`k-l
`m
`
`1 < ,- < p (8.12)
`
`(Since 61,, is unique, we will
`where 61,, are the values of ak that minimize 15,.
`drop the caret and use the notation ak to denote the values that minimize E,,.)
`
`¢,,(i,k) = 2 s,,(m—i)s,,(m—k)
`
`then Eq. (8.12) can be written more compactly as
`
`f, a,¢,(i,k) = ¢.(i,0)
`k-l
`
`i=l,2, .. . ,p
`
`(8.13)
`
`(8.14)
`
`
`E, = )3 s,,2(m) — i a, 2 s,,(m)s,,(m—k)
`(8.15)
`k-l
`
`and using Eq. (8.14) we can express E,1 as
`
`E. = ¢.(0,0) — f, ak¢.(0,k)
`k=1
`
`(8.16)
`
`Thus the total minimum error consists of a fixed component, and a component
`which depends on the predictor coefficients.
`
`To solve for the optimum predictor coefficients, we must first compute
`the quantities ¢>,,(i,k) for 1 S I' S p and 0 S k S p. Once this is done we
`only have to solve Eq.
`(8.14)
`to obtain the ak’s. Thus,
`in principle,
`linear
`prediction analysis is very straightforward. However, the details of the compu-
`tation of ¢,,(i,k) and the subsequent solution of the equations are somewhat
`intricate and further discussion is required.
`So far we have not explicitly indicated the limits on the sums in Eqs.
`(8.8)-(8.10) and in Eq. (8.12); however it should be emphasized that the limits
`on the sum in Eq.
`(8.12) are identical
`to the limits assumed for the mean
`squared prediction error in Eqs. (8.8)-(8.10). As we have stated, if we wish to
`develop a short-time analysis procedure, the limits must be over a finite inter-
`val. There are two basic approaches to this question, and we shall see below
`that two methods for linear predictive analysis emerge out of a consideration of
`the limits of summation and the definition of the waveform segment s,,(m).
`
`8.1.1 The autocorrelation method [1,2,5]
`
`One approach to determining the limits on the sums in Eqs. (8.8)-(8.10)
`and Eq.
`(8.12) is to assume that the waveform segment, s,,(m), is identically
`zero outside the interval 0 S m S N — 1. This can be conveniently expressed
`as
`
`(8.17)
`s,,(m) = s(m+n)w(m)
`where w(m) is a finite length window (e.g. a Hamming window) that is identi-
`cally zero outside the interval 0 S m S N — l.
`
`The effect of this assumption on the question of limits of summation for
`the expressions for E, can be seen by considering Eq. (8.5). Clearly, if s,,(m)
`is nonzero only for 0 S m S N — 1, then the corresponding prediction error,
`e,,(m),
`for
`a p'h order predictor will
`be nonzero over
`the
`interval
`
`
`
`
`
`are trying to. predict the signal from samples that have arbitrarily been set to
`zero. Likew1se the error can be large at the end of the interval
`(specifically
`N < m S N +p—1) because we are trying to predict zero from samples that
`are nonzero. Fer this reason, a window which tapers the segment, s,,(m), to
`zero 15 generally used for w(m) in Eq. (8.17).
`
`it is sym-
`i.e.,
`The po matrix of autocorrelation values is a Toeplitz matrix;
`metric and all
`the elements along a given diagonal are equal. This special
`property will be exploited in Section 8.3 to obtain an efficient algorithm for the
`solution of Eq. (8.23).
`
`8.1.2 The covariance method [3]
`
`The second basic approach to defining the speech segment 5,,(m) and the
`limits on the sums is to fix the interval over which the mean-squared error is
`computed and then consider the efiect on the computation of ¢,,(i,k). That is,
`if we define
`
`N—l
`
`E” = 2 en2(m)
`m-O
`
`(8.26)
`
`then d2,,(i,k) becomes
`
`1 S ~ S
`.
`N—I
`.
`(8.27)
`0 < [($113
`mm) = Eosn(m—I)s,,(m—k)
`In this case, if we change the index of summation we can express ¢,,(i,k) as
`either
`
`N—i—I
`¢,,(i,k) = ”12-2,- sn(m)s,,(vm+i—k)
`
`6 E [It :2),
`
`(8.2821)
`
`or
`
`(8 28b)
`1‘ ’ g P
`( +k—‘)
`)
`(k) — NEH (
`.
`0<k<p
`¢nl,,—m=_ks,,ms,,m
`I
`Although the equations look very similar to Eq. (8.1%), we see that the limits
`of summation are not the same. Equations (8.28) call for values of s,,(m) out-
`side the interval 0 < m < N — 1.
`Indeed,
`to evaluate ¢,,(i,k) for all of the
`required values of i and k requires that we use values of s,,(m) in the interval
`—p S m < N — 1.
`If we are to be consistent with the limits on E” in Eq.
`(8.26) then we have no choice but to supply the required values.
`In this case it
`does not make sense to taper the segment of speech to zero at the ends as in
`the autocorrelation method since the necessary values are made available from
`outside the interval 0 < m S N — 1. Clearly, this approach is very similar to
`what was called the modified autocorrelation function in Chapter 4. As pointed
`
`to
`(8.13) are identical
`The limits on the expression for ¢”(i.k) in Eq.
`(8.18). However, because s,,(m) is identically zero outside the
`interval 0 S m S N — 1, it is simple to show that
`~+ —1
`.
`_.
`_
`¢n(/.k)
`mfi-o
`s,,(m I)s,,(m k)
`can be expressed as
`
`1 < i <
`0 g kg?)
`
`(8.19a)
`
`=
`
`=
`
`\
`-_
`1 < i < p
`N—l—(i—k)
`(‘k)
`(8.1%)
`0 s k g p
`S,,(m)s,,(m+/ k)
`320
`l,
`furthermore it can be seen that in this case qS,,(i,k) is identical to the short-
`time autocorrelation function of Eq. (4.30) evaluated for (i—k). That is
`
`d),,(i,k) = R,,(i—k)
`
`'
`
`(8.20)
`
`N—l—k
`
`R,,(k)= 2 s,,(m)s,,(m+k)
`m-O
`
`(8.21)
`
`The computation of R,,(k) is covered in detail in Section 4.6 and thus we shall
`not consnder such details here. Since R,,(k) is an even function, it follows that
`¢n(i,k) = R,,(Ii—kl)
`[(3% ..., f,
`(8.22)
`Therefore Eq. (8.14) can be expressed as
`
`f akR”(Ii—kl) = 12,0)
`k-l
`
`1 s i < p
`
`(8.23)
`
`Similarly, the minimum mean squared prediction error of Eq. (816) takes the
`
`E, = R,,(O) - )5 akxnuc)
`k-l
`
`(8.24)
`
`The set of equations given by Eqs, (8.23) can be expressed in matrix form
`
`
`
`
`
`and the properties of the resulting optimum predictor.
`
`In matrix form these
`
`¢,(1,2) ¢,(1,3)
`¢,(2.2) ¢>,,(2,3)
`(3 2) ¢,(3,3)
`
`mm»
`¢"(2,p)
`<b,,(3,p)
`
`a]
`a2
`a3
`
`d>.(1.0)
`¢,,(2,0)
`¢n(3,0)
`
`¢,(p,2) ¢,(p.3)
`
`-
`
`‘
`
`' 4),,(1w)
`
`a,
`
`¢,(p,0)
`
`(8.2%)
`
`the pxp matrix of
`(8.28)),
`(see Eq.
`In this case, since ¢,(i,k)=d>,,(k,i)
`correlation-like values is symmetric but not Toeplitz.
`Indeed,
`it can be seen
`that the diagonal elements are related by the equation
`
`¢n(i+1,k+l) = ¢,,(/,k) + s,,(-i—1)s,,(—k—1)
`
`— s,,(N-—1—i)s,,(N—1—k)
`
`(8.30)
`
`The method of analysis based upon this method of computation of
`¢,,(i,k) has come to be known as the covariance method because the matrix of
`values {¢,,(I'.k)l has the properties of a covariance matrix [5].2
`
`8.1.3 Summary
`
`It has been shown that by using different definitions of the segments of
`to be analyzed,
`two distinct sets of analysis equations can be
`obtained. For the autocorrelation method,
`the signal
`is windowed by an N-
`point window, and the quantities d>,,(i,k) are obtained using a short-time auto-
`correlation function. The resulting matrix of correlations is Toeplitz leading to
`one type of solution for the predictor coefficients. For the covariance method,
`the signal is assumed to be known for the set of values —p S n S N—l. Out-
`side this interval no assumptions need be made about the signal, since these are
`the only values needed in the computation. The resulting matrix of correla-
`tions in this case is symmetric but not Toeplitz. The result is that the two
`methods of computing the correlations lead to different methods of solution of
`the analysis equations and to sets of predictor coefficients with somewhat
`different properties.
`
`In later sections we will compare and contrast computational details and
`
`energy in the signal with the energy of the linearly predicted samples. This
`indeed is true when appropriate assumptions are made about the excitation sig-
`nal to the LPC system.
`
`It is possible to relate the gain constant G to the excitation signal and the
`error in prediction by referring back to Eqs. (8.2) and (8.5).3 The excitation
`signal, Gu (n), can be expressed as'
`Gu(n) = s(n) — )5 aks(n—k)
`k-I
`
`(3.313)
`
`whereas the prediction error signal e(n) is expressed as
`
`e(n) = s(n) - f, aks(n—k)
`k==I
`
`(8.3lb)
`
`In the case where ak = ark,
`the model are identical, then
`
`i.e.,
`
`the actual predictor coefficients, and those of
`
`(8.32)
`e‘(n) = Gu(n)
`i.e., the input signal is proportional to the error signal with the constant of pro-
`portionality being the gain constant, G. A detailed discussion of the properties
`of the prediction error signal is given in Section 8.5.
`Since Eq. (8.32) is only approximate (i.e., it is valid to the extent that the
`ideal and the actual linear prediction parameters are identical) it is generally not
`possible to solve for G in a reliable way directly from the error signal
`itself.
`Instead the more reasonable assumption is made that the energy in the error
`signal is equal to the energy in the excitation input, i.e.,
`N—l
`N—l
`62 2 u2(m) = 2 e2(m) = E”
`m =0
`m -0
`
`(8.33)
`
`At this point we must make some assumptions about u(n) so as to be
`able to relate G to the known quantities, e.g.,
`the ak’s and the correlation
`coefficients. There are two cases of interest for the excitation. For voiced
`speech it
`is reasonable to assume u(n) = 6(n),
`i.e.,
`the excitation is a unit
`sample at n = 0.4 For this assumption to be valid requires that the effects of
`the glottal pulse shape used in the actual excitation for voiced speech be
`lumped together with the vocal
`tract transfer function, and therefore both of
`these effects are essentially modelled by the time-varying linear predictor. This
`
`
`
`
`
`resulting output for this particular input h(n) (since it is actually the impulse
`response of the system with transfer function H (2) as in Eq. (8.1)) we get the
`
`h(n) = f akh(n—k) + 08(n)
`k-l
`
`(8.34)
`
`is readily shown [Problem 8.1]
`
`that the autocorrelation function of h(n),
`
`satisfies the relations
`
`R(m) = i h(n)h(m+n)
`”-0
`
`(8.35)
`
`R(m) = f akRflm—kl)
`k-l
`
`m=1,2,
`
`.
`
`.
`
`.
`
`, p
`
`(8.36a)
`
`since E[u(n)g(n—m)]=0 for m > 0 because u(n) is uncorrelated with any
`signal prior to u(n). For m = 0 we get
`
`Me) = f: akR (k) + GE[u(n)g(n)]
`k=l
`
`= f 08,}? (k) + 02
`k‘l
`
`(8.42)
`
`Since
`since E[u(n)g(n)] = E[u(n)(Gu(n) + terms prior to n)] = G.
`energy in the response to Gu(n) must equal the energy in the signal, we get
`R(m) = R,,(m)
`o <m s p
`(8.43)
`
`the
`
`01'
`
`13(0) = )2; akR(k) + 62
`k—l
`
`(8.36b)
`
`G2 = R,,(0) — )5 akRnUc)
`k-l
`
`(8.44)
`
`Since Eqs. (8.36) are identical to Eqs. (8.23) it follows that
`
`as was the case for the impulse excitation for voiced speech.
`
`R(m) = R,,(m)
`
`1 s m < p
`
`(8.37)
`
`Since the total energies in the signal (R (0)) and the impulse response (R(0))
`must be equal we can use Eqs. (8.24) , (8.33) and (8.36b) to obtain
`
`02 == R,,(0) - f, akR”(k) = E,,
`k-l
`
`(8.38)
`
`It is interesting to note that Eq. (8.37) and the requirement that the energy of
`the impulse response be equal to the energy of the signal together require that
`the first p + l coefficients of the autocorrelation function of the impulse
`response of the model are identical to the first p + 1 coeflicients of the auto-
`correlation function of the speech signal.
`
`For the case of unvoiced speech, the correlations are defined as statistical
`It is assumed that the input is white noise with zero mean and unity
`
`If we excite the system with the random input Gu (n) and call the output g(n)
`
`E[u(n)u(n—m)] = 8(m)
`
`(8.39)
`
`8.3 Solution of the LPC Equations
`
`In order to effectively implement a linear predictive analysis system, it is neces-
`sary to solve the linear equations in an efi‘icient manner. Although a variety of
`techniques can be applied to solve a system of 1) linear equations in p unk-
`nowns,
`these techniques are not equally efiicient. Because of the special pro-
`perties of the coefficient matrices it
`is possible to solve the equations much
`more efficiently than is possible in general.
`In this section we will discuss in
`detail
`two methods for obtaining the predictor coefficients, and then we will
`compare and contrast several properties of these solutions.
`
`8.3.1 Cholesky decomposition solution
`for the covariance method [3]
`
`For the covariance method, the set of equations which must be solved is
`of the form:
`
`
`
`
`
`called the square root method) [3]. For this method the matrix (I) is expressed
`
`Using Eq. (8.51) for i = 2 gives
`
`‘1) = VDV‘
`
`(8.47)
`
`where V is a lower triangular matrix (whose main diagonal elements are all
`1’s), and D is a diagonal matrix. The superscript tdenotes matrix transpose.
`The elements of the matrices V and D are readily determined from Eq. (8.47)
`by solving for the (/,j) ”’ element of both sides of Eq. (8.47) giving
`¢,,(i,j) = t V,kde_,-k
`1 s j g i—l
`k-l
`
`(8.48)
`
`'—l
`V.,d,= Mm — $3 may”
`k=l
`
`1 < j g 1—1
`
`(8.49)
`
`and, for the diagonal elements
`
`I
`
`¢,,(i,i) = 2 V,kde,k
`k-l
`
`[—1
`d,= ¢,,(i,i) — 2 dek
`k-l
`
`i 2 2
`
`d1= «ML 1)
`
`(8.50)
`
`(8.51)
`
`(8.52)
`
`To illustrate the use of Eqs. (8.47)-(8.52) consider an example with p = 4, and
`matrix elements q),,(i,j) = 4),}. Equation (8.47) is thus of the form
`
`¢ii ¢21 ¢31 d>41
`
`¢21 (1522
`
`4332 4’42
`
`$31 4’32 ¢33 (#43 =
`¢>41
`0542
`0543 ¢44
`
`1000 d10001V21V31V41
`V21100 0(1200'01V32V42
`
`Using Eq. (8.49) for i = 3 and 4 gives
`
`0’2 = 9522 — V221dl
`
`or
`
`V3242 = (1532 — V3ldl V21
`
`V4242 = ¢42 — V4ld1 V21
`
`V32 = (4532— V31d1V21)/d2
`
`V42 = (¢42— V41d1V21)/d2
`Equation (8.51) is now used for i = 3 to solve for d;, then Eq. (8.49) is used
`for i = 4 to solve for V43, and finally Eq. (8.51) is used for i = 4 to solve for
`d4.
`
`it is relatively simple
`Once the matrices V and D have been determined,
`to solve for the column vector or in a two-step procedure. From Eqs.
`(8.46)
`and (8.47) we get
`
`which can be written as
`
`and
`
`or
`
`VDV‘ = (p
`
`VY = u];
`
`DV‘a = Y
`
`(8.53)
`
`(8.54)
`
`(8.55)
`
`(8.56)
`V‘a = D‘lY
`(8.54) can be solved for the column vector Y
`Thus from the matrix V, Eq.
`using a simple recursion of the form
`i—l
`
`Y.= in- 2 V,,-Y,.
`1-1
`
`p 21> 2
`
`(8.57)
`
`with initial condition
`
`
`
`
`
`(8.57)~(8.60) we continue our previous 1
`To illustrate the use of Eqs.
`example and first solve for the Y,'s assuming V and D are now known.
`In
`matrix form we have the equation
`
`From Eq. (8.56) we can substitute for a'the expression Y'D‘1V‘1 giving
`E,, = ¢"(0,0) — Y'D-lv-1¢
`
`(8.63)
`
`1
`
`V21
`
`0
`
`1
`
`V31 V32
`
`O
`
`0
`
`1
`
`0
`
`0
`
`0
`
`V41 V42 V43 1
`
`From Eqs. (8.57) and (8.58) we get
`
`Y1 = $1
`
`Y2 = 1112 — V21Y1
`
`Y3= lliz— V31Y1 — V32Y2
`
`Y4=¢4— V41Y1— V42Y2' V43Y3
`From the Y,’s we solve Eq. (8.56) which is of the form
`
`1
`
`0
`
`V21 V31 V41
`
`1
`
`V32 V42
`
`0‘1
`
`0‘2
`
`1
`o
`
`0
`0
`
`o
`0
`0
`0
`1/d1
`0
`i/d2
`o
`1/d3
`0
`0
`.0
`0
`0
`From Eqs. (8.59) and (8.60) we get
`
`V43
`1
`
`0
`0
`0
`1/(14
`
`63 ‘
`a4
`Yl/dx
`Y1
`Y2 m2
`Y3 = ma
`Y4
`Y4/d4
`
`a4= Y4/d4
`
`a3= Ys/da‘ V4304
`
`012= Yz/dz — V32“: " V4204
`
`‘11 = Yl/dl — V2102 “ V3101: _ V41a4
`thus completing the solution to the covariance equations.
`
`Y1
`
`Y2
`
`Y4
`
`Y3]=".03
`
`llli
`
`1112
`
`#14
`
`Using Eq. (8.54) we get
`
`or
`
`E. = ¢.(o,0) — Y'D-IY
`
`E,,= ¢.(o,o) ~ fi Ykz/dk
`k-l
`
`(8.64)
`
`(8.65)
`
`Thus the mean-squared prediction error E,, can be determined directly from the
`column vector Y and the matrix D. Furthermore Eq.
`(8.65) can be used to
`give the value of E,, for any value 'of p up to the value of 1) used in solving the
`matrix equations. Thus one can get an idea as to how the mean-squared predic-
`tion error varies with the number of predictor coefficients used in the solution.
`
`8.3.2 Durbin’s recursive solution
`for the autocorrelation equations [2]
`
`For the autocorrelation method the matrix equation for solving for the
`predictor coefficients is of the form
`
`)5 “mu/~78!) = Rn(i)
`k-l
`
`1 < i s p
`
`(8.66)
`
`By exploiting the Toeplitz nature of the matrix of coefficients, several efficient
`recursive procedures have been devised for solving this system of equations.
`Although the most popular and well known of these methods are the Levinson
`and Robinson algorithms [1], the most eificient method known for solving this
`particular system of equations is Durbin’s recursive procedure [2] which can be
`stated as follows (for convenience of notation we shall omit the subscript on
`the autocorrelation function):
`5“” = R (0)
`
`(8.67)
`
`7-1
`
`k,= R(i) — 2 aft-“RU—j) /E"'“
`.i-l
`
`1 < i s p
`
`(8.68)
`
`mm = k,
`
`(8.69)
`
`
`
`have also been obtained — i.e., 07,“) is the j "’ predictor coefficient for a predic-
`
`To illustrate the above procedure, consider an example of obtaining the
`predictor coefficients for a predictor of order 2. The original matrix equation is
`.
`
`R(0) R0)
`R0) R(0)
`
`R0)
`at
`012 = 12(2)
`
`Using Eqs. (8.67)-(8.72), we get
`
`E“) = R (o)
`
`k1= R(1)/R(0)
`
`a1(1)= R (l)/R (0)
`
`E“) =m
`R(0)
`
`k = R(2)R(0)—R2(1)
`2
`R2(0)—R2(l)
`
`(1(2) = R(2)R(0)_R2(1)
`2
`R2(0)—R2(1)
`an): R(1)R(0)—R(1)R(2)
`‘
`R2(0)—R2(1)
`
`a] = (11(2)
`
`(12 = 072(2)
`
`(8.71) is the prediction
`It should be noted that the quantity E m in Eq.
`error for a predictor of order i. Thus at each stage of the computation the pred-
`iction error for a predictor of order [can be monitored. Also, if the autocorre-
`lation coefficients R(i) are replaced by a set of normalized autocorrelation
`coefficients, i.e., r(k) = R (k)/R (0), then the solution to the matrix equation
`remains unchanged. However, the error E m is now interpreted as a normalized
`If we call this normalized error V“), then
`
`V(I) = * =1-— 2 akr(k)
`
`(8.73)
`
`
`
`This condition on the parameters k,- is important since it can be shown [1,181
`that it is a necessary and sufficient condition for all of the roots of the polyno-
`mial A (z) to be inside the unit circle, thereby guaranteeing the stability of the
`system H (2). Unfortunately a proof of this result would take us too far afield;
`however, the fact that we do not give a proof does not diminish the importance
`of this result. Furthermore,
`it
`is possible to show that no such guarantee of
`stability is available in the covariance method.
`
`8.3.3 Lattice formulations and solutions [11]
`
`As we have seen, both the covariance and the autocorrelation methods
`consist of two steps:
`'
`
`1. Computation of a matrix of correlation values.
`2. Solution of a set of linear equations.
`
`These methods have been widely used with great success in speech processing
`applications. However, another class of methods, called lartice methods, has
`evolved in which the above two steps have in a sense been combined into a
`recursive algorithm for determining the linear predictor parameters. To see
`how these methods are related, it is helpful to begin with the Durbin algorithm.
`First, let us recall that at the i’” stage of this procedure, the set of coefficients
`{afl’ j=l,2, .
`.
`.
`,
`i] are the coefficients of the i ”' order optimum linear predic-
`tor. Using these coefiicients we can define
`
`AWz) = 1 — 2' ape-k
`k=l
`
`(8.77)
`
`(or prediction error
`to be the system function of the i”’-order inverse filter
`filter).
`If
`the
`input
`to
`this
`filter
`is
`the
`segment
`of
`the
`signal,
`s,,(m) =s(n+m)w(m),
`then the output would be the prediction error,
`en(”(m) = e")(n+m), where
`
`e<”(m) =s(m)f '
`k'l
`
`afi’)s(m——k)
`
`(8.78)
`
`Note that for the sake of simplicity we shall henceforth drop the subscript n
`which denotes the fact that we are considering a segment of the signal located
`at sample n.
`In terms of z-transforms Eq. (8.78) is
`
`
`
` BACKWARD PREDICTION
`(f
`—i—->
`s(m-3)
`( _,)|
`:
`srn I
`I
`[
`|
`
`L_i___l
`Ii
`/_
`,
`l
`s(m-l)]
`
`FORWARD PREDICTlON
`
`s(m-Z)
`
`s(m)
`
`It is clear that if we extend the lattice to
`a structure is called a lattice network.
`p sections,
`the output of the last upper branch will be the forward prediction
`error as shown in Fig. 8.3. Thus, Fig. 8.3 is a digital network implementation
`of the prediction error filter with transfer function A (2).
`
`At this point we should emphasize that this structure is a direct conse-
`quence of the Durbin algorithm, and the parameters k, can be obtained as in
`Eqs.
`(8.67)-(8.72). Note also that
`the predictor coefficients do not appear
`explicitly in Fig. 8.3.
`Itakura [4,6] has shown that the k,- parameters can be
`directly related to the forward and backward prediction errors and because of
`the
`nature
`of
`the
`lattice
`structure
`the
`entire
`set
`of
`coefficients
`k,, i=l,2, ...,p} can
`be
`computed without
`computing the predictor
`coefficients. The relationship is [11]
`
`l
`l,
`s(m-IH)
`
`i| I
`
`i SAMPLES USED TO PREDICT
`s(m—i) AND s(m)
`
`Fig. 8.2 Illustration of forward and backward prediction using an i"'
`order predictor.
`
`The first term in Eq. (8.81) is obviously the z-transform of the prediction error
`for an (i—l) ”7 order predictor. The second term can be given a similar interpre-
`
`It is easily shown that the inverse transform of B(’)(z) is
`
`B"’(z) = z—iA“)(z-1)S(z)
`
`b"’(m) = s(m—i) — 2 ati)s(m+k—i)
`ksl
`
`(8.82)
`
`(8.83)
`
`This equation suggests that we are attempting to predict s(m—i) from the i
`samples of the input {s(m—i+k), k=1,2,...,i} that
`follow s(m—i). Thus
`is called the backward prediction error sequence.
`In Fig. 8.2 it
`is
`shown that the i samples involved in the prediction are the same ones used to
`predict s(m) in terms of i past samples in Eq. (8.78). Now returning to Eq.
`(8.81) we see that the prediction error sequence e(’)(m) can be expressed as
`
`2 e(i—l)(m)b(i—l)(m_1)
`m-O
`k, = WINE”?-
`2 (e("1)(m))2 2 (b(”1)(m—1))2
`m-O
`m-O
`
`(8.89)
`
`This expression is in the form of a normalized cross-correlation function; i.e., it
`is indicative of the degree of correlation between the forward and backward
`prediction error. For this reason the parameters k,- are called the partial correla-
`tion coefficients or PARCOR coefficients [4,6].
`It is relatively straightforward
`to verify that Eq. (8.89) is identical to Eq.
`(8.68) by substituting Eqs. (8.78)
`and (8.83) into Eq. (8.89).
`
`It can be seenthat if Eq. (8.89) replaces Eq. (8.68) in the Durbin algo-
`rithm, the predictor coefficients can be computed recursively as before. Thus
`the PARCOR analysis leads to an alternative to the inversion of a matrix and
`
`e")(m) = e(’“”(m) — k,b("”(m-1)
`
`(8.84)
`
`e‘°’(n)
`
`emin)
`
`emtn) .
`
`e("”(n)
`
`emu.)
`
`e(n)
`
`By substituting Eq. (8.80) into Eq. (8.82) we obtain
`
`B(")(z) = z"A ("1)(z“l)S(z) — k,»A ("1)(z)S(z)
`
`(8.85)
`
`B(’)(z) = z‘lB(’_1)(z) — k,E("1)(z)
`
`(8.86)
`
`Thus the i”' stage backward prediction error is
`
`
`
`
`
`
`
`
`
`the set of PARCOR
`gives results identical to the autocorrelation method; i.e.,
`coefiicients is equivalent
`to a set of predictor coefficients that minimize the
`mean-squared forward prediction error. More importantly, this approach opens
`up a whole new class of procedures based upon the lattice configuration of Fig.
`
`In particular, Burg [12] has developed a procedure based upon minimizing
`the sum of the mean-squared forward and backward prediction errors in Fig.
`
`Em = ”2“ [(eu>(m))2+ (b")(m))2]
`m-O
`
`(8.90)
`
`Substituting Eqs. (8.84) and (8.87) into Eq. (8.90) and differentiating 12"") with
`respect to k,, we obtain
`
`N—l
`‘
`3L = _ 2 2 [8(i—1)(m)_ klb(I—1)(m_l)]b(I-1)(m_l)
`"1-0
`N-1
`.
`.
`,
`—2 2 [b("”(m—1) — k,e("”(m)]e("”(m)
`m-O
`
`Setting the derivative equal to zero and solving for k, gives
`
`k, = _-—_m-0—2N_—1__———3
`2 [eh ”(m)‘ + 2 [b(i—1)(m_1)]
`m 0
`m-O
`
`It can be shown [1] that if k,- is estimated using Eq. (8.92) then
`
`—1 S k, < 1
`
`(8.91)
`
`(8.92)
`
`(8.93)
`
`it should be clear that the k,’s estimated using Eq. (8.92) will in gen-
`eral differ from those