throbber
Low Delay - Code Excited Linear Predictive (LD-CELP)
`Coding of Wide Band Speech at 32kbits/sec
`
`Erik Ordentlich
`
`Submitted in Partial Fulfillment of the
`Requirements for the Degrees of
`
`Master of Science in Electrical Engineering and Computer Science
`
`Bachelor of Science in Electrical Engineering and Computer Science
`
`and
`
`a.t the
`
`Massachusetts Institute of Technology
`
`April 1990
`
`@Massachusetts Institute of Technology
`
`Signature of Author
`
`Certified by
`
`Department of Electrical Engineering a.nd Computer Science
`April 1, 1990
`
`-
`>!,9fea&M-,Ja.e S. LiJD, Thesis Supervisor
`~ -~
`_/
`
`Accepted Ly .__
`
`........_ __ ___;;.. __ ~~-----.,,- -=-:---------------'---"-----""">-~~
`Arthur C. Smltlt' ..
`1.
`Chairman, Electrical Engineering Department Committee
`on Graduate Students, Massachusetts Institute of Technology
`
`Accepted by
`
`....
`,
`Vair Shoha.m, AT&T Bell Laboratories
`Com11any Official
`
`ARCHIVES
`
`tlASSACHUSEnS 1~5TITUTE
`Of TECHNOLOGY
`AUG 10 1990
`UbRAAlES
`
`Ex. 1042 / Page 1 of 133
`Apple v. Saint Lawrence
`
`

`

`Low Delay - Code Excited Linear Predictive (LD-CELP) Coding
`of Wide Band Speech at 32kbits/sec
`
`by
`
`Erik Ordentlich
`
`Abstract
`
`We investigate the potential of the Linear Predictive Coding (LPC) based algorithm,
`Low Delay - Code Excited Linear Prediction (LD-CELP), for coding wide band (.05-7khz)
`speech at a bit rate of 32kbits/sec. This bit rate allows for stereo transmission of wide band
`speech over a single 64khits/sec basic rate ISDN channel, thereby permitting the simulta(cid:173)
`neous transmission of image data over the other 64kbits/sec channel. The low delay feature
`makes tele-conferencing and other two way speech communication sceaarios particularly
`inviting applications for this type of coding algorithm.
`A simple minded application of the LD-CELP algorithm to wide hand speech resulted in
`audible high frequency distortion with essentially perfect coding at low frequencies. Three
`approaches ·to compensating for this frequeucy··a.S.ymni~try in coding quality were investi(cid:173)
`gated. .The first app~oach involves simple modifications to the conventional CELP noise
`weighting filter in an·effort to introduce more degrees of freedom, and thereby allow for a
`more accurate modeling of both spectral tilt and formant structure. The second approach
`uses a psycho-acoustically based model of human hearing to determine the noise weighting
`filter. The third approach decomposes the input speech into high and low frequency bands
`using a quadrature mirror filter bank and applies the LD-CELP algorithm to each hand sep(cid:173)
`arately. An additional part of the thesis is devoted to improving the efficiency of the CELP
`codebook design algorithm, which was found to be slow and cumbersome. Two improve(cid:173)
`ments are proposed and evaluated. Finally we preGent the results of a small-scale, formal,
`a/h subjective test comparing our best 32kbits/sec LD-CELP system to the 64khits/sec
`G.722 CCITT wide band speech coding standard.
`The ma.in conclusion of the thesis is that 32khits/sec high quality coding of wide band
`speech is definitely viable, and that the ha.sic LD-CELP algorithm along with some im(cid:173)
`provements suggested in the thesis is an excellent candidate for this application. The
`subjective test results indicate that the speech qualities of our 32kbits/sec LD-CELP and
`the 64kbits/sec G.722 are identical.
`
`Thesis Supervisor: Dr. Jae S. Lim
`Title:
`Professor of Electrical Engineering
`
`1
`
`Ex. 1042 / Page 2 of 133
`
`

`

`2
`
`Ex. 1042 / Page 3 Of133
`
`Ex. 1042 / Page 3 of 133
`
`

`

`Acknowledgements
`
`' I would like to thank the Signal Processing Research Department at AT&T Bell Labora-
`tories, Murray Hill, NJ for financial, intellectual and moral support during the course of this
`research. Deepest appreciation goes to my mentor, Dr. Yair Shoham, and my department
`head, Dr. N. S. Jayant. I would also like to thank Professor Jae S. Lim for being my thesis
`advisor at MIT.
`
`3
`
`Ex. 1042 / Page 4 of 133
`
`

`

`EX. 1042 / Page 5 0f133
`
`4
`
`Ex. 1042 / Page 5 of 133
`
`

`

`Contents
`
`1 Introduction
`1.1 Ba.ckground a.nd Motivation
`1.2 The Topic . . . . . . .
`1.3 Organization of Thesis . . .
`
`2 Basics of LPC and LD-CELP
`2.1
`Introduction . . . . . .
`2.2 Wha.t is LPC?
`. . . . . . . . ..
`2,2.1 The. Basic. Idea. ..... .
`. 2.2.2 · LPC An:tlysis of. a. Signal s( n )" . ·.
`2.2.3 Prediction Error
`. . . .
`2.2.4 Long-term Prediction
`2.3 Basic CELP a.nd LD-CELP
`2.3.1
`Introduction
`2.3.2 Basic CELP . .
`2.3.3 LD-CELP ...
`2.3.4 Our Coder
`. .
`2.3.5 Weighting a.nd Synthesis Filter Derivation
`2.3.6 Conclusion
`. . . . . . . . . . . . . . . . .
`
`3 Statement of Problem
`3.1
`Introduction . . . . . . . . . . . . . . . . . .
`3.2 Quantization Noise in LPC Based Systems.
`3.2.1 New CELP Architecture ...... .
`3.2.2 Prediction Ga.in Versus Frequency
`3.2.3 LD-CELP a.nd Implicit Bit Allocation
`3.2.4 Conclusions . . . . . . . .
`3.3 Return to Problem Statement . .
`3.4 Objective Criterion-SBSNR
`3.5 Conclusion . . . . . . . . . . . .
`
`.
`
`4 Simple Noise Weighting
`4.1
`Introduction . . . . . . . . . . . . . . . .
`4.2 Basic Coder Parameters . . . . . . . . .
`4.3
`Ina.dequa.cies of the Narrow Ba.nd Filter
`
`5
`
`9
`9
`10
`12
`
`13
`13
`13
`13
`15
`17
`21
`22
`22
`23
`25
`26
`30
`31
`
`32
`32
`33
`33
`39
`42
`42
`43
`46
`47
`
`49
`49
`50
`53
`
`Ex. 1042 / Page 6 of 133
`
`

`

`4.4 Non-adaptive Three Pole Weighting . . . . .
`4.5 Broad High Fraquency Weighting . . . . . . .
`4.6 Adaptive Three Pole and Three Zero Shaping
`4. 7 Adaptive Two Pole Shaping
`4.8 Conclusion
`. . . . . . . . . . . . . . .
`
`5 Perceptually Optimum Nois~ Shaping
`5.1
`Introduction . . . . . . . . . . . . . .
`5.2 Perceptual Transform Coding . . . . .
`5.2.1 Frequency Domain Masking ..
`5.3 Perceptual Transform Coding Algorithm .
`5.4 Recent Improvements of the Perceptual Model
`5.5 Extension of Algorithm to LPC . . . . . . . . .
`5.5.1 The Threshold Calculator . . . . . . . .
`5.5.2 LPC Analogue of Threshold Calculator
`5.5.3 CELP and the Psycho-aco1Jstic Noise Model .
`5.5.4 The Delay Issue
`5.6 Perceptual Entropy .
`5. 7 Conclusion
`. . . . .
`
`6
`
`·Split" Ba·rid LD-CELP
`6.1
`Introduction . . . . .
`6.2 Split Band Coding and Our System
`6.3 Results . . .
`6.4 Conclusion
`
`. .
`
`7 Codebook Design
`Introduction . . . . . . . . . . . . . .
`7.1
`7.2 Design Algorithm . . . . . . . . . . .
`7.3 Alternate 'Empty' Vector Processing
`7 .4 Eigenvalue Based Codebook Search .
`7.4.1 The Proposed Algorithm
`. .
`7.4.2 Lower Bound on the Minimum Eigenvalue of a Symmetric Positive
`Definite Matrix . . . . . . . . . . . . . . . . . .
`7.4.3 Performance Analysis of the Search Algorithm
`7 .5 Conclusion
`. . . . . . . . . . . . . . . . . . . . . . . .
`
`8 Subjective Test
`
`9 Summary and Conclusions
`9.1 Summary
`.
`9.2 Conclusion& . . . . . . . .
`
`6
`
`57
`61
`63
`65
`66
`
`68
`68
`69
`69
`·13
`76
`79
`79
`82
`86
`87
`90
`92
`
`93
`93
`93
`98
`99
`
`101
`101
`101
`104
`108
`108
`
`112
`113
`121
`
`122
`
`128
`128
`130
`
`Ex. 1042 / Page 7 of 133
`
`

`

`List of Figures
`
`2.1 Clasaical LPC architecture . . • • • • • • . . . . •
`2.2 Classicot.l LPC coder configuration with pitch loop ..
`2.3 Basic CELP architecture.
`
`3.1
`
`3.2
`3.3
`
`'Equivalence' of CELP with n1Jise weighting and pre/post filtered CELP without noise
`weighting . . . . . . . . . . . . . . . . . . . .
`CELP coder with DPCM type architecture and pre/post filtering.
`Band partitions for the SBSNR. distortion measure.
`. . . . . . .
`
`4.1
`Ideal prediction gain venus prediction order for wide band speech.
`· 4.2 Magnitud~ of the. 1/W(ej"'); 'Yi =.I and 'Yll =:as, (dotted) and 1/A(ei"') (solid).
`4.3 F~equency response of P1(z) for the case of r1 = r2 = .2 and 8 = 58". • • · •..
`4.4 Comparison o{ IW(z)l2 and IW(z)P1(z)j2 • • • • • ." • • • • • • • • • • • • •
`4.5
`Ideal square wave weighting filter (upper) and all-pole approximation (lower). .
`
`5.1 Tone masking noise (top) and noi.le masking tone (bottom).
`. • . • • . • . .
`5.2 Ahsolute threshold of hearing venua frequency.
`. • . . . . . . • . . . . . .
`5.3
`Ideal noise shape as given by the psycho-acoustic model (top), the all pole model (bottom
`dotted) and the pole/zer1.1 model (bottom solid). • . . . . . . . . . . . . . . . . . . .
`5.4 Relationship between noise shape analysi& window and coding window {or the psycho-
`acoustica!ly bl.Rd CELP coder.. • • • • • . • • . . • • • • . • • . . . . . • .
`Inverse tonality measure at various update rates for a 256 sample analysis window.
`
`5.5
`
`6.1 Split band LD-CELP.
`
`. . • • . • • • • . . • • • . . . . • • • . . .
`
`14
`22
`23
`
`34
`35
`47
`
`51
`54
`57
`58
`62
`
`6'9
`73
`
`84
`
`88
`89
`
`94
`
`7.1 Splitting in t:i-e direction of the smit.llest eigenvalue eigenvector (solid); splitting in the
`direction o{ the largellt eigenvalue eigenvector (ciotted).
`
`107
`
`7
`
`Ex. 1042 / Page 8 of 133
`
`

`

`List of Tables
`
`4.1 Parameter values for the coder which served u a teat vehicle for I.he noise shaping ideas
`described in this chapter. • . . . • • . . • . . . . • • . . • . . . • . • . . . . .
`4.2 SBSNR values averaged over four files coded using four different pitch configurations.
`. .
`4.3 Pitch gain codebook entries for the various pitch configurations tes1ted. • . • . • . . . .
`4.4 SBSNR's averaged over the same four files for two coding systems using 32 order LPC
`synthesis filters.
`. . . . • . . . . . . • . . . . • . .
`
`5.1 Parameters of the perceptually based LD-CELP coder.
`
`6.1 Fixed bit allocationi for high and. low subbanda. The bit rates are expressed iri bits per
`sample ~f ~he deci1J1ated subb~nds. This is ~hy the totiJ:in each tow ia equal to four.
`. .
`6.2 SBSNR's for three different fixed bit. allocations plus the SBSNR for the broad band LD-
`CELP with the 1, .85, P(z) = 1 weighting filter . • . • . . • · . • . . . • . . . . . . . .
`
`7 .1 Percentage of codebook for which full matrix multiplication needs to be done for various
`bandwidths and block sizes. The last column contains the reduction in multiplications
`based on the eigenvalue lower bound.
`. .
`
`8.1 Scores by listeners. Final score in bold .•
`8.2 Scor~ by listeners for male file pairs.
`. •
`8.3 Scores by lillteners for female file pairs.
`•
`8.4 Scores by file pair. The ·~ar' files are female, the 'ke' files are male ....
`
`50
`52
`53
`
`67
`
`87
`
`96
`
`98
`
`120
`
`124
`125
`126
`127
`
`.......
`'•
`
`/
`
`/
`
`...
`'
`
`/
`.I
`
`8
`
`Ex. 1042 / Page 9 of 133
`
`

`

`Chapter 1
`
`Introduction
`
`1.1 Background and Motivation
`
`Moderate rate digital telephone co~neativity such Cl:S AT &T's basic rate ISDN (two 64kbit/sec .
`
`data channels) allows for the possibility of high quality two-way wide b.tnd voice co~muni­
`
`cation. Consequently the speech coding community is placing a greater emphasis on coding
`
`algorithms which operate at wider than telephone grade bandwidths. One specific aim is
`
`the high quality coding of commentary grade or .05-7khz bandwidth speech. The perceived
`
`difference between wide band speech and telephone band (.2-3.4khz) speech is enormous
`
`and the improvement in speech quality gained by increasing t:1e bandwidth makes wide
`
`band speech coding a.n important area of research.
`
`Wide band audio coding in general has become an active area of research in recent
`
`times with the advent of transform based coding techniques 1 capable of maintaining near
`
`perceptual transparency b~tween coded and originJ.1 signals at bit rates as low as 64khits/sec
`
`[16, 8). Researchers in this field are interested in coding a variety of audio signals and
`
`not just speech. Most of the algorithms owe their high coding quality to quantization
`
`noise shaping schemes which incorporate empirical and intuitive notions of human auditory
`
`perception. The idea is to take advantage of properties of human hearing such as time
`
`1Transform coding refers to schemes in which the transmiUer quantizes the DFT (or a. variant thereof)
`
`components of blocb of inpnt t.nd sends the resulting information to the receiver which then reconstructs
`
`the DFT componen~!I and does an IDFT to obtain an approximation of the input.
`
`9
`
`Ex. 1042 / Page 10 of 133
`
`

`

`and frequency domain masking in order to quantize spectral samples in such a way that the
`
`resulting quantization noise is inaudible. The primary disadvantage of transform approaches
`,
`in general is that they obtain their coding gain by operating on relatively large blocks
`
`of the input signal and thereby introduce significant encoding and decoding delay. This
`
`greatly limits their potential for network applications such as teleconferencing where large
`
`delays across the network can result in annoying feedback induced echoes. Although echo
`
`cancelers are available commercially, they would necessarily add to the cost of the network.
`
`Furthermore, based on current results in transform coding research it seems like 64kbits/sec
`
`represents a lower bound on the bit rate for high quality coding of 20khz signals. This
`
`is primarily because of the extensive side information requirements of transform based
`
`implementations. Reducing the bandwidth at which these coders operate probably won't
`
`help since most of the energy in most sounds is at low frequencies.
`
`One possible approach to the problem of coding wide-band speech which· has received.
`little ~oriSideration to. d~te is tin¢ar Predktive Codi~g (LPC): Over the years ·researchers
`have encountered tremendous success with LPC based techniques for coding narrow, tele- ·
`
`phone bandwidth speech, a recent example of which is a Bell Labs developed, real time,
`
`near perceptually transparent, 16kbit/sec Low Delay Code Excited Linear Predictive (LD(cid:173)
`
`CELP) coder [13). The concept of LD-CELP is derived from the basic CELP concept
`
`introduced by Atal and Schroeder in (5). CELP in turn is founded in classical LPC theory.
`
`The main advantages of LPC based coding approaches are the possibilities of low delay,
`
`as demonstrated by the 16kbit/sec LD-CELP coder, which can have an encoding delay as
`
`low as 625 usec ( the real time implementation actually has a higher delay so that real
`
`time constra.int11 can be met by available computational resources), and the fact that LPC
`
`provides a theoretically and experimentally justifiable excitation, linear filter decomposition
`
`or model of speech.
`
`1.2 The Topic
`
`The ma.in focus of this thesis will be the effectiveness of the LPC based algorithm, LD-CELP
`
`for coding wide-band (.05-7khz) speech at a bit rate of 32kbits/sec. The primary motivation
`
`for restricting the scope of the research to LD-CELP is the potential for high coding quality
`
`10
`
`Ex. 1042 / Page 11 of 133
`
`

`

`at very low delay as evidenced by the 16kbits/sec narrowbanc:I. LD-CELP implementation.
`
`A low delay high quality wide-band coder would make commentary grade teleconferencing
`,
`and two way voice communication a. reality. The motivation behind looking at a 32kbits/sec
`
`bit rate is heuristic. We are approximately doubling the bandwidth in going from telephone
`
`bandwidth to commentary bandwidth, hence we should double the bit rate. One may think
`
`that it should be trivial to code even wide band speech at 32kbits/sec as most of the energy
`
`is in the lower 3.5khz which can already be coded at 16kbits/sec. As will be shown later
`
`this thinking is justified only if signal to noise ratio (SNR) is the exclusive arbiter of quality.
`
`Although the energy in the 3.5-7khz range of the speech spectrum is significantly less than
`
`the energy in the lower half of the spectrum the high frequencies are perhaps as perceptually
`
`important as the lower ones. In any event we feel that 32kbits/sec is a good target bit rate.
`
`Currently there is no 32kbit/sec commentary grade coding standard. There is a CCITT
`
`7khz speech standard (G.722), however, at 64kbit/sec [20]. The next lowest useful bit rate
`.
`.
`is 32kbit/~ec .2 and lo"wer th~n· thi~ is·l6kbits/~e~: It seems i~prudent -t~- set 16bits/sec as
`a research goal when there is no solid footing even at 32kbits/sec.
`
`After initial experimentation with the basic LD-CELP algorithm applied to wide band
`
`speech it was found that one problem is a perceptually inadequate coding of high frequencies.
`
`Some high frequency hiss was present in almost all the files tested. The low frequencies
`
`on the other hand were being coded extremely well. The specific aim of this thesis is to
`
`investigate and evaluate three approaches to reducing the high frequency distortion in the
`
`wide band LD-CELP system. These approaches fall under the headings of,
`
`• Simple fixed and adaptive error or noise shaping.
`
`• Auditory model driven noise shaping.
`
`• Split band LD-CELP.
`
`One of the main conclusions of the thesis is that high quality coding of wide band speech
`
`at 32kbits/sec is dennitely within reach, and all the more so if some of the techniques pro·
`
`posed and evaluated in this thesis are refined and incorporated into the LD-CELP system.
`
`2This bit rate allows for the transmission of two independent or stereo wide ba.nd speech channels over a.
`
`single 64kbit/sec ISDN data. line.
`
`11
`
`Ex. 1042 / Page 12 of 133
`
`

`

`1.3 Organization of Thesis
`
`The remainder of the thesis is broken down into eight chapters. Three chapters, specifically
`
`4, 5, and 6 are devoted to explaining and evaluating the three fundamental approaches to
`
`the reduction of high frequency distortion pursued in this thesis. Chapter 2 is a long but
`
`necessary technical introduction to LPC and LD-CELP. Chapter 3 provides some further
`
`discussion on the high frequency distortion problem and gives some theoretical motivation
`
`for the three main approarhes of the thesis. Chapter 7 presents some ideas and improve(cid:173)
`
`ments to the VQ design algorithm used to generate the codebooks used in the LD-CELP
`
`quantization process. As is expl:Yned in chapter 3, each new idea and parameter change
`
`requires a codebook redesign. This is the only way that the merits of a particular technique
`
`or parameter choice can be evaluated unequivocally. Codebook redesign is a long task, and
`
`chapter 1 explores some ways in which it. can be made more efficient. Chapter 8 presents
`
`'
`
`'
`
`the. results of an 'a/b' subjective test comparing our best wide band 32kbits/sec LD~CELP
`.
`.
`.
`.
`.
`.
`.
`.
`:i.gorithm. to the ff4kbits/s~c wide band CCITT standard, G.722 [20]. The i~tent of this
`presentation is to give the reader a better idea of the speech quality of our LD-CELP algo(cid:173)
`
`rithm. The final chapter summarizes the results of the previous chapters, and restates the
`
`main conclusions of the thesis.
`
`12
`
`Ex. 1042 / Page 13 of 133
`
`

`

`Chapter 2
`
`Basics of LPC and LD-CELP
`
`2.1
`
`Introduction
`
`This chapter is divided into two sedions .. T~e first se~tion .introdu~es the basic concept.
`· of LPC and defines certain quantities and expression~ which will be referred to repeatedly
`in the thesis. The second section describes the specific LPC based systems of CELP and .
`
`LD-CELP.
`
`2.2 What is LPC?
`
`2.2.1 The Basic Idea
`
`The notion of predicting future values of a signal based on linear combinations of past
`
`values has been around for a while. It was first applied to speech coding by Schroeder and
`
`Atal in [6]. The basic idea is that any part of the input signal which can be predicted
`
`from pd.St values of the output signal available at the receiver is redundant and need not
`
`be transmitted. Only the difference between the predicted value, which the receiver can
`
`compute, and the actual input value need be transmitted. H the prediction is good the
`
`power in this difference or residual signal will be much less than the power in the input
`
`signal. This results in bit savings if coding quality is fixed or improved coding quality if the
`
`bit rate is fixed. Figure 2.1 depicts a block diag1am of the classical LPC architecture.
`
`The predictor P(z) is an FIR filter whose coefficients are determined by finding the
`
`13
`
`Ex. 1042 / Page 14 of 133
`
`

`

`Transmitter
`
`Receiver
`
`Figure 2.1: Classical LPC architecture.
`
`.
`.
`'optimum li~ear prediCtor for a block of input speech~ The first LPC coders of this nature
`had fixed predictors, but quickly it was realized that to compensate for the nonstationary
`
`nature of speech the predictor has to be updated periodically. The derivation of P( z) is
`referred to as LPC analysis. If this analysis is done on the input signal the coefficients of P( z)
`
`have to be sent to the receiver as side information. There are also techniques for adapting
`
`P(z) based on previously reconstructed samples available at both receiver and transmitter.
`
`In these schemes no side information is sent resulting in bit savings. However, the predictors
`
`derived in this fashion are less efficien~ resulting he a. higher variance difference signal which
`
`is harder to code. There is a trade off and only experimentation can show which method is
`
`more appropriate for a particular application. Backward adaptation of predictor parameters
`
`has one added advantage in that it can be achieved with very little or no encoding delay.
`
`Sometimes this is essential.
`
`3efore we move on it is worthwhile noting a special property of the LPC coder configu(cid:173)
`
`ration of figure 2.1. This property is that the OV'i!rall quantization noise is exactly equal to
`the quantization noise at Q. In other words, referring to the figure,
`
`q(n) = s(n) - s(n) = r(n) - r(n).
`
`14
`
`Ex. 1042 / Page 15 of 133
`
`

`

`This can be easily derived by noting that,
`
`and,
`
`s(n) = .i(n) + r(n)
`
`s(n) = s(n) + r(n).
`
`From these two relations the result follows immediately. Therefore, to characterize the
`
`quantization characteristic's of the entire LPC configuration we can focus t:!Xclusively on the
`characteristics of Q.
`
`2.2.2 LPC Analysis of a Signal s(n)
`
`This subsection presents an overview of the basic LPC analysis procedure. Throughout the
`
`section, input signal refers to signal which is undergoing analysis and not necessarily the
`
`inpq.t_ to the c~d.ing ·scheme. As poi~ted out ~arlie~, the ~n.alysi!J can af~o be done on .th~
`
`output of the coding scheme if backward adaptation of the prediction filter is used.
`The z-transform of the linear predictor P(z) has the form,
`
`P(z) = a1z-1 + ... + a.,,z-P
`.,,
`= E a1cz-1c.
`
`k=l
`
`If we let P(z) = ~~' in the time domain this translates into:
`.,,
`s(n) = E a1cs(n - k).
`
`k=l
`
`(2.1)
`
`(2.2)
`
`Now the idea is to determine the coefficients a1c in such a way that the predictor is optimum
`
`according to some criterion for the current input signal. Strictly speaking, the predictor
`
`should be designed so that we get the best prediction of the coder input from past values
`
`of the coder output. This, however, is an extremely unwieldy optimization problem since
`
`the coder output is not available until the predictor is decided upon. Traditionally it is
`
`assumed that the output of the coder will be similar enough to the input that designing
`
`the predictor to be optimum for predicting future samples of the input from past samples
`
`of th~ input will not be far off the mark. The criterion most often used for optimization is
`
`15
`
`Ex. 1042 / Page 16 of 133
`
`

`

`the mean square error between s(n) and the predicted value s(n) evaluated over the current
`
`section of the input. In the equations which follow sample inde.x 0 corresponds to the first
`
`I
`
`sample of the current N sample analysis block.
`
`Hence we are out to minimize,
`
`t(ai, ... a,,) = E [s(n) - s\n)J 2
`
`N-1
`
`n=O
`N-i
`
`n=O
`
`= E [s(n)- L a1cs(n - k)]2.
`
`P
`
`k=l
`
`(2.3)
`
`If we view £'(·) as a scalar fu·,1ction of a11 ••• a,, we note that it is convex U in these
`variables and hence it's local minimum is also it's global minimum. This minimum is
`
`obtained by requiring that,
`
`(2.4)
`
`. Up0n workiag ~U:t the algebra one obtains a set of linear· equation~ !n ·the a1c 's which ··we
`sha.ll express in matrix notation as:
`
`c.,(0,0)
`
`c,.(1, 0)
`
`c,.(O, 1)
`
`c.,(1, 1)
`
`c,,(O,p- 1)
`
`c,.(1,p-1)
`
`Cu(P- 1,0) c.,(p - 1, 1)
`
`c.,(p- l,p- 1)
`
`or,
`
`R,.a = r ...
`
`a1
`
`a2
`
`a,,
`
`=
`
`c,.(0,1)
`
`c.,(O, 2)
`
`c,,.(O, p)
`
`(2.5)
`
`(2.6)
`
`The matrix R., is a local estimate of the auto-correlation matrix ( or covariance matrix
`if the signal is assumed to be zero mean) of the input signal s(n). The entries r.,,,(i,j) of
`
`the matrix are defined as,
`
`N-l
`
`c,.(i,j) = E s(n - i)s(n - j).
`
`n=O
`
`(2.7)
`
`There are two ways to deal with the samples with negative indices referred to in (2. 7).
`
`The first, known as the autocorrelation method, multiplies the input by a window, w(n)
`which is zero for N -1 < n < 0. The window can be uniform over its support (rectangular),
`
`16
`
`Ex. 1042 / Page 17 of 133
`
`

`

`but typically it is chosen to minimize the effects of signal discontinuities at n
`
`0 and
`
`n =, N - 1. The entries of the matrix Ra;, in (2.5) are then calculated as,
`N-t
`c88(i,j) = L s(n)s(n - Ii - jl)
`
`n:O
`
`or letting Ii - ii = k,
`
`ru(k) = L s(n)s(n - k)
`n=O
`In other words cu(i,j) = c13(i + tl.,j +fl.). Once c,,,,(i,j) is known for i = 0, j = 1, ... , p w~
`have all of R,,,,. This procedure is the same as replacing s(n) by s(n)w(n) in equation (2.7)
`
`N-t
`
`with the limits of summation changed to ±oo. The matrix R,,,, obtained in this fashion is
`
`both symmetric and toeplitz. Consequently, equation 2.5 lends itself to an efficient O(n 2)
`
`solution by several recursive algorithms which exploit the structure of Raa· The most
`
`efficient of these algorithms also "takes advantage of the fact that p - 1 of the entries of f
`.
`.
`occur in Raa· AU of these algorithms are more effi~ient titan the 0( n3 ) gau.ssian elir.iination
`
`based. inversion methods.
`
`The other approach is called the covariance method. In this approach samples with
`
`negative indices simply correspond to samples from the previous analysis block. For example
`s(-1) = s'(N -1), where s'(n) also refers to the input signal but with n = 0 corresponding
`to the start of the previous analysis block. Calculated in this way, R,,,, is symmetric, but
`
`not toeplitz. This is the primary disadvantage of the covariance method. No algorithm
`
`faster than O(n3 ) exists for solving (2.5) if this method is used.
`
`For P(z)'s with orders significantly less than the analysis block length the two methods
`
`perform almost equally well. All coding schemes explored during the course of this research
`
`are based on the autocorrelation method of LPC analysis.
`
`2.2.3 Prediction Error
`
`An important indicator of the coding gain accrued by LPC over simple PCM type quanti(cid:173)
`
`zation is the ratio of the energy of the input signal to the energy of the error or difference
`
`signal obtained by passing the input s(n) through the filter 1 - P(z). This error, which we
`
`shall denote as r(n), is ~ue to imperfect short term prediction, since after all,
`r(n) = s(n) - s(n)
`
`17
`
`Ex. 1042 / Page 18 of 133
`
`

`

`where s(n) is defined in (2.2) as the predicted value of s(n). The error l'(·) in equation
`
`(2.3) is the energy of r(n). The analysis of the prediction error is usually carried out by
`,
`assuming that s( n) is a stationary random process, and r( n) is the random process resulting
`
`from passing s(n) through the system 1- P(z). The LPC analysis procedures described in
`
`the previous section can also be derived from thi!j point of view by simply replacing all the
`
`summation;. with expected values. Thus, instead of talking about the energy of a signal,
`
`we'll talk about it's variance. From here on we will denote the energy or variance of r(n) as
`
`u~ and the variance of s( n) as u:. The predictor coefficients were chosen to minimize u:.
`
`The prediction gain G is defined as:
`
`It should be obvious that G 2: 1 ( choosi~g all the a1c 'c; to be 0 results in G = 1 ). It is well
`
`understood in coding theory that for a sc~ar quantizer, for example the kind used in figure
`
`2.1, the variance ~f the 9.uantizatfon 'erl'or or. noise dem~ted by .a; is proportional to the.
`
`variance of the input to the quantizer. In other words:
`
`(2.8)
`
`whP,re x denotes the input to the quantizer, b is the hit rate in bits per sample, and f
`
`is a constant which depends on the probability distribution function (pdf) of x and the
`
`particular nature of the quantizer. Uniform and non-uniform quantizers will have different
`
`epsilons. Equation (2.8) is valid only if the quantizer is designed to minimize overload
`distortion. For the signals depicted in figure 2.1 the quantization noise, q(n) = r(n)-r(n) =
`s(n)-s(n). Also let us denote q'(n) = .'l(n)-s'(n) as the quantization noise resl.tlting from a
`
`straightforward scalar quantization and reconstruction of s(n). Assuming equal £ 1S and bit
`
`rates for both quantizers and neglecting the fact that in figure 2.1 we are predicting future
`
`input samples from previous reconstructed samples, (2.8) implies that u~ will be smaller
`than CJ':. by a factor of b· Stated another way, the gain in SNR (signal to noise ratio) of
`the LPC system over simple quantization is given by G. The actual gain is typically less
`
`than this value because predicting input values from previous reconstructed values instead
`
`of previous input values results in a higher variance residual or error which is the input to
`
`the quantizer.
`
`18
`
`Ex. 1042 / Page 19 of 133
`
`

`

`To gain an appreciation for the benefits accrued by LPC over simple minded quantization
`
`it is worthwhile analyzing the asymptotic behaviour of G, in the Limit of infinite predictor
`
`I
`
`order. Our discussion closely follows a similar one in [15]. We will use the superscript p to
`
`denote the order of the predictor to which a particular parameter or matrix corresponds.
`
`It can be shown that,
`
`where I · I denotes determinant. Recognizing the recursive nature of this equation we can
`write,
`
`jP+l R,,,I = 1° R,,I IT iO'~
`
`p
`
`i=l
`
`p
`
`= CT~ II iu;.
`
`i=l
`
`(2.9)
`
`~e also know. th~t {Pu;} is a monotonically decreasing sequence, since we can a.chieve the
`lower bound on the residual energy· or p'th ~tage with a·p+· 1.order predictor by setting'the
`
`p+ I 'th coefficient to zero and carrying over the other coefficients from the p order predictor.
`
`Furthermore, the sequence ofPu; is clearly bounded below by zero. Therefore, this sequence
`has a limiting value. We will call this asymptotic error variance 000';. Furthermore, since
`the sequence of error variances converges and is always greater than zero, we also have the
`
`following,
`
`,,~~[a; ft iu;];; =00 a;.
`
`I
`
`•=1
`Using the fact that the determinant of a matrix is equal to the product of it's eigenvalues
`
`(2.10)
`
`and denoting the i'th eigenvalue of the p'th autocorrelation matrix by P .Xi, we can combine
`
`equations (2.9) and {2.10),
`
`p l ~
`oo u2 = lim u2 IT P ,\i
`
`r
`
`p-+oo
`
`[
`
`a
`
`i=l
`
`or,
`
`In [11] it is shown that,
`
`1 !tr
`lim - L log(P .x,) = -2
`
`1r
`
`-71'
`
`1 p
`p-+oo P i=l
`
`log( S.,a( w ))dw
`
`19
`
`Ex. 1042 / Page 20 of 133
`
`

`

`where S., is the power spectrum of s(n). So that finally we can say,
`1 Jtr log(S .. (w))dw]
`[2
`oou~ = e
`
`"'
`
`-11'
`
`or that,
`
`ooa -
`·-
`
`1 111'
`---2
`"'
`1
`---2 j 1og(S,.(w))dw
`
`s,.(w)<kil
`
`-ll'
`ll'
`
`(2.11)
`
`e "'
`Equation (2.11) states that the asymptotic prediction gain is equal to the ratio of the
`
`-ll'
`
`arithmetic and geomet1·ic means of the power spectrum of the input. It is easily seen
`
`that this ratio is unity onl1 when the input spectrum is flat indicating that the. signal is
`
`completely uncorrelated and consequently unpredictable. Because 00G is a strong function
`
`of the· 'fatness' of the input spectrum, it is also commonly referred to as the sp~ctral flatness
`
`.m~asure .(sfm).
`
`Another important property of the asymptotic prediction error is tha.t it i1:1 completely .
`
`uncorrelated or white. This is easily shown by appealing to the orthogonality principle of
`
`linear least squares estimation theory, which states that the minimum estimation error is
`
`uncorrelated with the parameters used in the linear estimation. Well, prediction is the same
`
`thing as es\':imation and hence the prediction error at index n is orthogonal to all previous
`
`values of s(n) and linear combinations thereof. Well, the prediction error at index n - l is
`
`a linear combination of previous values of s(n) and hence r(n) and r(n - 1) are orthogonal.
`
`The same argument applies for all values of n.
`
`The main conclusions of this analysis of the prediction error or residual which will be
`
`referred to repeatedly throughout this thesis are:
`
`• LPC prov~des the greatest coding gains for input signals with increasingly w:ion-ftat
`
`power spectra.
`
`• The asymptotic prediction error is fl.at.
`
`In real ap

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket