throbber
(12) United States Patent
`Gottesman et al.
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 7,010,482 B2
`Mar. 7, 2006
`
`US007010482B2
`
`(54) REW PARAMETRIC VECTOR
`QUANTIZATION AND DUAL-PREDICTIVE
`SEW VECTOR QUANTIZATION FOR
`WAVEFORM INTERPOLATIVE CODING
`
`(75) Inventors: Oded Gottesman, Goleta, CA (US);
`Allen Gersho, Goleta, CA (US)
`
`(73) Assignee: The Regents of the University of
`California, Oakland, CA (US)
`
`Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 543 days.
`
`Appl. No.: 09/811,187
`
`Filed:
`
`Mar. 16, 2001
`
`Prior Publication Data
`
`US 2002/0116184 A1
`
`Aug. 22, 2002
`
`Related US. Application Data
`
`Provisional application No. 60/190,371, ?led on Mar.
`17, 2000.
`
`Int. Cl.
`(2006.01)
`G10L 19/14
`US. Cl. .................................................... .. 704/222
`
`D.H. Pham et al., “Quantisation techniques for prototype
`Waveforms,” Fourth International Symposium on Signal
`Processing and Its Applications ’96, vol. 1, pp. 53-56, Aug.
`1996*
`Oded Gottesman et al., “Enhancing Waveform Interpolative
`Coding With Weighted REW Parametric Quantization,”
`IEEE Workshop on Speech Coding (2000), pp. 1-3.
`I.S. Burnett et al., “Multi-Prototype Waveform Coding Us
`ing Frame-By-Frame Analysis-By-Synthesis,” Department
`of Electrical and Computer Engineering, University of Wol
`longong, NSW, Australia (1997), pp. 1567-1570.
`I.S. Burnett et al., “NeW Techniques for Multi-Prototype
`Waveform Coding at 2.84kb/s,” Department of Electrical
`and Computer Engineering, University of Wollongong,
`NSW, Australia (1995), pp. 261-264.
`I.S. Burnett et al., “LoW Complexity Decomposition and
`Coding of Prototype Waveforms,” Dept. of Electrical and
`Computer Eng., University of Wollongong, NSW, 2522,
`Australia, pp. 23-24.
`I.S. Burnett et al., “A Mixed Prototype Waveform/Celp
`Coder for Sub 3KB/S,” School of Elecronic and Electrical
`Engineering, University of Bath, UK. BA2 7AY (1993), pp.
`II-175-II-178.
`
`(Continued)
`Primary Examiner—Susan McFadden
`(74) Attorney, Agent, or Firm—Fulbright & J aWorski
`
`Field of Classi?cation Search .............. .. 704/230,
`704/211, 219—223, 225, 229, 270, 500
`See application ?le for complete search history.
`
`(57)
`
`ABSTRACT
`
`(21)
`(22)
`(65)
`
`(60)
`
`(51)
`
`(52)
`(58)
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5/1996 Kleijn ...................... .. 704/205
`5,517,595 A *
`6,493,664 B1 * 12/2002 Udaya Bhaskar et al. .. 704/222
`6,691,092 B1 *
`2/2004 Udaya Bhaskar et al. .. 704/265
`
`OTHER PUBLICATIONS
`
`U. Bhasker et al., “Quantization of SEW and REW compo
`nents for 3.6 kbits/s coding based on PWI,” IEEE Workshop
`on Speech Coding Proceedings, pp. 99-101, Jun. 1999*
`
`BM, impliedl
`
`QUANTIZED
`REW I'M l
`
`MEANS
`
`PREDICTORS
`
`QUANTIZED
`REW
`A
`PAREMETER gM
`;
`|
`|
`'L——---* VECTOR
`QUANTIZER
`|-—————~ CODEBOOK
`l
`PITCH
`
`|
`
`An enhanced analysis-by-synthesis Waveform interpolative
`speech coder able to operate at 2.8 kbps. Novel features
`include dual-predictive analysis-by-synthesis quantization
`of the slowly-evolving Waveform, ef?cient parametrization
`of the rapidly-evolving Waveform magnitude, and analysis
`by-synthesis vector quantization of the rapidly evolving
`Waveform parameter. Subjective quality tests indicate that it
`exceeds G.723.1 at 5.3 kbps, and of G.723.1 at 6.3 kbps.
`
`8 Claims, 6 Drawing Sheets
`
`minwnz
`
`SPECTRAL
`WEIGHTINC
`Saint Lawrence Communications, LLC
`IPR2016-00704
`Exhibit 2016
`
`

`
`US 7,010,482 B2
`Page 2
`
`OTHER PUBLICATIONS
`
`Oded Gottesman, “Dispersion Phase Vector Quantization for
`Enhancement of Waveform Interpolative Coder,” Signal
`Compression Laboratory, Department of Electrical and
`Computer Engineering, University of California, Santa Bar
`bara, Calilfornia 93106, USA, pp. 1-4.
`Oded Gottesman et al., “Enhanced Waveform Interpolative
`Coding at 4 KBPS,” Signal Compression Laboratory,
`Department of Electrical and Computer Engineering,
`University of California, Santa Barbara, California 93106,
`USA, pp. 1-3.
`Oded Gottesman et al., “High Quality Enhanced Waveform
`Interpolative Coding at 2.8 KBPS,” IEEE International
`Conference on Acoustics, Speech, and Signal Processing,
`2000, pp. 1-4.
`Oded Gottesman et al., “Enhanced Analysis-By-Synthesis
`Waveform Interpolative Coding at 4 KBPS,” Signal
`Compression Laboratory, Department of Electrical and
`Computer Engineering, University of California, Santab
`Barbara, California 93106, USA, pp. 1-4.
`Daniel W. Grif?n et al., “Multiband Excitation Vocoder,”
`IEEE Transactions on Acoustics, Speech, and Signal
`Processing (1988) 36(8):1223-1235.
`W. Bastiaan Kleijn et al., “A Speech Coder Based on
`Decomposition of Characteristic Waveforms,” IEEE (1995),
`pp. 508-511.
`W. Bastiaan Kleijn et al., “Waveform Interpolation for
`Coding and Synthesis,” Speech Coding and Synthesis
`(1995), pp. 175-207.
`
`W. Bastiaan Kleijn et al., “Transformation and Decomposi
`tion of the Speech Signal for Coding,” IEEE Signal Proces
`ing Letters 1(9):136-138 (1994).
`W. Bastiaan Kleijn, “Encoding Speech Using Prototype
`Waveforms, ” IEE Transactions on Speech and Audio
`Processing 1(4):386-399 (1993).
`W. Bastiaan Kleijn, “Continuous Representations in Linear
`Predictive Coding,” Speech Research Department, AT&T
`Bell Laboratories, Murray Hill, NJ 07974 (1991), pp. 201
`204.
`W. Bastiaan Kleijn et al., “A LoW-Complexity Waveform
`Interpolation Coder,” Speech Codiing Research Depart
`ment, AT &T Bell Laboratories, 600 Mountain Avenue, Mur
`ray Hill, NJ 07974, USA (1996), pp. 212-215.
`R]. McAulay et al., “Sinusoidal Coding,” Speech Coding
`and Synthesis 4:121-173 (1995).
`
`Yair Shoham, “High-Quality Speech Coding at 2.4 to 4.0
`KBPS Based on Time Frequency Interpolation,” IEEE, pp.
`II-167-II-170 (1993).
`
`Yair Shoham, “Very LoW Complexity Interpolative Speech
`Coding at 1.2 to 2.4 KBPS,” IEEE, pp. 1599-1602 (1997).
`Yair Shoham, “LoW Complexity Speech Coding at 1.2 to 2.4
`kbps Based on Waveform Interpolation,” International
`Journal of Speech Technology 2:329-341 (1999).
`
`* cited by examiner
`
`

`
`U.S. Patent
`
`Mar. 7, 2006
`
`Sheet 1 0f 6
`
`US 7,010,482 B2
`
`REW PARAMETER 5
`
`F/G.
`
`E.
`’ Raw)
`
`VECTOR
`UA TIZER
`EBOOK
`
`E00)
`VECTOR OF
`QUANTIZED
`REW
`SPECTRA
`
`+
`
`3(0))
`VECTOR
`OF REW “*
`SPECTRA Maw)
`Raw)
`
`._
`
`VEC R
`QUAN
`R
`CODEBOOK
`
`

`
`U.S. Patent
`
`Mar. 7, 2006
`
`Sheet 2 0f 6
`
`US 7,010,482 B2
`
`258mm
`
`02:56;
`
`.5935
`
`022105;
`
`
`
`_ TN .21 E5
`
`Ill? @3522
`
`E: w
`~66?
`
`ESE/5o
`
`xoommaoo
`
`mohoammm
`
`xoomwaou
`
`
`
`
`
`AS; $252-$500 250252 2mm
`
`2% Z5
`
`A50 05
`
`M, 6E
`
`

`
`U.S. Patent
`
`Mar. 7, 2006
`
`Sheet 3 of 6
`
`US 7,010,482 B2
`
`355
`
`wzE_om;
`
`

`
`U.S. Patent
`
`Mar. 7, 2006
`
`Sheet 4 0f 6
`
`US 7,010,482 B2
`
`m, 6K
`
`E0 220
`
`
`
`4,4155% .EQQEE
`
`02:55;
`
`Ni
`
`E:
`
`as c w
`
`@2255
`
`
`
`
`
`“EX 222258 230258 51
`
`E3 “88>
`_ $528
`
`x8880
`
`
`
`
`
`._<m._.um_n_w “Eve on:
`
`

`
`U.S. Patent
`
`Mar. 7, 2006
`
`Sheet 5 0f 6
`
`US 7,010,482 B2
`
`OUTPUT SEW
`
`MEAN-REMOVED SEW
`
`BITS
`
`FIG.
`
`8
`HARMONICS
`RANGE
`
`9—14
`
`El 15-19
`
`E] 20-24
`
`E] 25-29
`
`ET 30-35
`
`[I] 36-69
`
`VOICED
`
`INTERMEDIATE
`
`UNVOICED
`
`_ _ _ _ _ _ _ _ _
`
`m m m M u w a 6 4 2 0
`
`
`
`as mzm BEE; 5&8
`
`

`
`U.S. Patent
`
`Mar. 7, 2006
`
`Sheet 6 6f 6
`
`US 7,010,482 B2
`
`F/G.
`
`9
`
`I
`
`0 9 8 _/ 6 5 A.
`
`_ _ _ _ _ _
`
`1
`0.5 —
`
`—0.5 —
`
`1
`II
`0.5 —\
`
`—0.5 ~
`
`—0.5 —
`
`HARMONICS
`RANGE
`
`E3 9-14
`
`El 15—19
`
`E] 20-24
`
`El 25-29
`
`El 30-35
`
`[I 36-69
`
`UNVOICED
`P76.
`
`70
`
`VOICED
`
`INTERMEDIATE
`
`VOICED RANGE
`I
`l
`EW PREDICTOR
`
`REW PREDICATOR
`
`I
`I
`l
`8
`10
`6
`INTERMEDIATE RANGE
`
`EW
`
`EDICATOR
`
`T
`
`I
`
`SEW PREDICATOR
`
`I
`I
`8
`6
`UNVOICED RANGE
`
`I
`10
`
`I
`12
`
`I
`I
`REW PREDICTOR
`
`SEW PREDICTOR
`I
`I
`6
`8
`HARMONICS
`
`I
`TO
`
`2 ‘
`
`

`
`US 7,010,482 B2
`
`1
`REW PARAMETRIC VECTOR
`QUANTIZATION AND DUAL-PREDICTIVE
`SEW VECTOR QUANTIZATION FOR
`WAVEFORM INTERPOLATIVE CODING
`
`CROSS REFERENCE TO RELATED
`APPLICATION
`
`This application claims the bene?t of Provisional Patent
`Application Ser. No. 60/190,371, ?led Mar. 17, 2000 Which
`application is herein incorporated by reference.
`
`10
`
`BACKGROUND OF THE INVENTION
`
`15
`
`The present invention relates to vector quantization (VQ)
`in speech coding systems using Waveform interpolation.
`In recent years, there has been increasing interest in
`achieving toll-quality speech coding at rates of 4 kbps and
`beloW. Currently, there is an ongoing 4 kbps standardiZation
`effort conducted by an international standards body (The
`International Telecommunications Union-Telecommunica
`tion (ITU-T) StandardiZation Sector). The eXpanding variety
`of emerging applications for speech coding, such as third
`generation Wireless netWorks and LoW Earth Orbit (LEO)
`systems, is motivating increased research efforts. The speech
`quality produced by Waveform coders such as code-excited
`linear prediction (CELP) coders degrades rapidly at rates
`beloW 5 kbps; see B. S. Atal, and M. R. Schroeder, (1984)
`“Stochastic Coding of Speech at Very LoW Bit Rate”, Proc.
`Int Conf. Comm, Amsterdam, pp. 1610—1613.
`On the other hand, parametric coders, such as: the Wave
`form-interpolative (WI) coder, the sinusoidal-transform
`coder (STC), and the multiband-eXcitation (MBE) coder,
`produce good quality at loW rates but they do not achieve toll
`quality; see Y. Shoham, IEEE ICASSP’93, Vol. II, pp.
`167—170 (1993); I. S. Burnett, and R. J. Holbeche, (1993),
`IEEE ICASSP’93, Vol. II, pp. 175—178; W. B. Kleijn, (1993),
`IEEE Trans. Speech andAudio Processing, Vol. 1, No. 4, pp.
`386—399; W. B. Kleijn, and J. Haagen, (1994), IEEE Signal
`ProcessingLetters, Vol. 1, No. 9, pp. 136—138; W. B. Kleijn,
`and J. Haagen, (1995), IEEE ICASSP’95, pp. 508—511; W.
`B. Kleijn, and J. Haagen, (1995), in Speech Coding Synthe
`sis by W. B. Kleijn and K. K. PaliWal, Elsevier Science B.
`V., Chapter 5, pp. 175—207; I. S. Burnett, and G. J. Bradley,
`(1995),IEEE ICASSP’95, pp. 261—263, 1995; I. S. Burnett,
`and G. J. Bradley, (1995), IEEE Workshop on Speech
`Coding for Telecommunications, pp. 23—24; I. S. Burnett,
`and D. H. Pham, (1997), IEEE ICASSP’97, pp. 1567—1570;
`W. B. Kleijn, Y. Shoham, D. Sen, and R. Haagen, (1996),
`IEEE ICASSP’96, pp. 212—215; Y. Shoham, (1997), IEEE
`ICASSP’97, pp. 1599—1602; Y. Shoham, (1999), Interna
`tional Journal of Speech Technology, KluWer Academic
`Publishers, pp. 329—341; R. J. McAulay, and T. F. Quatieri,
`(1995),in Speech Coding Synthesis by W. B. Kleijn and K.
`K. PaliWal, Elsevier Science B. V., Chapter 4, pp. 121—173;
`and D. Grif?n, and J. S. Lim, (1988),IEEE Trans. ASSR Vol.
`36, No. 8, pp. 1223—1235. This is largely due to the lack of
`robustness of speech parameter estimation, Which is com
`monly done in open-loop, and to inadequate modeling of
`non-stationary speech segments.
`Commonly in WI coding, the similarity betWeen succes
`sive rapidly evolving Waveform (REW) magnitudes is
`exploited by doWnsampling and interpolation and by con
`strained bit allocation; see W. B. Kleijn, and J. Haagen,
`(1995), IEEE ICASSP’95, pp. 508—511. In a previous
`Enhanced Waveform Interpolative (EWI) coder the REW
`magnitude Was quantized on a Waveform by Waveform base;
`see O. Gottesman and A. Gersho, (1999), “Enhanced Wave
`form Interpolative Coding at 4 kbps”, IEEE Speech Coding
`Workshop, pp. 90—92, Finland; Finland. O. Gottesman and
`
`25
`
`35
`
`45
`
`55
`
`65
`
`2
`A. Gersho, (1999), “Enhanced Analysis-by-Synthesis Wave
`form Interpolative Coding at 4 kbps”, EUROSPEECH’99,
`pp. 1443—1446, Hungary.
`
`SUMMARY OF THE INVENTION
`
`The present invention describes novel methods that
`enhance the performance of the WI coder, and alloWs for
`better coding ef?ciency improving on the above 1999 Got
`tesman and Gersho procedure. The present invention incor
`porates analysis-by-synthesis (AbS) for parameter estima
`tion, offers higher temporal and spectral resolution for the
`REW, and more efficient quantiZation of the sloWly-evolving
`Waveform
`In particular, the present invention pro
`poses a novel ef?cient parametric representation of the REW
`magnitude, an ef?cient paradigm for AbS predictive VQ of
`the REW parameter sequence, and dual-predictive AbS
`quantiZation of the SEW.
`More particularly, the invention provides a method for
`interpolative coding input signals, the signals decomposed
`into or composed of a sloWly evolving Waveform and a
`rapidly evolving Waveform having a magnitude, the method
`incorporating at least one various, preferably combinations
`of the folloWing steps or can include all of the steps:
`(a) AbS VQ of the REW;
`(b) parametriZing the magnitude of the REW;
`(c) incorporating temporal Weighting in the AbS VQ of
`the REW;
`(d) incorporating spectral Weighting in the AbS VQ of the
`REW;
`(e) applying a ?lter to a vector quantiZer codebook in the
`analysis-by-synthesis vector-quantiZation of the rapidly
`evolving Waveform Whereby to add self correlation to the
`codebook vectors; and
`(f) using a coder in Which a plurality of bits therein are
`allocated to the rapidly evolving Waveform magnitude.
`In addition, one can combine AbS quantiZation of the
`sloWly evolving Waveform With any or all of the foregoing
`parameters.
`The neW method achieves a substantial reduction in the
`REW bit rate and the EWI achieves very close to toll quality,
`at least under clean speech conditions. These and other
`features, aspects, and advantages of the present invention
`Will become better understood With regard to the folloWing
`detailed description, appended claims, and accompanying
`draWings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a REW Parametric Representation;
`FIG. 2 is a REW Parametric VQ;
`FIG. 3 is a REW Parametric Representation AbS VQ;
`FIG. 4 is a REW Parametric Representation Simpli?ed
`AbS VQ;
`FIG. 5 is a REW Parametric Representation Simpli?ed
`Weighted AbS VQ;
`FIG. 6 is a block diagram of the Dual Predictive AbS
`SEW vector quantiZation;
`FIG. 7 is a Weighted Signal-to-Noise Ratio (SNR) for
`Dual Predictive AbS SEW VQ;
`FIG. 8 is an output Weighted SNR for the 18 codebooks,
`9-bit AbS SEW VQ;
`FIG. 9 is a mean-removed SEW’s Weighted SNR for the
`18 codebooks, 9-bit AbS SEW VQ; and
`FIG. 10 are predictors for three REW parameter ranges.
`
`DETAILED DESCRIPTION
`
`In very loW bit rate WI coding, the relation betWeen the
`SEW and the REW magnitudes Was exploited by computing
`
`

`
`US 7,010,482 B2
`
`3
`the magnitude of one as the unity complement of the other;
`see W. B. Kleijn, and J. Haagen, (1995), “A Speech Coder
`Based on Decomposition of Characteristic Waveforms”,
`IEEE ICASSP’95, pp. 508—511; W. B. Kleijn, and J. Haagen,
`(1995), “Waveform Interpolation for Coding and Synthesis”,
`in Speech Coding Synthesis by W. B. Kleijn and K. K.
`PaliWal,Elsevier Science B. V, Chapter 5, pp. 175—207; I. S.
`Burnett, and G. J. Bradley, (1995), “New Techniques for
`Multi-Prototype Waveform Coding at 2.84 kb/s”, IEEE
`ICASSP’95, pp. 261—263, 1995; I. S. Burnett, and G. J.
`Bradley, (1995), “LoW Complexity Decomposition and
`Coding of Prototype Waveforms”, IEEE Workshop on
`Speech Coding for Telecommunications, pp. 23—24; I. S.
`Burnett, and D. H. Pham, (1997), “Multi-Prototype Wave
`form Coding using Frame-by-Frame Analysis-by-Synthe
`sis”, IEEE ICASSP’97, pp. 1567—1570; W. B. Kleijn, Y.
`Shoham, D. Sen, and R. Haagen, (1996), “A LoW-Complex
`ity Waveform Interpolation Coder”, IEEE ICASSP’96, pp.
`212—215; Y. Shoham, (1997), “Very LoW Complexity Inter
`polative Speech Coding at 1.2 to 2.4 kbps”, IEEE
`ICASSP’97, pp. 1599—1602; Y. Shoham, (1999), “LoW
`Complexity Speech Coding at 1.2 to 2.4 kbps Based on
`Waveform Interpolation”, International Journal of Speech
`Technology, KluWer Academic Publishers, pp. 329—341.
`Also, since the sequence of SEW magnitude evolves
`sloWly, successive SEWs exhibit similarity, offering oppor
`tunities for redundancy removal. Additional forms of redun
`dancy that may be exploited for coding ef?ciency are: (a) for
`a ?xed SEW/REW decomposition ?lter, the mean SEW
`magnitude increases With the pitch period and (b) the
`similarity betWeen successive SEWs, also increases With the
`pitch period. In this Work We introduce a novel “dual
`predictive” AbS paradigm for quantizing the SEW magni
`tude that optimally exploits the information about the cur
`rent quantized REW, the past quantized SEW, and the pitch,
`in order to predict the current SEW.
`
`1O
`
`15
`
`25
`
`4
`polynomials; see W. B. Kleijn, Y Shoham, D. Sen, and R.
`Haagen, (1996),IEEE ICASSP’96, pp. 212—215; Y Shoham,
`(1997), IEEE ICASSP’97, pp. 1599—1602; Y Shoham,
`(1999), International Journal of Speech Technology, KluWer
`Academic Publishers, pp. 329—341. Such a representation
`usually produces a smoother REW magnitude, and improves
`the perceptual quality. Suppose the REW magnitude, R(u)),
`is represented by a linear combination of orthonormal func
`tions, IpL-(w):
`
`I41
`RW) = 2mm 0 5m in
`
`(1)
`
`Where no is the angular frequency, and I is the representation
`order. The REW magnitude is typically an increasing func
`tion of frequency, Which, can be coarsely quantized With a
`loW number of bits per Waveform Without signi?cant per
`ceptual degradation. Therefore, it may be advantageous to
`represent the REW magnitude in a simple, but perceptually
`relevant manner. Consequently We model the REW by the
`folloWing parametric representation, R(u),E):
`
`I41
`1%, g) = Zwalmw), 0 s a) 5 7r; 0 S g 51
`[:0
`
`(2)
`
`, ,_1(E)]T is a parametric vector the
`.
`.
`Where \A((E)=[\A(O(E), .
`representation model subspace, and E is the “unvoicing”
`parameter Which is zero for a fully voiced spectrum, and one
`for a fully unvoiced spectrum. Thus R(u),E) de?nes a tWo
`dimensional surface Whose cross sections for each value of
`E give a particular REW magnitude spectrum, Which is
`de?ned merely by specifying a scalar parameter value.
`A simple and practical Way for parametric representation
`of the REW is, for example, by a parametric linear combi
`nation of basis functions, such as polynomials With para
`metric coef?cients, namely:
`
`I41
`iaogpiwgm, 0 swsmosgs 1
`[:0
`
`(3)
`
`35
`
`40
`
`45
`
`Introduction to REW Quantization
`The REW represents the rapidly changing unvoiced
`attribute of speech. Commonly in WI systems, the REW is
`quantized on a Waveform by Waveform base. Hence, for loW
`rate WI systems having long frame size, and a large number
`of Waveforms per frame, the relative bitrate required for the
`REW becomes signi?cantly excessive. For example, con
`sider a potential 2 kbps system Which uses a 240 sample
`frame, 12 Waveforms per frame, and Which quantizes the
`SEW by alternating bit allocation of 3 bit and 1 bit per
`Waveform. The REW bitrate is then 24 bit per frame, or 800
`kbps Which is 40% of the total bitrate. This example
`demonstrates the need for a more ef?cient REW quantiza
`tion.
`Efficient REW quantization can bene?t from tWo obser
`vations: (1) the REW magnitude is typically an increasing
`function of the frequency, Which suggests that an ef?cient
`parametric representation may be used; (2) one can observe
`a similarity betWeen successive REW magnitude spectra,
`Which may suggest a potential gain by employing predictive
`VQ on a group of adjacent REWs. The next tWo sections
`propose REW parametric representation, and its respective
`
`REW Parametric Representation
`Direct quantization of the REW magnitude is a variable
`dimension quantization problem, Which may result in spend
`ing bits and computational effort on perceptually irrelevant
`information. Asimple and practical Way to obtain a reduced,
`and ?xed, dimension representation of the REW is With a
`linear combination of basis functions, such as orthonormal
`
`For practical considerations assume that the parametric
`representation is a pieceWise linear function of E, and may
`therefore be represented by a set of N uniformly spaced
`spectra, as illustrated in FIG. 1.
`
`55
`
`REW Parametric Vector Quantization
`One can observe the similarity betWeen successive REW
`magnitude spectra, Which may suggest a potential gain by
`VQ of a set of successive REWs. FIG. 2 illustrates a simple
`parametric VQ system for a vector of REW spectra. The
`input is an M dimensional vector of REW magnitude spec
`tra,
`
`I—Q((D)=IR1((D)> R209): -
`
`-
`
`-
`
`> RM(0‘))]T
`
`(4)
`
`65
`
`and the VQ output is an index, j, Which determines a
`quantized parameter vector, E:
`,éMlT
`
`

`
`US 7,010,482 B2
`
`5
`Which parametrically determines a vector of quantized spec
`tra:
`
`314)] T
`
`(6)
`
`The encoder searches, in the parameter codebook Cq(i§), for
`the parameter vector Which minimizes the distortion:
`
`10
`
`(7)
`
`= argmi Z
`gems) W1
`
`6
`
`-continued
`
`lil
`
`= argmi Z (w! — war}
`gems) [:0
`
`In VQ case, the quantized parameter vector is given by:
`
`M
`
`MR... Row}
`3: argmi Z
`gecqta W1
`
`A
`
`(14)
`
`For example, suppose the input REW magnitude is repre
`sented by an I-th dimensional vector of function coef?cients,
`y, given by:
`
`20
`
`For a set of M input REWs, each is of Which represented by
`a vector of polynomial coef?cients, ym, Which form a P><M
`input coefficient matrix, I“:
`
`25
`
`B. PieceWise Linear Parametric Representation
`In order to have a simple representation that is computa
`tionally efficient and avoids excessive memory require
`ments, We model the tWo dimensional surface by a pieceWise
`linear parametric representation. Therefore, We introduce a
`set of N uniformly spaced spectra, {I1(uu,én}n=ON '1. Then the
`parametric surface is de?ned by linear interpolation accord
`ing t:
`
`The inverse VQ output is a vector of M quantized REWs,
`Which form the quantized function coefficient matrix:
`
`30
`
`A
`
`gm sgsgnaw %:A=§.—§H
`
`A
`
`_ A i
`
`A
`
`A
`
`maneme». .
`
`.
`
`. . Mo]
`
`(10)
`
`Which is used by the decoder to compute the quantized
`spectra.
`A. Quantization Using Orthonormal Functions
`Orthonormal functions, such as polynomials, may be used
`for efficient quantization of the REW; see W. B. Kleij n, et al.,
`(1996), IEEE ICASSP’96, pp. 212—215; Y. Shoham, (1997),
`IEEE ICASSP’97, pp. 1599—1602; Y. Shoham, (1999), Inter
`national Journal of Speech Technology, KluWer Academic
`Publishers, pp. 329—341. Consider REW magnitude, R(u)),
`represented by a linear combination of orthonormal func
`tions, lpl-(uu):
`
`Because this representation is linear, the coefficients of
`35 13(uufé) are” linear combinations of the coefficients of R(u),
`End) and R(u),En). Hence,
`i(E)=(1—u)i.,1+ui.
`
`(16)
`
`40 Where is the coefficient vector of the n-th REW magnitude
`function representation:
`
`i.=i(é.)
`
`45 In this case, the distortion may be interpolated by:
`
`A
`
`A
`
`A
`
`D(R. Re)» = f”|R(w)—(1—w)R(w,§nA1) —
`O
`
`50
`
`Mm), 30PM
`=||v-<1-wm-1-m||2
`
`(17)
`
`18
`
`(
`
`)
`
`Which is modeled using the parametric representation:
`
`lil
`M. a = Zmww. 0 s w s n; 0 54* s1
`[:0
`
`55 The above can be easily generalized to the parameter VQ
`case. The optimal interpolation factor that minimizes the
`distortion betWeen tWo representation vectors is given by:
`
`(12)
`
`The quantized REW parameter is then given by:
`
`60
`
`_ on mam-n1)
`‘10v! — %2
`M7,, — Wm H
`
`‘19)
`
`gems)
`
`65 and the respective optimal parameter value, Which is a
`continuous variable betWeen zero and one, is given by:
`
`

`
`US 7,010,482 B2
`
`7
`This result allows a rapid search for the best unvoicing
`parameter value needed to transform the coef?cient vector to
`a scalar parameter, folloWed by the corresponding quanti
`Zation scheme, as described in the section 4.
`C. Weighted Distortion Quantization
`Commonly in speech coding, the magnitude is quantiZed
`using Weighted distortion measure. In this case the quantiZed
`REW parameter is then given by:
`
`8
`Note that no bene?t is obtained here by using orthonormal
`functions, therefore any function representation may be
`used. The above can be easily generaliZed to the parameter
`VQ case. The optimal parameter that minimiZes the spec
`trally Weighted distortion betWeen tWo representation vec
`tors is given by:
`
`and the orthonormal function simpli?cation, given in equa
`tion (13), cannot be used. In this case, the Weighted distor
`tion betWeen the input and the parametric representation
`modeled spectra is equal to:
`
`is the Weighted correlation matrix of the
`Where
`orthonormal functions, its elements are:
`
`wtjrwwbf WwWwwj-(wdw.
`
`n
`
`0
`
`23
`
`)
`
`(
`
`is the modeled
`y is the input coef?cient vectors, and
`parametric coef?cient vector. In VQ case, the quantiZed
`parameter vector is given by:
`
`D. Weighted Distortion—PieceWise Linear Parametric
`Representation
`Again, for practical considerations assume that the para
`metric representation is pieceWise linear, and may be rep
`resented by a set of N spectra, {I1(uu,én)}n=O '1. For the
`pieceWise linear representation, the interpolated quantiZed
`coef?cient vector is:
`
`In the case Where parameter VQ is employed, the interpo
`lation alloWs for a substantial simpli?cation of the search
`computations. In this case, the distortion can be interpolated:
`
`15
`
`25
`
`35
`
`40
`
`45
`
`55
`
`65
`
`and the respective optimal parameter value, Which is a
`continuous variable betWeen Zero and one, is given by
`equation (20). This result alloWs a rapid search for the best
`unvoicing parameter value needed to transform the coef?
`cient vector to a scalar parameter, for encoding or for VQ
`design. Alternatively, in order to eliminate using the matrix
`11), the scalar product may rede?ned to incorporate the
`time-varying spectral Weighting. The respective orthonor
`mal basis functions then satisfy:
`
`Where 6(i-j) denotes Kroneker delta. The respective param
`eter vector is given by:
`
`, 1p,_1]T is an I-th dimensional
`.
`.
`Where 1p(w)=[1pO, 1P1, .
`vector of time-varying orthonormal functions.
`REW Parameter Analysis-By-Synthesis VQ
`This section presents the AbS VQ paradigm for the REW
`parameter. The ?rst presentation is a system Which quantiZes
`the REW parameter by employing spectral based AbS. Then
`simpli?ed systems, Which apply AbS to the REW parameter,
`are presented.
`A. REW Parameter Quantization by Magnitude AbS VQ
`The novel Analysis-by-Synthesis (AbS) REW parameter
`VQ technique is illustrated in FIG. 3. An excitation vector
`cij-(m) (m=1; .
`.
`. , M) is selected from the VQ codebook and
`is fed through a synthesis ?lter to obtain a parameter vector
`(synthesiZed quantiZed) Which is then mapped to
`quantiZed a representation coef?cient vectors
`This
`is compared With a sequence of input representation coef
`?cient vectors y(m) and each is spectrally Weighted. Each
`spectrally Weighted error is then temporally Weighted, and a
`distortion measure is obtained. A search through all candi
`date excitation vectors determines an optimal choice. The
`synthesis ?lter in FIG. 3 can be vieWed as a ?rst order
`predictor in a feedback loop. (While shoWn here is an
`auto-regressive synthesis ?lter, in other arrangements mov
`ing-average (MA) synthesis ?lter may be used.) By alloWing
`the value of the predictor parameter P to change, it becomes
`a “switched-predictor” scheme. SWitched-prediction is
`introduced to alloW for different levels of REW parameter
`correlation.
`The scheme incorporates both spectral Weighting and
`temporal Weighting. The spectral Weighting is used for the
`distortion betWeen each pair of input and the quantiZed
`
`

`
`US 7,010,482 B2
`
`9
`spectra. In order to improve SEW/REW mixing, particularly
`in mixed voiced and unvoiced speech segments, and to
`increase speech crispness, especially for plosives and onsets,
`temporal Weighting is incorporated in the AbS REW VQ.
`The temporal Weighting is a monotonic function of the
`temporal gain. TWo codebooks are used, and each codebook
`has an associated predictor coef?cient, P1 and P2. The
`quantization target is an M-dimensional vector of REW
`spectra. Each REW spectrum is represented by a vector of
`basis function coef?cients denoted by
`The search for
`the minimal WMSE is performed over all the vectors, 6,].
`(m), of the tWo codebooks for i=1, 2. The quantized REW
`function coefficients vector,
`is a function of the
`quantized parameter
`Which is obtained by passing the
`quantized vector, cij-(m), through the synthesis ?lter. The
`Weighted distortion betWeen each pair of input and quan
`tized REW spectra is calculated. The total distortion is a
`temporally-Weighted sum of the M spectrally Weighted
`distortions. Since the predictor coef?cients are knoWn, direct
`VQ can be used to simplify the computations. For a piece
`Wise linear parametric REW representation, a substantial
`simpli?cation of the search computations may be obtained
`by interpolating the distortion betWeen the representation
`spectra set, as explained in sections 3.B. and 3D.
`A sequence of quantized parameter, such as c(k), is
`formed by concatenating successive quantized vectors, such
`as {cl-j-(m)}m=lM. The quantized parameter is computed
`recursively by:
`
`10
`
`15
`
`25
`
`Where k is the time index of the coded Waveform.
`B. Simpli?ed REW Parameter AbS VQ
`The above scheme maps each quantized parameter to
`coef?cient vector, Which is used to compute the spectral
`distortion. To reduce complexity, such mapping, and spectral
`distortion computation, Which contribute to the complexity
`of the scheme, may be eliminated by using the simpli?ed
`scheme described beloW. For a high rate, and a smooth
`representation surface I1(u),§), the total distortion is equal to
`the sum of modeling distortion and quantization distortion:
`
`35
`
`40
`
`45
`
`M: l S u
`
`u
`
`M: S L
`
`The quantization distortion is related to the quantized
`parameter by:
`
`55
`
`10
`
`Which is linearly related to the REW parameter squared
`quantization error,
`and, therefore, justi?es
`direct VQ of the REW parameter.
`B.1. Simpli?ed REW Parameter AbS VQ—Non Weighted
`Distortion
`FIG. 4 illustrates a simpli?ed AbS VQ for the REW
`parametric representation. The encoder maps the REW
`magnitude to an unvoicing REW parameter, and then quan
`tizes the parameter by AbS VQ. Initially, the magnitudes of
`the M REWs in the frame are mapped to coef?cient vectors,
`{y(m)}m=1M. Then, for each coefficient vector, a search is
`performed to ?nd the optimal representation parameter, i@(y),
`using equation (20), to form an M-dimensional parameter
`vector for the current frame, {E(y(m))}m=1M. Finally, the
`parameter vector is encoded by AbS VQ. The decoded
`spectra, {I1(uu,é(m))}m=1M, are obtained from the quantized
`parameter vector, {E(m)}m=1M, using equation (15). This
`scheme alloWs for higher temporal, as Well as spectral REW
`resolution, compared to the common method described in W.
`B. Kleijn, et al, IEEE ICASSP’95, pp. 508—511 (1995), since
`no doWnsampling is performed, and the continuous param
`eter is vector quantized in AbS.
`B.2. Simpli?ed REW Parameter AbS VQ—Weighted
`Distortion
`The simpli?ed quantization scheme is improved to incor
`porate spectral and temporal Weightings, as illustrated in
`FIG. 5. The REW parameter vector is ?rst mapped to REW
`parameter by minimizing a distortion, Which is Weighted by
`the coef?cient spectral Weighting matrix 1P, as described in
`section 3.D. Then, the resulted REW parameter is used to
`compute a Weighting, WS(E(m)), Which We choose to be the
`spectral sensitivity to the REW parameter squared quanti
`zation error,
`given by:
`
`For the pieceWise linear representation case, using equation
`(33), the folloWing equation is obtained:
`
`mam» =
`
`0A T 6A
`"(ll
`5M)
`
`(35)
`
`M: S it
`
`Which, for the pieceWise linear representation case, is equal
`to
`
`65
`
`The above derivative can be easily computed off line.
`Additionally, a temporal Weighting, in form of monotonic
`function of the gain, denoted by W,(g(m)), is used to give
`relatively large Weight to Waveforms With larger gain values.
`
`

`
`US 7,010,482 B2
`
`11
`The AbS REW parameter quantization is computed by
`minimizing the combined spectrally and temporally
`Weighted distortion:
`
`M
`
`mil
`
`(36)
`
`The Weighted distortion scheme improves the reconstructed
`speech quality, most notably in mixed voiced and unvoiced
`speech segments. This may be explained by an improvement
`in REW/SEW mixing.
`Dual Predictive AbS SEW Quantization
`FIG. 6 illustrates a Dual Predictive SEW AbS VQ scheme
`Which uses tWo observables, (a) the quantized REW, and (b)
`the past quantized SEW, to jointly predict the current SEW.
`Although We refer to the operator on each observable as a
`“predictor”, in fact both are components of a single opti
`mized estimator. The SEW and the REW are complex
`random vectors, and their sum is a residual vector having
`elements Whose magnitudes have a mean value of unity. In
`loW bit-rate WI coding, the relation betWeen the SEW and
`the REW magnitudes Was approximated by computing the
`magnitude of one as the unity complement of the other.
`Suppose lrMl denotes the spectral magnitude vector

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket