throbber
US007584095B2
`US007584095B2
`
`(12) United States Patent
`US 7,584,095 B2
`(12) United States Patent
`(10) Patent N0.:
`US 7,584,095 B2
`(10) Patent No.:
`*Sep. 1, 2009
`Gottesman et al.
`Gottesman et a1.
`(45) Date of Patent:
`(45) Date of Patent:
`*Sep. 1, 2009
`
`(54)
`(54)
`
`(75)
`(75)
`
`(73)
`(73)
`
`(*)
`
`(21)
`(21)
`(22)
`(22)
`(65)
`(65)
`
`(62)
`(62)
`
`(60)
`(60)
`
`(51)
`(51)
`
`(52)
`(52)
`(58)
`(58)
`
`(56)
`(56)
`
`REW PARAMETRIC VECTOR
`REW PARAMETRIC VECTOR
`QUANTIZATION AND DUAL-PREDICTIVE
`QUANTIZATION AND DUAL-PREDICTIVE
`SEW VECTOR QUANTIZATION FOR
`SEW VECTOR QUANTIZATION FOR
`WAVEFORM INTERPOLATIVE CODING
`WAVEFORM INTERPOLATIVE CODING
`
`Inventors: Oded Gottesman, Goleta, CA (US);
`Inventors: Oded Gottesman, Goleta, CA (US);
`Allen Gersho, Goleta, CA (US)
`
`_
`Notice:
`Nonce:
`
`
`Allen Gersho, Goleta, Assignee: The_ Regfmts of the University of
`Assignee: The Regents of the University of
`California, Oakland, CA (US)
`Callfornla, Oakland: CA (Us)
`_
`_
`_
`_
`Subject to any disclaimer, the term of this
`Sub?q to any dlsclalmeri the term Ofthls
`patent is extended or adjusted under 35
`patem 15 extended Or adlusted under 35
`U.S.C. 154(b) by 98 days.
`U'S'C' 154(1)) by 98 days‘
`1 d_
`_
`Th
`_
`b_
`This patent is subject to a terminal dis-
`1 1.5 Patent 15 Su Ject to a tenmna 1S-
`claimer.
`C almer'
`APPI' NO‘: 11/234,631
`Appl. No.: 11/234,631
`
`Filed:
`Filed:
`
`sep_ 23 2005
`Sep. 23, 2005
`a
`Prior Publication Data
`Prior Publication Data
`
`US 2006/0069554 A1
`US 2006/0069554 A1
`
`Mar. 30, 2006
`Mar. 30, 2006
`
`-
`-
`Related US. Application Data
`RltdU.S.Al t Dt
`PP lea Ion a a
`e a 6
`Division of application No. 09/811,187, filed on Mar.
`Division of application No. 09/811,187, ?led on Mar.
`16, 2001, now Pat. No. 7,010,482.
`16, 2001, noW Pat. No. 7,010,482.
`Provisional application No. 60/190,371, filed on Mar.
`Provisional application No. 60/190,371, ?led on Mar.
`17, 2000.
`17 2000.
`’
`Int. Cl.
`Int_ CL
`(200601)
`G10L 19/14
`(2006.01)
`G10L 19/14
`US. Cl.
`...................................................... 704/222
`US. Cl. .................................................... .. 704/222
`Field of Classification Search .................. 704/222
`Field of Classi?cation Search ................ .. 704/222
`See application file for complete search history.
`See application ?le for complete search history.
`References Cited
`.
`C t d
`R f
`e erences l e
`U.S. PATENT DOCUMENTS
`U.S. PATENT DOCUMENTS
`5,924,061 A *
`7/1999 Shoham ...................... 704/218
`5,924,061 A *
`7/1999 Shoham .................... .. 704/218
`
`7/2002 Udaya Bhaskar et a1.
`6,418,408 B1 *
`7/2002 Udaya Bhaskar et a1.
`6,418,408 B1 *
`6,493,664 B1 * 12/2002 Udaya Bhaskar et a1.
`6,493,664 B1 * 12/2002 Udaya Bhaskar et a1.
`6,691,092 B1 *
`2/2004 Udaya Bhaskar et a1.
`6,691,092 B1* 2/2004 Udaya Bhaskar et a1.
`
`704/219
`704/219
`704/222
`704/222
`704/265
`704/265
`
`OTHER PUBLICATIONS
`
`Oded Gottesman et al., “Enhancing Waveform Interpolative Coding
`Oded Gottesman et a1., “Enhancing Waveform Interpolative Coding
`With Weighted REW Parametric Quantization,” IEEE Workshop on
`With Weighted REW Parametric Quantization,” IEEE Workshop on
`Speech Coding (2000), pp. 1-3.
`Speech Coding (2000), pp. 1-3.
`I.S. Burnett et a1., “Multi-Prototype Waveform Coding Using Frame-
`I.S. Burnett et al., “Multi-Prototype Waveform Coding Using Frame
`By-Frame Analysis-By-Synthesis,” Department of Eelctrical and
`By-Frame Analysis-By-Synthesis,” Department of Eelctrical and
`Computer Engineering, University of Wollongong, NSW, Australia
`Computer Engineering, University of Wollongong, NSW, Australia
`(1997), pp. 1567-1570.
`(1997), pp. 1567-1570.
`I.S. Burnett et a1., “New Techniques for Multi-Prototype Waveform
`I.S. Burnett et al., “New Techniques for Multi-Prototype Waveform
`Coding at 2.84kb/s,” Department of Electrical and Computer Engi-
`Coding at 2.84kb/s,” Department of Electrical and Computer Engi
`neering, University ofWollongong, NSW, Australia (1995), pp. 261 -
`r212e‘r1ng,Un1vers1ty ofWollongong, NSW, Austral1a (1995), pp. 261
`264.
`
`LS. Burnett et a1., “Low Complexity Decomposition and Coding 0f
`LS. Burnett et al., “Low Complexity Decomposition and Coding of
`Prototype Waveforms,” Dept. of Electrical and Computer Eng., Uni-
`Prototype Waveforms,” Dept. of Electrical and Computer Eng., Uni
`versity of Wollongong, NSW, 2522, Australia, pp. 23-24.
`versity ofWollongong, NSW, 2522, Australia, pp. 23-24.
`
`(Continued)
`(Continued)
`Primary Examinerisusan McFadden
`Primar ExamineriSusan McFadden
`y
`(74) Attorney, Agent, or FirmiBerliner & Associates
`(74) Attorney, Agent, or FirmiBerliner & Associates
`
`(57)
`(57)
`
`ABSTRACT
`ABSTRACT
`
`An enhanced analysis-by-synthesis waveform interpolative
`An enhanced analysis-by-synthesis Waveform interpolative
`speech coder able to operate at 2.8 kbps. Novel features
`speech coder able to operate at 2.8 kbps. Novel features
`include dual-predictive analysis-by-synthesis quantization of
`include dual-predictive analysis-by-synthesis quantization of
`the slowly-evolving waveform, efficient parametrization of
`the slowly-evolving Waveform, ef?eient parametrization of
`the rapidly-evolving waveform magnitude, and analysis-by-
`the rapidly-evolving Waveform magnitude, and analysis-by
`synthesis vector quantization of the rapidly evolving wave-
`synthesis vector quantization of the rapidly evolving Wave
`form parameter. Subjective quality tests indicate that
`it
`form parameter. Subjective quality tests indicate that it
`exceeds G.723.1 at 5.3 kbps, and of G.723.1 at 6.3 kbps.
`exceeds G.723.1 at 5.3 kbps, and of G.723.1 at 6.3 kbps.
`
`18 Claims, 6 Drawing Sheets
`18 Claims, 6 Drawing Sheets
`
`v0
`
`VQ-1
`
`VECTOR
`
`VECTOR
`1 4
`2(9)
`QUANTIZER
`
`QUANTIZER —- R(E,(0)
`
`VECTOR 0F
`CODEBOOK
`CODEBOOK
`VECTOR OF
`QUANTIZED
`QUANTIZED
`REW
`REW
`SPECTRA
`SPECTRA
`
`|PR2017-01075
`Saint Lawrence Communications
`Exhibit 2017
`
`
`
`
`_
`
`_
`J
`
`1'
`2
`5(0))
`minl |*||
`— +
`
`VECTOR
`VECTOR
`0F REW
`OF REW 'R‘ ~ w
`SPECTRA
`SPECTRA
`(5'
`)
`mam)
`$1
`VECTOR
`
`
`VECTOR
`QUANTIZER
`QUANTIZER
`
`CODEBOOK
`CODEBOOK
`
`l_—__
`
`
`

`

`US 7,584,095 B2
`US 7,584,095 B2
`Page 2
`Page 2
`
`OTHER PUBLICATIONS
`OTHER PUBLICATIONS
`I.S. Burnett et al., “A Mixed Prototype Waveform/CELP Coder for
`I.S. Burnett et al., “A Mixed Prototype Waveform/CELP Coder for
`Sub 3KB/S,” School of Elecronic and Electrical Engineering, Uni-
`Sub 3KB/S,” School of Elecronic and Electrical Engineering, Uni
`versity of Bath, UK. BA2 7AY (1993), pp. II-175-II-178.
`versity of Bath, UK. BA2 7AY (1993), pp. II-175-II-178.
`Oded Gottesman, “Dispersion Phase Vector Quantization for
`Oded Gottesman, “Dispersion Phase Vector Quantization for
`Enhancement of Waveform Interpolative Coder,” Signal Compres-
`Enhancement of Waveform Interpolative Coder,” Signal Compres
`sion Laboratory, Department of Electrical and Computer Engineer-
`sion Laboratory, Department of Electrical and Computer Engineer
`ing, University of California, Santa Barbara, Calilfornia 93106,
`ing, University of California, Santa Barbara, Calilfornia 93106,
`USA, pp. 1-4.
`USA, pp. 1-4.
`Oded Gottesman et al ., “Enhanced Waveform Interpolative Coding at
`Oded Gottesman et a1 ., “Enhanced Waveform Interpolative Coding at
`4 KBPS,” Signal Compression Laboratory, Department of Electrical
`4 KBPS,” Signal Compression Laboratory, Department of Electrical
`and Computer Engineering, University of California, Santa Barbara,
`and Computer Engineering, University of California, Santa Barbara,
`California 93106, USA, pp. 1-3.
`California 93106, USA, pp. 1-3.
`Oded Gottesman et al., “High Quality Enhanced Waveform Interpo-
`Oded Gottesman et al., “High Quality Enhanced Waveform Interpo
`lative Coding at 2.8 KBPS,” IEEE International Conference on
`lative Coding at 2.8 KBPS,” IEEE International Conference on
`Acoustics, Speech, and Signal Processing, 2000, pp. 1-4.
`Acoustics, Speech, and Signal Processing, 2000, pp. 1-4.
`Oded Gottesman et al ., “Enhanced Analysis-by-Synthesis Waveform
`Oded Gottesman et a1 ., “Enhanced Analysis-by-Synthesis Waveform
`Interpolative Coding at 4 KBPS,” Signal Compression Laboratory,
`Interpolative Coding at 4 KBPS,” Signal Compression Laboratory,
`Department of Electrical and Computer Engineering, University of
`Department of Electrical and Computer Engineering, University of
`California, Santab Barbara, California 93106, USA, pp. 1-4.
`California, Santab Barbara, California 93106, USA, pp. 1-4.
`Daniel W. Griffin et al., “Multiband Excitation Vocoder,” IEEE
`Daniel W. Grif?n et al., “Multiband Excitation Vocoder,” IEEE
`Transactions on Acoustics, Speech, and Signal Processing (1988)
`Transactions on Acoustics, Speech, and Signal Processing (1988)
`36(8):1223-1235.
`36(8):1223-1235.
`W. Bastiaan Kleijn et al., “A Speech Coder Based on Decomposition
`W. Bastiaan Kleijn et al., “A Speech Coder Based on Decomposition
`of Characteristic Waveforms,” IEEE (1995), p. 508-511.
`of Characteristic Waveforms,” IEEE (1995), p. 508-511.
`
`W. Bastiaan Kleijn et al., “Waveform Interpolation for Coding and
`W. Bastiaan Kleijn et al., “Waveform Interpolation for Coding and
`Synthesis,” Speech Coding and Synthesis (1995), pp. 175-207.
`Synthesis,” Speech Coding and Synthesis (1995), pp. 175-207.
`W. Bastiaan Kleijn et al., “Transformation and Decomposition ofthe
`W. Bastiaan Kleijn et al., “Transformation and Decomposition of the
`Speech Signal for Coding,” IEEE Signal Procesing Letters 1(9): 136-
`Speech Signal for Coding,” IEEE Signal Procesing Letters 1(9): 136
`138 (1994).
`138 (1994).
`W. Bastiaan Kleijn, “Encoding SpeechUsing Prototype Waveforms,”
`W. Bastiaan Kleijn, “Encoding SpeechUsing Prototype Waveforms,”
`IEE Transactions on Speech and Audio Processing 1(4):386-399
`IEE Transactions on Speech and Audio Processing 1(4):386-399
`(1993).
`(1993).
`W. Bastiaan Kleijn, “Continuous Representations in Linear Predic-
`W. Bastiaan Kleijn, “Continuous Representations in Linear Predic
`tive Coding,” Speech Research Department, AT&T Bell Laborato-
`tive Coding,” Speech Research Department, AT&T Bell Laborato
`ries, Murray Hill, NJ 07974 (1991), pp. 201-204.
`ries, Murray Hill, NJ 07974 (1991), pp. 201-204.
`W. Bastiaan Kleijn et al., “A Low-Complexity Waveform Interpola-
`W. Bastiaan Kleijn et al., “A Low-Complexity Waveform Interpola
`tion Coder,” Speech Coding Research Department, AT&T Bell Labo-
`tion Coder,” Speech Coding Research Department, AT&T Bell Labo
`ratories, 600 Mountain Avenue, Murray Hill, NJ 07974, USA (1996),
`ratories, 600 Mountain Avenue, Murray Hill, NJ 07974, USA (1996),
`pp. 212-215.
`pp. 212-215.
`R.J. McAulay et al., “Sinusoidal Coding,” Speech Coding and Syn-
`R]. McAulay et al., “Sinusoidal Coding,” Speech Coding and Syn
`thesis 4:121-173 (1995).
`thesis 4:121-173 (1995).
`Yair Shoham, “High-Quality Speech Coding at 2.4 to 4.0 KBPS
`Yair Shoham, “High-Quality Speech Coding at 2.4 to 4.0 KBPS
`Based on Time-Frequency Interpolation,” IEEE, pp. II-167-II-170
`Based on Time-Frequency Interpolation,” IEEE, pp. II-167-II-170
`(1993).
`(1993).
`Yair Shoham, “Very Low Complexity Interpolative Speech Coding at
`Yair Shoham, “Very Low Complexity Interpolative Speech Coding at
`1.2 to 2.4 KBPS,” IEEE, pp. 1599-1602 (1997).
`1.2 to 2.4 KBPS,” IEEE, pp. 1599-1602 (1997).
`Yair Shoham, “Low Complexity Speech Coding at 1.2 to 2.4 kbps
`Yair Shoham, “Low Complexity Speech Coding at 1.2 to 2.4 kbps
`Based on Waveform Interpolation,” International Journal of Speech
`Based on Waveform Interpolation,” International Journal of Speech
`Technology 2:329-341 (1999).
`Technology 2:329-341 (1999).
`* cited by examiner
`* cited by examiner
`
`

`

`US. Patent
`
`Sep. 1, 2009
`
`Sheet 1 0f 6
`
`US 7,584,095 B2
`
`
`
`33:22: 251
`
`n-1
`
`REW PARAMETER f
`
`3(a))
`VECTOR
`OF REW "“
`SPECTRA RG'w)
`
`+
`
`_
`
`or?
`
`VECTOR
`ER
`U T
`EB
`
`F/G.
`
`

`

`U.S. Patent
`
`Sep. 1, 2009
`
`Sheet 2 of 6
`
`US 7,584,095 B2
`
`.250QO
`
`@2212?»
`
`._<mon_zmp
`
`ozfizoma
`
`35E
`
`
`
` _TN.nI$5:
`
`I:9852;
`
`mosammm
`
`xoommooo
`
`105%
`
`$~Cz<so
`
`xoommaoo
`
`
`
`E:EZEEMS230258Em
`
`ED220
`
`EU9:
`
`m,wot
`
`
`
`
`

`

`U.S. Patent
`
`Mmn&
`
`e
`
`US 7,584,095 B2
`
`
`
`Em5%:1%,;.2
`
`mEmm_551
`
`may_2m_ENE/so
`
`._<Eom_n_m
`
`ozfiromg
`
`”65“.;
`
`muNfizgo
`
`xOOmmooo
`
`.....LmGE
`
`N
`
`Tn:E5:
`
`
`
`lil£3525
`
`meow;
`
`ESE/Bo
`
`xoommooo
`
`
`
`
`

`

`US. Patent
`
`Sep. 1, 2009
`
`Sheet 4 of6
`
`US 7,584,095 B2
`
`m 6K
`
`02:55;
`IZEBQQ 44201 2%
`
`
`
`358
`
`2%
`
`2%
`
`@255;
`
`2; b w
`
`E W
`
`N
`
`
`
`T a; NEE
`
`~66?
`
`153250
`
`xoommaoo
`
`55% 2% Q5
`
`
`
`2? 222258 250252 Ex
`
`2% 25
`
`

`

`US. Patent
`
`Sep. 1, 2009
`
`Sheet 5 of6
`
`US 7,584,095 B2
`
`14
`
`12 —
`a
`310 ‘
`ii, 8 —
`
`S 6 —
`E
`2 4 -
`5 2 _
`
`OUTPUT SEW
`
`MEAN-REMOVED SEW
`
`O l
`0
`
`I
`1
`
`I
`2
`
`1
`3
`
`1
`4
`
`1
`5
`1311s
`
`F
`e
`
`1
`7
`
`1
`8
`
`9
`
`2O
`
`18 ~
`
`A 16 -
`
`g 14 _
`% 12
`0
`% 1O -
`g
`B 8 “
`5
`6 _
`pl.
`8 4 _
`
`0
`
`HARMONICS
`RANGE
`
`E1 9-14
`1315-19
`20-24
`E]
`[II 25-29
`
`1:1 30-35
`
`1336-69
`
`1‘;
`73
`.32?
`"-1‘
`:i
`
`;I.;-;
`
`VOICED
`
`INTERMEDIATE
`
`UNVOICED
`
`

`

`US. Patent
`
`Sep. 1, 2009
`
`Sheet 6 of6
`
`US 7,584,095 B2
`
`.._
`
`"
`
`HARMONICS
`RANGE
`
`1.‘
`
`.
`
`9-14
`
`5115-19
`
`El 20-24
`
`13 25-29
`
`El 30-35
`
`H6" 9
`‘0
`
`9 _
`
`§ 8 -
`Q
`
`
`
`I5 § 6 _ 7 ~
`
`
`
`I..|_|
`
`g
`
`5
`a 4 T
`a?
`3 _
`%
`
`1 _
`
`O
`
`,.
`
`L,
`
`.
`
`.,
`
`35a, :37
`
`4
`
`.»
`
`-
`
`VOICED
`
`INTERMEDIATE
`
`UNVOICED
`
`I
`
`I
`
`VOICED RANGE
`I
`I
`Ew PREDICTOR
`
`I
`
`I
`
`'
`
`_
`
`_
`
`14
`
`-
`
`_
`
`14
`
`REw PREDICATOR
`
`l
`l
`I
`10
`a
`s
`INTERMEDIATE RANGE
`
`|
`12
`
`EW P EDICATOR
`
`'
`
`I
`
`SEW PREDICATOR
`
`1
`I
`s
`s
`UNVOICED RANGE
`
`1
`10
`
`1
`12
`
`I
`
`l
`
`I
`
`I
`
`I
`4
`
`I
`
`|
`4
`
`l
`
`1
`
`O5 .,
`0
`~05 -
`
`_1
`
`I
`2
`
`‘ '”'
`0.5 J
`o
`_0_5 _
`
`‘I
`
`1
`
`|
`2
`
`I
`
`_—Vw -
`
`O \ Y \
`
`4
`
`-05 —
`
`_1
`
`I
`2
`
`I
`4
`
`sEw PREDICTOR
`
`I
`6
`
`l
`8
`HARMONICS
`
`I
`10
`
`I
`12
`
`-
`
`14
`
`

`

`US 7,584,095 B2
`
`1
`REW PARAMETRIC VECTOR
`QUANTIZATION AND DUAL-PREDICTIVE
`SEW VECTOR QUANTIZATION FOR
`WAVEFORM INTERPOLATIVE CODING
`
`CROSS REFERENCE TO RELATED
`APPLICATION
`
`This application claims the bene?t of Provisional Patent
`Application No. 60/190,371 ?led Mar. 17, 2000, Which appli
`cation is herein incorporated by reference. This application is
`a divisional of US. patent application Ser. No. 09/811,187,
`?led Mar. 16, 2001 now US. Pat. No. 7,010,482.
`
`BACKGROUND OF THE INVENTION
`
`2
`magnitude Was quantized on a Waveform by Waveform base;
`see 0. Gottesman and A. Gersho, (1999), “Enhanced Wave
`form Interpolative Coding at 4 kbps”, IEEE Speech Coding
`Workshop, pp. 90-92, Finland; Finland. 0. Gottesman and A.
`Gersho, (1999), “Enhanced Analysis-by-Synthesis Wave
`form Interpolative Coding at 4 kbps”, EUROSPEECH’99,
`pp. 1443-1446, Hungary.
`
`SUMMARY OF THE INVENTION
`
`The present invention describes novel methods that
`enhance the performance of the WI coder, and alloWs for
`better coding ef?ciency improving on the above 1999 Got
`tesman and Gersho procedure. The present invention incor
`porates analysis-by-synthesis (AbS) for parameter estima
`tion, offers higher temporal and spectral resolution for the
`REW, and more e?icient quantization of the sloWly-evolving
`Waveform (SEW). In particular, the present invention pro
`poses a novel e?icient parametric representation of the REW
`magnitude, an e?icient paradigm for AbS predictive VQ of
`the REW parameter sequence, and dual-predictive AbS quan
`tization of the SEW.
`More particularly, the invention provides a method for
`interpolative coding input signals, the signals decomposed
`into or composed of a sloWly evolving Waveform and a rap
`idly evolving Waveform having a magnitude, the method
`incorporating at least one various, preferably combinations of
`the folloWing steps or can include all of the steps:
`(a) AbS VQ of the REW;
`(b) parametrizing the magnitude of the REW;
`(c) incorporating temporal Weighting in the AbS VQ of the
`REW;
`(d) incorporating spectral Weighting in the AbS VQ of the
`REW;
`(e) applying a ?lter to a vector quantizer codebook in the
`analysis-by-synthesis vector-quantization of the rapidly
`evolving Waveform Whereby to add self correlation to the
`codebook vectors; and
`(f) using a coder in Which a plurality of bits therein are
`allocated to the rapidly evolving Waveform magnitude.
`In addition, one can combine AbS quantization of the
`sloWly evolving Waveform With any or all of the foregoing
`parameters.
`The neW method achieves a substantial reduction in the
`REW bit rate and the EWI achieves very close to toll quality,
`at least under clean speech conditions. These and other fea
`tures, aspects, and advantages of the present invention Will
`become better understood With regard to the folloWing
`detailed description, appended claims, and accompanying
`draWings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a REW Parametric Representation;
`FIG. 2 is a REW Parametric VQ;
`FIG. 3 is a REW Parametric Representation AbS VQ;
`FIG. 4 is a REW Parametric Representation Simpli?ed
`AbS VQ;
`FIG. 5 is a REW Parametric Representation Simpli?ed
`Weighted AbS VQ;
`FIG. 6 is a block diagram of the Dual Predictive AbS SEW
`vector quantization;
`FIG. 7 is a Weighted Signal-to-Noise Ratio (SNR) for Dual
`Predictive AbS SEW VQ;
`FIG. 8 is an output Weighted SNR for the 18 codebooks,
`9-bit AbS SEW VQ;
`
`The present invention relates to vector quantization (VQ)
`in speech coding systems using Waveform interpolation.
`In recent years, there has been increasing interest in achiev
`ing toll-quality speech coding at rates of 4 kbps and beloW.
`Currently, there is an ongoing 4 kbps standardization effort
`conducted by an international standards body (The Interna
`tional
`Telecommunications Union-Telecommunication
`(ITU-T) Standardization Sector). The expanding variety of
`emerging applications for speech coding, such as third gen
`eration Wireless netWorks and LoW Earth Orbit (LEO) sys
`tems, is motivating increased research efforts. The speech
`quality produced by Waveform coders such as code-excited
`linear prediction (CELP) coders degrades rapidly at rates
`beloW 5 kbps; see B. S. Atal, and M. R. Schroeder, (1984)
`“Stochastic Coding of Speech at Very LoW Bit Rate”, Proc.
`Int. Conf Comm, Amsterdam, pp. 1610-1613.
`On the other hand, parametric coders, such as: the Wave
`form-interpolative (WI) coder, the sinusoidal-transform
`coder (STC), and the multiband-excitation (MBE) coder, pro
`duce good quality at loW rates but they do not achieve toll
`quality; seeY Shoham, IEEEICASSP'93, Vol. II, pp. 167-170
`(1993); I. S. Burnett, and R. J. Holbeche, (1993), IEEE
`ICASSP'93, Vol. II, pp. 175-178; W. B. Kleijn, (1993), IEEE
`Trans. Speech andAudio Processing, Vol. 1, No. 4, pp. 386
`399; W. B. Kleijn, and J. Haagen, (1994), IEEE Signal Pro
`cessingLetters, Vol. 1, No. 9, pp. 136-138; W. B. Kleijn, and
`J. Haagen, (1995), IEEE ICASSP'95, pp. 508-511; W. B.
`Kleijn, and J. Haagen, (1995), in Speech Coding Synthesis by
`W. B. Kleijn and K. K. PaliWal, Elsevier Science B. V., Chap
`ter 5, pp. 175-207; I. S. Burnett, and G. J. Bradley, (1995),
`IEEE ICASSP'95, pp. 261-263, 1995; I. S. Burnett, and G. J.
`Bradley, (1995), IEEE Workshop on Speech Codingfor Tele
`communications, pp. 23-24; I. S. Burnett, and D. H. Pham,
`(1997), IEEE ICASSP'97, pp. 1567-1570; W. B. Kleijn, Y.
`Shoham, D. Sen, and R. Haagen, (1996), IEEE ICASSP'96,
`pp. 212-215;Y. Shoham, (1997), IEEEICASSP'97, pp. 1599
`1602; Y. Shoham, (1999), International Journal ofSpeech
`Technology, KluwerAcademic Publishers, pp. 329-341; R. J.
`McAulay, and T. F. Quatieri, (1995), in Speech Coding Syn
`thesis by W. B. Kleijn and K. K. PaliWal, Elsevier Science B.
`V., Chapter 4, pp. 121-173; and D. Grif?n, and J. S. Lim,
`(1988), IEEE Trans. ASSP, Vol. 36, No. 8, pp. 1223-1235.
`This is largely due to the lack of robustness of speech param
`eter estimation, Which is commonly done in open-loop, and to
`inadequate modeling of non-stationary speech segments.
`Commonly in WI coding, the similarity betWeen succes
`sive rapidly evolving Waveform (REW) magnitudes is
`exploited by doWnsampling and interpolation and by con
`strained bit allocation; see W. B. Kleijn, and J. Haagen,
`(1995), IEEE ICASSP'95, pp. 508-511. In a previous
`Enhanced Waveform Interpolative (EWI) coder the REW
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`

`

`US 7,584,095 B2
`
`3
`FIG. 9 is a mean-removed SEW’s Weighted SNR for the 18
`codebooks, 9-bit AbS SEW VQ; and
`FIG. 10 are predictors for three REW parameter ranges.
`
`DETAILED DESCRIPTION
`
`In very loW bit rate WI coding, the relation betWeen the
`SEW and the REW magnitudes Was exploited by computing
`the magnitude of one as the unity complement of the other;
`see W. B. Kleijn, and J . Haagen, (1995), “A Speech Coder
`Based on Decomposition of Characteristic Waveforms”,
`IEEE ICASSP’95, pp. 508-511; W. B. Kleijn, and J. Haagen,
`(1995), “Waveform Interpolation for Coding and Synthesis”,
`in Speech Coding Synthesis by W B. Kleijn and K. K. PaliWal,
`Elsevier Science B. V., Chapter 5, pp. 175-207; I. S. Burnett,
`and G. J. Bradley, (1995), “New Techniques for Multi-Proto
`type Waveform Coding at 2.84 kb/s”, IEEE ICASSP'95, pp.
`261-263, 1995; I. S. Burnett, and G. J. Bradley, (1995), “LoW
`Complexity Decomposition and Coding of Prototype Wave
`forms”, IEEE Workshop on Speech Coding for Telecommu
`nications, pp. 23-24; I. S. Burnett, and D. H. Pham, (1997),
`“Multi-Prototype Waveform Coding using Frame-by-Frame
`Analysis-by-Synthesis”, IEEE ICASSP'97, pp. 1567-1570;
`W. B. Kleijn, Y. Shoham, D. Sen, and R. Haagen, (1996), “A
`LoW-Complexity Waveform Interpolation Coder”, IEEE
`ICASSP'96, pp. 212-215; Y. Shoham, (1997), “Very LoW
`Complexity Interpolative Speech Coding at 1.2 to 2.4 kbps”,
`IEEE ICASSP'97, pp. 1599-1 602; Y. Shoham, (1999), “Low
`Complexity Speech Coding at 1.2 to 2.4 kbps Based on Wave
`form Interpolation”, International Journal of Speech Tech
`nology, KluWer Academic Publishers, pp. 329-341.
`Also, since the sequence of SEW magnitude evolves
`sloWly, successive SEWs exhibit similarity, offering oppor
`tunities for redundancy removal. Additional forms of redun
`dancy that may be exploited for coding ef?ciency are: (a) for
`a ?xed SEW/REW decomposition ?lter, the mean SEW mag
`nitude increases With the pitch period and (b) the similarity
`betWeen successive SEWs, also increases With the pitch
`period. In this Work We introduce a novel “dual-predictive”
`AbS paradigm for quantizing the SEW magnitude that opti
`mally exploits the information about the current quantized
`REW, the past quantized SEW, and the pitch, in order to
`predict the current SEW.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`4
`REW Parametric Representation
`Direct quantization of the REW magnitude is a variable
`dimension quantization problem, Which may result in spend
`ing bits and computational effort on perceptually irrelevant
`information. A simple and practical Way to obtain a reduced,
`and ?xed, dimension representation of the REW is With a
`linear combination of basis functions, such as orthonormal
`polynomials; see W. B. Kleijn, Y. Shoham, D. Sen, and R.
`Haagen, (1996), IEEE ICASSP'96, pp. 212-215; Y Shoham,
`(1997),]EEEICASSP'97, pp. 1599-1602;Y Shoham, (1999),
`International Journal of Speech Technology, KluWer Aca
`demic Publishers, pp. 329-341 . Such a representation usually
`produces a smoother REW magnitude, and improves the per
`ceptual quality. Suppose the REW magnitude, R(u)), is rep
`resented by a linear combination of orthonormal functions,
`
`1:1
`Ru») - Z win-(w). 0 s w 5 7r
`
`(1)
`
`Where no is the angular frequency, and I is the representation
`order. The REW magnitude is typically an increasing func
`tion of frequency, Which, can be coarsely quantized With a loW
`number of bits per Waveform Without signi?cant perceptual
`degradation. Therefore, it may be advantageous to represent
`the REW magnitude in a simple, but perceptually relevant
`manner. Consequently We model the REW by the folloWing
`parametric representation, R(u),§):
`
`H
`1M. a =2 won-(w). 0 so in; 0 54:1
`[:0
`
`(2)
`
`, §,_1(g)]T is a parametric vector of
`.
`.
`Where \A((E):[\A(O(E), .
`coef?cients Within the representation model subspace, and E
`is the “unvoicing” parameter Which is zero for a fully voiced
`spectrum, and one for a fully unvoiced spectrum. Thus R(u),§)
`de?nes a tWo-dimensional surface Whose cross sections for
`each value of E give a particular REW magnitude spectrum,
`Which is de?ned merely by specifying a scalar parameter
`value.
`A simple and practical Way for parametric representation
`of the REW is, for example, by a parametric linear combina
`tion of basis functions, such as polynomials With parametric
`coe?icients, namely:
`
`For practical considerations assume that the parametric rep
`resentation is a pieceWise linear function of E, and may there
`fore be represented by a set of N uniformly spaced spectra, as
`illustrated in FIG. 1.
`
`REW Parametric Vector Quantization
`One can observe the similarity betWeen successive REW
`magnitude spectra, Which may suggest a potential gain by VQ
`of a set of successive REWs. FIG. 2 illustrates a simple
`parametric VQ system for a vector of REW spectra. The input
`is an M dimensional vector of REW magnitude spectra,
`
`45
`
`50
`
`Introduction to REW Quantization
`The REW represents the rapidly changing unvoiced
`attribute of speech. Commonly in WI systems, the REW is
`quantized on a Waveform by Waveform base. Hence, for loW
`rate WI systems having long frame size, and a large number of
`Waveforms per frame, the relative bitrate required for the
`REW becomes signi?cantly excessive. For example, consider
`a potential 2 kbps system Which uses a 240 sample frame, 12
`Waveforms per frame, and Which quantizes the SEW by alter
`nating bit allocation of 3 bit and 1 bit per Waveform. The REW
`55
`bitrate is then 24 bit per frame, or 800 kbps Which is 40% of
`the total bitrate. This example demonstrates the need for a
`more e?icient REW quantization.
`Ef?cient REW quantization can bene?t from tWo ob serva
`tions: (1) the REW magnitude is typically an increasing func
`tion of the frequency, Which suggests that an e?icient para
`metric representation may be used; (2) one can observe a
`similarity betWeen successive REW magnitude spectra,
`Which may suggest a potential gain by employing predictive
`VQ on a group of adjacent REWs. The next tWo sections
`propose REW parametric representation, and its respective
`
`60
`
`65
`
`

`

`5
`and the VQ output is an index, j, Which determines a quan
`tized parameter vector, E:
`
`6
`The quantized REW parameter is then given by:
`
`US 7,584,095 B2
`
`é:[é1>é2> -
`
`-
`
`- féMlT
`
`(5)
`
`5
`
`Which parametrically determines a vector of quantized spec
`tra:
`
`(13)
`
`é<w>:é<w.é>:tk<w.él11mg). -
`
`-
`
`- .iméMnT
`
`(6)
`
`In VQ case, the quantized parameter vector is given by:
`
`The encoder searches, in the parameter codebook C (16;), for
`the parameter vector Which minimizes the distortion:
`
`M
`
`é= argmi Z D(Rm. from} -
`
`gecqra W1
`
`M
`
`argrni Z
`gecqra W1
`
`(7)
`
`B. PieceWise Linear Parametric Representation
`In order to have a simple representation that is computa
`tionally e?icient and avoids excessive memory requirements,
`We model the tWo dimensional surface by a pieceWise linear
`parametric representation. Therefore, We introduce a set of N
`uniformly spaced spectra, {f{(uu,én)}n:o ‘1. Then the para
`metric surface is de?ned by linear interpolation according t:
`
`20
`
`For example, suppose the input REW magnitude is repre
`sented by an I-th dimensional vector of function coe?icients,
`y, given by:
`
`25
`
`VIP/0N1, -
`
`-
`
`- >YI-llT
`
`(8)
`
`For a set of M input REWs, each is of Which represented by a
`vector of polynomial coef?cients, ym, Which form a P><M
`input coef?cient matrix, I“:
`
`30
`
`Because this representation is linear, the coef?cients of
`IA{(u),E) are linear combinations of the coefficients of R(u),
`EM) and Rm.) Hence.
`
`Where y” is the coe?icient vector of the n-th REW magnitude
`function representation:
`
`i?é.)
`
`(17)
`
`In this case, the distortion may be interpolated by:
`
`zdwz
`
`(18)
`
`TIP/1N2, -
`
`-
`
`- NM]
`
`(9)
`
`The inverse VQ output is a vector of M quantized REWs,
`Which form the quantized function coe?icient matrix:
`
`?éHiél), 1(a). .
`
`.
`
`. re.»
`
`(10)
`
`Which is used by the decoder to compute the quantized spec
`tra.
`A. Quantization Using Orthonormal Functions
`Orthonormal functions, such as polynomials, may be used
`for e?icient quantization of the REW; see W. B. Kleijn, et al.,
`(1996), IEEE ICASSP'96, pp. 212-215; Y. Shoham, (1997),
`IEEE ICASSP'97, pp. 1599-1602; Y. Shoham, (1999), Inter
`national Journal of Speech Technology, KluWer Academic
`Publishers, pp. 329-341. Consider REW magnitude, R(u)),
`represented by a linear combination of orthonormal func
`tions, lpl-(uu):
`
`35
`
`40
`
`45
`
`50
`
`The above can be easily generalized to the parameter VQ
`case. The optimal interpolation factor that minimizes the
`distortion betWeen tWo representation vectors is given by:
`
`55
`
`Which is modeled using the parametric representation:
`
`60
`
`and the respective optimal parameter value, Which is a con
`tinuous variable betWeen zero and one, is given by:
`
`65
`
`This result alloWs a rapid search for the best unvoicing param
`eter value needed to transform the coe?icient vector to a
`scalar parameter, folloWed by the corresponding quantization
`scheme, as described in the section 4.
`
`

`

`US 7,584,095 B2
`
`7
`C. Weighted Distortion Quantization
`Commonly in speech coding, the magnitude is quantized
`using Weighted distortion measure. In this case the quantized
`REW parameter is then given by:
`
`8
`case. The optimal parameter that minimizes the spectrally
`Weighted distortion betWeen tWo representation vectors is
`given by:
`
`(Z1)
`
`110p: :
`
`(in — inilyxpbl — 9W1)
`
`(27)
`
`and the orthonormal function simpli?cation, given in equa
`tion (13), cannot be used. In this case, the Weighted distortion
`betWeen the input and the parametric representation modeled
`spectra is equal to:
`
`DW(R, 115)) =
`
`[0.
`
`(22)
`
`Where II'(W(uu)) is the Weighted correlation matrix of the
`orthonormal functions, its elements are:
`
`y is the input coef?cient vectors, and WE) is the modeled
`parametric coe?icient vector. In VQ case, the quantized
`parameter vector is given by:
`
`A
`
`q
`
`M
`
`H
`
`g = 22%;?)
`
`DAR... Rem} =
`
`(24)
`
`M
`
`argmi 2 (7m — wemfwwmwmm — Wm}
`560.7(5) W1
`
`D. Weighted DistortioniPieceWise Linear Parametric
`Representation
`Again, for practical considerations assume that the para
`metric representation is pieceWise linear, and may be repre
`sented by a set of N spectra, {IA{(u),én)}n:ON '1. For the piece
`Wise linear representation, the interpolated quantized
`coe?icient vector is:
`
`H
`
`(25)
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`In the case Where parameter VQ is employed, the interpola
`tion alloWs for a substantial simpli?cation of the search com
`putations. In this case, the distortion can be interpolated:
`
`60
`
`The above can be easily generalized to the parameter VQ
`
`65
`
`and the respective optimal parameter value, Which is a con
`tinuous variable betWeen zero and one, is given by equation
`(20). This result alloWs a rapid search for the best unvoicing
`parameter value needed to transform the coef?cient vector to
`a scalar parameter, for encoding or for VQ design. Altema
`tively, in order to eliminate using the matrix 11), the scalar
`product may rede?ned to incorporate the time-varying spec
`tral Weighting. The respective orthonormal basis functions
`then satisfy:
`
`Where 6(i-j) denotes Kroneker delta. The respective param
`eter vector is given by:
`
`. , 1p,_1]Tis an I-th dimensional vector
`.
`Where 1p(w):[1pO, 1p 1, .
`of time-varying orthonormal functions.
`REW Parameter Analysis-By-Synthesis VQ
`This section presents the AbS VQ paradigm for the REW
`parameter. The ?rst presentation is a system Which quantizes
`the REW parameter by employing spectral based AbS. Then
`simpli?ed systems, Which apply AbS to the REW parameter,
`are presented.
`A. REW Parameter Quantization by Magnitude AbS VQ
`The novel Analysis-by-Synthesis (AbS) REW parameter
`VQ technique is illustrated in FIG. 3. An excitation vector
`cZ-J-(m) (m:l; .
`.
`. ,M) is selected from the VQ codebook and is
`fed through a synthesis ?lter to obtain a parameter vector
`i@(m) (synthesized quantized) Which is then mapped to quan
`tized a representation coe?icient vectors
`This is
`compared With a sequence of input representation coef?cient
`vectors y(m) and each is spectrally Weighted. Each spectrally
`Weighted error is then temporally Weighted, and a distortion
`measure is obtained. A search through all candidate excitation
`vectors determines an optimal choice. The synthesis ?lter in
`FIG. 3 can be vieWed as a ?rst order predictor in a feedback
`loop. (While shoWn here is an auto -regressive synthesis ?lter,
`in other arrangements moving-average (MA) synthesis ?lter
`may be used.) By alloWing the value of the pre

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket