`Bessette et al.
`
`(54) PERCEPTUAL WEIGHTING DEVICE AND
`METHOD FOR EFFICIENT CODING OF
`WIDEBAND SIGNALS
`
`(73)
`(*)
`
`US0068075.24B1
`US 6,807,524 B1
`Oct. 19, 2004
`
`(10) Patent No.:
`(45) Date of Patent:
`
`6,006,174 A * 12/1999 Lin et al. .................... 704/201
`6,064,962 A * 5/2000 Oshikiri et al. ............. 704/262
`6,192,334 B1
`2/2001 Nomura
`6,449,590 B1 * 9/2002 Gao ........................... 704/219
`FOREIGN PATENT DOCUMENTS
`
`EP
`EP
`EP
`EP
`JP
`JP
`JP
`JP
`WO
`
`1/1992
`0 465 057 A1
`1/1992
`()465057 A.
`9/1996
`0732686 A
`0 732 686 A2 9/1996
`()2–012300 A 1/1990
`03-116199 A
`5/1991
`6-348300 A 12/1994
`10–28.2997 A 10/1998
`WO 96/21220
`7/1996
`
`(75) Inventors: Bruno Bessette, Rock Forest (CA);
`Redwan Salami, Sherbrooke (CA);
`Roch Lefebvre, Canton de Magog
`(CA)
`Assignee: Voiceage Corporation, Quebec (CA)
`Notice:
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`Appl. No.:
`09/830,276
`Oct. 27, 1999
`PCT Filed:
`PCT/CA99/01010
`
`(21)
`(22)
`(86)
`
`OTHER PUBLICATIONS
`“Predictive Coding of Speech Signals and Subjective Error
`Criteria” by Bishnus S. Atal et al., IEEE Transaction ASSP.
`PCT No.:
`vol. 27, No. 3, pp. 247–254 Jun. 1979.
`§ 371 (c)(1),
`* cited by examiner
`(2), (4) Date: Jun. 20, 2001
`Primary Examiner—T?livaldis Ivars Smits
`PCT Pub. No.: WOOO/25304
`Assistant Examiner—James S. Wozniak
`PCT Pub. Date: May 4, 2000
`(74) Attorney, Agent, or Firm—Birch, Stewart, Kolasch &
`Foreign Application Priority Data
`Birch, LLP
`(30)
`ABSTRACT
`(57)
`(CA) … 2252170
`Oct. 27, 1998
`A perceptual weighting device for producing a perceptually
`(51) Int. Cl." ................................................ G10L 19/04
`weighted signal in response to a wideband signal comprises
`(52) U.S. Cl. .................... 704/200.1; 704/219; 704/262;
`a signal pre-emphasis filter, a synthesis filter calculator, and
`704/224
`a perceptual weighting filter. The signal pre-emphasis filter
`(58) Field of Search ................................. 704/222, 201,
`enhances the high frequency content of the wideband signal
`704/219, 262, 224
`to thereby produce a pre-emphasized signal. The signal
`pre-emphasis filter has a transfer function of the form:
`P(z)=1-uz', wherein u is a pre-emphasis factor having a
`U.S. PATENT DOCUMENTS
`value located between 0 and 1. The synthesis filter calculator
`is responsive to the pre-emphasized signal for producing
`4,932,061 A 6/1990 Kroon et al.
`synthesis filter coefficients. Finally, the perceptual weighting
`5,307,441 A * 4/1994 Tzeng ........................ 704/222
`filter processes the pre-emphasized signal in relation to the
`5,359,696 A * 10/1994 Gerson et al. .............. 704/223
`synthesis filter coefficients to produce the perceptually
`5,444,816 A 8/1995 Adoul et al.
`weighted signal. The perceptual weighting filter has a trans
`5,519,807 A 5/1996 Cellario et al.
`
`: º :: º º . º 704/223 fer function, with fixed denominator, of the form: W (z)=A
`sº. A º Aºi.
`(Z/Y)/(1-Y-Z') where 0<yº-yºs 1.
`5,754,976 A 5/1998 Adoul et al.
`5,963,898 A 10/1999 Navarro et al.
`49 Claims, 4 Drawing Sheets
`
`(87)
`
`(56)
`
`References Cited
`
`*** –––––––––––––––––––––––––––––––––––––––––––––
`Nºur ? ---ºff, -----------> #(ºf
`#35
`rºs
`i
`f
`
`f f { |
`
`LP ANALYSIS
`QUARTZATION
`AND INTERPOLATION
`?ALCULATOR
`
`|MPULSE
`RESPONSE
`GENERATOR
`
`CLOSED-100F|b, TJ
`PiTCH SEARCH H --
`MODULE | -
`
`KIT
`X| NNOVATIVE |kg
`EXCHA? (N - -
`
`t
`
`|
`i
`:
`|
`!
`
`V- - - - - - - - || “... — - - SEARCH, MODULE
`
`
`
`f f()
`
`f
`£2%
`f f/
`ZERO-NPUT
`___j MEMORY|. Ul
`RESPONSE
`MôDULE
`CALCULATOR
`
`i
`
`Ex. 1001 / Page 1 of 16
`Apple v. Saint Lawrence
`
`
`
`U.S. Patent
`US. Patent
`
`Oct. 19, 2004
`Oct. 19, 2004
`
`Sheet 1 of 4
`Sheet1,0f4
`
`
`
`____
`
`US 6,807,524 B1
`US 6,807,524 B1
`_n_“
`
`
`
` “Q\\m§\mwgsoo:Ium<mm
`
`mmzoammm
`
`mob<mmzmo
`
`
`
`
`F.\\\_Iomwam
`
`
`“VSw?a?aR:_3%,aua852%
`
`-
`
`
`
`_5382+$5:NEE58%"132%SE02:13;
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
` 20.24813;oz<“_“InIum<mmIota“\waZOC<NCZ<DOvb\T|llll|lIi»J_QQgoofamodkomM32%E_.8NS__w:5ant_§
`
`
`
`3%5n:H"R?53:35_
`___.w29356__x$452__\\-/f%\%«M22522.xn_\\._
`I-151m“r...........I;fl
`
`
`
`53902J1}1-1L.rim:
`
`\\\
`
`I4fli>mo§m2:11mmzoummm1;
`WJDQOEmop<ifij<u
`
`
`
`HDQZTIOsz
`
`
`
`
`
` “JlII'IIl1IIIlllI1II(1|.__~~
`
`EX. 1001 / Page 2 of 16
`
`Ex. 1001 / Page 2 of 16
`
`
`
`
`
`
`
`
`Oct. 19, 2004
`
`Sheet 2 of 4
`
`,E>
`
`4at5:M__55%m"1m"3NS"_m“m<mE5:Nags
`2gag
`
`U.S. Patent
`US. Patent
`
`QQW
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`t"immgmgEm§mWNW
`
`US 6,807,524 B1
`
`1Bm
`
`6,5&8
`
`,mmm55%mn
`
`$25335
`
`SWNWU30m3:8:
`a.NmméozéE282%
`
`3.RmE5:g02522mofiwumo
`
`mooomo
`
`Ex. 1001 / Page 3 of 16
`
`Ex. 1001 / Page 3 of 16
`
`
`
`
`U.S. Patent
`US. Patent
`
`Oct. 19, 2004
`Oct. 19, 2004
`
`Sheet 3 of 4
`Sheet 3 0f 4
`
`US 6,807,524 B1
`US 6,807,524 B1
`
`__
`
`_
`
`_
`
`\QM-"_ximsn3me
`
`
`slIl|lIllIll“_S“1L.-..........---....1.1---)------JBa_~_fl$3$228L@3228L.€228--_3:
`_xi3E>§§____xE?FE:$0232853m_
`_$983085888IT“.L:85
`
`.Q\3wak--_T---
`.\$3"nHi5»EM,_Hxi.1:i2%\QSQ-Lm§§fl__rIIIll1III.1I2II{WilllnlllllJ
`-J___nEu"fl_nh_
`+_fl-_-wsw_m.».......
`195335_$2535_x2535__“\xh}ZS4.33’E4.SE}23fl-mTKS
`
`_\|___________mozmdm_-.
`murm----0“$§"
`
`
`
`
`Ex.1001/Page 4 of 16
`
`
`
`x
`
`
`
`
`
`R;NQ-fiER“n_
`
`
`
`
`
`Ex. 1001 / Page 4 of 16
`
`
`
`
`U.S. Patent
`US. Patent
`
`Oct. 19, 2004
`Oct. 19, 2004
`
`Sheet 4 of 4
`Sheet 4 0f 4
`
`US 6,807,524 B1
`US 6,807,524 B1
`
`
` __u4/J/_OJ.4EN_/_0EU80N_,_4N_A0dz_BCD.r_S_./ET._HT_QfiMZ/_WIcNL.0_U_BS4LC0_rIIIIIIIIIIIIIIIIIIIIIIIIIII|__ERH/IllaI.Tull
`.7._CL4/44T3_4_N__mNAA_E_.RDB_ClIIIIIl!IIlLilIllIlIIIII_RIO'E_nE//—O_n7__Rx"/
`IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII15.,flIIIIIIIIiII|Ii1§III.a_4_N_.4._E05_M_ST.”0__AA4,_S__W.____7.24LL_Iun0_OAIIN__4“nanll9_NwllP_NOE“gm9.fl_AT__BM__S__IIIIIIIIIIIII11.._xHR.............................nz23w.W.
`:1_Rm./W0_rim.AF.
`r1;IIIIuIIIIsIIIIIIn1rIIII:IvIIIIItIIuIInDCm..._/DLL,NDu,AEEL\.m5TTI._DuR1/A.lI:44m\Bm0RE4M[5TI_0A4NS4/W_MR
`UMEANLT:R\LMSTA687EOYmm000CCS444
`MmUMNd.8Fl}IIIIIll|IiIIIIIlIIIII
`
`L.
`— — — — — — — — — — — — — — — — — — — — —
`
`Ex. 1001 / Page 5 of 16
`
`
`
`
`
`Ex. 1001 / Page 5 of 16
`
`
`
`
`US 6,807,524 B1
`
`1
`PERCEPTUAL WEIGHTING DEVICE AND
`METHOD FOR EFFICIENT CODING OF
`WIDEBAND SIGNALS
`
`2
`speech signal. At the encoder end, the synthesis output is
`computed for all, or a subset, of the codevectors from the
`codebook (codebook search). The retained codevector is the
`one producing the synthesis output closest to the original
`speech signal according to a perceptually weighted distor
`tion measure. This perceptual weighting is performed using
`a so-called perceptual weighting filter, which is usually
`derived from the LP synthesis filter.
`The CELP model has been very successful in encoding
`telephone band sound signals, and several CELP-based
`standards exist in a wide range of applications, especially in
`digital cellular applications. In the telephone band, the sound
`signal is band-limited to 200–3400 Hz and sampled at 8000
`samples/sec. In wideband speech/audio applications, the
`sound signal is band-limited to 50–7000 Hz and sampled at
`16000 samples/sec.
`Some difficulties arise when applying the telephone-band
`optimized CELP model to wideband signals, and additional
`features need to be added to the model in order to obtain high
`quality wideband signals. Wideband signals exhibit a much
`wider dynamic range compared to telephone-band signals,
`which results in precision problems when a fixed-point
`implementation of the algorithm is required (which is essen
`tial in wireless applications). Furthermore, the CELP model
`will often spend most of its encoding bits on the low
`frequency region, which usually has higher energy contents,
`resulting in a low-pass output signal. To overcome this
`problem, the perceptual weighting filter has to be modified
`in order to suit wideband signals, and pre-emphasis tech
`niques which boost the high frequency regions become
`important to reduce the dynamic range, yielding a simpler
`fixed-point implementation, and to ensure a better encoding
`of the higher frequency contents of the signal.
`In CELP-type encoders, the optimum pitch and innova
`tive parameters are searched by minimizing the mean
`squared error between the input speech and synthesized
`speech in a perceptually weighted domain. This is equivalent
`to minimizing the error between the weighted input speech
`and weighted synthesis speech, where the weighting is
`performed using a filter having a transfer function W(z) of
`the form:
`
`In analysis-by-synthesis (AbS) coders, analysis show that
`the quantization error is weighted by the inverse of the
`weighting filter, WT'(z), which exhibits some of the formant
`structure in the input signal. Thus, the masking property of
`the human ear is exploited by shaping the error, so that it has
`more energy in the formant regions, where it will be masked
`by the strong signal energy present in those regions. The
`amount of weighting is controlled by the factors T1 and T2.
`This filter works well with telephone band signals.
`However, it was found that this filter is not suitable for
`efficient perceptual weighting when it was applied to wide
`band signals. It was found that this filter has inherent
`limitations in modelling the formant structure and the
`required spectral tilt concurrently. The spectral tilt is more
`pronounced in wideband signals due to the wide dynamic
`range between low and high frequencies. It was suggested to
`add a tilt filter into filter W(z) in order to control the tilt and
`formant weighting separately.
`OBJECT OF THE INVENTION
`An object of the present invention is therefore to provide
`a perceptual weighting device and method adapted to wide
`band signals, using a modified perceptual weighting filter to
`obtain a high quality reconstructed signal, these device and
`method enabling fixed point algorithmic implementation.
`SUMMARY OF THE INVENTION
`More specifically, in accordance with the present
`invention, there is provided a perceptual weighting device
`
`5
`
`10
`
`15
`
`20
`
`30
`
`35
`
`40
`
`50
`
`25
`
`This application is the national phase under 35 U.S.C.
`$371 of PCT International Application No. PCT/CA99/
`01010 which has an International filing date of Oct. 27,
`1999, which designated the United States of America and
`was published in English.
`BACKGROUND OF THE INVENTION
`1. Field of the invention
`The present invention relates to a perceptual weighting
`device and method for producing a perceptually weighted
`signal in response to a wideband signal (0–7000 Hz) in order
`to reduce a difference between a weighted wideband signal
`and a subsequently synthesized weighted wideband signal.
`2. Brief description of the prior art
`The demand for efficient digital wideband speech/audio
`encoding techniques with a good subjective quality/bit rate
`trade-off is increasing for numerous applications such as
`audio/video teleconferencing, multimedia, and wireless
`applications, as well as Internet and packet network appli
`cations. Until recently, telephone bandwidths filtered in the
`range 200–3400 Hz were mainly used in speech coding
`applications. However, there is an increasing demand for
`wideband speech applications in order to increase the intel
`ligibility and naturalness of the speech signals. A bandwidth
`in the range 50–7000 Hz was found sufficient for delivering
`a face-to-face speech quality. For audio signals, this range
`gives an acceptable audio quality, but is still lower than the
`CD quality which operates on the range 20–20000 Hz.
`A speech encoder converts a speech signal into a digital
`bitstream which is transmitted over a communication chan
`nel (or stored in a storage medium). The speech signal is
`digitized (sampled and quantized with usually 16-bits per
`sample) and the speech encoder has the role of representing
`these digital samples with a smaller number of bits while
`maintaining a good subjective speech quality. The speech
`decoder or synthesizer operates on the transmitted or stored
`bit stream and converts it back to a sound signal.
`One of the best prior art techniques capable of achieving
`a good quality/bit rate trade-off is the so-called Code Excited
`Linear Prediction (CELP) technique. According to this
`technique, the sampled speech signal is processed in suc
`cessive blocks of L samples usually called frames where L
`45
`is some predetermined number (corresponding to 10–30 ms
`of speech). In CELP, a linear prediction (LP) synthesis filter
`is computed and transmitted every frame. The L-sample
`frame is then divided into smaller blocks called subframes of
`size N samples, where L=kN and k is the number of
`subframes in a frame (N usually corresponds to 4–10 ms of
`speech). An excitation signal is determined in each
`subframe, which usually consists of two components: one
`from the past excitation (also called pitch contribution or
`adaptive codebook) and the other from an innovative code
`55
`book (also called fixed codebook). This excitation signal is
`transmitted and used at the decoder as the input of the LP
`synthesis filter in order to obtain the synthesized speech.
`An innovative codebook in the CELP context, is an
`indexed set of N-sample-long sequences which will be
`60
`referred to as N-dimensional codevectors. Each codebook
`sequence is indexed by an integer k ranging from 1 to M
`where M represents the size of the codebook often expressed
`as a number of bits b, where M=2".
`To synthesize speech according to the CELP technique,
`each block of N samples is synthesized by filtering an
`appropriate codevector from a codebook through time vary
`ing filters modelling the spectral characteristics of the
`
`65
`
`Ex. 1001 / Page 6 of 16
`
`
`
`3
`for producing a perceptually weighted signal in response to
`a wideband signal in order to reduce a difference between a
`weighted wideband signal and a subsequently synthesized
`weighted wideband signal. This perceptual weighting device
`comprises:
`a) a signal preemphasis filter responsive to the wideband
`signal for enhancing the high frequency content of the
`wideband signal to thereby produce a preemphasised
`signal;
`b) a synthesis filter calculator responsive to the preem
`phasised signal for producing synthesis filter coeffi
`cients; and
`c) a perceptual weighting filter, responsive to the preem
`phasised signal and the synthesis filter coefficients, for
`filtering the preemphasised signal in relation to the
`synthesis filter coefficients to thereby produce the per
`ceptually weighted signal. The perceptual weighting
`filter has a transfer function with fixed denominator
`whereby weighting of the wideband signal in a formant
`region is substantially decoupled from a spectral tilt of
`that wideband signal.
`The present invention also relates to a method for pro
`ducing a perceptually weighted signal in response to a
`wideband signal in order to reduce a difference between a
`weighted wideband signal and a subsequently synthesized
`weighted wideband signal. This method comprises: filtering
`the wideband signal to produce a preemphasised signal with
`enhanced high frequency content; calculating, from the
`preemphasised signal, synthesis filter coefficients; and fil
`tering the preemphasised signal in relation to the synthesis
`filter coefficients to thereby produce a perceptually weighted
`speech signal. The filtering comprises processing the pre
`emphasis signal through a perceptual weighting filter having
`a transfer function with fixed denominator whereby weight
`ing of the wideband signal in a formant region is substan
`tially decoupled from a spectral tilt of the wideband signal.
`In accordance with preferred embodiments of the subject
`invention:
`reduction of the dynamic range comprises filtering the
`wideband signal through a transfer function of the
`form:
`
`wherein u is a preemphasis factor having a value
`located between 0 and 1;
`the preemphasis factor u is 0.7;
`the perceptual weighting filter has a transfer function of
`the form:
`
`where 0<Y2<y1 =1 and Y, and Y are weighting control
`values; and
`the variable Y, is set equal to u.
`Therefore, the overall perceptual weighting of the quan
`tization error is obtained by a combination of a preemphasis
`filter and a modified weighting filter to enable high subjec
`tive quality of the decoded wideband sound signal into filter
`W(z) in order to control the tilt and formant weighting
`separately.
`The solution to the problem exposed in the brief descrip
`tion of the prior art is accordingly to introduce a preemphasis
`filter at the input, compute the synthesis filter coefficients
`based on the preemphasized signal, and use a modified
`perceptual weighting filter by fixing its denominator. By
`reducing the dynamic range of the wideband signal, the
`preemphasis filter renders the wideband signal more suitable
`
`US 6,807,524 B1
`
`4
`for fixed-point implementation, and improves the encoding
`of the high frequency contents of the spectrum.
`The present invention further relates to an encoder for
`encoding a wideband signal, comprising: a) a perceptual
`weighting device as described herein above; b) an pitch
`codebook search device responsive to the perceptually
`weighted signal for producing pitch codebook parameters
`and an innovative search target vector; c) an innovative
`codebook search device, responsive to the synthesis filter
`coefficients and to the innovative search target vector, for
`producing innovative codebook parameters; and d) a signal
`forming device for producing an encoded wideband signal
`comprising the pitch codebook parameters, the innovative
`codebook parameters, and the synthesis filter coefficients.
`Still further in accordance with the present invention,
`there is provided:
`a cellular communication system for servicing a large
`geographical area divided into a plurality of cells,
`comprising: a) mobile transmitter/receiver units; b)
`cellular base stations respectively situated in the cells;
`c) a control terminal for controlling communication
`between the cellular base stations; d) a bidirectional
`wireless communication sub-system between each
`mobile unit situated in one cell and the cellular base
`station of this cell, this bidirectional wireless commu
`nication sub-system comprising, in both the mobile unit
`and the cellular base station:
`i) a transmitter including an encoder as described
`hereinabove for encoding a wideband signal and a
`transmission circuit for transmitting the encoded
`wideband signal; and
`ii) a receiver including a receiving circuit for receiving
`a transmitted encoded wideband signal and a decoder
`for decoding the received encoded wideband signal.
`a cellular mobile transmitter/receiver unit comprising:
`a) a transmitter including an encoder as described
`hereinabove for encoding a wideband signal and a
`transmission circuit for transmitting the encoded
`wideband signal; and
`b) a receiver including a receiving circuit for receiving
`a transmitted encoded wideband signal and a decoder
`for decoding the received encoded wideband signal;
`a cellular network element comprising:
`a) a transmitter including an encoder as described
`hereinabove for encoding a wideband signal and a
`transmission circuit for transmitting the encoded
`wideband signal; and
`b) a receiver including a receiving circuit for receiving
`a transmitted encoded wideband signal and a decoder
`for decoding the received encoded wideband signal;
`and
`a bidirectional wireless communication sub-system
`between each mobile unit situated in one cell and the
`cellular base station of this cell, this bidirectional
`wireless communication sub-system comprising, in
`both the mobile unit and the cellular base station:
`a) a transmitter including an encoder as described
`hereinabove for encoding a wideband signal and a
`transmission circuit for transmitting the encoded
`wideband signal; and
`b) a receiver including a receiving circuit for receiving
`a transmitted encoded wideband signal and a decoder
`for decoding the received encoded wideband signal.
`The objects, advantages and other features of the present
`invention will become more apparent upon reading of the
`following non restrictive description of preferred embodi
`ments thereof, given by way of example only with reference
`to the accompanying drawings.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Ex. 1001 / Page 7 of 16
`
`
`
`US 6,807,524 B1
`
`5
`BRIEF DESCRIPTION OF THE DRAWINGS
`In the appended drawings:
`FIG. 1 is a schematic block diagram of a preferred
`embodiment of wideband encoding device;
`FIG. 2 is a schematic block diagram of a preferred
`embodiment of wideband decoding device;
`FIG. 3 is a schematic block diagram of a preferred
`embodiment of pitch analysis device; and
`FIG. 4 is a simplified, schematic block diagram of a
`cellular communication system in which the wideband
`encoding device of FIG. 1 and the wideband decoding
`device of FIG. 2 can be used.
`
`5
`
`10
`
`15
`
`6
`a receiving circuit 411 for receiving a transmitted
`encoded voice signal usually through the same
`antenna 409; and
`a decoder 412 for decoding the received encoded voice
`signal from the receiving circuit 411.
`The radiotelephone further comprises other conventional
`radiotelephone circuits 413 to which the encoder 407 and
`decoder 412 are connected and for processing signals
`therefrom, which circuits 413 are well known to those of
`ordinary skill in the art and, accordingly, will not be further
`described in the present specification.
`Also, such a bidirectional wireless radio communication
`subsystem typically comprises in the base station 402:
`a transmitter 414 including:
`an encoder 415 for encoding the voice signal; and
`a transmission circuit 416 for transmitting the encoded
`voice signal from the encoder 415 through an
`antenna such as 417; and
`a receiver 418 including:
`a receiving circuit 419 for receiving a transmitted
`encoded voice signal through the same antenna 417
`or through another antenna (not shown); and
`a decoder 420 for decoding the received encoded voice
`signal from the receiving circuit 419.
`The base station 402 further comprises, typically, a base
`station controller 421, along with its associated database
`422, for controlling communication between the control
`terminal 405 and the transmitter 414 and receiver 418.
`As well known to those of ordinary skill in the art, voice
`encoding is required in order to reduce the bandwidth
`necessary to transmit sound signal, for example voice signal
`such as speech, across the bidirectional wireless radio com
`munication subsystem, i.e., between a radiotelephone 403
`and a base station 402.
`LP voice encoders (such as 415 and 407) typically oper
`ating at 13 kbits/second and below such as Code-Excited
`Linear Prediction (CELP) encoders typically use a LP syn
`thesis filter to model the short-term spectral envelope of the
`voice signal. The LP information is transmitted, typically,
`every 10 or 20 ms to the decoder (such 420 and 412) and is
`extracted at the decoder end.
`The novel techniques disclosed in the present specifica
`tion may apply to different LP-based coding systems.
`However, a CELP-type coding system is used in the pre
`ferred embodiment for the purpose of presenting a non
`limitative illustration of these techniques. In the same
`manner, such techniques can be used with sound signals
`other than voice and speech as well with other types of
`wideband signals.
`FIG. 1 shows a general block diagram of a CELP-type
`speech encoding device 100 modified to better accommo
`date wideband signals.
`The sampled input speech signal 114 is divided into
`successive L-sample blocks called “frames”. In each frame,
`different parameters representing the speech signal in the
`frame are computed, encoded, and transmitted. LP param
`eters representing the LP synthesis filter are usually com
`puted once every frame. The frame is further divided into
`smaller blocks of N samples (blocks of length N), in which
`excitation parameters (pitch and innovation) are determined.
`In the CELP literature, these blocks of length N are called
`“subframes” and the N-sample signals in the subframes are
`referred to as N-dimensional vectors. In this preferred
`embodiment, the length N corresponds to 5 ms while the
`length L corresponds to 20 ms, which means that a frame
`contains four subframes (N=80 at the sampling rate of 16
`kHz and 64 after down-sampling to 12.8 kHz). Various
`N-dimensional vectors occur in the encoding procedure. A
`list of the vectors which appear in FIGS. 1 and 2 as well as
`a list of transmitted parameters are given herein below:
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`As well known to those of ordinary skill in the art, a
`cellular communication system such as 401 (see FIG. 4)
`provides a telecommunication service over a large geo
`graphic area by dividing that large geographic area into a
`number C of smaller cells. The C smaller cells are serviced
`by respective cellular base stations 4021, 4022. . . . 402 to
`provide each cell with radio signalling, audio and data
`channels.
`Radio signalling channels are used to page mobile radio
`telephones (mobile transmitter/receiver units) such as 403
`within the limits of the coverage area (cell) of the cellular
`base station 402, and to place calls to other radiotelephones
`403 located either inside or outside the base station’s cell or
`to another network such as the Public Switched Telephone
`Network (PSTN) 404.
`Once a radiotelephone 403 has successfully placed or
`received a call, an audio or data channel is established
`between this radiotelephone 403 and the cellular base station
`402 corresponding to the cell in which the radiotelephone
`403 is situated, and communication between the base station
`402 and radiotelephone 403 is conducted over that audio or
`data channel. The radiotelephone 403 may also receive
`control or timing information over a signalling channel
`while a call is in progress.
`If a radiotelephone 403 leaves a cell and enters another
`adjacent cell while a call is in progress, the radiotelephone
`403 hands over the call to an available audio or data channel
`of the new cell base station 402. If a radiotelephone 403
`leaves a cell and enters another adjacent cell while no call is
`in progress, the radiotelephone 403 sends a control message
`over the signalling channel to log into the base station 402
`of the new cell. In this manner mobile communication over
`a wide geographical area is possible.
`The cellular communication system 401 further com
`prises a control terminal 405 to control communication
`50
`between the cellular base stations 402 and the PSTN 404, for
`example during a communication between a radiotelephone
`403 and the PSTN 404, or between a radiotelephone 403
`located in a first cell and a radiotelephone 403 situated in a
`second cell.
`Of course, a bidirectional wireless radio communication
`subsystem is required to establish an audio or data channel
`between a base station 402 of one cell and a radiotelephone
`403 located in that cell. As illustrated in very simplified form
`in FIG. 4, such a bidirectional wireless radio communication
`subsystem typically comprises in the radiotelephone 403:
`a transmitter 406 including:
`an encoder 407 for encoding the voice signal; and
`a transmission circuit 408 for transmitting the encoded
`voice signal from the encoder 407 through an
`antenna such as 409; and
`a receiver 410 including:
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`55
`
`60
`
`65
`
`Ex. 1001 / Page 8 of 16
`
`
`
`US 6,807,524 B1
`
`8
`preemphasized using a filter having the following transfer
`function:
`
`7
`List of the Main N-dimensional Vectors
`s Wideband signal input speech vector (after down
`sampling, pre-processing, and preemphasis);
`s, Weighted speech vector;
`sº, Zero-input response of weighted synthesis filter;
`s, Down-sampled pre-processed signal; Oversampled syn
`thesized speech signal;
`s' Synthesis signal before deemphasis;
`sº, Deemphasized synthesis signal;
`sy, Synthesis signal after deemphasis and postprocessing;
`x Target vector for pitch search;
`x' Target vector for innovation search;
`h Weighted synthesis filter impulse response;
`v., Adaptive (pitch) codebook vector at delay T.
`yr Filtered pitch codebook vector (v, convolved with h);
`ce Innovative codevector at index k (k-th entry from the
`innovation codebook);
`c, Enhanced scaled innovation codevector;
`u Excitation signal (scaled innovation and pitch
`codevectors);
`u' Enhanced excitation;
`z Band-pass noise sequence;
`w" White noise sequence; and
`w Scaled noise sequence.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`where u is a preemphasis factor with a value located between
`0 and 1 (a typical value is us0.7). A higher-order filter could
`also be used. It should be pointed out that high-pass filter
`102 and preemphasis filter 103 can be interchanged to obtain
`more efficient fixed-point implementations.
`The function of the preemphasis filter 103 is to enhance
`the high frequency contents of the input signal. It also
`reduces the dynamic range of the input speech signal, which
`renders it more suitable for fixed-point implementation.
`Without preemphasis, LP analysis in fixed-point using
`single-precision arithmetic is difficult to implement.
`Preemphasis also plays an important role in achieving a
`proper overall perceptual weighting of the quantization
`error, which contributes to improved sound quality. This will
`be explained in more detail herein below.
`The output of the preemphasis filter 103 is denoted s(n).
`This signal is used for performing LP analysis in calculator
`module 104. LP analysis is a technique well known to those
`of ordinary skill in the art. In this preferred embodiment, the
`autocorrelation approach is used. In the autocorrelation
`approach, the signal s(n) is first windowed using a Hamming
`window (having usually a length of the order of 30–40 ms).
`The autocorrelations are computed from the windowed
`signal, and Levinson-Durbin recursion is used to compute
`LP filter coefficients, ag where i-1, . . . , p, and where p is
`the LP order, which is typically 16 in wideband coding. The
`parameters a, are the coefficients of the transfer function of
`the LP filter, which is given by the following relation:
`
`List of Transmitted Parameters
`STP Short term prediction parameters (defining A(z));
`T Pitch lag (or pitch codebook index);
`b Pitch gain (or pitch codebook gain);
`jIndex of the low-pass filter used on the pitch codevector;
`k Codevector index (innovation codebook entry); and
`g Innovation codebook gain.
`In this preferred embodiment, the STP parameters are
`transmitted once per frame and the rest of the parameters are
`LP analysis is performed in calculator module 104, which
`transmitted four times per frame (every subframe).
`also performs the quantization and interpolation of the LP
`filter coefficients. The LP filter coefficients are first trans
`Encoder Side
`formed into another equivalent domain more suitable for
`The sampled speech signal is encoded on a block by block "
`quantization and interpolation purposes. The line spectral
`pair (LSP) and immitance spectral pair (ISP) domains are
`basis by the encoding device 100 of FIG. 1 which is broken
`two domains in which quantization and interpolation can be
`down into eleven modules numbered from 101 to 111.
`efficiently performed. The 16 LP filter coefficients, a, can be
`The input speech is processed into the above mentioned
`quantized in the order of 30 to 50 bits using split or
`L-sample blocks called frames.
`multi-stage quantization, or a combination thereof. The
`Referring to FIG. 1, the sampled input speech signal 114
`purpose of the interpolation is to enable updating the LP
`is down-sampled in a down-sampling module 101. For
`filter coefficients every subframe while transmitting them
`example, the signal is down-sampled from 16 kHz down to
`once every frame, which improves the encoder performance
`12.8 kHz, using techniques well known to those of ordinary
`without increasing the bit rate. Quantization and interpola
`skill in the art. Down-sampling down to another frequency
`tion of the LP filter coefficients is believed to be otherwise
`can of course be envisaged. Down-sampling increases the
`well known to those of ordinary skill in the art and,
`coding efficiency, since a smaller frequency bandwidth is
`accordingly, will not be further described in the present
`encoded. This also reduces the algorithmic complexity since
`specification.
`the number of samples in a frame is decreased. The use of
`The following paragraphs will describe the rest of the
`down-sampling becomes significant when the bit rate is
`co