throbber
1996 IEEE International Conference on Acoustics,
`Speech, & Signal Processing
`
`‘
`
`Conference Proceedings
`
`May 7 - 10, 1996
`A1l:'tl1E'd.GeOi'_£_1i'.l USA
`
`.i
`
`..-I‘
`
`IC A S S P-96
`fit/a/(ta
`
`Volume 1
`
`Sponsored by the
`
`Signal Processing Society of the
`lnstittllc of Electrical and Electronic Engineers
`
`g1of8
`
`it
`
`P
`
`'
`
`P IBIT 05
`
`ZTE EXHIBIT 1005
`
`Page 1 of 8
`
`

`
`~ G/15Sf: f!i ~ ?f'
`(;r-:~!+55!)
`
`The 1996 IEEE International Conference on
`Acoustics, Speech, and Signal Processing
`Conference Proceedings
`
`Sponsored by the Signal Processing Society of the Institute of Electrical and
`Electronics Engineers
`
`May 7-10, 1996
`Marriott Marquis Hotel
`Atlanta, Georgia, USA
`
`Page 2 of 8
`
`

`
`I K 7 r 1 ~
`y ll.(l 5{ c!; ~ 7 cL,
`
`-
`
`. , -
`
`1 1 {o
`
`The 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing
`Conference Proceedings
`
`Copyright and Reprint Permission: Abstracting is permitted with credit to the source. Libraries are permilled to photocopy
`beyond the limit of U.S. copyright law for private use of patrons those articles in this volume that carry a code at the bottom
`of the first page, provided the per-copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood
`Drive, Danvers, MA 01923. For other copying, reprint or republication permission. write to IEEE
`
`Copyrights Manager, IEEE Service Center, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ
`08855-1331. All rights reserved. Copyright 1996 by the Institute of Electrical and Electronics ~/ _
`Engineers, Inc.
`·.::
`V 1
`1
`
`96CH35903
`IEEE Catalog Number:
`ISBN 0-7803-3192-3 (softbound)
`ISBN 0-7806-3193-1 (casebound edition)
`ISBN 0-7803-3194-X (microfiche)
`ISBN 0-7803-3195-8 (CD-ROM)
`84-645139
`Library of Congress:
`
`Additional Proceedings (hard-copy and CD-ROM) may be ordered from:
`
`IEEE Service Center
`445 Hoes Lane
`P.O. Box 1331
`Piscataway, NJ 08855-1331
`1-800-678-IEEE
`
`ii
`
`Page 3 of 8
`
`

`
`1996 International Conference on Acoustics,
`Speech, and Signal Processing
`Conference Committee
`
`The 1996 International Conference on Acoustics, Speech, and Signal Processing (ICASSP), sponsored by the IEEE Signal
`Processing Society, is the 21st in a series of international conferences devoted to experimental and theoretical aspects of sig(cid:173)
`nal processing, speech, and acoustics. Conferences on this scope are possible only because of the continuing interest and sup(cid:173)
`port of the Society membership, expressed both by their submission of papers of high quality and by their attendance at the
`conference. T he ICASSP 96 Conference Committee is grateful to all the authors, the session chairs, and the volunteers for
`contributing to the success of tbe conference.
`
`Committee Members and Chairs
`
`General Chair
`Monson H. Hayes
`Georgia Institute of Technology
`Atlanta, GA 30332-0250 U.S.A
`Tel. (404) 894-2958
`E-mail: icassp96-chair@ece.gatech.edu
`
`Technical Program:
`Mark A. Clements
`Georgia Institute of Technology
`Atlanta, GA 30332-0250 U.S.A
`Tel. (404) 894-4584
`E-mail: icassp96-technical @ece.gatech .edu
`
`Finance:
`Craig H. Richardson
`Atlanta Signal Processors Inc.
`1375 Peachtree Rd. NE. Ste. 690
`Atlanta, GA 30309-3115 U.S.A.
`Tel. (404) 892-7265
`E-Mail: ica.~sp96-financc@ece.gatcch.edu
`
`Exhibits:
`John Kalter
`Lanier Worldwide. Inc.
`4667 North Royal Atlanta Drive
`Tucker, GA 30084 U.S.A.
`Tel. (770) 493-2201
`E-Mail: icassp96-exhibits@ece.gatech.edu
`
`Local Arrangemems:
`Russell M. Mersereau
`Georgia Institute of Technology
`Atlanta. OA 30332-0250 U.S.A.
`Tel. (404) 894-29 13
`E-Mail: icassp96-local @ece.gatech.edu
`
`Registration:
`Douglas B. WiWams
`Georgia Institute of Technology
`Atlanta, OA 30332-0250 U.S.A
`Tel. (404) 894-9832
`E-Mail: ieassp96-reg@ece.gatech.edu
`
`Publications:
`Vijay K. Madisetti
`Georgia Institute of Technology
`Atlanta. GA 30332-0250 U.S.A.
`Tel. (404) 894-4696
`E-Mail: icassp96-pubs@ece.gatech.cdu
`
`Guotong Zbou
`Georgia Institute of Technology
`Atlanta. OA 30332-0250 U.S.A.
`Tel. (404) 894-2907
`E-mail: gtz@eedsp.gatech.edu
`
`Social:
`Mary Ann Ingram
`Georgia Institute of Technology
`Atlanta. GA 30332-0250 U.S.A
`Tel. (404) 894-9482
`E-Mail: icassp96-socinl @ece.gatech.edu
`
`Publicity:
`Stanley J. Reeves
`Dept. of Electrical Engineering
`Auburn University
`Auburn, AL. 36849-1809 U.S.A.
`Tel. (334) 844-1821
`E-Mail: icassp96-publicity@ece.gatech.edu
`
`Tutorial:
`John H. L. Hansen
`Dept. of Electrical Engineering
`Duke University
`Durham, NC 27706 U.S.A.
`Tel.(919) 660-5256
`E-mail: jhlh@ee.duke.edu
`
`European Liaison:
`M aurice Bellanger
`CNAM/Electronique
`292, rue Saint-Martin
`75141 Paris Cedex 03 FRANCE
`Tel. +(33)-1 -4027-2590
`E-mail: bellang@cnam.cnam.fr
`
`Far East Liaison:
`Sadaoki Furui
`Furui Research Laboratory
`NIT Human Interface Labs
`9-11 Midori-CHO 3-CHOME
`Musashino-Shi Tokyo 180 JAPAN
`E-mail: furui @speech-sun 15.ntt.jp
`
`Conference Secretariat
`Meeting Management
`2603 Main Street. Suite 690
`Irvine. CA 92714 U.S.A.
`Tel. (714) 752-8205
`Fax: (714) 752-7444
`E-mail: 74710.2266@CompuServe.Com
`
`iii
`
`Page 4 of 8
`
`

`
`16 KBIT/S WIDEBAND SPEECH CODING BASED ON UNEQUAL SUBBANDS
`
`Jiirgen W. Paulus and Jiirgen Schnitzler
`Institute of Communication Systems and Data Processing (IND)
`RWTH Aachen, University of Technology, D-52056 Aachen, Germany
`phone: +49.241.806961, fax: +49.241.8888186, juergen.paulus@ind.rwth-aachen.de
`
`ABSTRACT
`
`In this paper we propose a split-band encoding scheme
`for 16 kbit/s wideband speech coding (50-7000Hz), using
`2 unequal subbands from 0-6 kHz and from 6-7kHz. This
`approach was motivated by experimental evaluation of the
`signal bandwidth of speech frames. The higher subband
`is simply represented by white noise with adjustment of
`the short term energy. For the lower subband code-excited
`linear prediction (CELP) is used. By informal listening
`tests the speech quality was rated higher than the speech
`quality of the CCITT G.722 wideband codec operating at
`48kbit/s.
`
`1.
`
`INTRODUCTION
`
`During the last few years there has been an increasing effort
`in wideband speech coding at lower bit rates. This not only
`arises from high quality videophone and digital mobile tele(cid:173)
`phone applications, but also from the increasing market for
`multimedia systems where high quality speech and audio
`is demanded. Compared to narrowband telephone speech,
`the reduction of the lower cut off frequency from 300Hz to
`50 Hz contributes to increased naturalness and fullness. The
`high frequency extension from 3400Hz to 7000Hz provides
`better fricative differentiation and therefore higher intel(cid:173)
`ligibility.
`In 1986 the International Telegraph and Tele(cid:173)
`phone Consultative Committee ( CCITT, now ITU-T) re(cid:173)
`commended the G.722 standard for wideband speech and
`audio coding. This wideband speech codec provides high
`speech quality at 64 kbit/s with a bandwidth of 50 Hz to
`7000Hz [1). Slightly reduced qualities are achieved at 56
`and 48 kbit/s. Since September 1993, the International
`Telecommunications Union Study Group 15 (ITU-T SG 15)
`studies in Question 6 ("Audio and Wideband Coding for
`Public Telecommunication Networks") new coding schemes
`for low-rate wideband speech coding at 16, 24, and 32 kbitfs
`(2). The G.722 standard will serve as a reference for the de(cid:173)
`velopment of this alternative coding scheme.
`In the past, linear prediction models have been used very
`successfully for the coding of telephone speech. Recently,
`a new 8 kbit/s narrowband speech coder has been selected
`by the ITU-T SG 15 which provides telephone quality at
`1 bit/sample [3, 4). This indicates, that very good coding
`quality might be possible for wideband speech signals with
`1 bit/sample, too. However, for audio signals the desired
`quality has not yet been achieved using LPC techniques
`
`with long term prediction (LTP) which are based on a model
`of speech production. For those signals, subband coding,
`transform coding and various forms of entropy coding have
`been used for efficient coding with 2-3 bits per sample, if no
`oversampling is applied.
`In the following sections an encoding scheme for speech
`will be presented which consists of a 2-band splitband
`scheme with unequal bandwidths of the subbands. This ap(cid:173)
`proach is motivated by the experimental evaluation of the
`instantaneous signal bandwidth. First, in Section 2 a clas(cid:173)
`sification scheme is explained which leads to the unequal
`splitting of the subbands. Afterwards the analysis filter
`bank is described which performs the unequal band split(cid:173)
`ting combined 'vith critical subsampling of the sub-bands.
`In Section 3 and Section 4 the encoding techniques for both
`bands are explained. In Section 5 a bit error concealment
`technique is described and in Section 6 the final bit alloca(cid:173)
`tion is given. In Section 7 we discuss the extension of the
`coding scheme towards variable bitrate.
`
`2. ANALYSIS FILTERBANK
`
`The use of unequal sub bands was motivated by the experi(cid:173)
`mental evaluation of the instantaneous signal bandwidth of
`speech frames. During voiced parts of a speech signal, most
`of the signal energy is present in the lower frequency region.
`Therefore it is not necessary to encode the higher part of the
`frequency range. Transform coding techniques behave in a
`similar way in that they allocate in voiced frames more bits
`to code lower frequency components than higher frequency
`components. For that reason, simulations wen: po:rfonned
`to find out the actual cut-off frequency necessary to encode
`the current frame without loss of perceptual speech quality.
`By applying a frame size of 10 ms we found that almost 40%
`of the frames could be encoded using a bandwidth of 6kHz
`without loss of perceptual quality. The full bandwidth was
`selected mainly during unvoiced parts of the speech sig(cid:173)
`nals. The voice activity of the speech material used was
`95%. It was extracted from the European Broadcasting
`Union database (5). The speech material consists of various
`languages (English, German, and French), each with male
`and female speakers, and was bandlimited to a frequency
`range of 50-7000 Hz, according to the specifications in the
`G.722 recommendations [1). As a result of the classifica(cid:173)
`tion, a 2-band encoding scheme is proposed which consists
`of subbands with unequal bandwidth. The lower subband
`has a frequency range from 0-6 kHz and the upper subband
`
`0-7803-3192-3/96 $5.00©1996 IEEE
`
`255
`
`Page 5 of 8
`
`

`
`covers a frequency range from 6-7kHz, i.e. we obtain a sub(cid:173)
`band coder with 2 bands having a bandwidth of 6kHz and
`1 kHz respectively. Figure 1 shows the analysis filterbank
`for unequal subband splitting and critical subsampling of
`the subbands.
`
`n
`
`(-1)
`
`higher subband: 6-7kHz
`
`s(n)
`
`!;=16kHz
`
`xh(n)
`
`.(,=4kHz
`
`iS= 12kHz
`
`lower subband: 0-6 kHz
`
`Figure 1. Analysis filterbank for subband splitting
`and critical subsampling of the subband
`signals.
`
`The analysis filterbank is implemented using the efficient
`structure for sampling rate conversion with a fractional ra(cid:173)
`tio of the sampling rate, as described for example by Croch(cid:173)
`iere eta/. (6].
`
`3. ENCODING OF THE 0-6 KHZ BAND
`For encoding the decimated lower subband code-excited(cid:173)
`linear-prediction (CELP, Atal eta/. (7]) is performed. The
`coder operates on speech frames of 10 ms (120 samples).
`In the following, the main parts of the CELP-codec will
`be described: LP-analysis, pitch analysis, fixed codebook
`structure and perceptual weighting filter.
`The subframe lengths used for the different parts of the
`codec are indicated in Figure 2, being 5 ms for the pitch
`analysis and 2.5 ms for the fixed codebook.
`
`LPC
`
`LTP
`
`LTP
`
`CB
`
`CB
`
`CB
`
`CB
`
`0
`
`2.5
`
`5
`
`7.5
`
`10
`time [ms] ->
`
`Figure 2. Update of the codec parameters.
`
`3.1. LP-analysis
`The Linear-Prediction (LP) analysis uses a covariance(cid:173)
`lattice approach as described by Cumani (8). The analysis
`frame length is 15 ms, centered around the middle of the
`second LTP-subframe, resulting in a look-ahead of 5 ms. In
`our realization the order of the LP-filter is 14. The predic(cid:173)
`tion coefficients are updated every 10 ms. Prior to solving
`the equations for the coefficients, the covariance matrix is
`modified by weighting it with a binomial window having
`
`256
`
`an effective bandwidth of 80 Hz (9). This provides a small
`amount of bandwidth expansion to the final LP-filter coeffi(cid:173)
`cients. This is advantageous for the following conversion of
`the LP filter parameters to line spectral frequencies (LSF)
`(10), as weU as for the quantization of the LSF's.
`The LSFs are encoded using 44 bits by interframe moving
`average prediction and split vector quantization of the line
`spectral frequencies resulting in an average spectral distor(cid:173)
`tion of ldB.
`A linear interpolation of the LP-filter coefficients is per(cid:173)
`formed for the first LTP-subframe. This is done in the
`LSF-domain between the quantized actual coefficient set
`and the quantized coefficient set of the previous frame. For
`the second subframe, no interpolation is performed.
`3.2. Pitch analysis
`Every 5 ms, a long-term-prediction (LTP) is carried out in a
`combination of open-loop and closed-loop LT-analysis. For
`each lO ms speech frame, an open-loop pitch estimate is cal(cid:173)
`culated using a weighted correlation measure to avoid mul(cid:173)
`tiples of the pitch period. Thus, a smoothed estimate of the
`pitch contour is obtained. In the first subframe a focussed
`closed-loop adaptive codebook search is performed around
`the open-loop estimate To!, and in the second subframe a
`restricted search is performed around the pitch lag of the
`closed-loop analysis of the first subframe Tel, I , as depicted
`in Figure3.
`
`1st subframe :
`
`2nd subframe :
`
`0 samples
`
`-
`
`search range ~ 0 samples
`
`Figure 3. Long-Term analysis using combined open(cid:173)
`loop and closed-loop analysis and a fo(cid:173)
`cussed search strategy.
`This procedure results in a delta encoding scheme leading
`to 8+6=14 bits for coding the 2 pitch lags.
`The closed-loop search is performed using an adapt(cid:173)
`ive codebook filled with previously computed excitation
`samples. The minimum pitch lag is half of the subframe
`length, i.e. Tmin = 30samples. Additionally, in the lower
`delay range a fractional pitch approach is used (11), as
`shown in Figure 4.
`
`: integer delaY; fractional pitch:
`
`: actual speed\ frame :
`
`·'tmax
`
`o
`·'tmin
`
`120 sample3
`
`integer and fractional pitch
`Figure 4. Combined
`search ranges during closed-loop adaptive
`codebook search (rmax=l93 samples).
`
`Informal listening tests indicate, that a resolution of 1/2
`sample is sufficient for an improvement in speech quality.
`The pitch gain is nonuniformly scalar quantized with
`4 bits.
`
`Page 6 of 8
`
`

`
`3.3. Codebook
`Every 2.5 rns (30 samples), an excitation vector is selec(cid:173)
`ted from a modified 16-bit ternary sparse codebook, as de(cid:173)
`scribed by Salami et al. [12]. An innovation vector contains
`4 nonzero pulses, as shown in Table 1.
`
`I Amplitude I
`±1
`±1
`±1
`±1
`
`Position
`0, 4, 8, 12, 16, 20, 24, 28
`1, 5, 9, 13, 17, 21, 25, 29
`2, 6, 10, 14, 18, 22, 26, (30}
`3, 7, 11, 15, 19, 23, 27, (31)
`
`Table 1. 16-bit ternary sparse codebook [12].
`
`Note that the last position of the 3rd and 4th pulse falls
`outside the subframe boundary. This gives the possibility
`of a variable number of pulses per frame.
`Each pulse has 8 possible positions. Therefore the pulse
`positions are encoded for each pulse with 3 bits. FUrther(cid:173)
`more, each pulse amplitude is encoded with 1 bit, resulting
`in a total of 16 bits for the 4 pulses.
`Due to the structured nature of the codebook, a fast
`search procedure is ensured. Additionally, a focussed search
`approach is used to further reduce the computational load
`of the codebook search [12).
`To reduce the dynamik range of the fixed codebook gain,
`a fixed gain predictor is used. The gain predictor is pre(cid:173)
`dicting the log. energy of the current fixed codebook vector
`based on the log. energy of the previously selected scaled
`fixed codebook vector. This is done in a similar way as in
`a preliminary version of ITU-T G.729 [13). The residual of
`the gain predictor is nonuniformly scalar quantized with 4
`bits.
`
`3.4. Perceptual weighting filter
`The perceptual weighting filter W(z) used during the min(cid:173)
`imization process has a transfer function of the form
`
`(1)
`
`W(z) = A(zht),
`A(zhz)
`with A(z) being the LP-analysis illter, using unquant(cid:173)
`ized LP-filter coefficients. Different sets of weighting factors
`bt, "f2} are used for the adaptive and fixed codebook
`search. During the adaptive codebook search, weighting
`factors {1.0, 0.4} are used, and during the fixed codebook
`search {0.9, 0.8} is used. This was found to give better res(cid:173)
`ults compared to a fixed weighting filter.
`The perceptual weighting filter is updated every 5 ms, us(cid:173)
`ing in the first subframe a linear interpolation between the
`actual unquantized filter coefficients and the unquantized
`filter coefficients of the previous frame. ln the second sub(cid:173)
`frame the actual unquantized coefficients are used.
`
`4. ENCODING OF THE 6-7KHZ BAND
`The classification experiment shows, that the full band(cid:173)
`width is selected mainly during the unvoiced parts of the
`speech signal. This indicates that the higher subband has a
`noise like character. Furthermore, it turned out by experi(cid:173)
`ment that during unvoiced parts it is sufficient to add some
`noise like spectral components above 6kHz to obtain the
`
`perceptual speech quality of a 7kHz speech signal. There(cid:173)
`fore, the higher subband (6-7kHz) is simply represented
`by white noise with adjustment of the short term energy.
`At the output of the analysis filterbank of Section 2 the
`subband signal Xh(n) has a sampling rate of 4kHz, i.e. a
`bandwidth of 2kHz (see Figure 1). Since the input signal is
`bandlimited to 7 kHz, a further reduction of the sampling
`rate by a factor of 2 could be done without use of an aliasing
`filter. The input frame length of LO ms (20 samples) is split
`up into 4 subframes of 2.5 ms, each consisting of 5 samples.
`For each subframe tll.e short term energy is logarithmically
`quantized with 3 bits using MA-prediction with a fixed set
`of coefficients. This results in a bitrate of 1.2kbit/s for the
`higher subband.
`An informal listening test was performed using a high
`quality loudspeaker. The higher subband was processed us(cid:173)
`ing the encoding scheme as described above. The lower
`band remained uncoded, however the sampling rate conver(cid:173)
`sion of the lower sub-band was carried out. As a result,
`it was difficult to distinguish between the original and the
`processed speech signal. Thus, this very simple encoder can
`be used to encode the subband from 6-7kHz.
`
`5. BIT ERROR CONCEALMENT
`
`For the previously described scheme, the overaU bit-rate
`sums up to 15.8 kbit.fs. This gives the possibility of using
`2 parity-bits per frame for reducing the sensitivity of the
`codec to random bit errors up to BER=1o-3
`. After per(cid:173)
`forming informal listening tests, it was concluded, that the
`LP-coefficients are most sensitive against bit errors.
`Therefore, the first parity-bit is computed from the
`44 bits of the LP-coefficients. This bit is transmitted, and
`at the decoder the parity-bit is recomputed from the re(cid:173)
`If a parity-error occurs, the
`ceived LP-filter cofficients.
`LP-coefficient set is replaced by the values of the previous
`frame.
`The second parity-bit is computed from the 8 bits of the
`LTP-index of the first subframe. If a parity-error occurs,
`the value of the LTP-index is set to the integer delay value
`of the previous subframe.
`
`6. BIT ALLOCATION
`
`In the previous sections, the main components of the wide(cid:173)
`band codec were presented. According to Table 2, a final
`bit-rate of 16 kbit/s is achieved.
`
`6-7kHz Energy
`LPC
`LTP-Index
`LTP-Gain
`-cB:rndex
`CB-Gaiill
`Parity bits
`
`0-6kHz
`
`4*3Bit
`
`12 bits
`44bits
`8+6Bit 14 bits
`2*4 Bit
`8bits
`4*16 Bit 64 bits
`4*4 Bit 16bits
`2bits
`
`1.2 kbit/s
`4.4 kbit/s
`2. 2}<l:>i q s
`
`8.0kbit/s
`0.2kb1t/s
`l6.0kb1t/s
`
`Table 2. Bit allocation for a 10 ms frame of the pro(cid:173)
`posed 16 kbitfs splitband wideband codec
`
`257
`
`Page 7 of 8
`
`

`
`7. EXTENSION TO VARIABLE BITRATE
`One of the results of Section 2 has been, that 40% of the
`speech signal with a voice activity of 95% could be encoded
`using just the subband from 0-6 kHz. This means, during
`40% of the active talk time it is not necessary to encode the
`higher subband. This encourages us to consider different
`coding schemes.
`The first alternative is to neglect the bits necessary to
`encode the higher subband. This leads to a coder with a
`variable bitrate. Transmission of 1 Bit/frame is necessary
`in this case to indicate the encoding mode of the higher
`sub band.
`The second possibility is to use these bits to encode the
`lower subband more precisely, resulting in an encoder with
`an overall constant bitrate, but variable bitrate in the two
`bands. Since this happens most of the time during voiced
`parts of a speech signal this is advantageous with respect to
`speech quality. Again one additional Bit/frame is necessary
`to indicate the encoding mode of the higher subband.
`Another possibility was recently presented by the author
`in [14], based on a similar approach in (15) in the context
`of wideband ADPCM. The wideband speech signal is en(cid:173)
`coded using only the spectral bandwidth from 0-6 kHz and
`the higher subband is neglected. The missing components
`above 6 kHz are replaced at the receiver by interpolating
`the lower subband signal from 12kHz to 16kHz using an
`interpolation filter with cut-off frequency 7kHz which viol(cid:173)
`ates the interpolation rules. This is possible due to the fact,
`that the signals within the frequency ranges 5-6 kHz and 6-
`7kHz exhibit a similar distribution of energy along the time
`axis for a given speech sound. In this case a fixed bitrate of
`14.8 kbit/s is achieved, with only very small degradations
`compared to the fixed bit--rate version of the previous sec(cid:173)
`tions.
`
`8. CONCLUSION
`In this paper a split-band encoding scheme for 16kbit/s
`wideband speech coding has been presented. It is based
`on two unequal sub bands from 0-6 kHz and 6-7kHz. This
`approach was motivated by experimental evaluation of the
`instantaneous signal bandwidth of the speech frames. The
`coder operates on speech frames of 10 ms, using a look(cid:173)
`ahead of 5 ms for LP-analysis. Together with the 10 ms
`delay introduced by the analysis-synthesis filterbank, this
`results in an overall algorithmic delay of 25 ms. By in(cid:173)
`formal listening tests the speech quality was judged to be
`better than the CCITT G.722 wideband codec operating at
`48kbit/s.
`
`ACKNOWLEDGEMENTS
`This work bas been supported by the Research Center of
`Deutsche Telekom AG. The author would like to thank es(cid:173)
`pecially Mr. G. Schroder. Acknowledgements are made to
`Prof. P. Vary and the colleagues of the speech coding group
`for inspiring discussions, especially T. Fingscheidt.
`
`REFERENCES
`[1) CCITT, "7 kHz Audio Coding within 64kbit/s", in
`Recommendation G. 722, vol. Fascile III.4 of Blue Book,
`pp. 269-341. Melbourne 1988.
`
`[2) Study Group 15 ITU-T , "Report February 1995 Meet(cid:173)
`ing Working Party 2/15" , February 1995, Geneva,
`Switzerland.
`[3) S. Dimolitsas, "ITU Voice Coding Standards: Stand(cid:173)
`ardization of Voice Coding Milestones Reached" ,
`comp.speecb Newsgroup, February 1995.
`[4) ITU-T SG15 COM 15-152,
`"G.729 - Coding of
`Speech at 8kbps using conjugate-structure algebraic(cid:173)
`code-excited linear-predictoin {CS-ACELP)".
`[5) European Broadcasting Union ( EBU ), Sound Quality
`Assesment Material {Recordings for Subjective Test),
`no. 422 204-2 edition.
`[6) R.E. Crochiere and L.R. Rabiner, Multirate Digital
`Signal Processing, Signal Processing. Prentice-Hall,
`1983.
`"Stochastic Coding
`(7) B.S. Atal and M.R. Schroeder,
`of Speech Signals at Very Low Bit Rates" , in Proc.
`Int. Conf. Communication (ICC), May 1984, pp. 1610-
`1613.
`[8) A. Cumani, "On a Covariance-Lattice Algorithm for
`Linear Prediction", in Proc. Int. Conf. Acoust., Speech,
`Signal Processing, ICASSP, Paris, France, 1982, pp.
`651-654.
`[9) Y. Tohkura and F. ltakura nad S. Hashimoto,
`"Spectral Smoothing Technique in PARCOR Speech
`Analysis-Synthesis" ,
`IEEE Trans. Acoust., Speech,
`Signal Processing, vol. 26, no. 6, pp. 587- 596, Decem(cid:173)
`ber 1978.
`[10] P. Kabal and R.P. Ra.machandran, "The Computation
`of Line Spectral Frequencies Using Chebyshef Poly(cid:173)
`nomials",
`IEEE Trans. Acoust., Speech, Signal Pro(cid:173)
`cessing, vol. 34, no. 6, pp. 1419- 1426, December 1986.
`[11) J. S. Marques, J. M. 'fribolet, I. M. 'francoso, and L. B.
`Almeida, "Pitch Prediction with Fractional Delays in
`CELP Coding", in Proc. EUROSPEECH, Genua, It(cid:173)
`alien, 1989, pp. 509-513.
`[12) R. Salami, C. Laflamme, J-P. Adoul, A. Kataoka,
`S. Hayashi, C. Lamblin, D. Massaloux, S. Proust,
`P. Kroon, and Y. Shoham, "Description of the Pro(cid:173)
`posed ITU-T 8kb/s Speech Coding Standard", in Proc.
`IEEE Workshop on Speech Coding, Annapolis, Mary(cid:173)
`land, USA, September 1995, pp. 3-4.
`[13) R. Salami, C. Laflamme, J.-P. Adoul, and D. Mas(cid:173)
`saloux,
`"A Toll Quality 8 kb/s Speech Codec for
`the Personal Communications System (PCS)", IEEE
`'n-ons. Vehicular Technology, val. 43, no. 3, pp. 808-
`816, August 1994.
`[14) J. Paulus, "Variable Bitrate Wideband Speech Coding
`Using Perceptually Motivated Thresholds", in Proc.
`IEEE Work8hop on Speech Coding for Telecommunic(cid:173)
`ations, Annapolis, Maryland, USA, September 1995,
`pp. 35-36.
`(15) M. Dietrich, "Performance and Implementation of a
`Robust ADPCM Algorithm for Wideband Speech Cod(cid:173)
`ing with 64 kbit/s", in Proc. Int. Zurich Seminar on
`Digital Communication.,, Ziirich, Switzerland, Marcb
`1984.
`
`258
`
`Page 8 of 8

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket