throbber
INTERNATIONAL TELECOMMUNICATION UNION
`
`ITU-T
`
`TELECOMMUNICATION
`STANDARDIZATION SECTOR
`OF ITU
`
`G.729
`(03/96)
`
`GENERAL ASPECTS OF DIGITAL TRANSMISSION
`SYSTEMS
`
`CODING OF SPEECH AT 8 kbit/s
`USING CONJUGATE-STRUCTURE
`ALGEBRAIC-CODE-EXCITED
`LINEAR-PREDICTION (CS-ACELP)
`
`ITU-T Recommendation G.729
`
`(Previously “CCITT Recommendation”)
`
`Ex. 1038 / Page 1 of 39
`Apple v. Saint Lawrence
`
`

`

`FOREWORD
`
`The ITU-T (Telecommunication Standardization Sector) is a permanent organ of the International Telecommunication
`Union (ITU). The ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommen-
`dations on them with a view to standardizing telecommunications on a worldwide basis.
`
`The World Telecommunication Standardization Conference (WTSC), which meets every four years, establishes the topics
`for study by the ITU-T Study Groups which, in their turn, produce Recommendations on these topics.
`
`The approval of Recommendations by the Members of the ITU-T is covered by the procedure laid down in WTSC
`Resolution No. 1 (Helsinki, March 1-12, 1993).
`
`ITU-T Recommendation G.729 was prepared by ITU-T Study Group 15 (1993-1996) and was approved under the WTSC
`Resolution No. 1 procedure on the 19th of March 1996.
`
`___________________
`
`In this Recommendation, the expression “Administration” is used for conciseness to indicate both a telecommunication
`administration and a recognized operating agency.
`
`NOTE
`
`All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or
`mechanical, including photocopying and microfilm, without permission in writing from the ITU.
`
` ITU 1996
`
`Ex. 1038 / Page 2 of 39
`

`

`

`Recommendation G.729 (03/96)
`
`CONTENTS
`
`1
`
`2
`
`Introduction ..................................................................................................................................................
`
`General description of the coder....................................................................................................................
`
`2.1
`
`2.2
`
`2.3
`
`2.4
`
`2.5
`
`Encoder ...........................................................................................................................................
`
`Decoder ...........................................................................................................................................
`
`Delay ...............................................................................................................................................
`
`Speech coder description .................................................................................................................
`
`Notational conventions ....................................................................................................................
`
`3
`
`Functional description of the encoder ...........................................................................................................
`
`3.1
`
`3.2
`
`3.3
`
`3.4
`
`3.5
`
`3.6
`
`3.7
`
`3.8
`
`3.9
`
`Pre-processing .................................................................................................................................
`
`Linear prediction analysis and quantization.....................................................................................
`
`Perceptual weighting .......................................................................................................................
`
`Open-loop pitch analysis..................................................................................................................
`
`Computation of the impulse response...............................................................................................
`
`Computation of the target signal ......................................................................................................
`
`Adaptive-codebook search ...............................................................................................................
`
`Fixed codebook – Structure and search............................................................................................
`
`Quantization of the gains .................................................................................................................
`
`3.10 Memory update................................................................................................................................
`
`4
`
`Functional description of the decoder ...........................................................................................................
`
`4.1
`
`4.2
`
`4.3
`
`4.4
`
`Parameter decoding procedure.........................................................................................................
`
`Post-processing................................................................................................................................
`
`Encoder and decoder initialization...................................................................................................
`
`Concealment of frame erasures ........................................................................................................
`
`5
`
`Bit-exact description of the CS-ACELP coder ..............................................................................................
`
`5.1
`
`5.2
`
`Use of the simulation software.........................................................................................................
`
`Organization of the simulation software...........................................................................................
`
`Page
`
`1
`
`1
`
`2
`
`3
`
`4
`
`4
`
`4
`
`7
`
`7
`
`7
`
`14
`
`15
`
`16
`
`16
`
`17
`
`19
`
`22
`
`24
`
`25
`
`25
`
`28
`
`30
`
`30
`
`32
`
`32
`
`32
`
`Recommendation G.729 (03/96)
`
`i
`
`Ex. 1038 / Page 3 of 39
`
`

`

`Ex. 1038 / Page 4 of 39
`
`Ex. 1038 / Page 4 of 39
`
`

`

`Recommendation G.729
`
`Recommendation G.729 (03/96)
`
`CODING OF SPEECH AT 8 kbit/s USING CONJUGATE-STRUCTURE
`ALGEBRAIC-CODE-EXCITED LINEAR-PREDICTION (CS-ACELP)
`
`(Geneva, 1996)
`
`1
`
`Introduction
`
`This Recommendation contains the description of an algorithm for the coding of speech signals at 8 kbit/s using
`Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP).
`
`This coder is designed to operate with a digital signal obtained by first performing telephone bandwidth filtering
`(Recommendation G.712) of the analogue input signal, then sampling it at 8000 Hz, followed by conversion to 16-bit
`linear PCM for the input to the encoder. The output of the decoder should be converted back to an analogue signal by
`similar means. Other input/output characteristics, such as those specified by Recommendation G.711 for 64 kbit/s PCM
`data, should be converted to 16-bit linear PCM before encoding, or from 16-bit linear PCM to the appropriate format after
`decoding. The bitstream from the encoder to the decoder is defined within this Recommendation.
`
`This Recommendation is organized as follows: Clause 2 gives a general outline of the CS-ACELP algorithm. In clauses 3
`and 4, the CS-ACELP encoder and decoder principles are discussed, respectively. Clause 5 describes the software that
`defines this coder in 16 bit fixed-point arithmetic.
`
`2
`
`General description of the coder
`
`The CS-ACELP coder is based on the Code-Excited Linear-Prediction (CELP) coding model. The coder operates on
`speech frames of 10 ms corresponding to 80 samples at a sampling rate of 8000 samples per second. For every 10 ms
`frame, the speech signal is analysed to extract the parameters of the CELP model (linear-prediction filter coefficients,
`adaptive and fixed-codebook indices and gains). These parameters are encoded and transmitted. The bit allocation of the
`coder parameters is shown in Table 1. At the decoder, these parameters are used to retrieve the excitation and synthesis
`filter parameters. The speech is reconstructed by filtering this excitation through the short-term synthesis filter, as is
`shown in Figure 1. The short-term synthesis filter is based on a 10th order Linear Prediction (LP) filter. The long-term, or
`pitch synthesis filter is implemented using the so-called adaptive-codebook approach. After computing the reconstructed
`speech, it is further enhanced by a postfilter.
`
`TABLE 1/G.729
`
`Bit allocation of the 8 kbit/s CS-ACELP algorithm (10 ms frame)
`
`Parameter
`
`Codeword
`
`Subframe 1
`
`Subframe 2
`
`Total per frame
`
`Line spectrum pairs
`
`Adaptive-codebook delay
`
`Pitch-delay parity
`
`Fixed-codebook index
`
`Fixed-codebook sign
`
`Codebook gains (stage 1)
`
`Codebook gains (stage 2)
`
`Total
`
`L0, L1, L2, L3
`
`P1, P2
`
`P0
`
`C1, C2
`
`S1, S2
`
`GA1, GA2
`
`GB1, GB2
`
`8
`
`1
`
`13
`
`4
`
`3
`
`4
`
`5
`
`13
`
`4
`
`3
`
`4
`
`18
`
`13
`
`1
`
`26
`
`8
`
`6
`
`8
`
`80
`
`Recommendation G.729 (03/96)
`
`1
`
`Ex. 1038 / Page 5 of 39
`
`

`

`Excitation
`codebook
`
`Long-term
`synthesis
`filter
`
`Short-term
`synthesis
`filter
`
`Post
`filter
`
`Output
`speech
`
`Parameter decoding
`
`Received bitstream
`
`T1518640-95/d01
`
`FIGURE 1/G.729
`Block diagram of conceptual CELP synthesis model
`
`FIGURE 1/G.729...[D01] = 5 CM
`
`2.1
`
`Encoder
`
`The encoding principle is shown in Figure 2. The input signal is high-pass filtered and scaled in the pre-processing block.
`The pre-processed signal serves as the input signal for all subsequent analysis. LP analysis is done once per 10 ms frame
`to compute the LP filter coefficients. These coefficients are converted to Line Spectrum Pairs (LSP) and quantized using
`predictive two-stage Vector Quantization (VQ) with 18 bits. The excitation signal is chosen by using an analysis-by-
`synthesis search procedure in which the error between the original and reconstructed speech is minimized according to a
`perceptually weighted distortion measure. This is done by filtering the error signal with a perceptual weighting filter,
`whose coefficients are derived from the unquantized LP filter. The amount of perceptual weighting is made adaptive to
`improve the performance for input signals with a flat frequency-response.
`
`The excitation parameters (fixed and adaptive-codebook parameters) are determined per subframe of 5 ms (40 samples)
`each. The quantized and unquantized LP filter coefficients are used for the second subframe, while in the first subframe
`interpolated LP filter coefficients are used (both quantized and unquantized). An open-loop pitch delay is estimated once
`per 10 ms frame based on the perceptually weighted speech signal. Then the following operations are repeated for each
`subframe. The target signal x(n) is computed by filtering the LP residual through the weighted synthesis filter W(z)/Â(z).
`The initial states of these filters are updated by filtering the error between LP residual and excitation. This is equivalent to
`the common approach of subtracting the zero-input response of the weighted synthesis filter from the weighted speech
`signal. The impulse response h(n) of the weighted synthesis filter is computed. Closed-loop pitch analysis is then done (to
`find the adaptive-codebook delay and gain), using the target x(n) and impulse response h(n), by searching around the
`value of the open-loop pitch delay. A fractional pitch delay with 1/3 resolution is used. The pitch delay is encoded with
`8 bits in the first subframe and differentially encoded with 5 bits in the second subframe. The target signal x(n) is updated
`by subtracting the (filtered) adaptive-codebook contribution, and this new target, x¢ (n), is used in the fixed-codebook
`search to find the optimum excitation. An algebraic codebook with 17 bits is used for the fixed-codebook excitation. The
`gains of the adaptive and fixed-codebook contributions are vector quantized with 7 bits, (with MA prediction applied to
`the fixed-codebook gain). Finally, the filter memories are updated using the determined excitation signal.
`
`2
`
`Recommendation G.729 (03/96)
`
`Ex. 1038 / Page 6 of 39
`
`

`

`Fixed
`codebook
`
`Adaptive
`codebook
`
`GC
`
`G P
`
`Input
`speech
`
`Pre-
`processing
`
`LP analysis
`quantization
`interpolation
`
`LPC info
`
`Synthesis
`filter
`
`LPC info
`
`Perceptual
`weighting
`
`Pitch
`analysis
`
`Fixed CB
`search
`
`Gain
`quantization
`
`Parameter
`encoding
`
`Transmitted
`bitstream
`
`LPC info
`
`T1518650-95/D02
`
`FIGURE 2/G.729
`Encoding principle of the CS-ACELP encoder
`
`FIGURE 2/G.729...[D02] = 16 CM
`
`2.2
`
`Decoder
`
`The decoder principle is shown in Figure 3. First, the parameter’s indices are extracted from the received bitstream. These
`indices are decoded to obtain the coder parameters corresponding to a 10 ms speech frame. These parameters are the LSP
`coefficients, the two fractional pitch delays, the two fixed-codebook vectors, and the two sets of adaptive and fixed-
`codebook gains. The LSP coefficients are interpolated and converted to LP filter coefficients for each subframe. Then, for
`each 5 ms subframe the following steps are done:
`
`•
`
`•
`
`•
`
`the excitation is constructed by adding the adaptive and fixed-codebook vectors scaled by their respective
`gains;
`
`the speech is reconstructed by filtering the excitation through the LP synthesis filter;
`
`the reconstructed speech signal is passed through a post-processing stage, which includes an adaptive
`postfilter based on the long-term and short-term synthesis filters, followed by a high-pass filter and scaling
`operation.
`
`Recommendation G.729 (03/96)
`
`3
`
`Ex. 1038 / Page 7 of 39
`
`

`

`Fixed
`codebook
`
`Adaptive
`codebook
`
`GC
`
`GP
`
`Short-term
`filter
`
`Post-
`processing
`
`T1518660-95/d03
`
`FIGURE 3/G.729
`Principle of the CS-ACELP decoder
`
`FIGURE 3/G.729...[D03] = 7 CM
`
`2.3
`
`Delay
`
`This coder encodes speech and other audio signals with 10 ms frames. In addition, there is a look-ahead of 5 ms, resulting
`in a total algorithmic delay of 15 ms. All additional delays in a practical implementation of this coder are due to:
`
`•
`
`processing time needed for encoding and decoding operations;
`
`•
`
`transmission time on the communication link;
`• multiplexing delay when combining audio data with other data.
`
`2.4
`
`Speech coder description
`
`The description of the speech coding algorithm of this Recommendation is made in terms of bit-exact, fixed-point
`mathematical operations. The ANSI C code indicated in clause 5, which constitutes an integral part of this
`Recommendation, reflects this bit-exact, fixed-point descriptive approach. The mathematical descriptions of the encoder
`(clause 3), and decoder (clause 4), can be implemented in several other fashions, possibly leading to a codec
`implementation not complying with this Recommendation. Therefore, the algorithm description of the ANSI C code of
`clause 5 shall take precedence over the mathematical descriptions of clauses 3 and 4 whenever discrepancies are found. A
`non-exhaustive set of test signals, which can be used with ANSI C code, are available from the ITU.
`
`2.5
`
`Notational conventions
`
`Throughout this Recommendation, it is tried to maintain the following notational conventions:
`•
`
`Codebooks are denoted by caligraphic characters (e.g. ).
`
`•
`
`•
`
`•
`
`•
`
`•
`
`•
`
`Time signals are denoted by their symbol and a sample index between parenthesis [e.g. s(n)]. The symbol n
`is used as sample index.
`
`Superscript indices between parenthesis (e.g. g(m) are used to indicate time-dependency of variables. The
`variable m refers, depending on the context, to either a frame or subframe index, and the variable n to a
`sample index.
`
`Recursion indices are identified by a superscript between square brackets (e.g. E[k]).
`
`Subscripts indices identify a particular element in a coefficient array.
`
`The symbol ^ identifies a quantized version of a parameter (e.g. gc^ ).
`
`Parameter ranges are given between square brackets, and include the boundaries (e.g. [0.6, 0.9]).
`
`4
`
`Recommendation G.729 (03/96)
`
`Ex. 1038 / Page 8 of 39
`
`

`

`•
`•
`•
`
`The function log denotes a logarithm with base 10.
`
`The function int denotes truncation to its integer value.
`
`The decimal floating-point numbers used are rounded versions of the values used in the 16 bit fixed-point
`ANSI C implementation.
`
`Table 2 lists the most relevant symbols used throughout this Recommendation. A glossary of the most relevant signals is
`given in Table 3. Table 4 summarizes relevant variables and their dimension. Constant parameters are listed in Table 5.
`The acronyms used in this Recommendation are summarized in Table 6.
`
`TABLE 2/G.729
`
`Glossary of most relevant symbols
`
`Name
`
`Reference
`
`Description
`
`1/Â(z)
`Hh1(z)
`Hp(z)
`Hf (z)
`Ht(z)
`Hh2(z)
`P(z)
`
`W(z)
`
`Equation (2)
`
`Equation (1)
`
`Equation (78)
`
`Equation (84)
`
`Equation (86)
`
`Equation (91)
`
`Equation (46)
`
`Equation (27)
`
`LP synthesis filter
`
`Input high-pass filter
`
`Long-term postfilter
`
`Short-term postfilter
`
`Tilt-compensation filter
`
`Output high-pass filter
`
`Pre-filter for fixed codebook
`
`Weighting filter
`
`TABLE 3/G.729
`
`Glossary of most relevant signals
`
`Name
`
`Reference
`
`Description
`
`c(n)
`
`d(n)
`
`ew(n)
`
`h(n)
`
`r(n)
`
`s(n)
`s^(n)
`s¢ (n)
`sf(n)
`sf ¢ (n)
`sw(n)
`
`x(n)
`x¢ (n)
`u(n)
`
`v(n)
`
`y(n)
`
`z(n)
`
`3.8
`
`3.8.1
`
`3.10
`
`3.5
`
`3.6
`
`3.1
`
`4.1.6
`
`3.2.1
`
`4.2
`
`4.2
`
`3.6
`
`3.6
`
`3.8.1
`
`3.10
`
`3.7.1
`
`3.7.3
`
`3.9
`
`Fixed-codebook contribution
`
`Correlation between target signal and h(n)
`
`Error signal
`
`Impulse response of weighting and synthesis filters
`
`Residual signal
`
`Pre-processed speech signal
`
`Reconstructed speech signal
`
`Windowed speech signal
`
`Postfiltered output
`
`Gain-scaled postfiltered output
`
`Weighted speech signal
`
`Target signal
`
`Second target signal
`
`Excitation to LP synthesis filter
`
`Adaptive-codebook contribution
`
`Convolution v(n) * h(n)
`
`Convolution c(n) * h(n)
`
`Recommendation G.729 (03/96)
`
`5
`
`Ex. 1038 / Page 9 of 39
`
`

`

`TABLE 4/G.729
`
`Glossary of most relevant variables
`
`Name
`
`Size
`
`Description
`
`gp
`gc
`gl
`gf
`gt
`G
`
`Top
`ai
`ki
`k¢
`1
`oi
`
`i
`
`p^
`i, j
`qi
`r(k)
`r¢ (k)
`wi
`l^
`i
`
`1
`
`1
`
`1
`
`1
`
`1
`
`1
`
`1
`
`11
`
`10
`
`1
`
`2
`
`10
`
`40
`
`10
`
`11
`
`11
`
`10
`
`10
`
`Adaptive-codebook gain
`
`Fixed-codebook gain
`
`Gain term for long-term postfilter
`
`Gain term for short-term postfilter
`
`Gain term for tilt postfilter
`
`Gain for gain normalization
`
`Open-loop pitch delay
`LP coefficients (a0 = 1.0)
`Reflection coefficients
`
`Reflection coefficient for tilt postfilter
`
`LAR coefficients
`
`LSF normalized frequencies
`
`MA predictor for LSF quantization
`
`LSP coefficients
`
`Auto-correlation coefficients
`
`Modified auto-correlation coefficients
`
`LSP weighting coefficients
`
`LSP quantizer output
`
`TABLE 5/G.729
`
`Glossary of most relevant constants
`
`Name
`
`Value
`
`Description
`
`fs
`f0
`g 1
`g 2
`g n
`g d
`g p
`g t
`
`L0
`L1
`L2
`L3
`
`8000
`
`60
`
`0.94/0.98
`0.60/[0.4 -
`0.55
`
`0.70
`
`0.50
`
`0.90/0.2
`
`Table 7
`
`3.2.4
`
`3.2.4
`
`3.2.4
`
`3.2.4
`
`3.9
`
`3.9
`
`wlag
`wlp
`
`Equation (6)
`
`Equation (3)
`
`6
`
`Recommendation G.729 (03/96)
`
`Sampling frequency
`
`Bandwidth expansion
`
`Weight factor perceptual weighting filter
`
` 0.7]
`
`Weight factor perceptual weighting filter
`
`Weight factor postfilter
`
`Weight factor postfilter
`
`Weight factor pitch postfilter
`
`Weight factor tilt postfilter
`
`Fixed (algebraic) codebook
`
`Moving-average predictor codebook
`
`First stage LSP codebook
`
`Second stage LSP codebook (low part)
`
`Second stage LSP codebook (high part)
`
`Gain codebook (first stage)
`
`Gain codebook (second stage)
`
`Correlation lag window
`
`LP analysis window
`
`Ex. 1038 / Page 10 of 39
`
`w
`

`

`TABLE 6/G.729
`
`Glossary of acronyms
`
`Description
`
`Code-Excited Linear-Prediction
`
`Conjugate-Structure Algebraic-CELP
`
`Moving Average
`
`Most Significant Bit
`
`Mean-Squared Error
`
`Log Area Ratio
`
`Linear Prediction
`
`Line Spectral Pair
`
`Line Spectral Frequency
`
`Vector quantization
`
`Acronym
`
`CELP
`
`CS-ACELP
`
`MA
`
`MSB
`
`MSE
`
`LAR
`
`LP
`
`LSP
`
`LSF
`
`VQ
`
`3
`
`Functional description of the encoder
`
`In this clause the different functions of the encoder represented in the blocks of Figure 2 are described. A detailed signal
`flow is shown in Figure 4.
`
`3.1
`
`Pre-processing
`
`As stated in clause 2, the input to the speech encoder is assumed to be a 16 bit PCM signal. Two pre-processing functions
`are applied before the encoding process:
`1)
`signal scaling; and
`2)
`high-pass filtering.
`
`The scaling consists of dividing the input by a factor 2 to reduce the possibility of overflows in the fixed-point
`implementation. The high-pass filter serves as a precaution against undesired low-frequency components. A second order
`pole/zero filter with a cut-off frequency of 140 Hz is used. Both the scaling and high-pass filtering are combined by
`dividing the coefficients at the numerator of this filter by 2. The resulting filter is given by:
`- 1 + 0.46363718z
`Hh1(z) = 0.46363718 -
` 0.92724705z
`
`- 1 + 0.9114024z - 2
`1 -
` 1.9059465z
`
`(1)
`
`- 2
`
`The input signal filtered through Hh1(z) is referred to as s(n), and will be used in all subsequent coder operations.
`
`3.2
`
`Linear prediction analysis and quantization
`
`The short-term analysis and synthesis filters are based on 10th order Linear Prediction (LP) filters.
`
`The LP synthesis filter is defined as:
`
`1
` =
`Â(z)
`
`1 + (cid:229)
`
`1
`10
`i = 1
`
` âi z -
`
` i
`
`(2)
`
`where âi, i = 1,...,10, are the (quantized) Linear Prediction (LP) coefficients. Short-term prediction, or linear prediction
`analysis is performed once per speech frame using the autocorrelation method with a 30 ms asymmetric window. Every
`80 samples (10 ms), the autocorrelation coefficients of windowed speech are computed and converted to the LP
`coefficients using the Levinson algorithm. Then the LP coefficients are transformed to the LSP domain for quantization
`and interpolation purposes. The interpolated quantized and unquantized filters are converted back to the LP filter
`coefficients (to construct the synthesis and weighting filters for each subframe).
`
`Recommendation G.729 (03/96)
`
`7
`
`Ex. 1038 / Page 11 of 39
`
`

`

`Signal flow at the CS-ACELP encoder
`
`FIGURE 4/G.729
`
`T1518670-95/d04
`
`v(n)
`
`state
`update filter
`excitation &
`Compute
`
`3.10
`
`3.8.2
`
`c(n)
`
`P(z)
`
`codeword
`Compute
`
`GA2, GB2
`
`3.9
`
`GA1, GB1
`
`Gains
`
`VQ
`structure
`Conjugate
`
`gp
`
`Â(z)
`
`A(z)
`
`3.5
`
`response
`impulse
`Compute
`
`3.7
`
`index
`LSP
`
`3.4
`
`P2
`
`P0, P1
`
`delay & gain
`closed-loop pitch
`Find
`
`pitch delay
`Find open-loop
`
`3.2.3
`
` LSP
`
`A(z) fi
`
`Â(z)
`
`A(z)
`
`3.2.5;6
`
`LSP fi
`Interpolation &
`
` Â(z)
`
`LSP fi
`Interpolation &
`
` A(z)
`
`L2, L3
`
`L0, L1
`
`3.2.4
`
`quantization
`LSP
`
`3.9.1
`
`prediction
`code-gain
`MA
`
`gc
`
`3.8.1
`
`& store efficiently
`selected amplitudes
`combine with
`Compute F
`
`,
`
`3.8
`
`P(z)
`Pitch prefilter
`
`h(n)
`
`delay
`Pitch
`
`all 40 locations
`amplitude at
`potential pulse
`Pre-select a
`
`index
`Code
`
`S2, C2
`S1, C1
`
`3.8.1
`
`k
`c
`
`measure
`
`kT
`c
`
`2
`
`k
`
`dc
`
`T
`
`which maximizes the
`Find code word ck
`
`d(n)
`
`3.8.1
`
`3.6
`
`Â(z)
`
`A(z)
`
`A(z)
`
`code domain
`signal in
`Compute target
`
`x(n)
`
`target signal
`Compute pitch
`
`v(n)
`
`3.7.1
`
`x(n)
`
`3.3
`
`3.3
`
`3.2.1;2
`
`contribution
`Compute pitch
`
`weighted speech
`Compute
`
`adapt.
`Perceptual
`
`Levinson Durbin
`autocorrelations
`Windowing
`
`3.1
`
`& down scale
`High pass
`
`samples
`Input
`
`update
`Memory
`
`(fixed codebook)
`Algebraic codebook search
`
`(adaptive codebook)
`Closed-loop pitch search
`
`search
`Open loop pitch
`
`LP Analysis
`
`Pre-processing
`
`per subframe
`
`per frame
`
`FIGURE 4/G.729...[D04] = PAGE PLAINE
`
`8
`
`Recommendation G.729 (03/96)
`
`Ex. 1038 / Page 12 of 39
`
`g
`A
`F
`

`

`3.2.1 Windowing and autocorrelation computation
`
`The LP analysis window consists of two parts: the first part is half a Hamming window and the second part is a quarter of
`a cosine function cycle. The window is given by:
`
`wlp(n) =
`
`(cid:238)(cid:237)(cid:236) 0.54 -
`
`2p n
` 0.46 cos Ł(cid:231)(cid:230) ł(cid:247)(cid:246)
`399
` (n -
`2p
`cos Ł(cid:231)(cid:230) ł(cid:247)(cid:246)
` 200)
`159
`
`n = 0,...,199
`
`n = 200,...,239
`
`
`
`(3)
`
`There is a 5 ms lookahead in the LP analysis which means that 40 samples are needed from the future speech frame. This
`translates into an extra algorithmic delay of 5 ms at the encoder stage. The LP analysis window applies to 120 samples
`from past speech frames, 80 samples from the present speech frame, and 40 samples from the future frame. The
`windowing procedure is illustrated in Figure 5.
`
`LP windows
`
`Subframes
`
`T1518680-95/d05
`
`FIGURE 5/G.729
`Windowing procedure in LP analysis
`
`FIGURE 5/G.729...[D05] = 5 CM
`
`The different shading patterns identify corresponding excitation and LP analysis windows.
`
`The windowed speech:
`
`s¢ (n) = wlp(n) s(n) n = 0,...,239
`
`is used to compute the autocorrelation coefficients:
`
`239
`r (k) = (cid:229)
` s¢ (n) s¢ (n -
`n = k
`
` k) k = 0,...,10
`
`(4)
`
`(5)
`
`To avoid arithmetic problems for low-level input signals the value of r(0) has a lower boundary of r(0) = 1.0. A 60 Hz
`bandwidth expansion is applied, by multiplying the autocorrelation coefficients with:
`
`wlag(k) = exp
`
`ºŒŒØ -
`2 Ł(cid:231)(cid:230)
`2p
` 1
`
`ßœœø k = 1,...,10
`ł(cid:247)(cid:246) 2
`
` f0 k
`fs
`
`(6)
`
`Recommendation G.729 (03/96)
`
`9
`
`Ex. 1038 / Page 13 of 39
`
`

`

`where f0 = 60 Hz is the bandwidth expansion and fs = 8000 Hz is the sampling frequency. Furthermore, r(0) is multiplied
`by a white-noise correction factor 1.0001, which is equivalent to adding a noise floor at - 40 dB. The modified
`autocorrelation coefficients are given by:
`
`r ¢ (0) = 1.0001 r (0)
`r ¢ (k) = wlag(k) r (k) k = 1,...,10
`
`(7)
`
`3.2.2
`
`Levinson-Durbin algorithm
`
`The modified autocorrelation coefficients r¢ (k) are used to obtain the LP filter coefficients, ai, i = 1,...,10, by solving the
`set of equations:
`
` k|) = - r ¢ (k) k = 1,...,10
`
`(8)
`
`10
`
` air ¢ (|i -
` = 1
`
`(cid:229) i
`
`The set of equations in (8) is solved using the Levinson-Durbin algorithm. This algorithm uses the following recursion:
`
`E[0] = r ¢ (0)
`for i = 1 to 10
`[i -
` 1]
` = 1
`a 0
`
`ki = -
`
`ºŒØ (cid:229)
`
`
`
`i -
`j = 0
`
`[i -
` 1 a j
`
` 1]
` r ¢ (i -
`
` j)ßœø / E[i -
`
` 1]
`
`[i]
` = ki
`a i
`for j = 1 to i -
`[i]
`[i -
` = a j
`a j
`
`end
`
` 1
` 1]
`[i -
` + kia i -
`
` 1]
` j
`
`E[i] = Ł(cid:230) 1 -
`
`2ł(cid:246) E[i -
` ki
`
` 1]
`
`end
`
`[10], j = 0, ...,10, with a0 = 1.0.
`The final solution is given as aj = aj
`
`3.2.3
`
`LP to LSP conversion
`
`The LP filter coefficients ai, i = 0,...10 are converted to Line Spectral Pair (LSP) coefficients for quantization and
`interpolation purposes. For a 10th order LP filter, the LSP coefficients are defined as the roots of the sum and difference
`polynomials:
`
`and:
`
`¢ (z) = A(z) + z- 11A(z- 1)
`F1
`
`¢ (z) = A(z) -
`F2
`
` z- 11A(z- 1)
`
`(9)
`
`(10)
`
`10
`
`Recommendation G.729 (03/96)
`
`Ex. 1038 / Page 14 of 39
`
`

`

`¢ (z) is antisymmetric. It can be proven that all roots of theserespectively. The polynomial F1¢ (z) is symmetric, and F2
`
`
`¢ (z) has a root z = - 1 (w
`¢ (z) has a root z = 1
` = p ) and F2
`polynomials are on the unit circle and they alternate each other. F1
`(w = 0). These two roots are eliminated by defining the new polynomials:
`¢ (z) / (1 + z- 1)
`F1(z) = F1
`
`(11)
`
`and:
`
`¢ (z) / (1 -
`F2(z) = F2
`
` z- 1)
`
`Each polynomial has five conjugate roots on the unit circle (e– jw i), and they can be written as:
`
`and:
`
`F1(z) =
`
` (1 -
`i = 1, 3,...,9
`
` 2qiz- 1 + z- 2)
`
`F2(z) =
`
` (1 -
`i = 2, 4,...,10
`
` 2qiz- 1 + z- 2)
`
`(12)
`
`(13)
`
`(14)
`
`i). The coefficients w
`where qi = cos(w
`i are the Line Spectral Frequencies (LSF) and they satisfy the ordering property 0 <
`i < w 2 < ... < w 10 < p
`. The coefficients qi are referred to as the LSP coefficients in the cosine domain.
`Since both polynomials F1(z) and F2(z) are symmetric only the first five coefficients of each polynominal need to be
`computed. The coefficients of these polynomials are found by the recursive relations:
`f1(i + 1) = ai + 1 + a10 -
` i -
` f1(i) i = 0,...,4
`f2(i + 1) = ai + 1 -
` i + f2(i) i = 0,...,4
` a10 -
`
`(15)
`
`where f1(0) = f2(0) = 1.0. The LSP coefficients are found by evaluating the polynomials F1(z) and F2(z) at 60 points
`equally spaced between 0 and p
` and checking for sign changes. A sign change signifies the existence of a root and the
`sign change interval is then divided four times to allow better tracking of the root. The Chebyshev polynomials are used to
`evaluate F1(z) and F2(z). In this method the roots are found directly in the cosine domain. The polynomials F1(z) or F2(z),
`evaluated at z = ejw
`, can be written as:
`
`F(w
`
`) = 2e- j5w
`
` C(x)
`
`with:
`
`C(x) = T5(x) + f(1)T4(x) + f(2)T3(x) + f(3)T2(x) + f(4)T1(x) + f(5)/2
`
`(16)
`
`(17)
`
`) is the mth order Chebyshev polynomial, and f(i), i = 1,...,5, are the coefficients of either F1(z) or
`where Tm(x) = cos(mw
`F2(z), computed using Equation (15). The polynomial C(x) is evaluated at a certain value of x = cos(w
`) using the recursive
`relation:
`
`for k = 4 down to 1
`bk = 2xbk + 1 -
`
` bk + 2 + f(5 -
`
` k)
`
`end
`C(x) = xb1 -
`
` b2 + f(5)/2
`
`with initial values b5 = 1 and b6 = 0.
`
`Recommendation G.729 (03/96)
`
`11
`
`Ex. 1038 / Page 15 of 39
`
`(cid:213)
`(cid:213)
`w
`

`

`3.2.4
`
`Quantization of the LSP coefficients
`
`The LSP coefficients qi are quantized using the LSF representation w
`
`i in the normalized frequency domain [0, p ]; that is:
`
`i = arccos(qi) i = 1,...,10
`
`(18)
`
`A switched 4th order MA prediction is used to predict the LSF coefficients of the current frame. The difference between
`the computed and predicted coefficients is quantized using a two-stage vector quantizer. The first stage is a
`10-dimensional VQ using codebook L1 with 128 entries (7 bits). The second stage is a 10 bit VQ which has been
`implemented as a split VQ using two 5-dimensional codebooks, L2 and L3 containing 32 entries (5 bits) each.
`
`To explain the quantization process, it is convenient to first describe the decoding process. Each coefficient is obtained
`from the sum of two codebooks:
`
`i =
`l^
`
`(cid:238)(cid:239)(cid:237)(cid:239)(cid:236) L1i (L1) + L2i (L2)
`
`L1i (L1) + L3i -
`
` 5 (L3)
`
`i = 1,...,5
`
`i = 6,...,10
`
`
`
`(19)
`
`where L1, L2 and L3 are the codebook indices. To avoid sharp resonances in the quantized LP synthesis filter, the
`coefficients l^
`i are arranged such that adjacent coefficients have a minimum distance of J. The rearrangement routine is
`shown below:
`
`for i = 2,...,10
`i -
`if (l^
` 1 > l^
` J)
`i -
` 1 -
` 1 = (l^i + l^i -
`
`
`l^
` J)/2
`i -
`
`
` 1 + J)/2
`i = (l^i + l^i -
`l^
`
`end
`
`end
`
`This rearrangement process is done twice. First with a value of J = 0.0012, then with a value of J = 0.0006. After this
`rearrangement process, the quantized LSF coefficients w^(m)
`i for the current frame m, are obtained from the weighted sum
`^(m -
`^(m):
` k), and the current quantizer output li
`of previous quantizer outputs li
`
`i, k l^(m -
`i i = 1,...,10
` k)
`p^
`
`(20)
`
`4
`
`i, kł(cid:247)(cid:247)(cid:246) l^(m)
`i + (cid:229)
`
`k = 1
`
`p^
`
`4
`
`
` = 1
`
`(cid:229) k
`
`
`
`
`
`w ^ (m) i =
`
`Ł(cid:231)(cid:231)(cid:230) 1 -
`
`where p^
`i, k are the coefficients of the switched MA predictor. Which MA predictor to use is defined by a separate bit L0.
`i = ip /11 for all k < 0.
`^(k) are given by l^
`At start-up the initial values of li
`
`After computing w^
`
`i, the corresponding filter is checked for stability. This is done as follows:
`order the coefficient w^
`i in increasing value;
`i < 0,005 then w^
`i = 0.005;
` w^
` 0.0391 then w^
`i -
`i + 1 -
`i + 1 = w^
`10 > 3.135 then w^
`10 = 3.135.
`
`i + 0.0391, i = 1,...,9;
`
`1)
`
`2)
`
`3)
`
`4)
`
`if w^
`
`if w^
`
`if w^
`
`12
`
`Recommendation G.729 (03/96)
`
`Ex. 1038 / Page 16 of 39
`
`w
`

`

`The procedure for encoding the LSF parameters can be outlined as follows. For each of the two MA predictors the best
`approximation to the current LSF coefficients has to be found. The best approximation is defined as the one that
`minimizes the weighted mean-squared error:
`
`Elsf = (cid:229)
`
`10
`
` wi(w
`i = 1
`
`i -
`
` w ^
`
`
`
`i)2
`
`(21)
`
`The weights wi are made adaptive as a function of the unquantized LSF coefficients,
`
`wi =
`
`(cid:238)(cid:239)(cid:237)(cid:239)(cid:236) 1.0
`
`10 (w 2 -
`
` 0.04p
`
` -
`
`
`
` 1)2 + 1
`
` 0.04p
`if w 2 -
`otherwise
`
` -
`
`
`
` 1 > 0
`
`wi 2 £
`
` i £
`
` 9 =
`
`(cid:238)(cid:239)(cid:237)(cid:239)(cid:236) 1.0
`
`10 (w
`
`w10 =
`
`(cid:238)(cid:239)(cid:237)(cid:239)(cid:236) 1.0
`
`10 (-w
`
`i + 1 -
`
` w
`
`
`
` 1 -
`
`i -
`
` 1)2 + 1
`
` w
`
`
`
`i + 1 -
`if w
`otherwise
`
` 1 -
`
`i -
`
` 1 > 0
`
`(22)
`
`9 + 0.92p
`
` -
`
`
`
` 1)2 + 1
`
`9 + 0.92p
`if -w
`otherwise
`
` -
`
`
`
` 1 > 0
`
`In addition, the weights w5 and w6 are multiplied by 1.2 each.
`
`The vector to be quantized for the current frame m is obtained from
`
`(23)
`
`4
`
`
` = 1
`
`i, kł(cid:247)(cid:247)(cid:246) i = 1,...,10
`
`p^
`
`(cid:229) k
`
`
`
` k)
`
`ßœœø /
`
`Ł(cid:231)(cid:231)(cid:230) 1 -
`
`^(m -
`p^
`i, k li
`
`4
`
`
` = 1
`
`(cid:229) k
`
`
`
`li =
`
`(m)
`
`i -
`
`ºŒŒØ w
`
`The first codebook L1 is searched and the entry L1 that minimizes the (unweighted) mean-squared error is selected. This
`is followed by a search of the second codebook L2, which defines the lower part of the second stage. For each possible
`candidate, the partial vector w^
`i, i = 1,...,5, is reconstructed using Equation (20), and rearranged to guarantee a minimum
`distance of 0.0012. The weighted MSE of Equation (21) is computed, and the vector L2 which re

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket