throbber
EUROPEAN
`TELECOMMUNICATION
`STANDARD
`
`DRAFT
`pr ETS 300 726
`
`March 1996
`
`Source: ETSI TC-SMG
`
`Reference: DE/SMG-020660
`
`ICS: 33.060.50
`
`Key words: EFR, digital cellular telecommunications system, Global System for Mobile communications
`(GSM), speech
`
`Digital cellular telecommunications system;
`Enhanced Full Rate (EFR) speech transcoding
`(GSM 06.60)
`
`ETSI
`
`European Telecommunications Standards Institute
`
`ETSI Secretariat
`
`Postal address: F-06921 Sophia Antipolis CEDEX - FRANCE
`Office address: 650 Route des Lucioles - Sophia Antipolis - Valbonne - FRANCE
`X.400: c=fr, a=atlas, p=etsi, s=secretariat - Internet: secretariat@etsi.fr
`
`Tel.: +33 92 94 42 00 - Fax: +33 93 65 47 16
`
`Copyright Notification: No part may be reproduced except as authorized by written permission. The copyright and the
`foregoing restriction extend to reproduction in all media.
`
`© European Telecommunications Standards Institute 1996. All rights reserved.
`
`*
`
`ZTE EXHIBIT 1008
`
`Page 1 of 52
`
`

`
`Page 2
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`Whilst every care has been taken in the preparation and publication of this document, errors in content,
`typographical or otherwise, may occur. If you have comments concerning its accuracy, please write to
`"ETSI Editing and Committee Support Dept." at the address shown on the title page.
`
`Page 2 of 52
`
`

`
`Page 3
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`Contents
`
`Foreword .......................................................................................................................................................5
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`Scope ..................................................................................................................................................7
`
`Normative references..........................................................................................................................7
`
`Definitions, symbols and abbreviations ...............................................................................................8
`3.1
`Definitions ............................................................................................................................8
`3.2
`Symbols ...............................................................................................................................9
`3.3
`Abbreviations .....................................................................................................................12
`
`Outline description.............................................................................................................................13
`4.1
`Functional description of audio parts .................................................................................13
`4.2
`Preparation of speech samples .........................................................................................13
`4.2.1
`PCM format conversion.................................................................................14
`Principles of the GSM enhanced full rate speech encoder................................................14
`Principles of the GSM enhanced full rate speech decoder................................................15
`Sequence and subjective importance of encoded parameters..........................................16
`
`4.3
`4.4
`4.5
`
`Functional description of the encoder ...............................................................................................16
`5.1
`Pre-processing...................................................................................................................16
`5.2
`Linear prediction analysis and quantisation .......................................................................16
`5.2.1
`Windowing and autocorrelation computation ................................................16
`5.2.2
`Levinson-Durbin algorithm ............................................................................18
`5.2.3
`LP to LSP conversion....................................................................................18
`5.2.4
`LSP to LP conversion....................................................................................20
`5.2.5
`Quantisation of the LSP coefficients .............................................................20
`5.2.6
`Interpolation of the LSPs ...............................................................................21
`Open-loop pitch analysis....................................................................................................22
`Impulse response computation..........................................................................................22
`Target signal computation .................................................................................................23
`Adaptive codebook search ................................................................................................23
`Algebraic codebook structure and search .........................................................................24
`Quantisation of the fixed codebook gain............................................................................27
`Memory update ..................................................................................................................28
`CRC-calculation .................................................................................................................28
`
`5.3
`5.4
`5.5
`5.6
`5.7
`5.8
`5.9
`5.10
`
`Functional description of the decoder ...............................................................................................29
`6.1
`Decoding and speech synthesis ........................................................................................29
`6.2
`Post-processing .................................................................................................................30
`6.2.1
`Adaptive postfiltering .....................................................................................30
`6.2.2
`Up-scaling .....................................................................................................31
`
`Variables, constants and tables in the C-code of the GSM EFR codec............................................32
`7.1
`Description of the constants and variables used in the C code .........................................32
`
`Homing sequences ...........................................................................................................................41
`8.1
`Functional description ........................................................................................................41
`8.2
`Definitions ..........................................................................................................................41
`8.3
`Encoder homing.................................................................................................................43
`8.4
`Decoder homing ................................................................................................................43
`8.5
`Encoder home state...........................................................................................................44
`8.6
`Decoder home state ..........................................................................................................46
`
`Page 3 of 52
`
`

`
`Page 4
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`9
`
`Bibliography ...................................................................................................................................... 51
`
`History ......................................................................................................................................................... 52
`
`Page 4 of 52
`
`

`
`Page 5
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`Foreword
`
`This draft European Telecommunication Standard (ETS) has been produced by the Special Mobile Group
`(SMG) Technical Committee of the European Telecommunications Standards Institute (ETSI) and is now
`submitted for the Public Enquiry phase of the ETSI standards approval procedure.
`
`This draft ETS describes the detailed mapping between input blocks of 160 speech samples in 13-bit
`uniform PCM format to encoded blocks of 260 bits and from encoded blocks of 260 bits to output blocks
`of 160 reconstructed speech samples within the digital cellular telecommunications system.
`
`This draft ETS corresponds to GSM technical specification, GSM 06.60, version 5.0.0
`
`Date of latest announcement of this ETS (doa):
`
`3 months after ETSI publication
`
`Proposed transposition dates
`
`Date of latest publication of new National Standard
`or endorsement of this ETS (dop/e):
`
`6 months after doa
`
`Date of withdrawal of any conflicting National Standard (dow):
`
`6 months after doa
`
`Page 5 of 52
`
`

`
`Page 6
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`Blank page
`
`Page 6 of 52
`
`

`
`Page 7
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`1
`
`Scope
`
`This Draft European Telecommunication Standard (ETS) describes the detailed mapping between input
`blocks of 160 speech samples in 13-bit uniform PCM format to encoded blocks of 260 bits and from
`encoded blocks of 260 bits to output blocks of 160 reconstructed speech samples. The sampling rate is
`8000 sample/s leading to a bit rate for the encoded bit stream of 13 kbit/s. The coding scheme is the
`so-called Algebraic Code Excited Linear Prediction Coder, hereafter referred to as ACELP.
`
`This ETS also specifies the conversion between A-law PCM and 13-bit uniform PCM. Performance
`requirements for the audio input and output parts are included only to the extent that they affect the
`transcoder performance. This part also describes the codec down to the bit level, thus enabling the
`verification of compliance to the part to a high degree of confidence by use of a set of digital test
`sequences. These test sequences are described in GSM 06.54 [7] and are available on disks.
`
`In case of discrepancy between the requirements described in this ETS and the fixed point computational
`description (ANSI-C code) of these requirements contained in GSM 06.53 [6], the description in
`GSM 06.53 [6] will prevail.
`
`The transcoding procedure specified in this ETS is applicable for the enhanced full rate speech traffic
`channel (TCH) in the GSM system.
`
`In GSM 06.51 [5], a reference configuration for the speech transmission chain of the GSM enhanced full
`rate (EFR) system is shown. According to this reference configuration, the speech encoder takes its input
`as a 13-bit uniform PCM signal either from the audio part of the Mobile Station or on the network side,
`from the PSTN via an 8-bit/A-law to 13-bit uniform PCM conversion. The encoded speech at the output of
`the speech encoder is delivered to a channel encoder unit which is specified in GSM 05.03 [3]. In the
`receive direction, the inverse operations take place.
`
`2
`
`Normative references
`
`This ETS incorporates by dated and undated reference, provisions from other publications. These
`normative references are cited at the appropriate places in the text and the publications are listed
`hereafter. For dated references, subsequent amendments to or revisions of any of these publications
`apply to this ETS only when incorporated in it by amendment or revision. For undated references, the
`latest edition of the publication referred to applies.
`
`[1]
`
`[2]
`
`[3]
`
`[4]
`
`[5]
`
`[6]
`
`[7]
`
`GSM 01.04 (ETR 100): "Digital cellular telecommunication system (Phase 2);
`Abbreviations and acronyms".
`
`GSM 03.50 (ETS 300 540): "Digital cellular telecommunication system (Phase
`2); Transmission planning aspects of the speech service in the GSM Public
`Land Mobile Network (PLMN) system".
`
`GSM 05.03 (ETS 300 575): "Digital cellular
`(Phase 2); Channel coding".
`
`telecommunication system
`
`GSM 06.32 (ETS 300 580-6): "Digital cellular telecommunication system (Phase
`2); Voice Activity Detection (VAD)".
`
`GSM 06.51 (prETS 300 723): "Digital cellular telecommunications system;
`Enhanced Full Rate (EFR) speech processing functions General description
`
`GSM 06.53 (prETS 300 724): "Digital cellular telecommunications system;
`ANSI-C code for the GSM Enhanced Full Rate (EFR) speech codec".
`
`GSM 06.54 (Work item DE/SMG-020654 prETS 300 725): "Digital cellular
`telecommunications system; Test vectors for the GSM Enhanced Full Rate
`(EFR) speech codec".
`
`Page 7 of 52
`
`

`
`Page 8
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`[8]
`
`[9]
`
`3
`
`3.1
`
`ITU-T Recommendation G.711 (1988): "Coding of analogue signals by pulse
`code modulation Pulse code modulation (PCM) of voice frequencies".
`
`ITU-T Recommendation G.726: "40, 32, 24, 16 kbit/s adaptive differential pulse
`code modulation (ADPCM)".
`
`Definitions, symbols and abbreviations
`
`Definitions
`
`For the purpose of this ETS the following definitions apply.
`
`adaptive codebook:
`
`The adaptive codebook contains excitation vectors that are adapted for every
`subframe. The adaptive codebook is derived from the long term filter state. The
`lag value can be viewed as an index into the adaptive codebook.
`
`adaptive postfilter:
`
`This filter is applied to the output of the short term synthesis filter to enhance the
`perceptual quality of the reconstructed speech. In the GSM enhanced full rate
`codec, the adaptive postfilter is a cascade of two filters: a formant postfilter and
`a tilt compensation filter
`
`algebraic codebook:
`
`A fixed codebook where algebraic code is used to populate the excitation
`vectors (innovation vectors).The excitation contains a small number of nonzero
`pulses with predefined interlaced sets of positions.
`
`closed-loop pitch analysis: This is the adaptive codebook search, i.e., a process of estimating the pitch
`(lag) value from the weighted input speech and the long term filter state. In the
`closed-loop search, the lag is searched using error minimisation loop (analysis-
`by-synthesis). In the GSM enhanced full rate codec, closed-loop pitch search is
`performed for every subframe.
`
`direct form coefficients: One of the formats for storing the short term filter parameters. In the GSM
`enhanced full rate codec, all filters which are used to modify speech samples
`use direct form coefficients.
`
`fixed codebook:
`
`The fixed codebook contains excitation vectors for speech synthesis filters. The
`contents of the codebook are non-adaptive (i.e., fixed). In the GSM enhanced
`full rate codec, the fixed codebook is implemented using an algebraic codebook.
`
`fractional lags:
`
`A set of lag values having sub-sample resolution. In the GSM enhanced full rate
`codec a sub-sample resolution of 1/6th of a sample is used.
`
`frame:
`
`A time interval equal to 20 ms (160 samples at an 8 kHz sampling rate).
`
`integer lags:
`
`A set of lag values having whole sample resolution.
`
`interpolating filter:
`
`An FIR filter used to produce an estimate of sub-sample resolution samples,
`given an input sampled with integer sample resolution.
`
`inverse filter:
`
`lag:
`
`This filter removes the short term correlation from the speech signal. The filter
`models an inverse frequency response of the vocal tract.
`
`The long term filter delay. This is typically the true pitch period, or a multiple or
`sub-multiple of it.
`
`Line Spectral Frequencies:
`
`(see Line Spectral Pair)
`
`Page 8 of 52
`
`

`
`Line Spectral Pair:
`
`LP analysis window:
`
`Page 9
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`Transformation of LPC parameters. Line Spectral Pairs are obtained by
`decomposing the inverse filter transfer function A(z) to a set of two transfer
`functions, one having even symmetry and the other having odd symmetry. The
`Line Spectral Pairs (also called as Line Spectral Frequencies) are the roots of
`these polynomials on the z-unit circle).
`
`For each frame, the short term filter coefficients are computed using the high
`pass filtered speech samples within the analysis window. In the GSM enhanced
`full rate codec, the length of the analysis window is 240 samples. For each
`frame, two asymmetric windows are used to generate two sets of LP
`coefficients. No samples of the future frames are used (no lookahead).
`
`LP coefficients:
`
`Linear Prediction (LP) coefficients (also referred as Linear Predictive Coding
`(LPC) coefficients) is a generic descriptive term for describing the short term
`filter coefficients.
`
`open-loop pitch search:A process of estimating the near optimal lag directly from the weighted speech
`input. This is done to simplify the pitch analysis and confine the closed-loop
`pitch search to a small number of lags around the open-loop estimated lags. In
`the GSM enhanced full rate codec, open-loop pitch search is performed every
`10 ms.
`
`residual:
`
`The output signal resulting from an inverse filtering operation.
`
`short term synthesis filter: This filter introduces, into the excitation signal, short term correlation which
`models the impulse response of the vocal tract.
`
`perceptual weighting filter: This filter is employed in the analysis-by-synthesis search of the codebooks.
`The filter exploits the noise masking properties of the formants (vocal tract
`resonances) by weighting the error less in regions near the formant frequencies
`and more in regions away from them.
`
`subframe:
`
`A time interval equal to 5 ms (40 samples at an 8 kHz sampling rate).
`
`vector quantisation:
`
`A method of grouping several parameters into a vector and quantising them
`simultaneously.
`
`zero input response:
`
`The output of a filter due to past inputs, i.e. due to the present state of the filter,
`given that an input of zeros is applied.
`
`zero state response:
`
`The output of a filter due to the present input, given that no past inputs have
`been applied, i.e.,. given the state information in the filter is all zeroes.
`
`3.2
`
`Symbols
`
`The inverse filter with unquantised coefficients
`
`For the purpose of this ETS the following symbols apply.
`( )A z
`( )
`z
`
`The inverse filter with quantised coefficients
`
`A
`
`The speech synthesis filter with quantised coefficients
`
`The unquantised linear prediction parameters (direct form coefficients)
`
`1
`( )
`z
`
`A
`
`ai
`
`Page 9 of 52
`
`

`
`Page 10
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`ai
`( )W z
`g g
`,
`
`1
`
`2
`
`The quantised linear prediction parameters
`
`The perceptual weighting filter (unquantised coefficients)
`
`The perceptual weighting factors
`
`F z(
`
`)
`
`Adaptive prefilter
`
`H zf ( )
`
`d
`
`H zt ( )
`
`t
`
`m g=
`
`t k1
`
`H zh1( )
`
`The adaptive prefilter coefficient
`
`The formant postfilter
`
`Control coefficient for the amount of the formant postfiltering
`
`Tilt compensation filter
`
`Control coefficient for the amount of the tilt compensation filtering
`
`A tilt factor, with k1 being the first reflection coefficient
`
`Pre-processing high-pass filter
`
`w n1( ) , w n2(
`
`)
`
`LP analysis windows
`
`lag( )
`w i
`
`fs
`
`F z1 ( )
`
`F z2 ( )
`
`T xm ( )
`
`f
`
`i( )
`
`Lag window for the autocorrelations (60 Hz bandwidth expansion)
`
`The sampling frequency
`
`Symmetric LSF polynomial
`
`Antisymmetric LSF polynomial
`
`A m th order Chebyshev polynomial
`
`The coefficients of either F z1(
`
`) or F z2(
`
`)
`
`)1 n , z(
`z( )(
`
`
`
`
`
`)(2 n
`
`)
`
`The mean-removed LSF vectors
`
`
`
`
`
`)r( )(1 n , r(
`
`
`
`)(2 n
`
`)
`
`The LSF prediction residual vectors
`
`p( )n
`
`w ii ,
`
`= 1
`10
`,
`,
`
`,
`
`The predicted LSF vector
`
`LSP-quantisation weighting factors
`
`h n(
`
`)
`
`s n'( )
`
`s nw(
`
`)
`
`The impulse response of the weighted synthesis filter
`
`The windowed speech signal
`
`The weighted speech signal
`
`Page 10 of 52
`
`b
`g
`g
`

`
`Page 11
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`
`
`(s n
`
`)
`
`
`
` ( )¢s n
`
`Reconstructed speech signal
`
`The gain-scaled postfiltered signal
`
`
`
`
`
` (s nf
`
`)
`
`Postfiltered speech signal (before scaling)
`
`x n(
`
`)
`
`x n2(
`
`)
`
`r n(
`
`)
`
`c n(
`
`)
`
`v n(
`
`)
`
`The target signal for adaptive codebook search
`
`The target signal for algebraic codebook search
`
`The LP residual signal
`
`The fixed codebook vector
`
`The adaptive codebook vector
`
`( ) = ( )
`( )
`v n h n
`y n
`
`The filtered adaptive codebook vector
`
`y nk ( )
`
`u n(
`
`)
`
`'( )
`u n
`
`Top
`
`tmin
`
`tmax
`
`R k
`
`Tk
`
`=
`t
`d H x
`
`2
`
`H
`
`F = H Ht
`
`d n( )
`
`f ( , )
`i
`j
`
`c k
`
`The past filtered excitation
`
`The excitation signal
`
`The gain-scaled emphasised excitation signal
`
`Open-loop lag
`
`Minimum lag search value
`
`Maximum lag search value
`
`Correlation term to be maximised in the adaptive codebook search
`
`Correlation term to be maximised in the algebraic codebook search
`( )
`x n2
`
`The correlation between the target signal
`( )h n , i.e., backward filtered target
`
` and the impulse response
`
`The lower triangular Toepliz convolution matrix with diagonal
`diagonals ( )
`(
`)
`1
`39
`h
`h
`
`, ,
`The matrix of correlations of ( )h n
`
`( )h 0 and lower
`
`The elements of the vector d
`
`The elements of the symmetric matrix F
`
`The innovation vector
`
`Page 11 of 52
`
`*
`

`
`Page 12
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`The position of the i th pulse
`
`The number of pulses
`
`The weighted sum of the normalised
`prediction residual
`
`( )d n vector and normalised long-term
`
`Sign extended backward filtered target
`
`The mean-removed innovation energy (in dB)
`
`The mean of the innovation energy
`
`The predicted energy
`
`The MA prediction coefficients
`
`The fixed-codebook gain
`
`The predicted fixed-codebook gain
`
`The quantised fixed codebook gain
`
`The adaptive codebook gain
`
`The quantised adaptive codebook gain
`
`mi
`
`N p
`
`b n( )
`
`d n' (
`
`)
`
`E n(
`
`)
`
`E
`
`
`
`~
`( )E n
`[
`
`
`
`b b b b
`1 2 3 4
`
`]
`
`gc
`
`'
`gc
`
`gc
`
`g p
`
`g p
`
`g = g
`
`/
`
`c
`
`g
`
`'
`c
`
`g
`
`g(D)
`
`GF(2)
`
`3.3
`
`'
`A correction factor between the gain gc and the estimated one gc
`The optimum value for g
`
`A cyclic generator polynomial
`
`Gain scaling factor
`
`Galois field of 2 elements
`
`Abbreviations
`
`For the purposes of this ETS the following abbreviations apply. Further GSM related abbreviations may be
`found in GSM 01.04 [1].
`
`ACELP
`AGC
`CELP
`CRC
`FIR
`ISPP
`LP
`LPC
`LSF
`LSP
`
`Algebraic Code Excited Linear Prediction
`Adaptive Gain Control
`Code Excited Linear Prediction
`Cyclic Redundancy Check
`Finite Impulse Response
`Interleaved Single-Pulse Permutation
`Linear Prediction
`Linear Predictive Coding
`Line Spectral Frequency
`Line Spectral Pair
`
`Page 12 of 52
`
`g
`

`
`Page 13
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`LTP
`MA
`
`4
`
`Long Term Predictor (or Long Term Prediction)
`Moving Average
`
`Outline description
`
`This ETS is structured as follows:
`
`Section 4.1 contains a functional description of the audio parts including the A/D and D/A functions.
`Section 4.2 describes the conversion between 13-bit uniform and 8-bit A-law samples. Sections 4.3 and
`4.4 present a simplified description of the principles of the GSM EFR encoding and decoding process
`respectively. In section 4.5, the sequence and subjective importance of encoded parameters are given.
`
`Section 5 presents the functional description of the GSM EFR encoding, whereas section 6 describes the
`decoding procedures. Section 7 describes variables, constants and tables of the C-code of the GSM EFR
`codec.
`
`4.1
`
`Functional description of audio parts
`
`The analogue-to-digital and digital-to-analogue conversion will in principle comprise the following
`elements:
`
`1)
`
`Analogue to uniform digital PCM
`microphone;
`input level adjustment device;
`input anti-aliasing filter;
`sample-hold device sampling at 8 kHz;
`analogue-to-uniform digital conversion to 13-bit representation.
`
`The uniform format shall be represented in two's complement.
`
`2) Uniform digital PCM to analogue
`conversion from 13-bit/8 kHz uniform PCM to analogue;
`a hold device;
`reconstruction filter including x/sin( x ) correction;
`output level adjustment device;
`earphone or loudspeaker.
`
`In the terminal equipment, the A/D function may be achieved either
`
`by direct conversion to 13-bit uniform PCM format;
`
`or by conversion to 8-bit/A-law companded format, based on a standard A-law codec/filter
`according to ITU-T Recommendations G.711 [8] and G.714, followed by the 8-bit to 13-bit
`conversion as specified in section 4.2.1.
`
`For the D/A operation, the inverse operations take place.
`
`In the latter case it should be noted that the specifications in ITU-T G.714 (superseded by G.712) are
`concerned with PCM equipment located in the central parts of the network. When used in the terminal
`equipment, this ETS does not on its own ensure sufficient out-of-band attenuation. The specification of
`out-of-band signals is defined in GSM 03.50 [2] in section 2.
`
`4.2
`
`Preparation of speech samples
`
`The encoder is fed with data comprising of samples with a resolution of 13 bits left justified in a 16-bit
`word. The three least significant bits are set to '0'. The decoder outputs data in the same format. Outside
`the speech codec further processing must be applied if the traffic data occurs in a different representation.
`
`Page 13 of 52
`
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`

`
`Page 14
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`4.2.1
`
`PCM format conversion
`
`The conversion between 8-bit A-Law compressed data and linear data with 13-bit resolution at the speech
`encoder input shall be as defined in ITU-T Rec. G.711 [8].
`
`ITU-T Rec. G.711 [8] specifies the A-Law to linear conversion and vice versa by providing table entries.
`Examples on how to perform the conversion by fixed-point arithmetic can be found in ITU-T Rec. G.726
`[9]. Section 4.2.1 of G.726 [9] describes A-Law to linear expansion and section 4.2.7 of G.726 [9] provides
`a solution for linear to A-Law compression.
`
`4.3
`
`Principles of the GSM enhanced full rate speech encoder
`
`The codec is based on the code-excited linear predictive (CELP) coding model (see Bibliography). A 10th
`order linear prediction (LP), or short-term, synthesis filter is used which is given by
`
`1
`
`im
`
`=
`
`-(cid:229)
`
`
`
`H z( )
`
`=
`
`=
`
`1
`
`A z( )
`
`
`
`+
`
`1
`
`,
`
`i
`
`(1)
`
`
`
`a z
`i
`
`1
`= 1  are the (quantised) linear prediction (LP) parameters, and m = 10 is the predictor
`,m
`where  ,a i
`
`,
`,
`
`i
`order. The long-term, or pitch, synthesis filter is given by
`
`=
`
`1
`
`B z(
`
`)
`
`1
`- -
`g zp
`
`1
`
`,
`
`T
`
`(2)
`
`where T is the pitch delay and g p is the pitch gain. The pitch synthesis filter is implemented using the
`so-called adaptive codebook approach.
`
`The CELP speech synthesis model is shown in figure 2. In this model, the excitation signal at the input of
`the short-term LP synthesis filter is constructed by adding two excitation vectors from adaptive and fixed
`(innovative) codebooks. The speech is synthesised by feeding the two properly chosen vectors from these
`codebooks through the short-term synthesis filter. The optimum excitation sequence in a codebook is
`chosen using an analysis-by-synthesis search procedure in which the error between the original and
`synthesised speech is minimised according to a perceptually weighted distortion measure.
`
`The perceptual weighting filter used in the analysis-by-synthesis search technique is given by
`
`gg
`
`A z(
`/
`)
`
`A z(
`/
`)
`( )A z is the unquantised LP filter and 0
`
`<
`< £g g
`1
` are the perceptual weighting factors. The
`where
`2
`1
` and g
`values g
`0 9= .
`0 6= .
` are used. The weighting filter uses the unquantised LP parameters while
`2
`1
`the formant synthesis filter uses the quantised ones.
`
`,
`
`1 2
`
`
`
`W z( )
`
`=
`
`(3)
`
`The coder operates on speech frames of 20 ms corresponding to 160 samples at the sampling frequency
`of 8000 sample/s. At each 160 speech samples, the speech signal is analysed to extract the parameters
`of the CELP model (LP filter coefficients, adaptive and fixed codebooks' indices and gains). These
`parameters are encoded and transmitted. At the decoder, these parameters are decoded and speech is
`synthesised by filtering the reconstructed excitation signal through the LP synthesis filter.
`
`Page 14 of 52
`
`

`
`Page 15
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`The signal flow at the encoder is shown in figure 3. LP analysis is performed twice per frame. The two
`sets of LP parameters are converted to line spectrum pairs (LSP) and jointly quantised using split matrix
`quantisation (SMQ) with 38 bits. The speech frame is divided into 4 subframes of 5 ms each (40
`samples). The adaptive and fixed codebook parameters are transmitted every subframe. The two sets of
`quantised and unquantised LP filters are used for the second and fourth subframes while in the first and
`third subframes interpolated LP filters are used (both quantised and unquantised). An open-loop pitch lag
`is estimated twice per frame (every 10 ms) based on the perceptually weighted speech signal.
`
`Then the following operations are repeated for each subframe:
`
`) is computed by filtering the LP residual through the weighted synthesis filter
`The target signal x n(
`
`
`W z H z( ) (
`
`) with the initial states of the filters having been updated by filtering the error between
`LP residual and excitation (this is equivalent to the common approach of subtracting the zero input
`response of the weighted synthesis filter from the weighted speech signal).
`
`The impulse response, h n(
`
`) of the weighted synthesis filter is computed.
`
`)
`Closed-loop pitch analysis is then performed (to find the pitch lag and gain), using the target x n(
`and impulse response h n(
`) , by searching around the open-loop pitch lag. Fractional pitch with
`1/6th of a sample resolution is used. The pitch lag is encoded with 9 bits in the first and third
`subframes and relatively encoded with 6 bits in the second and fourth subframes.
`
`) is updated by removing the adaptive codebook contribution (filtered
`The target signal x n(
`) , is used in the fixed algebraic codebook search
`adaptive codevector), and this new target, x n2(
`(to find the optimum innovation). An algebraic codebook with 37 bits is used for the innovative
`excitation.
`
`The gains of the adaptive and fixed codebook are scalar quantised with 4 and 5 bits respectively
`(with moving average (MA) prediction applied to the fixed codebook gain).
`
`Finally, the filter memories are updated (using the determined excitation signal) for finding the target
`signal in the next subframe.
`
`The bit allocation of the codec is shown in table 1. In each 20 ms speech frame, 260 bits are produced,
`corresponding to a bit rate of 13 kbit/s. Within these 260 bits, 8 bits are used for CRC error checking.
`More detailed bit allocation is presented in table 6. Note that the most significant bits (MSB) are always
`sent first.
`
`Table 1: Bit allocation of the 13 kbit/s coding algorithm for 20 ms frame.
`
`Parameter
`
`1st & 3rd subframes
`
`2nd & 4th subframes
`
`total per frame
`
`2 LSP sets
`Parity bits
`Pitch delay
`Pitch gain
`Algebraic code
`Codebook gain
`Total
`
`9
`4
`37
`5
`
`6
`4
`37
`5
`
`38
`8
`30
`16
`148
`20
`260
`
`4.4
`
`Principles of the GSM enhanced full rate speech decoder
`
`The signal flow at the decoder is shown in figure 4. At the decoder, the transmitted indices are extracted
`from the received bitstream. The indices are decoded to obtain the coder parameters at each
`transmission frame. These parameters are the two LSP vectors, the 4 fractional pitch lags, the 4
`innovative codevectors, and the 4 sets of pitch and innovative gains. The LSP vectors are converted to the
`
`Page 15 of 52
`
`

`
`Page 16
`Draft prETS 300 726: March 1996 (GSM 06.60 version 5.0.0)
`
`LP filter coefficients and interpolated to obtain LP filters at each subframe. Then, at each 40-sample
`subframe:
`
`-
`
`-
`
`the excitation is constructed by adding the adaptive and innovative codevectors scaled by their
`respective gains.
`
`the speech is reconstructed by filtering the excitation through the LP synthesis filter.
`
`Finally, the reconstructed speech signal is passed through an adaptive postfilter.
`
`4.5
`
`Sequence and subjective importance of encoded parameters
`
`The encoder will produce the output information in a unique sequence and format, and the decoder must
`receive the same information in the same way. In table 6, the sequence of output bits s1 to s260 and the
`bit allocation for each parameter is shown.
`
`The different parameters of the encoded speech and their individual bits have unequal importance with
`respect to subjective quality. Before being submitted to the channel encoding function the bits have to be
`rearranged in the sequence of importance as given in table 7.
`
`5
`
`Functional description of the encoder
`
`In this section, the different functions of the encoder represented in figure 3 are described.
`
`5.1
`
`Pre-processing
`
`Two pre-processing functions are applied prior to the encoding process: high-pass filtering and signal
`down-scaling.
`
`Down-scaling consists of dividing the input by a factor of 2 to reduce the possibility of overflows in the
`fixed-point implementation.
`
`The high-pass filter serves as a precaution against undesired low frequency components. A filter with a
`cut off frequency of 80 Hz is used, and it is given by
`
`H z(
`
`
`h1
`
`)
`
`=
`
`+
`- -- -
`0 92727435
`0 92727435 18544941
`.
`.
`.
`z
`z
`+
`2
`1
`0 9114024
`1 19059465
`z
`z
`.
`.
`
`1
`
`2
`
`.
`
`(4)
`
`Down-scaling and high-pass filtering are combined by dividing the coefficients at the numerator of
`
`) by 2.
`H zh1(
`
`5.2
`
`Linear prediction analysis and quantisation
`
`Short-term prediction, o

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket