`Makamura et a1.
`
`llllllIlllllllllllllllllllllllllIllllIlllllllllllllllllllllllllllllllllllll
`US005295224A
`[11] Patent Number:
`5,295,224
`' [451 Date of Patent:
`Mar. 15, 1994
`
`[54] LINEAR PREDICTION SPEECH CODING
`WITH HIGH-FREQUENCY PREEMPHASIS
`[75] Inventors:
`Makio Malramura; Yosliilriro Unno,
`both of Tokyo, Japan
`[73] Assignee:
`NEC Corporation, Tokyo, Japan
`[21] Appl. No.: 765,737
`[22] Filed:
`Sep.26, 1991
`[30]
`Foreign Application Priority Data
`Sep. 26, 1990 [JP]
`Japan ................................ .. 2-256493
`
`[51] Int. Cl; ......................... .. 0101. 3/02; H04B 1/66
`[52] us. or. ................................ .. sis/2.32; 395/228;
`395/229
`[58] Field of Search ...................... .. 395/2; 381/29-36,
`.
`381/46, 47
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`vol. 6, No. 2, Feb. 1988, pp. 353-363, P. Kroon et al.: “A
`class of analysis-by-synthesis predictive coders for high
`quality speech coding at rates between 4.8 and 16
`kbits/s.”
`.
`ICASSP’86 (IEEE-IECEJ-ASJ Int. Conf. on Acous
`tics, Speech, and Signal Processing, Apr. 7-11, 1986), '
`vol. 4, pp. 3055-3057, IEEE, G. Davidson et a.l.: "Com
`plexity reduction methods for vector excitation cod
`ins-9!
`ICASSP’89 (1989) Intl. Conf. on Acoustics, Speech and
`Signal Processing, Glasgow, May 23-26, 1989) vol. 1,
`_} pp. 53-56, IEEE, A Bergstrom et a1: “Codebook driven
`glottal pulse analysis.”
`\
`ICASSP’90 (1990 Int. Conf. on Acoustics, Speech, and
`Signal Processing, Albuquerque, N.M., Apr. 3-6, 1990)
`vol. 1, pp. 241-244, T. Taniguchi et a1: “Principal axis
`extracting vector excitation coding: high quality speech
`at 8 kb.s.”
`
`Primary Examiner-David D. Knepper
`Attorney, Agent, or Firm-Sughrue, Mion, Zinn,
`Macpeak & Seas
`
`[s71
`ans'macr
`.
`High frequency components of input digital speech
`samples are emphasized by a preemphasis ?lter (11).
`From the preemphasized samples a spectral parameter
`(at) is derived at frame intervals. The input digital
`samples are weighted by a weightin g ?lter (13) according
`to a characteristic that is inverse to the characteristic of
`the preemphasis ?lter (11) and is a function of the spectral
`parameter (a,-). A codebook (18, 19) is searched for an
`optimum fricative value in response to a pitch parameter
`that is derived by an adaptive codebook (16) from a
`previous fricative value (v(n)) and a difference between
`the weighted speech samples and synthesized speech
`samples which are, in turn, derived from past pitch
`parameters and optimum fricative values, whereby the
`difference is reduced to a minimum. Index signals
`representing the spectral parameter, pitch parameter and
`optimum fricative value are multiplexed into a single data
`stream.
`
`4 Claims, 4 Drawing Sheets
`
`
`
`4,899,385 2/1990 Ketchum et al. 4,933,957 6/1990 Bottau et al. 4,965,789 10/1990 Bottau et a1. 5,007,092 4/1991 Galand et a1. 5,142,583 8/1992 Galand et al. ..................... ..
`
`
`
`
`
`
`
`FOREIGN PATENT DOCUMENTS
`
`0331858 8/1988 Japan ........ ..' .......... .1 ..... .. GlOL 9/14 '
`WO86/02726 5/1986 World Int. Prop. 0.
`G10L 5/00
`
`OTHER PUBLICATIONS
`“Code-excited linear prediction: High quality speech at
`very low bit rates”, ICASSP, vol. 3, pp. 937-940, Mar.
`1985.
`Signal Processing IV: Theories and Applications (Pro
`ceedings of EUSIPCO-88, 4th European Signal Pro
`cessing Conference, Grenoble, Sep. 5-8, 1988), vol. II,
`pp. 871-874, F. Bottau et a1: “On different vector pre
`dictive coding schemes and their application to low bit
`rates speech coding.”
`ICASSP’89 (1989 Intl. Conf. on Acoustics, Speech, and
`Signal Processing, Glasgow, May 23-26, 1989) vol. 1,
`pp. 132-135, IEEE, J. Menez et al.: “Adaptive code
`excited linear predictive coder.”
`IEEE Journal on Selected Areas in Communications,
`
`,1: m ‘M
`
`am
`
`,15
`mus
`7 am
`
`r“
`
`" m
`
`r11
`
`muss
`nan-ms
`mm
`
`l ,1:
`uc mm
`l r‘
`
`la \
`
`a
`
`"
`u
`|_
`r
`1
`
`I
`1‘,
`i
`
`Ex. 1021 / Page 1 of 11
`Apple v. Saint Lawrence
`
`
`
`uM
`
`4mm,
`
`Sheet 1 of 4
`
`5,295,224
`
`Saab—mamxmg
`
`US. Patent
`928%5.:I5::
`
`
`
`soonmaou2003.39
`
`u>_.E<a<
`
`xOOmwn—ou
`
`222.23....
`
`«3.5.:
`
`5&3:
`
`Easy:
`
`Ex. 1021 / Page 2 of 11
`
`
`
`US. Patent
`
`Mar. 15, 1994
`
`Sheet 2 0f 4
`
`5,295,224
`
`N .2“
`
`EOE:
`nun-“5a
`“55:25A
`35352
`
`5.5:
`
`5838
`
`“>522 I
`.1 _ Q1
`
`n
`
`a
`
`9.82.38 h
`
`5: ¢ 2
`\ __ a
`
`w a
`
`._ _
`
`v a
`
`3838 u
`Q28: ‘All w.
`
`?x ~_ 2
`
`
`
`Ex. 1021 / Page 3 of 11
`
`
`
`uM
`
`4m.b,
`
`Sheet 3 of 4
`
`5,295,224
`
`
`
`:35“.28:0
`
`«5.23
`
`.u.
`
`28.383238
`
`US. Patent
`928mm5.:.-
`
`2.5.3"3.5.5“323823...:me
`55?ono55:38
`
`a
`
`02:20.25
`
`5.5:
`
`Ex. 1021 / Page 4 of 11
`
`
`
`Mar. 15, 1994
`
`Sheet 4 of 4
`
`5,295,224
`
`aaA
`
`a
`
`a
`
`m>_.E<n<
`
`gamma—cu
`
`
`
`55.58azcua
`
`:55:5::03838
`5:2352:5:
`
`US. Patent
`
`55>on
`
`3238
`
`cuss—uh—mdmxm:
`
`Ex. 1021 / Page 5 of 11
`
`
`
`1
`
`LINEAR PREDICTION SPEECH CODING WITH
`HIGH-FREQUENCY PREEMPHASIS
`
`RELATED APPLICATION
`This application is related to co-pending U.S. patent
`application Ser. No. 07/658,473, K. Ozawa, ?led Feb.
`20, 1991, titled “Speech Coder”, and assigned to the
`same assignee as the present application.
`
`5,295,224
`2
`parameter and optimum fricative value are generated at
`frame intervals and multiplexed into a single data bit
`stream at low bit rates for transmission or storage. In a
`speech decoder, the data bit stream is decomposed into
`individual index signals. A codebook is accessed with a
`corresponding index signal to recover the optimum
`fricative value which is combined with a pitch parame
`ter derived from an adaptive codebook in response to
`the pitch parameter index signal, thus forming an input
`signal to a synthesis ?lter having a characteristic that is
`a function of the decomposed spectral parameter. The
`output of the synthesis ?lter is deemphasized according
`to a characteristic inverse to the preemphasis character
`istic.
`In a preferred embodiment of the speech encoder, the
`amount of computations is reduced by converting the
`spectral parameter to a second spectral parameter ac
`cording to a prescribed relationship between the second
`parameter and a combined value of the ?rst spectral
`parameter and a parameter representing the response of
`the high-frequency preemphasis. The second spectral
`parameter is used to weight the digital speech samples
`and the ?rst spectral parameter is multiplexed with the
`other index signals. In the speech decoder of the pre
`ferred embodiment, the ?rst spectral parameter is con
`verted to the second spectral parameter in the same
`manner as in the speech encoder. A synthesis ?lter is
`provided having a characteristic that is inverse to the
`preemphasis characteristic and is a function of the sec
`ond spectral parameter to synthesize speech samples
`from a sum of the pitch parameter and the optimum
`fricative value.
`
`20
`
`BACKGROUND OF THE INVENTION
`The present invention relates generally to speech
`coding techniques, and more speci?cally to a speech
`conversion system using a low-rate linear prediction
`speech coding/decoding technique.
`As described in a paper by M. Schroeder and B. Atal,
`“Code-excited linear prediction: High quality speech at
`very low bit rates”, M. Schroeder and B. Atal (ICASSP
`Vol. 3, pages 937-940, March 1985), speech samples
`digitized at 8-kHz sampling rate are converted to digital
`samples of 4.8 to 8 kbps rates by extracting spectral
`parameters representing the spectral envelope of the
`speech samples from frames at 20-ms intervals and de
`riving pitch parameters representing the long-term cor~
`25
`relations of pitch intervals from subframes at SO-ms
`intervals. Fricative components of speech are stored in
`a codebook. Using the pitch parameter a search is made
`through the codebook for an optimum value that mini
`mizes the difference between the input speech samples
`and speech samples which are synthesized from a sum
`of the optimum codebook values and the pitch parame
`ters. Signals indicating the spectral parameter, pitch
`parameter, and codebook value are transmitted or
`stored as index signals at bit rates in the range between
`4.8 and 8 kbps.
`However, one disadvantage of linear prediction cod
`ing is that it requires a large amount of computations for
`analyzing voiced sounds, an amount that exceeds the
`capability of the state-of-the-art hardware implementa
`tion such as 16-bit ?xed point DSP (digital signal pro
`cessing) LSI packages.
`ith the current technology,
`LPC analysis is not satisfactory for high-pitched voiced
`sounds.
`
`35
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The present invention will be described in further
`detail with reference to the accompanying drawings, in
`which:
`FIG. 1 is a block diagram of a speech encoder ac
`cording to the present invention;
`FIG. 2 is a block diagram of a speech decoder ac
`cording to the present invention;
`FIG. 3 is a block diagram of a modi?ed speech en
`coder of the present invention; and
`FIG. 4 is a block diagram of a modi?ed speech de
`coder associated with the speech encoder of FIG. 3.
`
`45
`
`SUMMARY OF THE INVENTION
`It is therefore an object of the present invention to
`provide a speech encoder having reduced computations
`for LPC analysis to enable hardware implementation
`with limited computational capability.
`In a speech encoder of the present invention, high
`frequency components of input digital speech samples
`of an underlying analog speech signal are preempha
`sized according to a prede?ned frequency response
`characteristic. From the preemphasized speech samples
`a spectral parameter is derived at frame intervals to
`represent the spectrum envelope of the preemphasized
`speech samples. The input digital samples are weighted
`according to a characteristic that is inverse to the pre
`emphasis characteristic and is a function of the spectral
`parameter. A search is made through a codebook for an
`optimum fricative value in response to a pitch parame
`ter which is derived by an adaptive codebook from a
`previous fricative value and a difference between the
`weighted speech samples and synthesized speech sam
`ples which are, in turn, derived from pitch parameters
`and optimum fricative values. The optimum fricative
`value is one that reduces the difference to a minimum.
`Index signals representing the spectral parameter, pitch
`
`DETAILED DESCRIPTION
`Referring now to FIG. 1, there is shown a speech
`encoder according to one embodiment of the present
`invention. An analog speech signal is sampled at 8 kHz,
`converted to digital form and formatted into frames or
`20-ms duration each containing N speech samples. The
`speech samples of each frame are stored in a buffer
`memory 10 and applied to a preemphasis high-pass filter
`11. Preemphasis ?lter 11 has a transfer function H(z) of
`the form:
`
`55
`
`H(z)=1-Bz-1
`
`(1)
`
`where B is a preemphasis ?lter coef?cient (0<B<1)
`and z is a delay operator. The effect of this high fre
`quency emphasis is to make signal processing less diffi
`cult for high frequency speech components which are
`abundant in utterances from women and children.
`To the output of buffer memory 10 is connected a
`weighting ?lter 13 having a weighting function W(z) of
`the form:
`
`65
`
`Ex. 1021 / Page 6 of 11
`
`
`
`5,295,224
`4
`the range between 1 and a maximum number of code
`words for codewords c1 and k is a variable in the range
`between 1 and a maximum number of codewords for
`codewords c;). The codeword signal indicating the
`optimum codeword c1}(n) and its gain r1 is supplied
`from searching circuit 18 to a second searching circuit
`19 as well as to an adder 20 in which it is summed with
`a codeword signal representing the optimum codeword
`czk(n) and its gain r; from searching circuit 19 to pro
`duce a sum v(n) given by;
`'
`
`v(n)=r1-¢1,(n)+rz-¢u(n)
`
`(4)
`
`The output of adder 20 is fed to the adder 17 and
`summed with the pitch parameter e.b(n). 0n the other
`hand, the address signals used by the searching circuits
`18 and 19 for accessing the optimum codewords and
`gain values are supplied as codebook index signals I1
`and 12, respectively, to multiplexer 23 at frame intervals.
`Searching circuits 18 and 19 operate to detect opti
`mum codewords and gain values from codebooks 21
`and 22 so that the error E given by the following for
`mula is reduced to a minimum:
`
`where a,- represents the spectral envelope of ith speech
`sample of the frame, or ith order linear predictor, 'y is a
`coef?cient (0<'y<l), P represents the order of the 10
`spectral parameter.
`The output of LPG analyzer 12 is applied to
`weighting ?lter 13 to control its weighting coef?cient,
`so that the N samples x(n) of each frame are scaled by
`weighting ?lter 13 according to Equation (2) as a func
`tion of the spectral parameter a,-. Since the LPC analysis
`is performed on the high-frequency emphasized speech
`samples, weighting ?lter 13 compensates for this em
`phasis by the inverse ?lter function represented by a
`term of Equation (2).
`The output of weighting ?lter 13 is applied to a sub
`tractor 14 in which it is combined with the output of a
`synthesis ?lter 15 having a ?lter function given by:
`
`25
`
`30
`
`where s(n) is an impulse response of the ?lter function
`8(2) of synthesis ?lter 15.
`More speci?cally, searching circuit 18 makes a search
`for data r; and c1j(n) which minimize the following
`error component E1:
`
`N-l
`E1 = r50 KM”) — r1 wad-$019K")?
`
`6
`( )
`
`difference
`residual
`the
`is
`ew(n)
`where,
`{x(n)—e.b(n)}w(n). By partially differentiating Equa
`tion (6) with respect to gain r1 and equating it to zero,
`the following Equations hold:
`
`r1=G,-/q-
`
`where, Gj and Cj are given respectively by:
`
`N-1
`Gj= ‘.20 ¢t(n)-¢1,(n)-s(n)
`
`N —--l
`c; = .2 [9101) ~10)? '
`i=0
`
`Equation (6) can be rewritten as:
`
`(1)
`
`(8)
`
`Since the ?rst term of Equation (8) is a constant, a code
`word c1}(n) is selected from codebook 21 such that it
`maximizes the second term of Equation (8).
`The second searching circuit 19 receives the code
`word signal from the ?rst searching circuit as well as
`the residual difference x(n)—e.b(n) from the adaptive
`codebook 16 to make a search through the second code
`book 22 in a known manner and detects the optimum
`
`Subtractor 14 produces a difference signal indicating
`the power of error between a current frame and a syn
`thesized frame. The difference signal is applied to a
`known adaptive codebook 16 to which the output of an
`adder 17 is also applied. Adaptive codebook 16 divides
`each frame of the output of subtractor 14 into subframes
`of 5-ms duration. Between the two input signals of pre
`vious subframes the adaptive codebook 16 provides
`cross-correlation and auto-correlation and derives at
`subframe intervals a pitch parameter e.b(n) representa
`tive of the long-term correlation between past and pres
`ent pitch intervals (where 6 indicates the pitch gain and
`b(n) the pitch interval) and further generates at sub
`frame intervals a signal x(n)—e.b(n) which is propor
`tional to the residual difference {x(n)—e.b(n)}w(n).
`Adaptive codebook 16 further generates a pitch param
`eter index signal In at frame intervals to represent the
`pitch parameters of each frame and supplies it to a mul
`tiplexer 23 for transmission or storage. Details of the
`adaptive codebook are‘ described in a paper by Kleijin et
`al., titled “Improved speech quality and ef?cient vector
`quantization in SELP”, ICASSP, Vol. 1, pages
`155-158, 1988.
`The pitch parameter e.b(n) is applied to adder 17 and
`the signal x(n) -e.b(n) is applied to first and second
`searching circuits 18 and 19, which are known in the
`speech coding art, for making a search through ?rst and
`second codebooks 21 and 22, respectively. The ?rst
`codebook 21 stores codewords representing fricatives
`which are obtained by a long-term learning process in a
`manner as described in a paper by Buzo et al., titled‘
`“Speech coding based upon vector quantization”
`(IEEE Transaction ASSP, Vol. 28, No. 5, pages
`562-574, October 1980). The second codebook 22 is
`generally similar to the ?rst codebook 21. However, it
`stores codewords of random numbers to make the
`searching circuit 19 less dependent on the training data.
`As described in detail below, codebooks 21 and 22 are
`searched for optimum codewords c1}(n), c2k(n) and
`optimum gains r1, r; so that an error signal E given
`below is reduced to a minimum (where j is a variable in
`
`4-0
`
`45
`
`50
`
`55
`
`65
`
`Ex. 1021 / Page 7 of 11
`
`
`
`5,295,224
`6
`5
`function which is inverse to that of preemphasis ?lter
`codeword c2k(n) and the optimum gain r1 of the code
`11':
`word.
`With regard to the searching circuits 18 and 19, the
`aforesaid co-pending US. Patent Application is incor
`porated herein as a reference material for implementa
`tion.
`,
`The output of adder 17 is supplied at subframe inter
`vals to the synthesis ?lter 15 in which synthesized N
`speech samples x’(n) are derived from successive frames
`according to the following known formula:
`
`$z(Z)=1/(l=B-=")
`
`(12)
`
`10
`
`Since the combined transfer function of the synthesis
`?lter 36 and deemphasis ?lter 37 is equal to the transfer
`function S(z) of the encoder’s weighting ?lter 13, a
`replica of the original digital speech samples x(n) ap
`pears at the output of deemphasis low-pass ?lter 37. A
`buffer memory 38 is coupled to the output of this deem
`phasis ?lter to store the recovered speech samples at
`frame intervals for conversion to analog form.
`A modi?cation of the present invention is shown in
`FIG. 3. This modi?cation differs from the previous
`embodiment by the provision of a weight ?lter shown at
`41 instead of the ?lter 13 and a coef?cient converter 40
`connected between LPC analyzer 12 and weighting
`?lter 41. Coef?cient converter 40 transforms the spec
`tral parameter a_,- to 61- according to the following Equa
`tions:
`
`#00 = b(n) + .51 a: - Iii-(n + 1)
`I:
`
`(9)
`
`where a/ is a spectral parameter obtained from interpo
`lations between successive frames and p represents the
`order of the interpolated spectral parameter, and b(n) is
`given by:
`
`(10)
`
`20
`
`25
`
`Since the coef?cient conversion incorporates the
`high-frequency preemphasis factor B, the function
`W'(z) of weighting ?lter 41 can be expressed as follows:
`
`30
`
`It is seen from Equations (9) and (10) that the synthe
`sized speech samples contain a sequence of data bits
`representing v(n) and a sequence of binary zeros which
`appear at alternate frame intervals. The alternate occur
`rence of zero-bit sequences is to ensure that a current
`frame of synthesized speech samples is not adversely
`affected by a previous frame. The synthesis ?lter 15
`proceeds to weight the synthesized speech samples x'(n)
`with the ?lter function 8(2) of Equation (3) to synthe
`size weighted speech samples of a previous frame for
`coupling to the subtractor 14 by which the power of
`35
`error E is produced, representing the difference be
`tween the previous frame and a current frame from
`weighting ?lter 13 having the ?lter function W(z) of
`Equation (2).
`The output a; of LPC analyzer 12 and the residual
`difference x(n)—e.b(n) are supplied to multiplexer 23 as
`index signals and multiplexed with the index signals 11
`and l; from searching circuits 18, 19 into a single data bit
`stream at a bit rate in the range of 4.8 kbps and 8 kbps
`and sent over a transmission line to a site of signal recep
`tion or recorded into a suitable storage medium.
`At the site of signal reception or storage, a speech
`decoder as shown in FIG. 2 is provided. The speech
`decoder includes a demultiplexer 30 in which the multi
`plexed data bit stream is decomposed into the individual
`components 1,, l1, l2 and a;, which are applied respec
`tively to an adaptive codebook 31, a ?rst codebook 32,
`a second codebook 33 and a synthesis ?lter 36. Code
`word signals r1c1j(n) and r2c2k(n) are respectively recov
`ered by codebooks 32 and 33 and summed with the
`output of adaptive codebook 31 and applied via a delay
`circuit 34 to adaptive codebook 31 so that it reproduces
`the pitch parameter e.b(n). As a function of the pitch
`parameter a; supplied from demultiplexer 30, the synthe
`sis ?lter 36 transforms the output of adder 34 according
`to the following transfer function:
`
`40
`
`45
`
`50
`
`55
`
`The output of synthesis ?lter 36 is coupled to a deem
`phasis low-pass ?lter 37 having the following transfer
`
`65
`
`By coupling the output of coefficient converter 40 as a
`spectral parameter to weighting ?lter 41, the speech
`samples x(n) are weighted according to the function
`W’(z) and supplied to subtractor 14. In this way, the
`amount of computations which the weighting ?lter 41 is
`required to perform can be reduced signi?cantly in
`comparison with the computations required by the pre
`vious embodiment.
`As shown in FIG. 4, the speech decoder associated
`with the speech encoder of FIG. 3 differs from the
`embodiment of FIG. 1 in that it includes a coef?cient
`converter 50 identical to the encoder’s coef?cient con
`verter 40 and a synthesis filter 51 having the ?lter func
`tion 83(2) of the form:
`
`This speech decoder further differs from the previous
`embodiment in that it dispenses with the deemphasis
`low-pass ?lter 37 by directly coupling the output of
`synthesis ?lter 51 to buffer memory 38. The spectral
`parameter aj from the demultiplexer 30 is converted by
`coef?cient converter 50 to 5; according to Equations
`(13a), (13b), (13c) and supplied to synthesis ?lter 51 as a
`spectral parameter. The output of adder 34 is weighted
`with the ?lter function 53(2) by ?lter 51 as a function of
`the spectral parameter 8;. As a result of the coef?cient
`conversion, the amount of computations required for
`the speech decoder of this embodiment is signi?cantly
`
`Ex. 1021 / Page 8 of 11
`
`
`
`5
`
`20
`
`25
`
`What is claimed is:
`l. A speech encoder comprising:
`preemphasis means for receiving input digital speech
`samples of an underlying analog speech signal and
`emphasizing higher frequency components of the
`speech samples according to a prede?ned fre
`quency response characteristic;
`linear prediction analyzer means for receiving said
`preemphasized speech samples and deriving there
`from at frame intervals a spectral parameter repre
`senting a spectrum envelope of said preemphasized
`speech samples;
`weighting means for weighting said input digital
`speech samples according to a characteristic in
`verse to the characteristic of said preemphasis
`means as a function of said spectral parameter;
`a subtractor for detecting a difference between the
`weighted speech samples and synthesized speech
`samples;
`codebook means for storing data representing frica
`tives;
`search means for detecting optimum data from said
`codebook means as a function of a pitch parameter
`representing a pitch interval of said input speech
`samples so that said difference is reduced to a mini
`mum and generating a codebook index signal rep
`resenting said optimum data at frame intervals;
`adaptive codebook means for deriving said pitch
`parameter at subframe intervals from said differ
`ence and said optimum data and generating a pitch
`parameter index signal at frame intervals;
`speech synthesis means for deriving said synthesized
`speech samples from said pitch parameter and said
`optimum data; and
`means for multiplexing said spectral parameter, said
`pitch parameter index signal and said codebook
`index signal into a single data stream.
`2. A speech encoder comprising:
`preemphasis means for receiving input digital speech
`samples of an underlying analog speech signal and
`emphasizing higher frequency components of the
`speech samples according to a prede?ned fre
`quency response characteristic;
`linear prediction analyzer means for receiving said
`preemphasized speech samples and deriving there
`from at frame intervals a ?rst spectral parameter
`representing a spectrum envelope of said preem
`phasized speech samples;
`parameter conversion means for converting the first
`spectral parameter to a second spectral parameter
`according to a prescribed relationship between said
`second parameter and a combined value of said
`?rst spectral parameter and a parameter represent
`ing the frequency response of said preemphasis
`
`5,295,224
`8
`7
`search means for detecting optimum data from said
`reduced in comparison with the speech decoder of FIG.
`codebook means as a function of a pitch parameter
`representing a pitch interval of said input speech
`samples so that said difference is reduced to a mini
`mum and generating a codebook index signal rep
`resenting said optimum data at frame intervals;
`adaptive codebook means for deriving said pitch
`parameter at subframe intervals from said differ
`ence and said optimum data and generating a pitch
`parameter index signal at frame intervals;
`speech synthesis means for deriving said synthesized
`speech samples from said pitch parameter and said
`optimum data; and
`means for multiplexing said ?rst spectral parameter,
`said pitch parameter index signal and said code
`book index signal into a single data stream.
`3. A speech conversion system comprising:
`preemphasis means for receiving input digital speech
`samples of an underlying analog speech signal and
`emphasizing higher frequency components of the
`speech samples according to a prede?ned fre
`quency response characteristic;
`linear prediction analyzer means for receiving said
`preemphasized speech samples and deriving there
`from at frame intervals a spectral parameter repre
`senting a spectrum envelope of said preemphasized
`speech samples;
`weighting means- for weighting said input digital
`speech samples according to a characteristic in
`verse to the characteristic of said preemphasis
`means as a function of said spectral parameter;
`a subtractor for detecting a difference between the
`weighted speech samples and synthesized speech
`samples;
`?rst codebook means for storing data representing
`fricatives;
`search means for detecting optimum data from said
`?rst codebook means as a function of a pitch pa
`rameter representing a pitch interval of said speech
`samples so that said difference is reduced to a mini—
`mum and for generating a codebook index signal
`representing said optimum data at frame intervals;
`second, adaptive codebook means for deriving said
`pitch parameter at subframe intervals from said
`difference and said optimum data and for generat
`ing a pitch parameter index signal at frame inter
`vals;
`?rst speech synthesis means for deriving said synthe
`sized speech samples from said pitch parameter and
`said optimum data;
`-
`multiplexer means for multiplexing said spectral pa
`rameter, said pitch parameter index signal and said
`codebook index signal into a single data stream;
`demultiplexer means for demultiplexing said data
`stream into said spectral parameter, said pitch pa
`rameter index signal and said codebook index sig
`ml;
`third' codebook means for writing said optimum data
`therefrom at subframe intervals as a function of the
`demultiplexed codebook index signal;
`fourth adaptive codebook means for writing a pitch
`parameter at subframe intervals in response to the
`demultiplexed pitch parameter index signal and a
`sum of an output of the fourth adaptive codebook
`means and said optimumKdata of the third codebook
`means; optimum data from the stored fricatives
`representative data at subframe intervals as a func
`tion of the demultiplexed codebook index signal;
`
`30
`
`35
`
`45
`
`50
`
`55
`
`means;
`
`-
`
`weighting means for weighting said input digital
`speech samples according to a characteristic in
`verse to the characteristic of said preemphasis
`means as a function of said second spectral parame
`ter;
`a subtractor for detecting a difference between the
`weighted speech samples and synthesized speech
`samples;
`_
`codebook means for storing data representing frica
`tives;
`
`65
`
`Ex. 1021 / Page 9 of 11
`
`
`
`5,295,224
`10
`second speech synthesis means for synthesizing
`a minimum and generating a codebook index signal
`speech samples from the optimum data of said third
`-. representing said optimum data at frame intervals;
`codebook means and from said pitch parameter
`second, adaptive codebook means for deriving said
`from said fourth adaptive codebook means; and;
`pitch parameter at subframe intervals from said
`deemphasis means for emphasizing the speech sam
`difference and said optimum data and for generat
`ples synthesized by the second speech synthesis
`ing a pitch parameter index signal at frame inter
`vals;
`means according to a characteristic inverse to the
`?rst speech synthesis means for deriving said synthe
`characteristic of said preemphasis means.
`4. A speech conversion system comprising:
`sized speech samples from said pitch parameter and
`preemphasis means for receiving input digital speech
`said optimum data;
`samples of an underlying analog speech signal and
`multiplexer means for multiplexing said ?rst spectral
`emphasizing higher frequency components of the
`parameter, said pitch parameter index signal and
`speech samples according to a prede?ned fre
`said codebook index signal into a single data
`quency response characteristic;
`stream;
`linear prediction analyzer means for receiving said
`demultiplexer means for demultiplexing said data
`preemphasized speech samples and deriving there
`stream into said spectral parameter, said pitch pa
`rameter index signal and said codebook index sig
`from at frame intervals a ?rst spectral parameter
`representing a spectrum envelope of said preem
`' nal;
`phasized speech samples;
`third codebook means for writing said optimum data
`?rst parameter conversion means for converting the
`therefrom at subframe intervals as a function of the
`demultiplexed codebook index signal;
`?rst spectral parameter to a second spectral param
`eter according to a prescribed relationship between
`second parameter conversion means for converting
`the demultiplexed ?rst spectral parameter to said
`said second spectral parameter and a combined
`value of said ?rst spectral parameter and a parame
`second spectral parameter in a manner identical to
`ter representing the frequency response of said
`said ?rst parameter conversion means;
`preemphasis means;
`fourth adaptive codebook means for writing a pitch
`weighting means for weighting said input digital
`parameter at subframe intervals in response to the
`speech samples according to a characteristic in
`demultiplexed pitch parameter index signal and a
`verse to the characteristic of said preemphasis
`sum of an output of the fourth adaptive codebook
`means and said optimum data of the third codebook
`means as a function of said second spectral parame
`means; and
`ter;
`.
`second speech synthesis means having a characteris
`a subtractor for detecting a difference between the
`weighted speech samples and synthesized speech
`tic that is inverse to the characteristic of said pre
`samples;
`emphasis means and is a function of said second
`?rst codebook means for storing data representing
`spectral parameter of the second parameter con
`fricatives;
`version means for deriving synthesized speech
`search means for detecting optimum data from said
`samples from the optimum data of said second
`codebook means and from the pitch parameter
`?rst codebook means as a function of a pitch pa
`rameter representing a pitch interval of said input
`from said fourth adaptive codebook means.
`speech samples so that said difference is reduced to
`i i i i t
`
`10
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`65
`
`Ex. 1021 / Page 10 of 11
`
`
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`CERTIFICATE OF CORRECTION
`
`PATENT NO. :
`
`5,295,224
`
`DATED
`
`1
`
`March 15, 1991+
`
`|NVENTOR(S) :
`
`Makio Nakamura, et. al.
`
`It is certi?ed that error appears in the above-indenti?ed patent and that said Letters Patent is hereby
`corrected as shown below:
`
`Title page, item [75], inventor: delete "Makamura" and insert—-Nakamura-—.
`Col. 4,
`line 58, delete "Cl-2" and insert --Gj2--.
`
`lines 42 and 43,
`Col. 5,
`_
`insert —-11 and l2——;
`
`delete "11 and 12" and
`
`Col. 5,
`1
`a! l1! l2_—;
`
`line 51, delete "l
`a, ll, 12'' and insert
`
`Col. 5,
`
`line 54, delete "C13" and insert ——Clj--.
`
`Col. 6,
`
`line 20, delete "a.j to 63-" and insert --al
`
`Col. 6,
`
`line 62, delete "63-" and insert --6i--;
`
`Col. 6,
`
`line,66, delete "831' and insert —-6i—-.
`
`Signed and Sealed this
`Twenty-fourth Day of J
`
`1995
`
`Attesting O?icer
`
`Commissioner of Patents and Trademarks
`
`BRUCE LEHMAN
`
`Ex. 1021 / Page 11 of 11
`
`