throbber
United States Patent [19]
`United States Patent [19]
`Diethorn
`Diethorn
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US006035048A
`US006035048A
`[11] Patent Number:
`[11] Patent Number:
`[45] Date of Patent:
`[45] Date of Patent:
`
`6,035,048
`6,035,048
`Mar. 7,2000
`Mar. 7, 2000
`
`[54] METHOD AND APPARATUS FOR
`[54] METHOD AND APPARATUS FOR
`REDUCING NOISE IN SPEECH AND AUDIO
`REDUCING NOISE IN SPEECH AND AUDIO
`SIGNALS
`SIGNALS
`
`Primary Examiner—Vivian Chang
`Primary Examiner-Vivian Chang
`Attorney, Agent, or Firm-Martin I. Finston; Ozer M.N.
`Attorney, Agent, or Firm—Martin I. Finston; Ozer M.N.
`Teitelbaum
`Teitelbaum
`
`[75] Inventor: Eric John Diethorn, Morristown, NJ.
`[75]
`Inventor: Eric John Diethorn, Morristown, N.J.
`
`[57]
`[57]
`
`ABSTRACT
`ABSTRACT
`
`[73] Assignee: Lucent Technologies Inc., Murray Hill,
`[73] Assignee: Lucent Technologies Inc., Murray Hill,
`N.J.
`NJ.
`
`[21] Appl. No.: 08/877,909
`[21] Appl. No.: 08/877,909
`[22]
`Filed:
`Jun. 18, 1997
`Jun. 18, 1997
`[22] Filed:
`
`[51]
`Int. CI? ..................................................... H04B 15/00
`[51] Int. Cl.7 ................................................... .. H04B 15/00
`[52] U.S. CI. ............................................ 381/94.3; 704/226
`[52] US. Cl. ................ ..
`381/94.3; 704/226
`[58] Field of Search .................................. 381194.1, 94.2,
`[58] Field of Search ................................ .. 381/941, 94.2,
`381/943, 72, 94.5, 94.7, 98, 73.1, 71.1;
`381/94.3, 72, 94.5, 94.7, 98, 73.1, 71.1;
`704/225, 226
`704/225, 226
`
`[56]
`[56]
`
`References Cited
`References Cited
`
`U.S. PATENT DOCUMENTS
`U.S. PATENT DOCUMENTS
`
`5,251,263 10/1993 Andrea et al. ......................... 381/71.6
`5,251,263 10/1993 Andrea et a1. ....................... .. 381/716
`5,550,924
`8/1996 Helf et al. .
`5,550,924
`8/1996 Helf et a1. .
`
`OTHER PUBLICATIONS
`OTHER PUBLICATIONS
`
`R. E. Crochiere and L. R. Rabiner, Multirate Digital Signal
`R. E. Crochiere and L. R. Rabiner, Multirate Digital Signal
`Processing, Prentice—Hall, Englewood Cliffs, New Jersey,
`Processing, Prentice-Hall, Englewood Cliffs, New Jersey,
`Jan. 1983, Chapter 7, “Multirate Techniques in Filter Banks
`Jan. 1983, Chapter 7, "Multirate Techniques in Filter Banks
`and Spectrum Analyzers and Synthesizers,” pp. 289—400.
`and Spectrum Analyzers and Synthesizers," pp. 289-400.
`W. Etter and G. S. Moschytz, “Noise Reduction by
`W. Etter and G. S. Moschytz, "Noise Reduction by
`Noise—Adaptive Spectral Magnitude Expansion,” J. Audio
`Noise-Adaptive Spectral Magnitude Expansion," J. Audio
`Eng. Soc. 42 (May 1994) 341—349.
`Eng. Soc. 42 (May 1994) 341-349.
`J. B. Allen, “Short Term Spectral Analysis, Synthesis, and
`J. B. Allen, "Short Term Spectral Analysis, Synthesis, and
`Modification by Discrete Fourier Transform," IEEE Trans(cid:173)
`Modi?cation by Discrete Fourier Transform,” IEEE Trans
`actions on Acoustics, Speech, and Signal Processing, vol.
`actions on Acoustics, Speech, and Signal Processing, vol.
`ASSP-25, No.3, Jun. 1977.
`ASSP—25, No. 3, Jun. 1977.
`
`Amethod and apparatus are disclosed for enhancing, within
`A method and apparatus are disclosed for enhancing, within
`a signal bandwidth, a corrupted audio-frequency signal. The
`a signal bandwidth, a corrupted audio-frequency signal. The
`signal which is to be enhanced is analyzed into plural
`signal which is to be enhanced is analyzed into plural
`sub-band signals, each occupying a frequency sub-band
`sub-band signals, each occupying a frequency sub-band
`smaller than the signal bandwidth. A respective signal gain
`smaller than the signal bandwidth. A respective signal gain
`function is applied to each sub-band signal, and the respec
`function is applied to each sub-band signal, and the respec(cid:173)
`tive sub-band signals are then synthesized into an enhanced
`tive sub-band signals are then synthesized into an enhanced
`signal of the signal bandwidth. The signal gain function is
`signal of the signal bandwidth. The signal gain function is
`derived, in part, by measuring speech energy and noise
`derived, in part, by measuring speech energy and noise
`energy, and from these determining a relative amount of
`energy, and from these determining a relative amount of
`speech energy, within the corresponding sub-band. In certain
`speech energy, within the corresponding sub-band. In certain
`embodiments of the invention, the signal gain function is
`embodiments of the invention, the signal gain function is
`also derived, in part, by determining a relative amount of
`also derived, in part, by determining a relative amount of
`speech energy within a frequency range greater than, but
`speech energy within a frequency range greater than, but
`centered on, the corresponding sub-band. In other embodi
`centered on, the corresponding sub-band. In other embodi(cid:173)
`ments of the invention, the sub-band noise energy is deter(cid:173)
`ments of the invention, the sub-band noise energy is deter
`mined from a noise estimate that is updated at periodic
`mined from a noise estimate that is updated at periodic
`intervals, but is not updated if the newest sample of the
`intervals, but is not updated if the newest sample of the
`signal to be enhanced exceeds the current noise estimate by
`signal to be enhanced exceeds the current noise estimate by
`a multiplicative threshold (i.e., a threshold expressible in
`a multiplicative threshold (i.e., a threshold expressible in
`decibels). In still other embodiments of the invention, the
`decibels). In still other embodiments of the invention, the
`value of the noise estimate is limited by an upper bound that
`value of the noise estimate is limited by an upper bound that
`is matched to the dynamic range of the signal to be
`is matched to the dynamic range of the signal to be
`enhanced.
`enhanced.
`
`12 Claims, 4 Drawing Sheets
`12 Claims, 4 Drawing Sheets
`
`1001
`901
`501
`501
`701
`801
`lOO~
`~~
`~ ~~ ~ M~
`,--""----,
`rllO
`GAIN
`LUMPED
`SIGNAL
`NOISE
`NARROW-BAND
`BROAD-BAND
`Z110
`GAIN
`LUMPED
`SIGNAL
`NOISE
`NARROW-BAND BROAD-BAND
`COMPUTATION r--L-l
`DEFLECTION
`ESTIMATION
`ESTIMATION
`DEFLECTION
`DEFLECTION
`COMPUTATION r" ~1
`DEFLECTIUN
`ESTIMATION
`ESTIMATION
`DEFLECTION
`DEFLECTION
`[7]
`1
`i
`(6]
`(2)
`(3)
`[4]
`(5)
`L...-_(2r-) _ I L----y(3)_.J L....---,(4r-) ---I L...---,(5.-) ---I L...---,(6.-) ---I 1....-..,(7,-) ---I i GAIN i
`I010
`120 ~
`: 9 (k, m) :
`120
`‘9 '"H
`SUBBAND
`I
`:.
`,
`: k=Q ,.---"-----,
`
`l |% |
`
`INDEX k=0 c(k,ml
`SUBBAND
`k=O
`401
`~ __ ~~4--------+--------~-------+--------~-------4----~:~x : k=1
`SUBBAND
`k=1
`‘
`E H
`SUBBAND
`SYNTHESIS
`I
`ANALYSIS _-’
`!
`I
`-
`5YNT(§)E5I5
`ALL M SUBBANDS INDEPENDENTLY PROCESSED
`ALL M SUBBANDS INDEPENDENTLY PROCESSED
`!
`(8)
`BY BLOCKS
`(2) THROUGH
`(7)
`(1)
`:
`BY BLOCKS (2) THROUGH (71
`-
`I
`:
`I ,
`I k=M~1
`k=M—1
`l
`|k=M—1
`,
`__.
`:
`_|___,
`i , ,
`I , ,
`1
`1
`
`-
`
`00th
`
`SPEECH)
`
`'
`
`'
`
`L"_"J
`
`y tn)
`y(n1
`$5535
`(NOISE(cid:173)
`REDUCED
`SPEECH)
`
`RTL345-1_1026-0001
`
`

`
`u.s. Patent
`U.S. Patent
`
`Mar. 7,2000
`Mar. 7,2000
`
`Sheet 1 of4
`Sheet 1 of 4
`
`6,035,048
`6,035,048
`
`FIG.
`FIG. 1
`j
`(PRIOR ART)
`(PRIOR ART)
`x (i)
`xh)
`
`1
`
`ANALYZER
`ANALYZER
`
`10
`
`c (0, m)
`
`c (1. m)
`
`c (2, m)
`
`c (M-1, m)
`
`SPECTRAL
`SPECTRAL
`MUDIFIEH
`MODIFIER
`
`_
`
`,
`
`,
`
`l
`
`V
`
`l
`
`20
`J——g(0.m)
`9 (0, m)
`9 (1. m)
`-——g (1, m)
`;
`
`--—g(M-1.m]
`
`9 (M-l, m)
`
`[I30
`.
`.
`0
`l
`30
`SYNTHESIZEH /
`SYNTHESIZER
`
`y (i)
`Hi)
`
`FIG. 3
`FIG. 3
`EACH PROCESSING EPOCH, m
`130
`EACH PROCESSING EPOCH, m 5130
`SHLIFNTEWIN
`SHIFT IN
`L NEW
`SHIFT REGISTER
`(N)
`SAMPLES --—-- SHIFT REGISTER (N)
`SAMPLES
`OF x (i)
`0F x0)
`
`LENGTH N VECTOR
`LENGTH N VECTOR
`140
`140
`ANALYSIS wmnow [N] J
`ANALYSIS WINDOW
`(N)
`l LENGTH N VECTOR
`LENGTH N VECTOR
`150
`150
`m (N)
`5
`
`DIUSLCDAERSDTL
`DISCARD L
`OLDEST
`SAMPLES
`SAMPLES
`OF X (i)
`0F xh)
`
`HM) c(1,m)
`c (0, m) c (1. m)
`
`c (M-1, m)
`c(M-1,m)
`
`1 COMPLEX TIME SERIES SAMPLE, c (k, m)
`1 COMPLEX TIME SERIES SAMPLE, C (k, m)
`FOR EACH OF M = N/2 + 1 SUBBANDS
`FOR EACH OF M = N/E + 1 SUBBANDS
`
`RTL345-1_1026-0002
`
`

`
`FIG. 2
`
`50
`
`60
`
`70
`
`80
`
`90
`
`100
`
`SIGNAL
`ESTIMATION
`(2)
`
`NOISE
`ESTIMATION
`(3)
`
`NARROW-BAND
`DEFLECTION
`(4)
`
`BROAD-BAND
`DEFLECTION
`(5)
`
`LUMPED
`DEFLECTION
`(6)
`
`GAIN
`COMPUTATION
`(7)
`
`SUBBAND
`INDEX
`k=O c (k, m)
`
`k=1 -· · · k=M-1
`
`~
`
`40~
`
`SUBBAND
`ANALYSIS
`(1)
`
`x (i)
`(NOISY
`SPEECH)
`
`ALL M SUBBANDS INDEPENDENTLY PROCESSED
`BY BLOCKS
`(2) THROUGH
`(7)
`
`d •
`rJl
`•
`~
`~ .....
`~ = .....
`
`~
`~
`:"l
`~-..J
`N
`C
`C
`C
`
`'JJ. =(cid:173)~
`~ .....
`N
`o ....,
`
`~
`
`0\
`....
`8
`.... = ""-00
`
`Ul
`
`I
`
`r--
`I I G
`: 9 I
`I ,
`
`I
`1 k=O
`1
`SUBBAND
`: X : k=1
`I
`SYNTHESIS
`I
`:
`: : (8)
`I k=M~1
`I I ,
`I
`
`I
`I
`
`I
`I
`
`L _____ J
`
`y (n)
`(NOISE(cid:173)
`REDUCED
`SPEECH)
`
`RTL345-1_1026-0003
`
`

`
`u.s. Patent
`U.S. Patent
`
`Mar. 7, 2000
`Mar. 7,2000
`
`Sheet 3 of4
`Sheet 3 of 4
`
`6,035,048
`6,035,048
`
`FIG. 4
`FIG‘. 4
`
`4 3
`r-----------~ A=ALPHA_ATTACK
`- A=ALPHA ATTACK /
`
`4.1
`I4. 1
`s (k, m) = A s (k, m-1)
`+ (i-A) Ic(k,m)1
`
`c (k, m) ----e-<:
`U.
`m
`C
`
`- A=ALPHA_DECAY
`
`s (k, m)
`s (k, m)
`
`FIG. 5
`FIG. 5
`
`B=BETA ATTACK
`
`‘$5.3
`
`INHIBIT UPDATE ON
`INHIBIT UPDATE ON
`PROBABLE SPEECH SAMPLES
`PROBABLE SPEECH SAMPLES
`
`c (k, m)
`
`.2
`
`i
`
`LIMIT MAXIMUM
`LIMIT MAXIMUM
`ATTAINABLE NOISE LEVEL
`ATTAINABLE NOISE LEVEL
`
`n (k, m) = min [n (k, m) ,
`n(k,m) = min[n[k,ml.
`NOISE PROFILE (k)]
`NUISE_PHOFILE (k) I
`t “5.6 I
`n (k, m)
`= B DUMB-1]
`"(k-m)
`n (k, m) = B n (k, m-1)
`n (k. m)
`+ (1-8) Ie (k, m)1
`)lcTkmH
`+ (1-8
`
`DDN'T UPDATE n (k, 0)
`
`B=BETA_UECAY '—
`
`5.1
`
`FIG. 6
`FIG‘. 6
`s (k, m) -....-t
`d (k, m) =
`(Tm
`k)
`mn
`=.K.
`'—-d (k, m)
`r---d(k, m)
`1/
`AU | Wm
`....... '---_S_(k_, m_) I_n _(k,_m) __
`m
`S
`n (k, m) _
`
`RTL345-1_1026-0004
`
`

`
`u.s. Patent
`U.S. Patent
`
`Mar. 7,2000
`Mar. 7, 2000
`
`Sheet 4 of4
`Sheet 4 of 4
`
`6,035,048
`6,035,048
`
`FIG. 7
`FIG. 7
`
`5 (k. m)
`FOR SUBBAND INDICES
`D (k. M) = [5 (k-Kl m) + s (k-K+1. m) + ... +
`s“SW-"1111.111 = 1s (k-K 11) + 511-1010 + .
`FOR SUBS/1ND INDISES
`k = K. K+1.
`. ... M-K-1
`s (k+K. m) / [(2K+1) *n (k. m)]
`1<= K, |<+1,
`11-1-1 n[k|m)__
`S(k+K,m)]/[(2K+1)*R(k,m)]
`1<-1) AND k IN (M-K.
`E0R1< IN (0.
`1
`.... K-1) AND k IN
`(M-K.
`FOR k IN (0. 1)
`M—K+1, ...M-1). 001,11) IS NOT COMPUTED
`M-K +1.
`. .. M-1 . D (k. m)
`IS NOT COMPUTED
`FIG. 8A
`FIG. 8A
`
`. +._.D(k m)
`D (k. m)
`'
`
`d (k. m)
`“(k-"0*" PHI (k m) = {max [11 [K m] /GAMMA NB
`PHI (k. m) = Imax [d (k. m) /GAMMA~B.
`FOR SUBBAND INDICES
`{=02 SKU+R1SANIR IIINDMI_CKE_S1 Wm)‘
`Mum) NAMMLBBHW -
`D(k. m)/GAMMA_BB]l**p
`. ..• M-K-1 D (k. m)
`k=K, K+l.
`
`,__
`PHI (k. m)
`PHI (k, m)
`
`. ..• K-l] AND
`FOR k IN
`[0. 1.
`.
`E0R1< IN [0,
`1,
`1<-1] AND
`k IN (M-K. M-k+1.
`... M-1]
`kIN (M-K, 11-1<+1
`11-11
`d (k. m)
`110,111- PHI (1,11) = [d(k,m]/GAMMA_NB] 101p
`PHI (k.m) = [d(k. m)/GAMMA~B] **p
`
`FIG. 88
`FIG. 8B
`
`PHI (k .m)
`--RR1(|<.11)
`
`FIG. 9
`FIG. 9
`
`9 (k. m) = min /1. O. PHI (k. m) 1
`PHI (k, m) - 90.11) = 1111100, PHI 0.11)]
`PHI (k. m) -
`
`f--g (k. m)
`-—g(k.m)
`
`FIG.
`jO
`FIG. 10
`1 COMPLEX TIME SERIES SAMPLE, g(k, 0) 11111.11).
`1 COMPLEX TIME SERIES SAMPLE. g(k.m) *c (k. mI.
`FOR EACH OF M = N/2+1 SUBBANDS
`FOR EACH OF M = N/2+1 SUBBANDS
`9 (0. m) * c (0. m)
`g(0,m) 11 c(0,m)
`9 (1. m) * c (1. m)
`g(1,m) * clLml
`9 (M-1. m) * c (M-1. m)
`1
`1101-111] 1 101-111)
`
`IFFT
`(N)
`@160
`IFFT (N)
`160
`{
`LENGTH N vEcmR
`LENGTH N VECTOR
`170
`SYNTHESIS WINDOW
`(N)
`SYNTHESIS 111110011 (N)
`»170
`LENGTH N VECTU
`L NEWEST
`LENGTH N VECTOR
`LNEWEST
`SHIFT 1“
`ACCUMULATE S SHgFT REGISTER (N)
`R
`SHIFT IN -_--+ ACCUMULATE & SHIFT REGISTER
`P1211125“
`(N) 1 - - - PROCESSED
`L ZEROES
`SAMPLES
`
`RTL345-1_1026-0005
`
`

`
`1
`1
`METHOD AND APPARATUS FOR
`METHOD AND APPARATUS FOR
`REDUCING NOISE IN SPEECH AND AUDIO
`REDUCING NOISE IN SPEECH AND AUDIO
`SIGNALS
`SIGNALS
`
`6,035,048
`6,035,048
`
`5
`
`30
`30
`
`2
`2
`The input data are fed into ?lter-bank analyZer 10. The
`The input data are fed into filter-bank analyzer 10. The
`output of this analyZer consists of a respective sub-band
`output of this analyzer consists of a respective sub-band
`signal c(O,m), c(1,m), c(2,m), ... , c(M-1,m) at each of M
`signal c(0,m), c(1,m), c(2,m), .
`.
`. , c(M—1,m) at each of M
`respective output ports of the analyZer, M a positive integer.
`respective output ports of the analyzer, M a positive integer.
`(The time index is shown as changed from i to m because the
`(The time index is shoWn as changed from i to m because the
`effective sampling rate may differ betWeen the respective
`effective sampling rate may differ between the respective
`processing stages.)
`processing stages.)
`At short-time spectral modi?er 20, each of the sub-band
`At short-time spectral modifier 20, each of the sub-band
`signals is subjected to gain modi?cation according to a
`signals is subjected to gain modification according to a
`10 respective signal gain function g(k,m), k=0,1,2, ... , M-1,
`10
`respective signal gain function g(k,m), k=0,1,2, .
`.
`. , M-1,
`Which may differ betWeen respective sub-bands. (In this
`which may differ between respective sub-bands. (In this
`context, "short-time" refers to a time scale typical of that
`context, “short-time” refers to a time scale typical of that
`over which speech utterances evolve. Such a time scale is
`over Which speech utterances evolve. Such a time scale is
`generally on the order of 20 ms in applications for process-
`generally on the order of 20 ms in applications for process
`ing human speech.)
`15 ing human speech.)
`15
`The sub-band signals are recombined at filter-bank syn(cid:173)
`The sub-band signals are recombined at ?lter-bank syn
`thesiZer 30 into modi?ed full-band signal y(i).
`thesizer 30 into modified full-band signal y(i).
`One application of methods of this kind to the problem of
`One application of methods of this kind to the problem of
`noise reduction is described in W. Etter and G. S. MoschytZ,
`noise reduction is described in W. Etter and G. S. Moschytz,
`“Noise Reduction by Noise-Adaptive Spectral Magnitude
`"Noise Reduction by Noise-Adaptive Spectral Magnitude
`Expansion,” J. Audio Eng. Soc. 42 (May 1994) 341—349.
`Expansion," J. Audio Eng. Soc. 42 (May 1994) 341-349.
`This article discusses a signal gain function (for each
`This article discusses a signal gain function (for each
`respective sub-band) that varies inversely according to a
`respective sub-band) that varies inversely according to a
`power of the fractional contribution made by an estimated
`poWer of the fractional contribution made by an estimated
`noise level to the total signal (i.e., speech plus noise). At
`noise level to the total signal (i.e., speech plus noise). At
`relatively high signal-to-noise ratios, this signal gain func
`relatively high signal-to-noise ratios, this signal gain func(cid:173)
`tion assumes a maximum value of unity. The exponent in the
`tion assumes a maximum value of unity. The exponent in the
`power-function relationship is referred to as an expansion
`poWer-function relationship is referred to as an expansion
`factor. An expansion factor controls the rate at which the
`factor. An expansion factor controls the rate at Which the
`gain decays as the signal-to-noise ratio decreases.
`gain decays as the signal-to-noise ratio decreases.
`Although the article by Etter et al. provides useful insights
`Although the article by Etter et al. provides useful insights
`of a general nature, it does not teach how to estimate the
`of a general nature, it does not teach hoW to estimate the
`noise level or how to discriminate between incidents of
`noise level or hoW to discriminate betWeen incidents of
`speech and background noise that is free of speech. Thus it
`35 speech and background noise that is free of speech. Thus it
`35
`does not suggest any practical implementation of the ideas
`does not suggest any practical implementation of the ideas
`discussed there.
`discussed there.
`Another application of methods of this kind is described
`Another application of methods of this kind is described
`in U.S. Pat. No. 5,550,924, "Reduction of Background
`in US. Pat. No. 5,550,924, “Reduction of Background
`40 Noise for Speech Enhancement," issued Aug. 27, 1996 to B.
`Noise for Speech Enhancement,” issued Aug. 27, 1996 to B.
`40
`M. Helf and P. L. Chu. This patent describes two methods
`M. Helf and P. L. Chu. This patent describes tWo methods
`for estimating the noise level. Both methods involve detect(cid:173)
`for estimating the noise level. Both methods involve detect
`ing sequences of input data that satisfy some criterion that
`ing sequences of input data that satisfy some criterion that
`signi?es the likely presence of background noise Without
`signifies the likely presence of background noise without
`45 speech. In one method, the processor observes the frequency
`speech. In one method, the processor observes the frequency
`45
`spectrum of the input data and detects data sequences for
`spectrum of the input data and detects data sequences for
`Which this spectrum is stationary for a relatively long time
`which this spectrum is stationary for a relatively long time
`interval. In the other method, the input stream is divided into
`interval. In the other method, the input stream is divided into
`ten-second intervals, and within these intervals, the proces-
`ten-second intervals, and Within these intervals, the proces
`50 sor observes the energy content of multiple sub-intervals.
`sor observes the energy content of multiple sub-intervals.
`Within each interval, the processor takes as representative of
`Within each interval, the processor takes as representative of
`speech-free background noise that sub-interval having the
`speech-free background noise that sub-interval having the
`least energy.
`least energy.
`The method of Helf et al. further involves making a binary
`The method of Helf et al. further involves making a binary
`55 decision whether speech is present, based on the ratio of
`decision Whether speech is present, based on the ratio of
`55
`input signal to noise estimate. Acon?dence level is assigned
`input signal to noise estimate. A confidence level is assigned
`to each of these decisions. These confidence levels
`to each of these decisions. These con?dence levels
`determine, in part, the corresponding values of the signal
`determine, in part, the corresponding values of the signal
`gain function.
`gain function.
`Although useful, the method of Helf et al. involves
`Although useful, the method of Helf et al. involves
`relatively complex procedures for estimating the noise level,
`relatively complex procedures for estimating the noise level,
`establishing the presence of speech, and establishing values
`establishing the presence of speech, and establishing values
`for the signal gain function. Complexity is disadvantageous
`for the signal gain function. Complexity is disadvantageous
`because it increases demands on computational resources,
`because it increases demands on computational resources,
`65 and often leads to greater product costs.
`and often leads to greater product costs.
`65
`Moreover, it is signi?cant that human speech includes
`Moreover, it is significant that human speech includes
`intervals of narrowband, multicomponent energy, referred to
`intervals of narroWband, multicomponent energy, referred to
`
`20
`20
`
`25
`25
`
`FIELD OF THE INVENTION
`FIELD OF THE INVENTION
`This invention relates to the use of digital ?ltering tech
`This invention relates to the use of digital filtering tech(cid:173)
`niques to improve the audibility or intelligibility of speech
`niques to improve the audibility or intelligibility of speech
`or other audio-frequency signals that are corrupted With
`or other audio-frequency signals that are corrupted with
`noise. More particularly, the invention relates to those
`noise. More particularly, the invention relates to those
`techniques that seek to reduce stationary, or sloWly varying,
`techniques that seek to reduce stationary, or slowly varying,
`background noise.
`background noise.
`ART BACKGROUND
`ART BACKGROUND
`It is a matter of daily experience for speech (or other
`It is a matter of daily experience for speech (or other
`audible information) received over a communication chan(cid:173)
`audible information) received over a communication chan
`nel to be corrupted With background noise. Such noise may
`nel to be corrupted with background noise. Such noise may
`arise, e.g., from circuitry Within the communication system,
`arise, e.g., from circuitry within the communication system,
`or from environmental conditions at the source of the
`or from environmental conditions at the source of the
`audible signal. Environmental noise may come, for example,
`audible signal. Environmental noise may come, for example,
`from fans, automobile engines, other vibrating machines, or
`from fans, automobile engines, other vibrating machines, or
`nearby vehicular traf?c. Although noise components that
`nearby vehicular traffic. Although noise components that
`occupy narrow, discrete frequency bands are often advan(cid:173)
`occupy narroW, discrete frequency bands are often advan
`tageously removed by filtering, there are many cases in
`tageously removed by ?ltering, there are many cases in
`Which this does not provide an adequate solution. Instead,
`which this does not provide an adequate solution. Instead,
`the background noise often exhibits a frequency spectrum
`the background noise often exhibits a frequency spectrum
`that overlaps substantially With the spectrum of the desired
`that overlaps substantially with the spectrum of the desired
`signal. In such a case, a narrow frequency-rejection filter
`signal. In such a case, a narroW frequency-rejection ?lter
`may not reject enough of the noise, whereas a broad such
`may not reject enough of the noise, Whereas a broad such
`?lter may unacceptably distort the desired signal.
`filter may unacceptably distort the desired signal.
`What is needed in such a case is a filter whose frequency
`What is needed in such a case is a ?lter Whose frequency
`characteristics strike an appropriate balance betWeen reject
`characteristics strike an appropriate balance between reject(cid:173)
`ing frequency components characteristic of unWanted noise,
`ing frequency components characteristic of unwanted noise,
`and preserving the esthetic quality or intelligibility of the
`and preserving the esthetic quality or intelligibility of the
`desired signal. Among the various audible signals of interest,
`desired signal. Among the various audible signals of interest,
`it is fortuitous that speech, at least, is marked by frequent
`it is fortuitous that speech, at least, is marked by frequent
`pauses of suf?cient length to be captured and analyZed using
`pauses of sufficient length to be captured and analyzed using
`digital sampling techniques. Consequently, it is possible to
`digital sampling techniques. Consequently, it is possible to
`apply different ?lter characteristics depending Whether,
`apply different filter characteristics depending whether,
`according to some criterion, the current signal is more
`according to some criterion, the current signal is more
`probably speech or more probably noise. (Although the
`probably speech or more probably noise. (Although the
`desired signal will often be referred to below as speech, it
`desired signal Will often be referred to beloW as speech, it
`should be noted that this usage is purely for convenience.
`should be noted that this usage is purely for convenience.
`Those skilled in the art Will readily appreciate that the
`Those skilled in the art will readily appreciate that the
`techniques to be described here apply more generally to
`techniques to be described here apply more generally to
`audible signals of various kinds.)
`audible signals of various kinds.)
`Recently, a number of investigators have described
`Recently, a number of investigators have described
`approaches to this problem using digital ?lter banks for
`approaches to this problem using digital filter banks for
`sub-band filtering. The filter-bank methods used include,
`sub-band ?ltering. The ?lter-bank methods used include,
`e.g., the DFT (Discrete Fourier Transform) filter-bank
`e.g., the DFT (Discrete Fourier Transform) ?lter-bank
`method and the polyphase filter-bank method. (As is well(cid:173)
`method and the polyphase ?lter-bank method. (As is Well
`known in the art, these two methods are essentially the same,
`knoWn in the art, these tWo methods are essentially the same,
`but differ in certain details of the computational
`but differ in certain details of the computational
`implementation.) Sub-band ?ltering in general, and in par
`implementation.) Sub-band filtering in general, and in par(cid:173)
`ticular the DFT and polyphase filter-bank methods, are
`ticular the DFT and polyphase ?lter-bank methods, are
`described in detail in R. E. Crochiere and L. R. Rabiner,
`described in detail in R. E. Crochiere and L. R. Rabiner,
`Multirate Digital Signal Processing, Prentice-Hall, Engle
`Multirate Digital Signal Processing, Prentice-Hall, Engle(cid:173)
`wood Cliffs, N.J., 1983, hereinafter referred to as
`Wood Cliffs, N.J., 1983, hereinafter referred to as
`CROCHIERE, particularly at Chapter 7, “Multirate Tech
`CROCHIERE, particularly at Chapter 7, "Multirate Tech(cid:173)
`niques in Filter Banks and Spectrum AnalyZers and
`niques in Filter Banks and Spectrum Analyzers and
`Synthesizers,” pages 289—400. I hereby incorporate CRO
`Synthesizers," pages 289-400. I hereby incorporate CRO- 60
`60
`CHIERE by reference.
`CHIERE by reference.
`In a broad sense, these and similar approaches can be
`In a broad sense, these and similar approaches can be
`described in terms of the processing stages depicted in FIG.
`described in terms of the processing stages depicted in FIG.
`1. Adigitally sampled input signal is denoted in the ?gure by
`1. A digitally sampled input signal is denoted in the figure by
`xCi). Here, x typically represents the amplitude of an audio(cid:173)
`Here, x typically represents the amplitude of an audio
`frequency signal, and i is the time variable, referred to in this
`frequency signal, and i is the time variable, referred to in this
`digitized form as a time index.
`digitiZed form as a time index.
`
`RTL345-1_1026-0006
`
`

`
`6,035,048
`6,035,048
`
`4
`4
`FIG. 2 is a high-level, schematic diagram shoWing signal
`FIG. 2 is a high-level, schematic diagram showing signal
`?oW through various processing stages of the invention in an
`flow through various processing stages of the invention in an
`exemplary embodiment.
`exemplary embodiment.
`FIG. 3 is a more detailed, schematic representation of the
`FIG. 3 is a more detailed, schematic representation of the
`sub-band analysis stage of FIG. 2.
`5 sub-band analysis stage of FIG. 2.
`FIG. 4 is a more detailed, schematic representation of the
`FIG. 4 is a more detailed, schematic representation of the
`signal-estimation stage of FIG. 2.
`signal-estimation stage of FIG. 2.
`FIG. 5 is a more detailed, schematic representation of the
`FIG. 5 is a more detailed, schematic representation of the
`noise-estimation stage of FIG. 2.
`10 noise-estimation stage of FIG. 2.
`10
`FIG. 6 is a more detailed, schematic representation of the
`FIG. 6 is a more detailed, schematic representation of the
`narroWband de?ection stage of FIG. 2.
`narrowband deflection stage of FIG. 2.
`FIG. 7 is a more detailed, schematic representation of the
`FIG. 7 is a more detailed, schematic representation of the
`broadband de?ection stage of FIG. 2.
`broadband deflection stage of FIG. 2.
`FIGS. 8A and 8B provide a more detailed, schematic
`FIGS. 8A and 8B provide a more detailed, schematic
`representation of the lumped de?ection stage of FIG. 2.
`representation of the lumped deflection stage of FIG. 2.
`FIG. 9 is a more detailed, schematic representation of the
`FIG. 9 is a more detailed, schematic representation of the
`gain computation stage of FIG. 2.
`gain computation stage of FIG. 2.
`FIG. 10 is a more detailed, schematic representation of the
`FIG. 10 is a more detailed, schematic representation of the
`sub-band synthesis stage of FIG. 2.
`sub-band synthesis stage of FIG. 2.
`
`20
`
`15
`15
`
`3
`3
`as "voiced speech," and intervals of broadband energy,
`as “voiced speech,” and intervals of broadband energy,
`referred to as "unvoiced speech." Methods of sub-band
`referred to as “unvoiced speech.” Methods of sub-band
`processing, such as those described here, tend to be most
`processing, such as those described here, tend to be most
`effective in detecting voiced speech, because speech detec
`effective in detecting voiced speech, because speech detec(cid:173)
`tion can take place Within the speci?c frequency sub-bands
`tion can take place within the specific frequency sub-bands
`Where speech energy is concentrated. HoWever, such meth
`where speech energy is concentrated. However, such meth(cid:173)
`ods are generally less sensitive to incidents of unvoiced
`ods are generally less sensitive to incidents of unvoiced
`speech, because the speech energy is distributed over rela
`speech, because the speech energy is distributed over rela(cid:173)
`tively many frequency bands.
`tively many frequency bands.
`Thus, what has been lacking until now is a sub-band
`Thus, What has been lacking until noW is a sub-band
`method for enhancing speech (or other audible signals) that
`method for enhancing speech (or other audible signals) that
`is computationally relatively simple, and is at least as
`is computationally relatively simple, and is at least as
`effective for detecting unvoiced speech (or other incidents of
`effective for detecting unvoiced speech (or other incidents of
`broadband energy) as it is for detecting voice speech (or
`broadband energy) as it is for detecting voice speech (or
`other incidents of narroWband, multicomponent energy).
`other incidents of narrowband, multicomponent energy).
`SUMMARY OF THE INVENTION
`SUMMARY OF THE INVENTION
`I have invented an improved sub-band method for
`I have invented an improved sub-band method for
`enhancing speech or other audible signals in the presence of
`enhancing speech or other audible signals in the presence of
`background noise. My method is computationally relatively
`background noise. My method is computationally relatively
`simple, and thus can achieve economy in the use of, and
`simple, and thus can achieve economy in the use of, and
`demand for, computational resources. In contrast to methods
`demand for, computational resources. In contrast to methods
`of the prior art, my method includes separate speech
`of the prior art, my method includes separate speech(cid:173)
`detection stages, one directed primarily to voiced speech or
`detection stages, one directed primarily to voiced speech or
`the like, and the other directed primarily to unvoiced speech
`the like, and the other directed primarily to unvoiced speech
`or the like.
`or the like.
`In a broad aspect, my invention involves a method for
`In a broad aspect, my invention involves a method for
`enhancing, Within a signal bandWidth, a corrupted audio
`enhancing, within a signal bandwidth, a corrupted audio(cid:173)
`frequency signal having a signal component and a noise
`frequency signal having a signal component and a noise
`component. In accordance With this method, the corrupted
`component. In accordance with this method, the corrupted
`signal is analyZed into plural sub-band signals, each occu
`signal is analyzed into plural sub-band signals, each occu- 30
`pying a frequency sub-band smaller than the signal band
`pying a frequency sub-band smaller than the signal band(cid:173)
`Width. A respective signal gain function is applied to the
`width. A respective signal gain function is applied to the
`sub-band signal corresponding to each sub-band, thereby to
`sub-band signal corresponding to each sub-band, thereby to
`yield respective gain-modi?ed signals. The gain-modi?ed
`yield respective gain-modified signals. The gain-modified
`signals are synthesiZed into an enhanced signal of the signal
`signals are synthesized into an enhanced signal of the signal 35
`35
`bandwidth.
`bandWidth.
`Within each frequency sub-band, the step of applying the
`Within each frequency sub-band, the step of applying the
`signal gain function to the sub-band signal includes: evalu
`signal gain function to the sub-band signal includes: evalu(cid:173)
`ating a function that is preferentially sensitive to energy in
`ating a function that is preferentially sensitive to energy in
`the signal component; and applying, to the sub-band signal,
`the signal component; and applying, to the sub-band

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket