throbber
ISCAArchive
`http://www. isca-speech. org/ archive
`
`3rd European Conference on
`Speech Communication and Technology
`EUROSPEECH'93
`Berlin, Germany, September 19-23, 1993
`
`AN EFFICIENT ALGORITHM TO ESTIMATE THE
`INSTANTANEOUS SNR OF SPEECH SIGNALS
`
`Rainer Martin
`
`Institute for Communication Systems and Data Processing (IND), Aachen University of Technology,
`Templergraben 55, 52056 Aachen, Germany, Phone: +49 241 806984, Fax: +49 241 806985
`
`ABSTRACT
`This contribution presents an efficient algorithm to esti(cid:173)
`mate the instantaneous signal-to-noise ratio of speech signals.
`The algorithm is capable to track non stationary noise signals
`and has a low computational complexity. It does not need a
`speech activity detector nor histograms to Jearn signal statis(cid:173)
`tics. The algorithm is based on the observation that a noise
`power estimate can be obtained using minimum values of a
`smoothed power estimate. This paper will present this algo(cid:173)
`rithm, its performance, its limits, and some applications.
`Keywords: SNR, time delay estimation, speech enhancement
`
`1. INTRODUCTION
`Instantaneous SNR estimation is an essential component
`of speech processing algorithms which are sensitive to vary(cid:173)
`ing noise levels. An instantaneous SNR estimate is based
`on short time power estimates with time constants of inte(cid:173)
`gration in the range of 0.02 - 0.1 s. Typical applications are
`time delay estimation and speech enhancement (e.g. spectral
`subtraction).
`To acquire noise statistics the conventional approach to
`SNR estimation employs a voice activity detector to extract
`the noise only segments of the disturbed speech signal. The
`identification of noise segments might be based on the signal
`power, on a statistical evaluation by means of histograms
`or on combinations thereof [1 J. In all cases the update of
`the noise power estimate requires a signal segment where
`no speech is present. Depending on the method tracking of
`varying noise levels might be slow and confined to periods
`of no speech activity.
`The proposed algorithm, however, does not need an
`explicit speech/nospeech decision to gather noise statistics
`and is capable to track varying noise levels during speech
`activity. The algorithm is based on the observation that the
`smoothed power estimate of a noisy speech signal exhibits
`distinct peaks and valleys (see Figure 1). While the peaks
`correspond to speech activity the valleys of the smoothed
`noise estimate can be used to obtain a noise power estimate.
`To estimate the noise floor our algorithm takes the minimum
`of a smoothed power estimate within a window of finite
`length. The SNR estimates obtained by this method are fairly
`accurate.
`
`In section 2 and 3 we will present the algorithm and
`discuss some of its statistical properties. Section 4 will
`present experimental results. We conclude in section 5 with
`two applications.
`
`2. DESCRIPTION OF ALGORITHM
`In what follows we assume that the bandlirnited and
`sampled disturbed signal x(i) is a sum of a speech signal
`s(i) and a noise signal n(i), x(i) = s(i) + n(i), where i
`denotes the time index. We further assume that s( i) and n( i)
`are statistically independent, hence E { x 2 ( i)} = E { s2 ( i)} +
`E{n2 (i)}.
`SN R,(i) will denote the estimated signal-to-noise ratio
`of signal x(i) at time i. The algorithm works on a sample
`basis, i.e. a new output sample SN R,(i) is computed for
`each input sample x(i).
`
`Estimated short time power and noise floor
`xl07
`lOr---~--~--~--~--~--~--~--~--~
`
`Figure 1: Smoothed power and estimated noise floor of noisy
`speech signal (f.=8kHz, segmental SNR ca. 5 dB, car noise)
`
`The computation of SN R,(i) is based on a noise power
`estimate Pn(i) which is obtained as the minimum of the
`smoothed short time power estimate P, ( i) within a window
`of L samples.
`
`EUROSPEECH 93, Berlin, Germany, September 1993
`
`1093
`
`Petitioner Apple Inc.
`Ex. 1006, p. 1093
`
`

`
`Besides initialization the algorithm can be split into three
`major parts which will be discussed below (see Figure 2):
`
`1. Computation of a smoothed short time power estimate
`P.,(i) of signal :~:(i)
`2. Computation of the noise power estimate P,.(i)
`3. Computation of the SNR.,(i)
`
`SNRx(i)•
`
`Px(i)- miD(o&dor •Pa(i).Px(i))
`o&ctor•Pa(i)
`
`Figure 2: Flowchart of the SNR estimation algorithm
`
`Computation of a smoothed power estimate
`Computation of the short time signal power P., ( i) and
`smoothing of the power estimate is done in two steps.
`The power estimate may be obtained recursively or non(cid:173)
`recursively. We here use a sliding rectangular window of
`length N with N=128.
`In many applications, however, a
`power estimate is already available.
`Let P.,(i) denote the smoothed short time power esti(cid:173)
`mate at time i. Smoothing of the power estimate is done
`by means of a first order recursive system. The smoothing
`constant is typically set to values between cr = 0.95 ... 0.98.
`The recursion for i > N is given by equation 1:
`P.,(i) = P.,(i- 1) + :~:(i) * x(i)- x(i- N) * x(i- N)
`.P.,(i) =a* P.,(i- 1) + (1- a)* P.,(i)
`
`(1)
`
`Noise power estimation
`The noise power estimate is based on the minimum of
`signal power within a window of L samples. For reasons
`of computational complexity and delay the data window of
`
`EUROSPEECH 93, Berlin, Germany, September 1993
`
`length L is decomposed into W windows of length M such
`that M * W = L. For a sampling rate of f5=8 kHz typical
`window parameters are M::1250 and W::4, thus ~5000
`corresponding to a time window of 0.625 s.
`The minimum power of the last M samples is found by
`a samplewise comparison of the actual minimum PMmin(i)
`and the smoothed power ..P., ( i).
`i = r *
`Whenever M samples have been read, i.e.
`M, we store the minimum power of the last M sam(cid:173)
`ples and reset PMmin(i = r * M) to its maximum value:
`PMmin(i=r*M+) = Pmaz·
`To determine the noise power estimate we distinguish
`two cases:
`
`1.
`2.
`
`slowly varying noise power,
`rapidly varying noise power.
`
`If the minimum power of the last W windows with
`M samples each is monotonically increasing we decide on
`rapid noise power variation. In this case the noise power
`estimate equals the power minimum of the last M samples
`P,.(i) = PMmin.(i = r * M).
`In case of non monotonic power the noise power esti(cid:173)
`mate is set to the minimum of the length L window, i.e.:
`P,. ( i) = PLmin ( i). The minimum power of the length L
`window is easily obtained as the minimum of the last W
`minimum power estimates:
`PLmin.(i) = min(PMmin(i = r * M),
`PMmin(i = (r- 1) * M),
`... , PMmin.(i = (r- W + 1) * M))
`If the actual smoothed power is smaller than the esti(cid:173)
`mated noise power P,.,(i) the noise power is updated im(cid:173)
`mediately independent of window adjustment: P,.,(i) =
`min(P.,(i), P,.(i)).
`Computation of SNR
`The estimated SNR is computed on the basis of the
`estimated minimum noise power P,.,(i). A factor ofar:tor
`accounts for the fact that the minimum power estimate is
`smaller than the true noise power. ofar:tor is typically set
`to values between 1.3 and 2 (see section 3):
`SNR(i) =
`P.,(i)-min(ofactor*Pn.(i), P.,(i)))
`10 * 0910
`l
`f ...
`o a~e.or * n z
`P. ( ")
`Figure 1 plots the smoothed power estimate and the es(cid:173)
`timated noise floor for a noisy speech sample. The window
`length L = M * W must be large enough to bridge any peak
`of speech activity, but short enough to follow non stationary
`noise variations. Experiments with different speakers, differ(cid:173)
`ent languages, and modulated noise signals have shown that
`a window length of 0.625 s is a good value.
`In case of slowly varying noise power the update of
`noise estimates is delayed by L + M samples. If a rapid
`noise power increase is detected this delay is reduced to M
`samples, thus improving the noise tracking capability of the
`algorithm.
`
`(2)
`
`(3)
`
`(
`
`1094
`
`Petitioner Apple Inc.
`Ex. 1006, p. 1094
`
`

`
`3. STATISTICS OF MINIMUM ESTIMATES
`In this section we compute the density function of the
`minimum noise power estimate and justify our choice of the
`overestimation factor ofador. To facilitate the analytical
`evaluation of minimum estimates we assume that the noise
`process n is zero mean white Gaussian noise with variance u 2
`and that the computation of the smoothed power estimate is
`entirely done by means of non recursive accumulation, i.e.:
`
`N-1
`
`Pa:(i) = L x 2 (i- m)
`
`m=O
`
`(4)
`
`We now choose the overestimation factor of actor such
`that the noise power estimate is approximately unbiased, i.e.
`E{P,.} * ofador ~ E{Pa:}· Since fp:(Y) and /min(Y) are
`scaled by the noise variance u 2 of actor does not depend on
`u 2 • Figure 4 shows the dependency of of actor on N and
`Lw and allows the selection of an appropriate overestimation
`factor.
`
`* yN/2-1 * e-v/2u" * U(y) (5)
`
`Then, the power estimate Pz( i) is chi-square distributed
`[2] with mean N * u 2 and density:
`fp:(Y) =
`1
`(uv'2( r(N/2)
`where r() and U() denote the Gamma function and the unit
`step function, respectively.
`The density of the minimum of Lw independent power
`estimates is given by [2]:
`/min(Y) = Lw * (1- Fp: (y))L.,-1 * fp: (y)
`where Fp: (y) denotes the distribution function of the chi(cid:173)
`square density:
`
`(6)
`
`Fp: (y) = 1- e-v/2u' * L I* ( ~)m * U(y)
`2u
`
`1
`m=O m.
`
`N/2-1
`
`(7)
`
`Clearly, successive values of Pa:(i) are correlated but if
`we shift the sliding window of equ. 4 by ~i > N /2 we
`obtain sufficiently uncorrelated power estimates.
`Figure 3 plots the density functions fp: (y) and /min(Y)
`and corresponding histograms of Pa: ( i) and P,. ( i) for a car
`noise signal.
`
`0.2
`
`0.18
`
`0.16
`
`0.14
`
`0.12
`
`0.1
`
`~
`0.08
`
`·0.06
`
`0.04
`
`0.02
`
`0
`0
`
`r··\
`/;
`
`0.2
`
`0.18
`
`0.16
`
`0.14
`
`~
`
`10
`
`15
`
`15
`
`Figure 3: Density functions fp:(-g) (dotted) and fmin('V)
`(solid) for u 2 = 0.09, N = 80, and L'W = 20 (left
`graph) and corresponding histograms of P.,(i) (dotted)
`and Pn(i) (solid) for car noise signals (right graph)
`
`Lw
`
`Figure 4: Overestimation factor of actor versus N and L'W
`
`4. EXPERIMENTAL RESULTS
`Figure 5 plots the true and the estimated instantaneous
`SNR of the same noisy speech signal as in Figure 1. The
`true SNR was computed on the basis of separate speech and
`noise signals. Our SNR estimate shows good agreement with
`the true SNR during speech activity. In agreement with the
`statistical evaluation the estimate is biased when no speech
`is present.
`
`samples
`
`samples
`
`Figure 5: True and estimated instantaneous
`SNR of noisy speech signal (ofactor = 1.5)
`To test the algorithm with non stationary noise the noise
`signal was modulated with a sine function and then added to
`a speech signal: x(i) = s(i)+n(i)* (1.5 + sin(2• .. ;~0~3•i)).
`The modulation frequency was set to fm = 0.33 Hz.
`
`EUROSPEECH 93, Berlin, Germany, September 1993
`
`1095
`
`Petitioner Apple Inc.
`Ex. 1006, p. 1095
`
`

`
`Figure 6 plots the corresponding short time power and
`the estimated noise floor. Note the delay of the noise power
`values in case of increasing noise power. Figure 7 shows
`the true and estimated SNR. Due to the window length of
`0.625 s rapid noise variations might result in erroneous SNR
`estimates.
`
`Estimated sbort timepowerandnoisefioor (modulated noise)
`xl07
`16,---~--~--~--~--~--~--~--~--.
`
`determine the delay between microphone signals we com(cid:173)
`pute the maximum of a smoothed cross correlation estimate.
`Whenever the SNR is below a preset threshold the update
`of smoothed correlation functions is frozen. Figure 8 plots
`the delay estimate without and with SNR estimation. The
`enhanced algorithm clearly eliminates all large deviations of
`the time delay estimate.
`
`14
`
`12
`
`10
`
`samples
`
`xl0 4
`
`Figure 6: Short time power of modulated noisy
`speech signal and noise estimate for fm=0.33 Hz
`
`samples
`
`xl0 4
`
`Figure 7: True and estimated SNR of
`modulated noisy speech signal for fm=0.33 Hz
`
`5. APPLICATIONS
`The algorithm was tested with varying noise levels and
`successfully incorporated in several speech processing sys(cid:173)
`tems. In what follows we briefly discuss two applications,
`namely time delay estimation and spectral subtraction.
`
`TIME DElAY ESTIMATION
`from mi(cid:173)
`Time delayed speech signals originate e.g.
`crophone arrays where the speaker is in a non symmetric
`position relative to the array and possibly moving. In-phase
`summation or adaptive processing of these microphone sig(cid:173)
`nals usually requires a time delay compensation.
`The SNR estimator was implemented to support time
`delay estimation by means of (generalized) correlation. To
`
`l =
`
`~
`11 ..,
`
`4
`
`2
`
`0
`
`·2
`
`-4
`0
`
`samples
`
`x10'
`
`0.5
`
`1.5
`samples
`
`2
`
`25
`
`xlO'
`
`Figure 8: Time delay of microphone channel 1 with respect to
`channel 2 of a noisy speech sample with moving speaker
`without (upper graph) and with (lower graph) SNR estimation.
`SPECTRAL SUBTRACTION
`To reduce the noise level within a disturbed speech
`signal the spectral subtraction method modifies the short time
`spectral magnitude of the disturbed speech signal. In our
`experiments we used a filter bank with 256 channels and
`estimated the minimum power in each of these channels.
`Our informal listening test reveal relatively few annoy(cid:173)
`ing musical tones. However, due to the fact that we subtract
`slightly biased noise power estimates ( ofactor = 1.5) the noise
`suppression is limited. Power spectra of the disturbed and of
`the improved signal show an improvement of about 10 dB.
`
`6. CONCLUSION
`Varying noise levels have a significant impact on the
`performance of many speech processing algorithms. The
`algorithm proposed in this paper provides a computational
`inexpensive and effective mean to cope with this problem.
`The algorithm is accurate for medium to high SNR conditions
`but necessarily biased when no speech is present. A priori
`knowledge of noise variation and noise correlation is helpful
`to adapt window length and to control the estimation bias.
`
`ACKNOWLEDGMENTS
`Part of this work was supported by Philips Kommlinikations Indus(cid:173)
`trie, Germany. Spectral subtraction using minimum power estimates was
`investigated by Peter Kocybik.
`
`References
`
`(1)
`
`R. McAulay and M. Malpass: "Speech Enhancement Using a Soft·
`Decision Noise Suppression Filter ", IEEE Trans. ASSP, Vol. 28, No.
`2, pp. 137-145, April 1980.
`(2) A. Papoulis: "Probability, Random Variables, and Stochastic Pro·
`cesses", 2nd ed., McGraw-Hill, 1984.
`
`EUROSPEECH 93, Berlin, Germany, September 1993
`
`1096
`
`Petitioner Apple Inc.
`Ex. 1006, p. 1096

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket