throbber
An Adaptive Microphone Array for Hands–Free Communication
`
`Klaus Uwe Simmer
`Sven Fischer
` University of Bremen
`Department of Physics and Electrical Engineering
`P.O. Box 330 440, D–28334 Bremen, Germany
`e-mail: fischer@comm.uni-bremen.de
` Houpert Digital Audio
`Wiener Str. 5, D–28359 Bremen, Germany, Fax: +49 421 705675
`
`2. CONSTRAINED ADAPTIVE
`BEAMFORMING WITH ADAPTIVE
`LOOK–DIRECTION RESPONSE
`The task of constraint minimum variance beamform-
`ing is to minimize the total output power of the ar-
`ray subject to the constraint of preventing an a priori
`specified impulse response in the look direction[4]. In
`beamforming for speech application however, the per-
`formance of the array can be improved by allowing the
`look direction impulse response to vary with time. We
`use as adaptive look direction response the impulse re-
`sponse of a non–causal Wiener filter. The implemen-
`tation is straightforward, if we choose a generalized
`sidelobe cancelling structure as shown in figure 1. The
`constraints are included in the signal blocking matrix
`and an unconstrained update algorithm can be used [5].
`In our application we use the short time Fourier trans-
`form and the overlap–add method to estimate the trans-
`fer functions. The transfer functions Hi of the sidelobe
`cancelling part (lower signal track in figure 1) are given
`by [8]:
`
`Hiz (cid:3)
`
`(cid:2) i (cid:3) (cid:2) (cid:3) (cid:3) (cid:3)(cid:2) L (cid:2)
`
`(1)
`
`iyw z
`iiz
`where the number L depends on the blocking matrix
`structure (L can be M (cid:2) or less (M b(cid:3) number of mi-
`crophones) ). The cross power density spectrum iyw
`and the auto power density spectrum ii of equation
`(1) are estimated using the recursive update formulas:
`
`ABSTRACT
`In this paper we present an adaptive microphone ar-
`ray to suppress coherent as well as incoherent noise in
`disturbed speech signals. We use a generalized side-
`lobe cancelling (GSC) structure implemented in the fre-
`quency domain, because it allows a separate handling of
`determining the adaptive look direction response to sup-
`press incoherent noise and adjusting the adaptive filters
`for cancellation of coherent noise. The transfer function
`in the look direction is an adaptive Wiener–Filter which
`is estimated using the short time Fourier–Transform and
`the Nuttall/Carter method for spectrum estimation.
`
`INTRODUCTION
`1.
`Various methods for noise reduction and speech en-
`hancement with the aid of microphone arrays have pre-
`viously been described in the literature. The differ-
`ent approaches can be classified into three main cate-
`gories:
` conventional beamforming [1], [2], [3]
` adaptive beamforming [4], [5],
` microphone arrays with adaptive postfiltering [6]
`[7].
`The performance of these array techniques for noise
`reduction depends on the acoustical environment in
`which they have to operate. For example, adaptive
`beamforming works well if the number of point noise
`sources is smaller than the number of sensors. How-
`ever, in closed environments, noise is influenced by
`multipath propagation and reverberation which yields
`a multi–source noise field. In such noise fields the third
`method yields a much better noise reduction perfor-
`mance, but theoretically requires a completely inco-
`herent noise field.
`Realistic noise fields are neither perfectly diffuse
`nor do they consist of direct–path noise only. The re-
`flection coefficients of the walls as well as the distance
`between the noise sources and the array determine the
`ratio of coherent and incoherent noise components re-
`ceived by a microphone array. Therefore, a practical
`system for noise reduction must operate independently
`of the correlation properties of the noise field.
`The method presented here is able to suppress co-
`herent (i.e. direct path) noise and incoherent (i.e. dif-
`fuse) noise and can be conceived as a unification of
`the above mentioned three array techniques for noise
`reduction.
`
`(2)
`(3)
`
`(cid:2)
`
`l
`iyw k (cid:3)
`l
`iik (cid:3)
`where k is the frequency index, l is the time segment
`index, i(cid:3)lk is the short time spectrum at the output of
`the signal blocking unit and Yw(cid:3)lk is the postfiltered
`output spectrum of the conventional beamformer (see
`also figure 1). In equations (2) and (3), is a number
`close to one and defines the average time. The GSC ap-
`proach for noise reduction is closely related to adaptive
`noise cancelling proposed by Widrow et al. [8]. The
`noise reduction which can be achieved by this type of
`processor is completely specified by the spatial coher-
`ence of the noise field. A reasonable noise reduction
`can be achieved, if the noise signals between adjacent
`microphones are highly correlated [9]. Therefore, this
`part of our noise reduction system is able to suppress
`
`l(cid:0) 
`iyw k (cid:8) i(cid:3)lkYw(cid:3)lk
`
`l(cid:0) 
`k (cid:8) ji(cid:3)lkj
`ii
`
`b
`
`b
`
`b
`
`b
`
`Proc. IWAENC-95, Røros, Norway, June 1995
`RTL898_1024-0001
`
`1
`
`Realtek 898 Ex. 1024
`
`

`
`adaption of all the transfer functions take place simul-
`taneously.
`2.2.
`Improvement of the Transfer Function Esti-
`mate cW
`The identity in equation (4) holds only in a statistical
`sense. In practice, only estimates of the spatial cross
`power densities bxixj are available, and validation of
`the identity in equation (4) requires infinitely long pe-
`riodogram time–averaging. Due to the nonstationary
`nature of the speech signals, only short time intervals
`are available for spectrum estimation. Therefore, the
`transfer function cW is only a rough estimate for the
`true Wiener Filter.
`To improve the estimate cW we use the combined
`time and lag weighting technique for periodogram
`smoothing as introduced by Nuttall and Carter [11],
`which was adapted to our application. The starting
`point are equations (6) and (7), which describes a
`short time Weighted Overlapped Segment Averaging
`(WOSA) method (excluding constant factors). These
`estimates are subjected to an inverse Fourier Trans-
`form to yield the correlation function estimates bRss
`and bRxx respectively. In a next step, the correlation es-
`timates are multiplied by a symmetric real lag weight-
`ing function wlag, which takes into account the win-
`dowing of the input data prior to the computation of
`the FFT’s, and is calculated according to the following
`expression [11]:
`
`wlagn (cid:4)
`
`(cid:2)
`
`(8)
`
`wdnRww
`Rwwn
`In equation (8), wd is the desired lag window (in our
`case a Hanning window of one fourth the FFT length
`to perform the desired smoothing), Rww is the auto
`correlation function of the data window, and wlag is
`the reshaped lag window. In a final step the weighted
`correlation estimates are transformed back into the
`frequency domain to yield the desired power density
`spectra used in determining cW according to equation
`(5).
`It should be noted that, the lag multiplication and
`Fourier transform can be replaced with frequency do-
`main convolution. But as stated in [11], the unusual
`lag weighting is important for achieving good mean
`behaviour and is unlikely to be apparent by consider-
`ing a frequency domain convolution approach.
`Finally, because a conventional beamformer per-
`fectly cancels single frequencies in several directions,
`cW is bounded between values of zero and one to avoid
`poles in the transfer function estimate.
`3. SPATIOTEMPORAL BLOCKING MATRIX
`In the Generalized Sidelobe Canceller the constraints
`are included in the signal blocking matrix. In its sim-
`plest form, the signal blocking is realized by taking the
`difference between adjacent sensor signals to yield (in
`the ideal case) noise only reference signals. The major
`drawback of this approach is that, if the desired speech
`signal is not blocked out completely at the input of the
`
`IFFT O
`
`LA
`
`+
`
`-
`

`

`

`
`Blocking Matrix
`

`
`FFT
`
`FFT
`
`FFT
`
`Window
`
`Window
`
`Window
`

`0
`

`1
`

`M-1
`
`Fig. 1. Block diagram of the noise reduction system.
`
`the coherent direct path noise, but is inefficient for in-
`coherent noise.
`2.1. Transfer Function in Look Direction
`The transfer function W in look direction contains the
`constraint values and is designed to suppress spatially
`incoherent noise signals only.
`It is based on the as-
`sumption of a spatially white noise field, where in the
`ideal case of uncorrelated speech and noise the spatial
`cross power density spectrum of the received signals
`xixj z equals the auto power density spectrum of
`the desired speech signal ssz [6]:
`
`(4)
`xixj z (cid:4) ssz
`This fact is utilized to estimate the Wiener Filter W in
`look direction. The transfer function cW is given in our
`application by [10]:
`
`(cid:2)
`
`
`
`
`M M (cid:0) 
`
`M (cid:0) X
`
`MX
`
`xixj z
`
`i(cid:2)
`
`j(cid:2)i(cid:3)
`
`
`
`bssz
`
`(cid:3)
`
`cW z (cid:3)
`(cid:2)
`xxz
`xxz
`(5)
`xxz is the auto power density spectrum of the out-
`put signal of the conventional beamformer x. It can
`be shown that this transfer function cW is identical to
`the transfer function of a non causal Wiener Filter in
`the case of zero spatial correlation of the noise signals
`[10]. In the case of a completely coherent noise field
`the transfer function cW equals one and the noise re-
`duction is only due to the sidelobe cancelling path of
`the system shown in figure 1. The power density spec-
`tra in the numerator and denominator of equation (5)
`can be estimated in a manner similar to equations (2)
`and (3) from the short time spectra:
`
`blss k (cid:4) bl(cid:0) 
`
`M M (cid:0) 
`
`ss
`
`k (cid:8)
`
`(6)
`
`M (cid:0) X
`
`MX
`
`
`
`X i(cid:2)lkXj(cid:2)lk
`
`i(cid:5)
`j(cid:5)i(cid:6)
`(7)
`blxxk (cid:4) bl(cid:0) 
`
`k (cid:8) jX lkj
`The transfer functions cW and Hi are determined as
`the data arrives at the input microphones. Thus, the
`
`xx
`
`(cid:2)
`
`RTL898_1024-0002
`
`2
`
`

`
`4. EXPERIMENTAL RESULTS
`4.1. Simulation Description
`To test the noise reduction performance of the de-
`scribed system, a computer program has been de-
`veloped which allows easy changing of the acous-
`tical properties of the enclosure. The input signals
`were generated by convolving one channel anechoic
`recordings of speech and noise with the source–to–
`microphone impulse responses. These room impulse
`responses were simulated using the image method de-
`scribed by Allen and Berkley [13]. The room dimen-
`sions were 3.50  7.10  2.96 m and the wall re-
`flection coefficients were varied to simulate different
`reverberation times, i.e. different ratios of direct path
`noise and diffuse noise. The desired speaker was posi-
`tioned 50 cm in front of the array and as noise source
`we used a hair–drier positioned 4.3 m away from the
`array center. The input SNR was 3 dB.
`4.2. Choosing the Array Aperture
`An array of discrete sensors can be conceived as a
`sampled continuous aperture. If the sampling period
`is not chosen appropriately, this sampling introduces
`spatial aliasing in form of grating lobes [14] [15]. On
`the other hand, the estimation of the transfer function
`in look direction W assumes a spatially white noise
`field.
`In practice the noise field can be at best dif-
`fuse with a spatial coherence function given by a sinc
`function. To yield spatially uncorrelated noise signals,
`undersampling the continuous aperture is usually per-
`formed. This works well for pure diffuse noise fields
`and if the desired speaker is close to the array. Our
`proposed system for noise reduction is in principle an
`adaptive beamformer which includes the method pro-
`posed in [7] as special case. An undersampled aperture
`yields a poor system performance in the case of direct
`path noise.
`We used a seven element linear, equally spaced ar-
`ray with 5 cm inter–element spacing and total aperture
`length of 35 cm to avoid spatial aliasing in the fre-
`quency band below 3400 Hz. Experiments with var-
`ious sensor configurations led us to revert to the linear
`array, which yielded the best performance under the
`constraint of a maximal number of seven sensors.
`A better performance is expected by splitting the ar-
`ray in subarrays [3] and performing the noise reduction
`in each subarray with the system shown in figure 1, and
`combining the outputs. However, this will increase the
`cost.
`4.3. Performance Measure
`In speech communications, the ultimate recipient of
`information is the human being. The artefacts gen-
`erated by many speech enhancement techniques de-
`crease the user acceptance for voice communication
`systems. Therefore, for performance evaluation, sub-
`jective listening tests are absolutely necessary. But be-
`cause this is time and cost intensive, we used for per-
`formance evaluation the Log Area Ratio (LAR) Dis-
`tance (L norm without energy weighting) as objective
`measure for speech quality which is found to correlate
`well with the subjective sensation [16].
`
`30
`
`20
`
`10
`
`0
`
`−10
`0
`
`500
`
`1000
`
`2500
`2000
`1500
`Frequency (Hertz)
`
`3000
`
`3500
`
`4000
`
`Fig. 2. Magnitude response of the lowpass filter for the
`signals i to increase the noise reduction performance
`in the low temporal frequency region.
`
`Magnitude Response (dB)
`
`noise cancelling filters Hi, the filters will adapt to the
`desired speech signal and as a consequence the latter
`will be partially cancelled in the output signal y. In
`beamforming for speech application, this signal can-
`cellation is often observed.
`The rows of the blocking matrix can be interpreted
`as fixed beamformers, each of them forming a spatial
`null in the look direction [5]. Based on this interpre-
`tation, a blocking matrix using a spatial filtering tech-
`nique was proposed in [12] to broaden the look direc-
`tion and therefore prevent the adaptive filters Hi from
`cancelling signals coming from an area around the fo-
`cal point. This approach can reduce the signal cancel-
`lation due to steering delay errors or widespread signal
`sources, but in general requires many sensors.
`To increase the overall performance of the noise re-
`duction system shown in figure 1, we propose in addi-
`tion to the spatial filter approach a temporal filtering in
`the blocking matrix. The motivation behind this is as
`follows:
`The low frequency components of the noise field can
`neither be suppressed by the conventional beamformer
`W , because of the good spatial
`nor by the postfilter c
`correlation in this frequency region. The summation
`of the array signals has the effect of a temporal low-
`pass filter on the noise signals. On the other hand, the
`low temporal frequencies of the noise field will be at-
`tenuated at the output of the blocking matrix. There-
`fore, the signal blocking has the effect of a temporal
`highpass filter on the array signals. The transforma-
`tion filters Hi have to compensate for this opposed be-
`haviour to form a proper cancelling signal YS k. The
`transformation filters are theoretically given by equa-
`tion (1) for time stationary signals. But in practice,
`there are only estimates available and the filter order is
`limited. There exists always a potential for mismatch
`in the transfer functions Hi, and because these filters
`have to operate over a large range of gain values, a
`mismatch can result in a very distorted output signal.
`Therefore, we include a fixed temporal lowpass filter
`in the blocking matrix, with the effect that the low fre-
`quency components in the signals i will be empha-
`sized. The used lowpass filter, whose magnitude re-
`sponse is shown in figure 2, is a one pole filter with
`transfer function Gz (cid:5) (cid:3) (cid:0) az(cid:0)  and a (cid:5) (cid:4) .
`To avoid poles in the estimated transfer functions Hi
`due to zero power densities, their magnitudes were in
`addition constrained between values of zero and one.
`
`RTL898_1024-0003
`
`3
`
`

`
`REFERENCES
`[1] J.L. Flanagan, J.D. Johnston, R. Zahn, and G.W. Elko,
`“Computer–steered microphone arrays for sound
`transduction in large rooms,” J. Acoust. Soc. Amer.,
`vol. 78, no. 5, pp. 1508–1518, Nov. 1985.
`[2] W. Kellermann, “A self–steering digital microphone
`array,” in Proc. of
`the Internat. Conference on
`Acoustics, Speech and Signal Processing ICASSP–91,
`pp. 3581–3584, 1991.
`[3] Y. Mahieux, G. Le Tourneur, A. Gilloire, A. Saliou,
`and J.P. Jullien, “A microphone array for multimedia
`workstations,” in Proc. of the 3rd International Work-
`shop on Acoustic Echo Control, (Plestin les Gr`eves,
`France), pp. 145–149, Sep. 1993.
`[4] O.L. Frost, “An algorithm for linearly constrained
`adaptive array processing,” Proc. IEEE, vol. 60, no. 8,
`pp. 926–935, Aug. 1972.
`[5] L.J. Griffiths and C.W. Jim, “An alternative approach
`to linearly constrained adaptive beamforming,” IEEE
`Trans. Antennas Propagat., vol. AP-30, no. 1, pp. 27–
`34, Jan. 1982.
`[6] R. Zelinski, “A microphone array with adaptive post–
`filtering for noise reduction in reverberant rooms,”
`in Proc. of the Internat. Conference on Acoustics,
`Speech and Signal Processing ICASSP–88, (New
`York), pp. 2578–2581, Apr. 1988.
`[7] K.U. Simmer and A. Wasiljeff, “Adaptive microphone
`arrays for noise suppression in the frequency domain,”
`in Second Cost 229 Workshop on Adaptive Algorithms
`in Communications, (Bordeaux, France), pp. 185–
`194, 30.9.–2.10 1992.
`[8] B. Widrow, J.R. Glover, J.M. McCool, J. Kaunitz,
`Ch.S. Williams, R.H. Hearn, J.R. Zeidler, E. Dong,
`and R.C. Goodlin, “Adaptive noise cancelling: Prin-
`ciples and applications,” Proc. IEEE, vol. 63, no. 12,
`pp. 1692–1975, Dec. 1975.
`[9] W. Armbr¨uster, R. Czarnach, and P. Vary, “Adaptive
`noise cancellation with reference input – possible ap-
`plications and theoretical limits,” in Proc. European
`Signal Processing Conf. EUSIPCO–86, (The Hague),
`pp. 391–394, Sep. 1986.
`[10] K.U. Simmer, S. Fischer, and A. Wasiljeff, “Suppres-
`sion of coherent and incoherent noise using a micro-
`phone array,” Annals of telecommunications, vol. 49,
`no. 7/8, no. 7/8, pp. 439–446, 1994.
`[11] A. H. Nuttall and G.C. Carter, “Spectral estimation us-
`ing combined time and lag weighting,” Proc. IEEE,
`vol. 70, no. 9, pp. 1115–1125, Sep. 1982.
`[12] I. Claesson and S. Nordholm, “A spatial filtering ap-
`proach to robust adaptive beamforming,” IEEE Trans.
`Antennas Propagat., vol. 40, no. 9, pp. 1093–1096,
`Sep. 1992.
`[13] J.B. Allen and D.A. Berkley, “Image method for ef-
`ficiently simulating small–room acoustics,” J. Acoust.
`Soc. Amer., vol. 65, no. 4, pp. 943–950, Apr. 1979.
`[14] Don H. Johnson and Dan E. Dudgeon, Array Signal
`Processing — Concepts and Techniques. Englewood
`Cliffs: Prentice Hall, 1993.
`[15] L.J. Ziomek, Fundamentals of Acoustic Field Theory
`and Space–Time Signal Processing. Boca Raton: CRC
`Press, 1995.
`[16] S.R Quackenbush, T.P. Barnwell, and M.A. Clements,
`Objective Measures of Speech Quality. Englewood
`Cliffs: Prentice Hall, 1988.
`
`6
`
`5.5
`
`5
`
`4.5
`
`4
`
`3.5
`
`3
`
`2.5
`
`2
`
`Log Area Ratio Distance
`
`1.5
`100
`
`150
`
`200
`
`350
`300
`250
`Reverberation Time (msec)
`
`400
`
`450
`
`500
`
`Fig. 3. Log Area Ratio Distance as function of rever-
`beration time of the enclosure. Solid line: input LAR,
`dotted line: output LAR, dashed line: output LAR with-
`out temporal filtering in the blocking matrix.
`
`4.4. Results
`Figure 3 shows the LAR improvement as a function
`
`of Sabine’s reverberation time T (low LAR b(cid:0) high
`speech quality). The solid line shows the input LAR
`and the dotted Line shows the output LAR of the pro-
`posed noise reduction system. We can deduce from
`figure 3 that the proposed method works well for a
`large range of reverberation times and is therefore able
`to operate independently of the acoustical properties
`of the enclosure. The speech quality is considerably
`increased at the output of the noise reduction system.
`The dashed line in figure 3 shows the LAR at the out-
`put of the noise reduction system without the temporal
`lowpass filter in the blocking matrix, thus the overall
`performance can be increased by the proposed spa-
`tiotemporal signal blocking matrix, especially for re-
`verberation times below 400 msec.
`
`5. CONCLUSION
`In this paper we proposed a noise reduction system for
`suppression of coherent and incoherent noise in dis-
`turbed speech signals which is based on a Generalized
`Sidelobe Canceller with two adaptive portions.
`Im-
`provements of the estimation of the adaptive transfer
`function in look direction and the design of the signal
`blocking matrix were given. The experimental results
`demonstrated that the proposed method works well for
`a large range of reverberation times and is therefore
`able to operate independently of the acoustical proper-
`ties of the enclosure.
`
`ACKNOWLEDGEMENT
`The authors would like to thank Mr. E. Ochieng–
`Ogolla of the University of Bremen for his helpful ad-
`vice and suggestions that contributed to the improve-
`ment of this paper.
`
`RTL898_1024-0004
`
`4

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket