`
`(12)
`
`Europäisches Patentamt
`
`European Patent Office
`
`Office européen des brevets
`
`*EP001538867A1*
`EP 1 538 867 A1
`
`(11)
`
`EUROPEAN PATENT APPLICATION
`
`(43) Date of publication:
`08.06.2005 Bulletin 2005/23
`
`(21) Application number: 03022273.1
`
`(22) Date of filing: 01.10.2003
`
`(84) Designated Contracting States:
`AT BE BG CH CY CZ DE DK EE ES FI FR GB GR
`HU IE IT LI LU MC NL PT RO SE SI SK TR
`Designated Extension States:
`AL LT LV MK
`
`(30) Priority: 30.06.2003 EP 03014846
`
`(71) Applicant: Harman Becker Automotive Systems
`GmbH
`76307 Karlsbad (DE)
`
`(54)
`
`Handsfree system for use in a vehicle
`
`(57)
`The invention is directed to a handsfree system
`for use in a vehicle, comprising a microphone array with
`at least two microphones, a signal processing means,
`
`(51) Int Cl.7: H04R 3/00, H04R 1/40
`
`(72) Inventor: Chirstoph, Markus
`94315 Straubing (DE)
`
`(74) Representative: Grünecker, Kinkeldey,
`Stockmair & Schwanhäusser Anwaltssozietät
`Maximilianstrasse 58
`80538 München (DE)
`
`and an adaptive post-filter, the signal processing means
`comprising a beamformer having an input connected to
`the at least two microphones and an output connected
`to the input of the adaptive post-filter.
`
`Printed by Jouve, 75001 PARIS (FR)
`
`1
`
`EP1 538 867A1
`
`Amazon v. Jawbone U.S.
`Patent 8,280,072
`Amazon Ex. 1006
`
`
`
`Description
`
`EP 1 538 867 A1
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`[0001] The invention is directed to a handsfree system for use in a vehicle comprising a microphone array with at
`least two microphones and a signal processing means.
`[0002] For making telephone calls in a car, handsfree systems are used more and more since they provide increased
`comfort and reduce the risk of an accident as the driver is distracted only marginally. Because of that, in many countries,
`handsfree devices are even required by law.
`[0003] Usually, a handsfree system comprises a microphone that can be fastened to a user such as the driver.
`[0004] Due to the relatively large distance between the speaker's mouth and the microphone, many handsfree de-
`vices today suffer from the drawback of a poor speech quality. This is particularly due to the fact that in a car, usually
`a large ambient noise is present interfering with the speech signal. The noise stems from different sources such as
`the motor, wind, or car radio.
`[0005] However, common methods for noise reduction are often costly to implement and require a large amount of
`memory and computing power. In particular, a signal processed by conventional noise reduction systems has a relatively
`large delay time which makes these systems unsuitable for real time applications, i.e. telephone applications.
`[0006]
`It is, therefore, the problem underlying the invention to overcome the above drawbacks and provide a hands-
`free system for use in a vehicle with improved speech quality.
`[0007] This problem is solved by a handsfree system according to claim 1. Accordingly, the invention provides a
`handsfree system for use in a vehicle comprising a microphone array with at least two microphones, a signal processing
`means and an adaptive post-filter, the signal processing means comprising a beamformer having an input connected
`to the at least two microphones and an output connected to the input of the adaptive post-filter.
`[0008]
`In the context of this invention, the term "connected" also includes the case that a filter or another signal
`processing means is provided along the signal path between two devices or means. A beamformer processes signals
`emanating from a microphone array to obtain a combined signal. A beamformer comprises a beamsteering means
`being responsible for time delay compensation of the different microphones and a summing means. In its simplest
`form (Delay-and-Sum beamformer), beamforming only comprises delay compensation and summing of the compen-
`sated signals. Beamforming allows to provide a specific directivity pattern for a microphone array. Usually, a beamformer
`can be implemented as digital system with a plurality of digital filter using, for example, digital signal processors (DSP).
`A beamformer can be configured as an adaptive or a non-adaptive beamformer. Adaptive means that relevant param-
`eters such as filter coefficients can be re-calculated during use of the system in order to adapt the beamformer to
`changing conditions. In the non-adaptive case, the system parameters are determined once by calibrating the beam-
`former and, then, kept unchanged. In both cases of a non-adaptive and an adaptive beamformer, the beamforming, in
`principle, can be performed in the time domain or in the frequency domain.
`[0009] A handsfree system in accordance with the invention shows an excellent acoustic performance in a vehicular
`environment. Due to the beamformer, an improved directivity is obtained and, furthermore, speech signals are en-
`hanced and ambient noise is reduced. The adaptive post-filter (responsible for filtering a signal after the beamforming)
`further reduces the noise in the signal.
`[0010] According to a preferred embodiment, the adaptive post-filter can be a filter in the time domain. If the post-
`filtering is performed in the time domain, the delay time is reduced and the implementation is simplified.
`[0011] According to a preferred embodiment, the adaptive post-filter can be a Wiener filter. It turns out that a Wiener
`filter is particularly suitable for filtering in a car environment.
`[0012]
`In order to reduce spectral distortions of the filtered signal, preferably, the adaptive post-filter can be a linear-
`phase filter. Advantageously, the adaptive post-filter can be a linear-phase Wiener filter.
`[0013] According to a preferred embodiment, the signal processing means can further comprise at least two adaptive
`filters having an input connected to the output of the beamsteering means and an output connected to the adaptive
`post-filter, wherein the at least two adaptive filters are configured to determine adaptive filter parameters for the adaptive
`post-filter.
`[0014]
`In this way, background filters are provided for adaptively estimating the filter parameters for the adaptive
`post-filter.
`[0015] Preferably, for each of the at least two microphones, an adaptive filter can be provided having an input con-
`nected to the output of the beamsteering means. Thus, for each output of the beamsteering signal corresponding to a
`microphone, adaptive filter parameters can be determined for the adaptive post-filter. The actual filter parameters of
`the post-filter can be given, for example, by the filter parameters determined by one of the adaptive filters or the mean
`of the filter parameters determined by several different adaptive filters.
`[0016] Advantageously, an input of each of the at least two adaptive filters can be further connected to the output of
`the beamformer. This allows for an adaption of the respective filter parameters directly with respect to the beamformed
`signal.
`[0017] According to a preferred embodiment, the signal processing means can further comprise a pre-emphasis
`
`2
`
`
`
`EP 1 538 867 A1
`
`filter, in particular, comprising a pre-whitening filter, having an input connected to an output of the adaptive post-filter
`and/or a pre-emphasis filter, in particular, comprising a pre-whitening filter, having an input connected to the output of
`the beamsteering means and an output connected to the at least two adaptive filters.
`[0018] Such a pre-emphasis filter, on the one hand, emphasizes high frequencies and, on the other hand, attenuates
`low frequencies which is particularly useful to reduce low frequency correlated noise. Preferably, the pre-emphasis
`filter can comprise a pre-whitening filter. A pre-whitening filter whitens the spectral distribution of a signal. The filter
`coefficients of such a pre-whitening filter can be determined using a linear predictive coding (LPC) analysis, for example,
`via an adaptive lattice predictor (ALP) algorithm.
`[0019] According to a preferred embodiment of the above handsfree systems, the signal processing means can
`further comprise an inverse filter, particularly a warped inverse filter. These filters are especially useful to adjust the
`microphone transfer function and to match the microphones of the array in this way. Preferably, the beamformer can
`comprise at least one inverse filter, in particular, having an output for providing an inversely filtered signal to a summing
`means.
`[0020]
`In order to overcome the matching problem, alternatively or additionally, matched microphones on the basis
`of silicone or paired microphones may be used.
`[0021] The susceptibility of microphone arrays often increases with decreasing frequency. Due to this, a higher match-
`ing precision is preferred for low frequencies compared to high frequencies. A frequency depending adjustment of the
`microphone transfer functions with the use of warped filters reduces the required memory compared to the case of
`conventional FIR filters.
`[0022] Preferably, each inverse filter can be an approximate inverse of a non-minimum phase filter. This results in
`an inverse filter which is both stable and has no phase error.
`[0023] According to a preferred embodiment, an inverse filter may be combined with another filter of the handsfree
`system, for example, a filter of the beamformer. Such a combination in one filter results in a simplified implementation.
`[0024] Preferably, the signal processing means of the above handsfree systems can comprise a non-adaptive post-
`filter having an input connected to an output of the adaptive post-filter. The non-adaptive post-filter may directly follow
`the adaptive post-filter. Such a filter is used to compensate for the ambient acoustics of a speaker. Thus, the non-
`adaptive post-filter may have the form of an inverse room filter.
`[0025]
`In order to further reduce low frequency noise, according to a preferred embodiment, the signal processing
`means may further comprise an adaptive noise canceller (ANC), for electrical ANC implementations.
`[0026] Preferably, the ANC can be connected to a non-acoustic sensor to determine a noise signal, for example, by
`using the tachometer of the vehicle. The ANC, advantageously, can have an output connected to the input of the
`beamformer and/or of the adaptive post-filter.
`[0027] For a further improvement of the speech signal quality, the signal processing means of the previously de-
`scribed handsfree systems can comprise an acoustic echo canceller AEC. Preferably, the AEC can comprise an echo
`shaping filter. In this way, a frequency selected echo attenuation may be obtained. As in the case of an ANC, the AEC
`can have an output connected to the input of the beamformer and/or of the adaptive post-filter.
`[0028] According to a preferred embodiment of all previously described handsfree systems, the beamformer can be
`a non-adaptive beamformer. By using a non-adaptive beamformer with fixed filters, the computing power during oper-
`ation of the system is reduced.
`[0029] Preferably, the beamformer may be a superdirective beamformer which further improves the acoustic per-
`formance.
`[0030] Advantageously, the beamformer may be a regularized superdirective beamformer using a finite regularization
`parameter µ. The regularization parameter usually enters the equation for computing the filter coefficients or, alterna-
`tively, is inserted into the cross-power spectrum matrix or the coherence matrix. In contrast to the maximum superdi-
`rective beamformer (µ = 0), the regularized superdirective beamformer has reduced noise and is less sensitive to an
`imperfect matching of the microphones.
`[0031] The finite regularization parameter µ, preferably, may depend on the frequency. This achieves an improved
`gain of the array compared to a regularized superdirective beamformer with fixed regularization parameter µ. According
`to a preferred embodiment, each superdirective filter may result from an iterative design based on a predetermined
`maximum susceptibility. This enables an optimal adjustment of the microphones, particularly with respect to the transfer
`function and the position of each microphone.
`[0032] By using a predetermined maximum susceptibility, defective parameters of the microphone array can be taken
`into account to further improve the gain. The maximum susceptibility may be determined as a function of the error in
`the transfer characteristic of the microphones, the error in the microphone positions and a predetermined (required)
`maximum deviation in the directional diagram of the microphone array. The time-invariant impulse response of the
`filters will be determined iteratively only once; there is no adaption of the filter coefficients during operation.
`[0033] According to a preferred embodiment, each superdirective filter can be a filter in the time domain. Filtering in
`the frequency domain is a possible alternative, however, requiring to perform a Fourier transform (FFT) and an inverse
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`3
`
`
`
`EP 1 538 867 A1
`
`Fourier transform (IFFT), thus, increasing the required memory.
`[0034] Advantageously, the beamformer may have the structure of a generalized sidelobe canceller (GSC). In this
`way, at least one filter can be saved. The implementation in the GSC structure, however, is only possible in the frequency
`domain.
`[0035]
`In order to obtain an optimal adaption of the handsfree system to a particular noise situation, according to a
`preferred embodiment, the beamformer can be a minimum variance distortionless response (MVDR) beamformer.
`[0036] According to a preferred embodiment, the microphone array can comprise at least two microphones being
`arranged in endfire orientation with respect to a first position. An array in endfire orientation has a better directivity and
`is less sensitive to a mismatched propagation or delay time compensation. The first position can be the location of the
`drivers head, for example.
`[0037] Preferably, the microphone array can comprise at least two microphones being arranged in endfire orientation
`with respect to a second position. Thus, the handsfree system of the invention has a good directivity in two directions.
`Speech signals coming from two different positions, for example, from the driver and the front seat passenger, can
`both be recorded in good quality.
`[0038] According to a preferred embodiment, the signal processing means may comprise at least two beamformers.
`A first beamformer may be used for signals from a first position and a second beamformer may be used for signals
`from a second position. In this case, advantageously, the handsfree system may further comprise a voice activity
`detector (VAD) and/or a switch control means. The switch control and the VAD are used to determine how to combine
`the output of the at least two beamformers.
`[0039] Advantageously, the handsfree system can comprise a residual echo suppression (RES) means and/or a
`dynamic volume control (DVC). A RES means serves for suppression of residual echoes, in particular, being present
`in the signal resulting from the adaptive post filter. Thus, a residual echo suppression means can comprise an input
`connected to the output of the adaptive post filter. Furthermore, a RES means can comprise an input for receiving a
`far end signal. A DVC is intended for dynamically adapting the output volume of a far end signal depending on the
`ambient noise level being present in the vehicle.
`[0040] According to a preferred embodiment, the at least two microphones in the first endfire orientation (endfire
`orientation with respect to a first position) and the at least two microphones in the second endfire orientation (endfire
`orientation with respect to a second position) can have a microphone in common. In this way, already a microphone
`array consisting of only three microphones provides an excellent directivity for use in a vehicular environment.
`[0041] According to a preferred embodiment of all previously discussed handsfree systems, the microphone array
`may comprise at least two subarrays. Each subarray of microphones may be optimized for a specific frequency band
`yielding an improved overall directivity.
`[0042] To decrease the total number of microphones, preferably, at least two subarrays may have at least one mi-
`crophone in common.
`[0043] According to a preferred embodiment, the above handsfree systems may comprise a frame wherein each
`microphone of the microphone array is arranged in a predetermined, in particular fixed, position in or on the frame.
`This ensures that after manufacture of the frame with the microphone, the relative positions of the microphones are
`known. Such an array can be easily mounted in a vehicular cabin.
`[0044] According to a preferred embodiment, at least one microphone may be a directional microphone. The use of
`directional microphones improves the array gain.
`[0045] Preferably, at least one directional microphone may have a cardioid characteristic. This further improves the
`array gain. More preferred, the cardioid characteristic is a hyper-cardioid characteristic.
`[0046] Advantageously, at least one directional microphone may be a differential microphone. This results in a mi-
`crophone array with excellent directivity and small dimensions, in particular, the differential microphone may be a first
`order differential microphone.
`[0047] The invention is further directed to a vehicle, particularly a car, comprising any of the above-described hands-
`free systems.
`[0048] The invention is also directed to the use of any of the previously described handsfree systems in a vehicle,
`in particular, a car.
`[0049] Furthermore, the invention provides a method for noise reduction in a vehicular handsfree system, comprising
`receiving input signals resulting from a microphone array with at least two microphones, processing the input signals
`by a beamformer to provide a beamformed signal, and adaptively filtering a signal resulting from the beamformed
`signal by an adaptive post-filter.
`[0050] This method results in an excellent acoustic performance of a handsfree system in a vehicular environment.
`[0051] According to a preferred embodiment, the adaptive filtering can be performed in the time domain. In this way,
`particularly the delay time is reduced.
`[0052] Preferably, the method can further comprise
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`4
`
`
`
`providing at least two adaptive filters, particularly Wiener filters, wherein
`
`EP 1 538 867 A1
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`beam processing the input signals by a beamformer forming comprises beamsteering the input signals for providing
`beamsteered signals corresponding to one of the at least two microphones and summing the signals, and
`
`adaptively filtering comprises receiving and processing at least one beamsteered signal by at least one of the at
`least two adaptive filters to determine adaptive filter parameters for the adaptive post filter.
`
`[0053] According to a preferred embodiment, adaptively filtering can further comprise receiving a signal resulting
`from the beamformed signal by at least one adaptive filter and wherein processing the beamsteered signal can comprise
`determining adaptive filter parameters using the at least one beamsteered signal and the signal resulting from the
`beamformed signal.
`[0054] Preferably, for each beamsteered signal, an adaptive filter can be provided for determining adaptive filter
`parameters using the beamsteered signal and the signal resulting from the beamformed signal.
`[0055]
`In order to reduce low frequency correlated noise, receiving at least one beamsteered signal by at least one
`of the at least two adaptive filters can comprise processing the at least one beamsteered signal by a pre-emphasis
`filter, in particular, comprising a pre-whitening filter.
`[0056] According to an advantageous embodiment, the above methods can further comprise processing a signal
`resulting from the microphone array by an inverse filter, in particular, a warped inverse filter.
`[0057] Preferably, the methods can further comprise non-adaptively filtering a signal resulting from the adaptively
`filtered signal and/or processing a signal resulting from the adaptively filtered signal by a pre-emphasis filter.
`[0058] The above method, advantageously, can further comprise processing a signal resulting from the microphone
`array, particularly resulting from the beamformed signal, by an adaptive noise canceller (ANC) and/or an acoustic echo
`canceller (AEC) and/or a residual echo suppression (RES) means.
`[0059] According to a preferred embodiment, the input signals can be processed by a non-adaptive and/or superdi-
`rective and/or minimum variance distortionless response (MVDR) beamformer.
`[0060] The invention also provides a computer program product comprising one or more computer readable media
`having computer-executable instructions for performing the steps of the above described methods.
`[0061] Additional features and advantages will be described with reference to the examples illustrated in the draw-
`ings:
`
`Fig. 1
`
`illustrates the structure of a handsfree system according to the invention with an adaptive post-filter in
`the time domain;
`
`35
`
`Fig. 2
`
`shows the structure of a beamformer in the frequency domain;
`
`Fig. 3
`
`Fig. 4
`
`40
`
`illustrates an FXLMS algorithm;
`
`shows the structure of a beamformer in the time domain;
`
`Figs. 5A, 5B
`
`illustrate preferred embodiments of arrangements of the microphone array in a vehicle;
`
`Figs. 6A, 6B
`
`illustrate preferred embodiments of arrangements of a microphone array in a mirror;
`
`45
`
`Fig. 7
`
`shows a microphone array consisting of three subarray;
`
`50
`
`Fig. 8
`
`Fig. 9
`
`Fig. 10
`
`Fig. 11
`
`illustrates a superdirective beamformer in a GSC structure;
`
`illustrates a microphone array with two microphones in a noise field with a noise free sector;
`
`shows the structure of a superdirective beamformer comprising four first order gradient microphones;
`
`illustrates the structure of a handsfree system with an electrical ANC;
`
`55
`
`Fig. 12
`
`shows the structure of an ANC;
`
`Fig. 13
`
`shows the structure of an embodiment of a handsfree system according to the invention with an ANC
`and AEC;
`
`5
`
`
`
`EP 1 538 867 A1
`
`illustrates the structure of an AEC; and
`
`shows another embodiment of a handsfree system according to the invention.
`
`Fig. 14
`
`Fig. 15
`
`[0062] An example of the handsfree system in accordance with the present invention is shown in Fig. 1. In the
`following, first, the general structure will be shortly described, and, then, the different components will be explained in
`more detail. In the figures, it is to be noted that the dotted lines encasing some elements simply serve for better un-
`derstanding of the figures without necessarily implying any actual combination or separation of different elements.
`[0063] The main components of the system are a microphone array, a beamformer and an adaptive post-filter in the
`time domain. The microphone array 101, in this example, comprises four microphones 102. Each microphone 102
`yields an output signal xi[k]. The microphone signals may be filtered by an optional high-pass filter 103.
`[0064] Then, the signals are passed to a beamformer. This beamformer may be a conventional delay and sum beam-
`former. However, in the present example, a preferred superdirective beamformer is shown. Such a beamformer com-
`prises beamsteering means 104 and filters 105. The output signals of the beamformer may be passed through optional
`inverse filters 106 and, then, are summed by summing means 107 to yield a resulting beamformed signal x[k].
`[0065] This signal is passed through an adaptive post-filter 108 in the time domain which may be followed by an
`optional non-adaptive post-filter 109 and/or by an optional pre-emphasis filter (not shown). The adaption of the post-
`filter 108 is performed using a set of Wiener filters 109. The input signals of the Wiener filters 110 comprise, on the
`one hand, the individual signals resulting from the different microphones and, on the other hand, the summed signal
`x[k]. In the present example, the microphone signals are taken after the beamsteering. However, if the beamformer
`comprises further (superdirective) filters 105 as in the present case, it is also possible to take the microphone signals
`after this additional filtering. Before being presented to the Wiener filters, the microphone signals are passed through
`an optional pre-emphasis filter 111.
`[0066]
`In the following, the functioning of a Wiener filter will be explained. A microphone signal x[k] is the sum of the
`speech signal s[k] and the noise n[k]. The microphone signal will be filtered by an impulse response w(i) to obtain a
`[k]. It is the aim to minimize the mean square error between the undisturbed speech signal s[k]
`noise reduced signal
`[k]:
`and the output signal
`
`[0067]
`In other words, the partial derivative of the mean square error with respect to the coefficients of the impulse
`response has to vanish yielding the Wiener-Hopf equation:
`
`wherein rxx(l) and rsx(l) are the auto-correlation function and the cross-correlation function of the microphone signal
`and the undisturbed speech signal. One may assume that the speech signal and the noise are statistically independent,
`i.e. rsx(l) = rss(l), thus,
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`[0068] A transformation of this equation into the frequency domain yields the frequency response of the Wiener filter:
`
`6
`
`
`
`EP 1 538 867 A1
`
`W(ω) =
`
`Φss(ω)
`-------------------- =
`Φxx (ω)
`
`Φss(ω)
`-----------------------------------------.
`Φss(ω)+Φnn(ω)
`
`[0069]
`In order to obtain a time variant filter, the power spectral densities in the above equation may be replaced by
`the corresponding short-time estimated values that may be obtained, for example, by a recursive averaging:
`
`wherein S(κ,ν) and X(κ,ν) are short-time spectra that may be determined, for example, with the help of DFT filter banks
`or an FFT. Here, κ is the time index and ν the frequency index;
`E
`{.} represents the short-time average that may be
`obtained, for example, with the help of a first order IIR filter.
`[0070] The short-time auto power spectral density of the speech signal in the numerator of the above equation is to
`be estimated in a suitable way. Appropriate estimation methods include spectral subtraction (estimating the auto power
`spectral density of the noise), minimum mean square error short-time spectral amplitude (MMSE STSA) estimator or
`MMSE log-SA estimator or a speech pause detector, for example.
`[0071]
`It is also possible to estimate the short-time auto power spectral density of the noise signal with the help of
`the coherence between two or more microphones. In a second step, the estimated short-time auto power spectral
`density of the noise signal may be used to estimate the absolute value of the most probable Fourier coefficient (using,
`for example, a spectral subtraction algorithm or an MMSE log-SA estimator) and to reconstruct the absolute value of
`the spectrum of the speech signal. For the multi-channel noise reduction, one estimates that the coherence or the
`cross power spectral density of the noise signals received by the microphones is vanishing. In the case of two micro-
`phones, for example, the microphone signal has the form:
`
`wherein h1(i) and h2(i) are the impulse responses representing the acoustic transfer between the source of speech
`and the microphones. Both parts of the speech signal filtered in this way are superimposed with the uncorrelated noise
`signals n1[k] and n2[k].
`[0072] Since Φn1n2 (ω) = 0 and assuming that the Fourier transforms (H1(ω) and H2 (ω)) of the impulse responses
`of the acoustic transfer (h1(i) and h2(i)) obey |H1 (ω)| = |H2 (ω)|, one obtains for the short-time auto power spectral density
`
`wherein
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`[0073] The corresponding Wiener filter, then, has the form
`
`7
`
`
`
`EP 1 538 867 A1
`
`W(ω) =
`
`|Φx1x2
`(ω)|
`---------------------------- =
`Φx1x1
`(ω)
`
`Φss (ω)|H1 (ω)|2
`----------------------------------------------
`(ω)
`Φx1x1
`
`[0074]
`In Fig. 1, the adaption of the post-filter w(k,i) - kbeing the time index and i denoting the coefficient within the
`impulse response - is performed in the time domain, for example, with the help of the LMS algorithm. The background
`Wiener filters w1(k, i),...,w4(k, i) are two minimize the error signals e1[k],...,e4[k] such that, for example, the filter w4(k,
`i) tends towards the frequency response
`
`wherein
`
`[0075] The form of the other three Wiener filters is obtained by a cyclic permutation of the indices.
`[0076]
`It is to be understood that the system is not restricted to a particular number of Wiener filters 110. Furthermore,
`not every Wiener filter 110 is always to be used to determine the adaptive post-filter 108. For example, one may use
`only the Wiener filter which uses the microphone signal of the microphone proximal to the source of speech.
`[0077] Preferably, however, the adaptive post-filter 108 is determined as
`
`w[k,i] = 1
`--(w1[k,i]+w2[k,i]+w3[k,i]+w4[k,i]).
`
`4-
`
`[0078] The filter is linear-phase if the filter coefficients satisfy
`
`w[k,i] = w[k,L - i].
`
`[0079] Using this symmetry condition, the filter coefficients of a linear-phase post-filter (with length L) can be obtained.
`Accordingly, the linear-phase post-filter has twice the length of one of the background filters 110 (with length L/2). Such
`a linear-phase filter only modifies the amplitude spectrum of the input signal of the filter without a frequency dependent
`distortion of the phase spectrum.
`[0080] The performance of the filter can be further improved by smoothing its frequency response. This can be
`achieved by weighting the filter coefficients with a window function.
`[0081] The inverse filters 106 serve to compensate for the acoustic transfer function of the path between the source
`of speech and the microphones.
`[0082]
`In Figure 2, the structure of a superdirective beamformer is shown. The beamformer shown in this figure
`performs the filtering in the frequency domain, in contrast to the case of Figure 1. If a beamformer in the frequency
`domain were used in Figure 1, an inverse Fourier transform is to be performed on the signals before passing the signals
`to the Wiener filters 110 or the pre-emphasis filter 111.
`
`8
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`
`
`EP 1 538 867 A1
`
`[0083]
`In Figure 2, the microphone array consists of M microphones 102, each yielding a signal xi(t). The signals xi
`(t) are transferred to the frequency domain by fast Fourier transform (FFT) means 201, resulting in a signal Xi(ω). In
`general, the beamforming consists of a beamsteering and a filtering. The beamsteering is responsible for the propa-
`gation time compensation. The beamsteering is performed by a steering vector
`
`5
`
`10
`
`with
`
`15
`
`and
`
`d(ω) = aa0e
`
`-j2πfτ0, a1 e
`
`-j2πfτ1,...,aM-1 e
`
`-j2πfτM-1(cid:237),
`
`α n =
`
`储
`储 q - pref
`----------------------------
`储q - pn
`储
`
`储
`储 - 储 q - pn
`储 q - pref
`-----------------------------------------------------------,
`c
`
`τn =
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`wherein pref denotes the position of a reference microphone, pn the position of microphone n, q the position of the
`source of sound (for example, the speaker), f the frequency and c the velocity of sound. In the far field, one has
`
`a0 = a1 = ··· = am-1 = 1.
`
`[0084] According to a rule of thumb, one has the far field situation if the source of the useful signal is more than twice
`as far from the microphone array as the maximum dimension of the array. In Figure 2, a far field beamformer is shown
`since only a phase factor ejωτk denoted by reference sign 202 is applied to the signals Xk(ω).
`[0085] After the beamsteering, the signals are filtered by superdirective filters 203 that are filters in the frequency
`domain. The filtered signals are summed yielding a signal Y(ω). After an inverse fast Fourier transform (IFFT) by means
`204, the resulting signal y[k] is obtained.
`[0086] The optimal filter coefficients Ai (ω) may be computed according to
`
`Ai (ω)) =
`
`Γ(ω)-1 d(ω)
`--------------------------------------------------,
`d(ω)H Γ(ω)-1 d(ω)
`
`wherein the superscript H denotes Hermitian transposing and Γ(ω) is the complex coherence matrix
`
`the entries of which are the coherence functions that are defined as the normalized cross-power spectral density of
`two signals
`
`ΓXiXj
`
`(ω) =
`
`(ω)
`PXiXj
`-----------------------------------------------------.
`(ω) PXjXj
`(ω)
`PXiXi
`
`[0087] Preferably, the beamsteering is separated from the filtering step which reduces the steering vector in the
`design equation for the filter coefficients Ai (ω) to the unity vector
`
`9
`
`
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`EP 1 538 867 A1
`
`d(ω) = (1,1,...,1)T.
`
`(The superscript T denotes transposing.)
`
`[0088]
`
`In the case of an isotropic noise field in three dimensions (diffuse noise field), the coherence is