throbber
A comparison of hearing-aid array-processing techniques
`James M. Katesa) and Mark R. Weiss
`Center for Research in Speech and Hearing Sciences, City University of New York, Graduate Center,
`Room 901, 33 West 42nd Street, New York, New York 10036
`共Received 8 May 1995; revised 5 September 1995; accepted 9 January 1995兲
`Microphone arrays have proven effective in improving speech intelligibility in noise for
`hearing-impaired listeners, and several array processing techniques have been proposed for hearing
`aids. Among the signal-processing approaches are classical delay-and-sum beamforming,
`superdirective arrays, and adaptive arrays. To directly compare the effectiveness of these different
`processing strategies, a 10-cm-long linear array was built using five uniformly spaced
`omnidirectional microphones. This array was used in the end-fire orientation to acquire speech and
`noise signals for a variety of array placements in two representative rooms. Both digital and
`simulated analog processing techniques were considered, with the array processing implemented in
`the frequency domain. The performance metric was the steady-state array gain weighted to represent
`the relative importance of the different frequency regions in understanding speech. The processing
`comparison indicates that digital systems are more effective than the simulated analog processing,
`and that both superdirective and adaptive digital array processing can provide more than 9 dB of
`weighted array gain. © 1996 Acoustical Society of America.
`PACS numbers: 43.66.Ts, 43.60.Gk
`
`INTRODUCTION
`
`In this paper, several fixed-coefficient and adaptive pro-
`cessing algorithms are compared for a short microphone ar-
`ray suitable for hearing-aid applications. The processing ef-
`fectiveness is evaluated using acoustic data acquired in two
`representative rooms, with the processing performed off-line.
`End-fire array placements on the side of the head or near
`reflecting surfaces were used to give conditions similar to
`those that could be experienced in everyday use. The data
`from a real room avoids the limitations of computer simula-
`tions that are typically used in evaluating array-processing
`algorithms, and permits an accurate comparison of the dif-
`ferent processing strategies that have been proposed for hear-
`ing aids.
`A short microphone array is attractive for hearing-aid
`applications since it is one of the few approaches, among the
`many that have been proposed, that has actually improved
`speech intelligibility in noise for the hearing impaired. The
`improvement in signal-to-noise ratio 共SNR兲 for a 10-cm long
`array using five uniformly spaced cardioid microphones with
`delay-and-sum beamforming is 5–12 dB 共Soede et al.,
`1993a, 1993b兲, with the greatest improvement occurring at
`the highest frequencies. Such an array can be hand-held or
`can be built into an eyeglass frame, and the performance of
`the array does not appear to be affected by the head to any
`great extent. The directional arrays used by Soede et al. im-
`proved the speech reception threshold 共SRT兲 by 7 dB in a
`diffuse noise field, so the improvement in SNR is directly
`related to a comparable improvement in speech intelligibility
`in noise.
`The performance offered by delay-and-sum beamform-
`ing can be bettered by using superdirective array processing
`
`a兲Corresponding author. Tel: 212-642-2179; Fax: 212-642-2379; E-mail:
`jkates@email.gc.cuny.edu
`
`共Cheng, 1971; Cox et al., 1986兲, in which the array perfor-
`mance is optimized for noise coming uniformly from all di-
`rections. A sensitivity constraint 共Newman et al., 1978; Cox
`et al., 1986兲 can be used in designing the superdirective ar-
`ray weights to reduce the effects of microphone position er-
`rors, wavefront perturbations, and the sensor internal noise.
`The constraint, however, causes a small reduction in the ar-
`ray gain. Simulation studies 共Kates, 1993; Stadler and
`Rabinowitz, 1993兲 have shown that a constrained superdirec-
`tive array can offer substantially more array gain than clas-
`sical delay-and-sum beamforming, but the performance in a
`real room has not been ascertained. A further processing op-
`tion is an oversteered array, similar to delay-and-sum beam-
`forming except that the time delays used in combining the
`microphone output signals are greater than the acoustic
`propagation times between the microphones. An oversteered
`array can offer performance very close to that of the optimal
`superdirective array 共Cox et al., 1986兲, and can be realized
`with a relatively simple analog system.
`Adaptive algorithms have also been proposed for
`hearing-aid arrays 共Peterson et al., 1987; Greenberg and
`Zurek, 1992; Link and Buckley, 1993; McKinney and De-
`Brunner, 1993; Hoffman et al., 1994兲. Adaptive array pro-
`cessing offers the possibility of improved performance over
`arrays using fixed coefficients, but a perturbed wavefront, as
`can be caused by sensor misalignment or by a specular re-
`flection, can result in signal cancellation 共Cox, 1973兲. The
`scaled projection algorithm 共Cox et al., 1987兲 can be used to
`prevent signal cancellation, and its application to adaptive
`hearing-aid arrays 共Link and Buckley, 1993; Hoffman et al.,
`1994兲 has resulted in improved performance. However, the
`improvement in speech SNR due to the array processing can
`be substantially reduced at low ratios of direct to reverberent
`sound even when the scaled projection constraint is used
`共Hoffman et al., 1994; Greenberg, 1994兲.
`
`3138
`
`0001-4966/96/99(5)/3138/11/$6.00
`
`J. Acoust. Soc. Am. 99 (5), May 1996
`© 1996 Acoustical Society of America
`Realtek 898 Ex. 1022
`RTL898_1022-0001
` Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 141.217.20.120 On: Thu, 08 Jan 2015 20:38:27
`
`3138
`
`

`
`The desire for immunity to correlated interference such
`as specular reflections has lead to modifications of the basic
`adaptive array-processing algorithms. One technique is to
`force the correlation matrix used in the array processing to
`have a Toeplitz structure 共Godara and Gray, 1989; Godara,
`1991兲 in which the entries of the correlation matrix are re-
`placed by the values averaged along the diagonals. Simula-
`tion studies have shown that the resulting structured correla-
`tion matrix offers improved performance in the presence of
`correlated interference for an array several wavelengths long
`共Godara, 1991兲. A further modification is to form a compos-
`ite correlation matrix, using the structured correlation matrix
`in the scaled projection algorithm at a low estimated input
`SNR value and gradually changing to the correlation matrix
`corresponding to an ideal isotropic noise field at high input
`SNR values. This approach is designed to give the benefits of
`an adaptive system at low input SNR values, but to smoothly
`shift to a superdirective array at high input SNR values
`where adaptive systems have exhibited reduced performance.
`In this paper, five frequency-domain processing algo-
`rithms are compared for the same set of microphone data.
`The algorithms are classical delay-and-sum beamforming, an
`oversteered superdirective array, an optimal superdirective
`array, an adaptive system using the scaled projection algo-
`rithm, and an adaptive system using the scaled projection
`algorithm combined with the composite structured correla-
`tion matrix. In order to directly compare the effectiveness of
`these different processing strategies in a real room, a 10-cm-
`long linear array was built using five uniformly spaced om-
`nidirectional microphones. This array was used in the end-
`fire orientation to acquire speech and noise signals for three
`array placements in two representative rooms. The methods
`used for the data acquisition, signal processing, and perfor-
`mance evaluation are described in the remainder of the pa-
`per, along with the performance results.
`
`I. METHOD
`
`A. Data acquisition
`The array used for experiments was 10 cm long and
`consisted of five uniformly spaced Knowles EK-3033 omni-
`directional microphones. The array was used in the end-fire
`orientation. The outputs of the microphones were found to be
`matched to within ⫾1 dB, and no amplitude or phase equal-
`ization was provided. The microphone outputs were sampled
`at 10 kHz using an A/D converter with simultaneous sample-
`and-hold circuits having a ⫾25 ns aperture uncertainty.
`Stimuli were presented one at a time over a loudspeaker
`with the microphone responses sampled and stored on the
`computer for later processing. Speech stimuli were presented
`at an azimuth of 0 deg, and the noise stimuli were presented
`at azimuths of 60, 105, 180, 255, and 300 deg counterclock-
`wise around the array. The speech stimulus consisted of the
`sentence ‘‘The candy shop was empty.’’ spoken by a male
`talker. The uncorrelated noise stimuli at the other azimuths
`consisted of multitalker speech babble. A combined noise
`source was formed by summing the babble signals from the
`five noise azimuths at equal intensities; this combination pro-
`duced a diffuse noise field of the sort that would be found in
`
`FIG. 1. Floor plan of the office used for the array measurements. The array
`position and orientation for the floor-standing and KEMAR measurements
`are indicated by the arrow within the circle, and the loudspeaker positions
`for the speech and noise are indicated by the crosses. Angles are measured
`counterclockwise from the array orientation, with the speech loudspeaker
`position at 0 deg. The desk used for the desk-top measurements is also
`identified. For the desk measurements, the array was positioned at the ‘‘U’’
`in ‘‘USED’’ with the speech loudspeaker positioned at
`the ‘‘Y’’ in
`‘‘ARRAY.’’
`
`a restaurant or similar environment where several people are
`talking simultaneously. The test stimuli were bandlimited to
`5 kHz.
`Two rooms, an office and a conference room, were used
`for the measurements. A floor plan of the office showing the
`location of the furniture, the microphone array, and the loud-
`speaker positions, is presented in Fig. 1. The office walls are
`painted plasterboard, the floor is carpeting over a concrete
`slab, and the ceiling is acoustical tile beneath a plenum. Two
`of the walls are covered with bookshelves, and the office
`contains several desks, tables, and chairs, thus providing a
`complex acoustical environment. A floor plan of the confer-
`ence room showing the location of the furniture, the micro-
`phone array, and the loudspeaker positions, is presented in
`Fig. 2. The construction of the conference room is the same
`as for the office with the exception that the floor is covered
`with cork tiles instead of carpeting.
`Three array positions were used for the data acquisition
`in each room. A quasi-‘‘free-field’’ position was obtained by
`placing the array at a height of 1.4 m on a floor stand near
`the middle of the room and as far as possible from any re-
`flecting surface. A desktop position was obtained by placing
`the array on a microphone stand at a height of 15 cm above
`the surface of a desk 共office兲 or group of tables 共conference
`room兲, with the array at one end and the speech loudspeaker
`at the opposite end of the desk or tables. Measurements were
`also made using the KEMAR anthropometric manikin
`共Burkhard and Sachs, 1975兲 positioned near the center of the
`
`3139
`
`J. Acoust. Soc. Am., Vol. 99, No. 5, May 1996
`RTL898_1022-0002
` Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 141.217.20.120 On: Thu, 08 Jan 2015 20:38:27
`
`J. M. Kates and M. R. Weiss: Array processing comparison
`
`3139
`
`

`
`FIG. 2. Floor plan of the conference room used for the array measurements.
`The array position and orientation for the desk-top measurements is indi-
`cated by the arrow within the circle, and the loudspeaker positions for the
`speech and noise are indicated by the crosses. For the floor-standing and
`KEMAR measurements, the tables were moved to the periphery of the room
`and approximately the same loudspeaker and array positions within the
`room were used.
`
`room with the array positioned just above the left ear at a
`height of 1.2 m above the floor. The power for each speech
`or noise test signal at each microphone array position was
`normalized by forming the rms average across the five mi-
`crophones in the array and setting this average to 1 V.
`The physical and acoustic properties of the rooms are
`summarized in Table I. The reverberation time was estimated
`by observing the decay of a speech-shaped noise signal that
`was allowed to reach steady state in the room and was then
`switched off. The test signal was output by the speech loud-
`speaker in the room and the response was measured at the
`floor microphone array position. Since the ambient noise lev-
`els in the rooms did not permit an accurate measurement of
`the entire 60-dB decay of the test signal, the time to reach a
`level 20 dB below the steady-state level was measured and
`then tripled to give the indicated 60-dB reverberation time.
`The calculated quantities were computed from the physical
`measurements and reverberation time using the room acous-
`tics formulae given by Beranek 共1954兲.
`
`B. Array processing
`All of the array processing was implemented using a
`block frequency-domain approach as shown in Fig. 3. A
`frequency-domain implementation of the adaptive processing
`
`TABLE I. Acoustic properties of the rooms used for the processing evalu-
`ation.
`
`Property
`
`Office
`
`Conference room
`
`Measured:
`Length, m
`Width, m
`Height, m
`Volume, m3
`Reverb. time T60, ms
`Calculated:
`Ave. absorption coef.
`Mean free path, m
`Critical distance, m
`Speech direct/reverb., dB
`Noise direct/reverb., dB
`
`5.1
`4.5
`2.8
`60
`250
`
`0.332
`2.50
`0.97
`⫺6.3
`⫺0.2
`
`10.7
`6.2
`2.8
`185
`600
`
`0.197
`3.26
`1.05
`⫺10.5
`⫺5.0
`
`FIG. 3. Block diagram for the frequency-domain array processing.
`
`generally offers faster convergence than a time-domain ver-
`sion due to the reduced eigenvalue spread in the correlation
`matrices 共Narayan et al., 1983兲. A block frequency-domain
`implementation was chosen to reduce the computational bur-
`den 共Mansour and Gray, 1982兲, and a time-domain constraint
`was added to the weight computation to ensure a causal
`adaptive filter 共Clark et al., 1983兲. To implement the equiva-
`lent of an L-tap time-domain filter, a 2L-sample block of
`data is acquired from each microphone. A fast Fourier trans-
`form 共FFT兲 of size 2L is performed on each 2L-sample data
`buffer, after which the weights are computed independently
`for each positive FFT frequency bin. The frequency-domain
`signal is multiplied by the weights, summed across micro-
`phones at each frequency, and a 2L-point inverse FFT re-
`turns the weighted signal to the time domain. An overlap-
`save implementation 共Clarke et al., 1983兲 was used, with the
`buffer contents and weights updated every L input samples.
`Relatively short adaptive filters, varying in length from L⫽8
`to L⫽32 samples, were used in the experiments since work
`on adaptive microphone arrays 共Sondhi and Elko, 1986兲 has
`indicated that a short filter offers better immunity to delete-
`rious reflection effects than does a long filter.
`The weight vectors for the different processing ap-
`proaches can all be expressed using the same basic equation
`共Cox, 1973; Monzingo and Miller, 1980兲. The set of micro-
`phone weights in each FFT frequency bin is chosen to opti-
`mize the array output SNR subject to a constraint that a
`signal from the end-fire direction be passed with unit gain.
`The processing strategies differ primarily in the description
`of the noise field. The equation for the steady-state weights
`for all of the processing approaches is given by
`R⫺1共k 兲d共k 兲
`d*共k 兲R⫺1共k 兲d共k 兲 ,
`where d(k) is the steering vector 共vector giving the phase
`shift from one microphone to the next as a wave arriving
`from 0 deg propagates across the array兲 for FFT frequency
`index k, R(k) is the noise correlation matrix, and the asterisk
`denotes the conjugate transpose of the vector.
`
`w共k 兲⫽
`
`共1兲
`
`3140
`
`J. Acoust. Soc. Am., Vol. 99, No. 5, May 1996
`RTL898_1022-0003
` Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 141.217.20.120 On: Thu, 08 Jan 2015 20:38:27
`
`J. M. Kates and M. R. Weiss: Array processing comparison
`
`3140
`
`

`
`Classical delay-and-sum beamforming is based on the
`assumption that the dominant source of noise is the self-
`noise of the microphones and not the ambient noise field.
`This assumption leads to the system noise correlation matrix
`being the identity matrix, that is, R(k)⫽I(k), since the as-
`sumed Gaussian noise has equal intensity and is completely
`independent at each microphone 共Cox, 1973兲. The solution
`of Eq. 共1兲 for this form of assumed interference reduces to
`w(k)⫽d(k)/M, where M is the number of microphones in
`the array. The weight vector is independent in each FFT
`frequency bin. The oversteered weight vector is similar to
`that for delay-and-sum beamforming, but uses a modified
`steering vector having time delays multiplied by a scale fac-
`tor greater than one. The oversteered delay factor was imple-
`mented by approximating the phase response of a cascade of
`analog one-zero/one-pole all-pass networks, with the group
`delay chosen to double the normal propagation time between
`the microphones at low frequencies and to reduce to the nor-
`mal propagation time at 5 kHz. This degree of oversteering
`gave a minimum white noise gain of 0 dB.
`The weights for the superdirective and adaptive algo-
`rithms are also similar in form, but use correlation matrices
`that optimize the array performance for the ambient noise
`field rather than for the sensor self-noise. The superdirective
`processing is based on the correlation matrix R(k) calculated
`a priori for an assumed ideal spherically isotropic noise field
`共Cheng, 1971兲, while the adaptive system uses for R(k) the
`signal-plus-noise correlation matrix estimated directly from
`the incoming microphone signals 共Cox et al., 1987兲. The su-
`perdirective processing therefore determines
`the array
`weights based on assumed noise-field characteristics, while
`the adaptive processing determines the array weights in re-
`sponse to the actual noise field found in the room.
`The adaptive processing was implemented using the
`scaled projection algorithm of Cox et al. 共1987兲. This algo-
`rithm imposes a constraint on the magnitude of the weights
`so that
`
`w*共k 兲w共k 兲⭐1/␦2共k 兲,
`
`共2兲
`
`which has been shown to minimize the amount of signal
`cancellation that will occur under perturbed wavefront con-
`ditions. The constraint is equivalent to adding a constant to
`the elements of the main diagonal of the system correlation
`matrix R(k), with the result that the array response ap-
`proaches that of delay-and-sum processing when tightly con-
`strained. Because of the frequency-domain implementation,
`the weight constraint can easily be made frequency-
`dependent. At low frequencies, where the array is shortest
`with respect to the acoustic wavelength and thus has the
`poorest directivity, the constraint can be adjusted to allow a
`higher degree of directionality in the array response. Con-
`versely, at high frequencies, where delay-and-sum beam-
`forming can give adequate amounts of array gain, the con-
`straint can be tightened to guarantee that no signal
`cancellation will occur. The weight constraint was thus set to
`
`␦2共k 兲⫽再 0 dB re:1,
`
`f ⬍1 kHz,
`f ⫺1 dB re:1,
`f ⬎1 kHz.
`
`共3兲
`
`The correlation matrix was computed separately at each FFT
`analysis frequency, and each matrix was smoothed by a low-
`pass filter having a time constant of 500 ms. The weight
`adaptation was independent at each of the FFT bin frequen-
`cies, with convergence for the equivalent of a 16-tap filter
`taking about 200 ms. The algorithm was allowed to adapt for
`2 s to ensure full convergence prior to computing the perfor-
`mance metrics.
`The superdirective processing used a variant on the
`scaled projection algorithm to produce the array weights.
`The scaled projection algorithm uses the signal-plus-noise
`correlation matrix in computing the set of array weights that
`minimizes the array output power; the processing is adaptive
`because the weights are computed iteratively using a corre-
`lation matrix that can change over time. To produce the su-
`perdirective weights,
`the correlation matrix for the ideal
`spherically isotropic noise field, computed a priori from the
`array geometry 共Cheng, 1971兲 and unvarying in time, was
`substituted for the matrix measured from the input signal.
`The weight calculation was then iterated until convergence
`was reached, and the converged weights were used for the
`superdirective array performance measurements. The super-
`directive weights used the same scaled projection constraint
`on the magnitude of the weight vector as was used for the
`adaptive processing in order to prevent any potential signal
`cancellation caused by system misalignment 共Cox et al.,
`1986兲.
`the
`For the composite structured correlation matrix,
`scaled projection algorithm framework was again used, but
`with a modified correlation matrix. The values of the mea-
`sured signal correlation matrix were first averaged along
`each diagonal to give a Toeplitz structure 共Godara, 1991兲;
`this matrix was then combined with the correlation matrix
`calculated for the spherically diffuse noise field, with the
`proportion of the diffuse noise-field matrix increasing with
`increasing input SNR.
`
`C. Performance metric
`The performance metric used in this paper is the articu-
`lation index 共AI兲 weighted array gain, which is similar to
`intelligibility-weighted gain 共Greenberg et al., 1993兲. Ex-
`perimental results have shown a strong correlation between
`the array gain and the improvement in speech intelligibility
`共Soede et al., 1993b兲, and an even better correlation between
`the weighted array gain and intelligibility 共Hoffman et al.,
`1994兲. The AI-weighted array gain is calculated from the
`array gain computed at each frequency of the transformed
`data, and the array gains are combined using weights for
`each frequency band derived from the articulation index im-
`portance function given by Kryter 共1962a兲.
`The array gain 共Cox et al., 1987兲 for the kth FFT bin is
`given by
`
`G共k 兲⫽
`
`兩w*共k 兲d共k 兲兩2
`w*共k 兲Q共k 兲w共k 兲 ,
`where Q(k) is the noise-alone correlation matrix normalized
`so that Tr关Q(k)]⫽M, the number of microphones in the
`array. The array gain depends on the array weights and on
`
`共4兲
`
`3141
`
`J. Acoust. Soc. Am., Vol. 99, No. 5, May 1996
`RTL898_1022-0004
` Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 141.217.20.120 On: Thu, 08 Jan 2015 20:38:27
`
`J. M. Kates and M. R. Weiss: Array processing comparison
`
`3141
`
`

`
`TABLE II. AI-weighted array gain in dB for the single noise source in the office. The data are presented as a
`function of the microphone array position and the filter length L, and are averaged over source location.
`
`Position
`and
`length
`
`Delay
`and
`sum
`
`Oversteered
`
`Optimal
`superdirective
`
`Scaled projection
`Input SNR, dB
`
`Composite struct.
`Input SNR, dB
`
`⫺10
`
`0
`
`⫹10
`
`⫺10
`
`0
`
`⫹10
`
`Floor
`L⫽8
`16
`32
`
`Desk
`L⫽8
`16
`32
`
`KEMAR
`L⫽8
`16
`32
`
`5.5
`5.3
`5.1
`
`6.0
`5.6
`5.3
`
`5.3
`5.0
`4.8
`
`7.6
`7.5
`7.3
`
`8.2
`7.9
`7.6
`
`7.3
`7.0
`6.8
`
`8.9
`9.5
`9.3
`
`10.1
`10.6
`10.1
`
`8.1
`8.6
`8.4
`
`10.6
`11.4
`11.1
`
`11.8
`12.4
`11.8
`
`10.0
`10.7
`10.6
`
`10.3
`10.9
`10.6
`
`11.5
`11.8
`11.1
`
`9.5
`10.2
`9.9
`
`9.7
`9.9
`9.4
`
`11.1
`10.7
`9.8
`
`8.4
`8.5
`7.8
`
`9.0
`9.7
`9.1
`
`10.5
`10.7
`9.9
`
`8.2
`8.5
`8.1
`
`9.5
`9.8
`9.2
`
`10.5
`10.9
`10.0
`
`8.7
`8.8
`8.3
`
`9.9
`10.0
`9.4
`
`10.2
`10.9
`10.0
`
`8.9
`9.0
`8.4
`
`the spatial distribution of the noise, but is independent of the
`actual signal and noise powers. An array consisting of a
`single omnidirectional microphone has an array gain of 1.
`The estimated noise-alone correlation matrix used in the
`array-gain calculation was smoothed using a low-pass filter
`having a time constant of 500 ms. Since both the speech and
`noise were measured in reverberent rooms, this metric gives
`the ratio of the power in the direct portion of the speech
`signal to the total direct-plus-reverberent noise power at the
`array output, normalized by the SNR at the array input. This
`measure thus represents the directional gain of the array in
`the noise field. It is also possible, under conditions of a per-
`turbed signal wavefront, for the array gain to appear to be
`favorable even though signal cancellation is occurring. The
`output signal power in each FFT frequency bin was moni-
`tored as a check for this condition, and no measurable signal
`cancellation was observed.
`The AI-weighted array gain is then given by
`K
`
`GAI⫽ 兺
`
`a共k 兲关10 log10G共k 兲兴 dB,
`
`共5兲
`
`k⫽0
`where the set of weights 兵a(k)其 is the AI importance func-
`tion weights given by Kryter 共1962a兲 reinterpolated for the
`FFT band edges. Spread of masking effects are ignored in
`this metric. The AI-weighted array gain GAI is expressed in
`dB re: the array gain for a single omnidirectional micro-
`phone.
`The array gain given in Eq. 共4兲 differs from the ratio of
`array output SNR to input SNR used by other authors
`共Greenberg and Zurek, 1992; Hoffman et al., 1994兲 as the
`basis of the performance metric. The array output SNR is the
`ratio of the total speech power to the total noise power at the
`output of the array. The speech and noise powers both in-
`clude the reverberated as well as the direct components. The
`array output SNR is given by
`w*共k 兲S共k 兲w共k 兲
`w*共k 兲N共k 兲w共k 兲 ,
`where S(k) is the speech-alone correlation matrix and N(k)
`is the noise-alone correlation matrix. The processing benefit
`
`SNR共k 兲⫽
`
`共6兲
`
`using this metric would then be calculated as the ratio of the
`array output SNR to the array input SNR, converted to dB
`and summed over frequency using the AI weights.
`The preference for array gain versus the ratio of output
`to input SNR as the basis of the performance metric depends
`on the assumptions made about the effects of reverberation
`on speech intelligibility. The SNR-based metric assumes that
`all speech power, reverberated as well as direct, contributes
`equally to speech intelligibility. Experiments in speech intel-
`ligibility in reverberation, however, indicate that reverbera-
`tion times typical of the rooms used in this paper lead to
`reduced speech intelligibility, and the longer the reverbera-
`tion time the greater the reduction in intelligibility 共Moncur
`and Dirks, 1967; Houtgast and Steeneken, 1972兲. This reduc-
`tion of speech intelligibility with increasing reverberation
`time applies to hearing-impaired as well as to normal-
`hearing subjects 共Duquesnoy and Plomp, 1980; Na´beˇlek,
`1982; Na´beˇlek, 1988兲. The effects of reverberation on speech
`intelligibility have been accurately modeled by the speech
`transmission index 共STI兲, based on the modulation transfer
`function within the room for speech envelope modulation
`frequencies 共Houtgast and Steeneken, 1973; Steeneken and
`Houtgast, 1980兲. Even small amounts of reverberation within
`the room will reduce the envelope modulation depth and will
`therefore reduce the speech intelligibility predicted by the
`STI. These results indicate that the effects of reverberation
`are similar to those of noise in reducing speech intelligibility
`in rooms. Thus the array gain, by excluding the reverberated
`components in the estimated speech power, may lead to a
`more valid estimate of the array benefit for speech intelligi-
`bility in rooms than an estimate that assumes that all of the
`reverberated speech is beneficial.
`
`II. RESULTS
`The data from the experiment are presented in Tables
`II–V. Five array processing approaches were considered in
`the experiment. The three fixed-coefficient approaches of
`delay-and-sum beamforming, oversteered delay-and-sum
`beamforming, and optimal superdirective processing were
`tested along with the two adaptive approaches based on the
`
`3142
`
`J. Acoust. Soc. Am., Vol. 99, No. 5, May 1996
`RTL898_1022-0005
` Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 141.217.20.120 On: Thu, 08 Jan 2015 20:38:27
`
`J. M. Kates and M. R. Weiss: Array processing comparison
`
`3142
`
`

`
`TABLE III. AI-weighted array gain in dB for the combined noise source in the office. The data are presented
`as a function of the microphone array position and the filter length L.
`
`Position
`and
`length
`
`Delay
`and
`sum
`
`Oversteered
`
`Optimal
`superdirective
`
`Scaled projection
`input SNR, dB
`
`Composite struct.
`input SNR, dB
`
`⫺10
`
`0
`
`⫹10
`
`⫺10
`
`0
`
`⫹10
`
`Floor
`L⫽8
`16
`32
`
`Desk
`L⫽8
`16
`32
`
`KEMAR
`L⫽8
`16
`32
`
`5.3
`5.0
`4.6
`
`5.8
`5.4
`4.9
`
`5.1
`4.8
`4.5
`
`7.5
`7.5
`6.9
`
`8.2
`7.8
`7.4
`
`7.1
`6.7
`6.4
`
`8.6
`9.5
`8.5
`
`10.0
`10.3
`9.6
`
`7.9
`8.3
`7.9
`
`10.0
`10.3
`9.8
`
`11.4
`11.6
`10.6
`
`9.5
`9.6
`9.2
`
`9.7
`10.0
`9.4
`
`11.2
`11.1
`10.0
`
`9.0
`9.1
`8.6
`
`9.3
`9.3
`8.5
`
`10.9
`10.3
`9.1
`
`8.2
`8.0
`7.2
`
`8.4
`8.8
`8.3
`
`10.2
`10.2
`9.2
`
`8.3
`8.6
`7.7
`
`9.4
`9.1
`8.5
`
`10.2
`10.5
`9.4
`
`8.8
`8.4
`7.8
`
`9.7
`9.5
`8.7
`
`10.0
`10.5
`9.4
`
`8.9
`8.6
`8.0
`
`scaled projection algorithm and the scaled projection algo-
`rithm combined with the composite structured correlation
`matrix. The input SNR was varied for the adaptive process-
`ing approaches but not for the fixed-coefficient approaches.
`The difference in treating the SNR is a direct result of the
`performance metric based on the array gain. The array gain,
`given by Eq. 共4兲, depends only on the array weights and the
`normalized noise-only correlation matrix. The performance
`of the fixed-coefficient systems is independent of the array
`input SNR since neither the fixed coefficients nor the nor-
`malized noise-only correlation matrix will change with
`changes in the incoming signal or noise levels. The SNR
`affects the adaptive systems, however, through the array
`weights; the adaptive array weights change in response to
`changes in the relative levels of the signal and noise embed-
`ded in the measured signal-plus-noise correlation matrix
`used by the adaptive weight algorithm. The speech and noise
`power levels used to compute the array input SNR were de-
`
`termined from the respective steady-state sound fields at the
`microphone array, and thus include both the direct and re-
`verberent field contributions.
`Three filter lengths were used for each processing ap-
`proach, these being equivalent to time-domain filters having
`8, 16, or 32 samples duration. The microphone position in-
`cludes the three placements of floor stand, desk stand, and
`above the left ear of KEMAR. Two rooms, the office and the
`conference room, were used for the measurements, and the
`noise source was either the average of the individual loud-
`speaker results or the result for the five-loudspeaker combi-
`nation.
`Each entry in Tables II and IV is the AI-weighted array
`gain in dB re: a single omnidirectional microphone, averaged
`over the set of five noise loudspeaker azimuths. The data in
`Table II is for the office, and the data in Table IV is for the
`conference room. The entries in Tables III and V were pro-
`duced by combining the babble signals from the five separate
`
`TABLE IV. AI-weighted array gain in dB for the single noise source in the conference room. The data are
`presented as a function of the microphone array position and the filter length L, and are averaged over the
`source location.
`
`Position
`and
`length
`
`Delay
`and
`sum
`
`Oversteered
`
`Optimal
`superdirective
`
`Scaled projection
`input SNR, dB
`
`Composite struct.
`input SNR, dB
`
`⫺10
`
`0
`
`⫹10
`
`⫺10
`
`0
`
`⫹10
`
`Floor
`L⫽8
`16
`32
`
`Desk
`L⫽8
`16
`32
`
`KEMAR
`L⫽8
`16
`32
`
`5.8
`5.4
`5.0
`
`6.2
`5.9
`5.5
`
`5.0
`4.8
`4.7
`
`7.8
`7.5
`7.1
`
`8.3
`8.0
`7.6
`
`7.0
`6.8
`6.6
`
`9.1
`9.6
`9.1
`
`10.2
`10.8
`10.3
`
`7.2
`7.6
`7.5
`
`11.0
`11.3
`10.7
`
`12.1
`12.8
`12.0
`
`9.4
`10.0
`10.1
`
`10.9
`11.1
`10.5
`
`11.9
`12.4
`11.7
`
`9.3
`9.8
`9.8
`
`10.5
`10.6
`9.8
`
`11.5
`11.6
`10.7
`
`8.7
`8.9
`8.6
`
`9.3
`9.6
`8.9
`
`10.0
`10.5
`9.7
`
`7.5
`7.8
`7.5
`
`9.4
`9.7
`8.9
`
`10.3
`10.6
`9.9
`
`7.7
`7.9
`7.4
`
`9.4
`9.7
`8.9
`
`10.4
`10.6
`10.2
`
`7.7
`7.9
`7.4
`
`3143
`
`J. Acoust. Soc. Am., Vol. 99, No. 5, May 1996
`RTL898_1022-0006
` Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 141.217.20.120 On: Thu, 08 Jan 2015 20:38:27
`
`J. M. Kates and M. R. Weiss: Array processing comparison
`
`3143
`
`

`
`TABLE V. AI-weighted array gain in dB for the combined noise source in the conference room. The data are
`presented as a function of the microphone array position and the filter length L.
`
`Position
`and
`length
`
`Delay
`and
`sum
`
`Oversteered
`
`Optimal
`superdirective
`
`Scaled projection
`input SNR, dB
`
`Composite struct.
`input SNR, dB
`
`⫺10
`
`0
`
`⫹10
`
`⫺10
`
`0
`
`⫹10
`
`Floor
`L⫽8
`16
`32
`
`Desk
`L⫽8
`16
`32
`
`KEMAR
`L⫽8
`16
`32
`
`5.4
`5.0
`4.4
`
`6.0
`5.6
`5.2
`
`4.8
`4.6
`4.4
`
`7.5
`7.1
`6.7
`
`8.0
`7.7
`7.4
`
`6.9
`6.8
`6.6
`
`8.7
`9.0
`8.4
`
`9.9
`10.1
`9.7
`
`7.2
`7.6
`7.5
`
`10.4
`10.4
`9.6
`
`11.5
`11.8
`11.1
`
`9.0
`9.2
`9.1
`
`10.3
`10.3
`9.5
`
`11.5
`11.6
`10.8
`
`9.0
`9.2
`9.0
`
`10.1
`10.0
`9.1
`
`11.2
`11.2
`10.2
`
`8.6
`8.7
`8.2
`
`8.5
`9.1
`8.1
`
`9.4
`10.0
`9.2
`
`6.9
`7.6
`7.2
`
`8.7
`9.1
`8.1
`
`9.7
`9.9
`9.4
`
`7.0
`7.7
`7.2
`
`8.9
`9.1
`8.1
`
`10.3
`9.9
`9.5
`
`7.3
`7.8
`7.3
`
`loudspeaker azimuths into a single normalized data file rep-
`resenting a diffuse source of interference of the sort that
`could occur in a meeting room or restau

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket