`
`- 1 -
`
`Amazon v. Jawbone
`U.S. Patent 11,122,357
`Amazon Ex. 1003
`
`
`
`(;,
`~ Ycfc?tJ
`
`
`
`- 2 -
`
`
`
`I
`I
`i
`I
`l
`I
`
`Springer
`Berlin
`Heidelberg
`New York
`Barcelona
`Hongkong
`London
`Milan
`Paris
`Singapore
`Tokyo
`
`ONLINE LIBRARY
`
`http://www.springer.de/engine/
`
`- 3 -
`
`
`
`Series Editors
`Prof. Dr.-Ing. ARILD LACROIX
`T ohann-Wolfgang-Goethe-Univers.i tat
`Institut fi.i.r angewandte Physik
`Robert-Mayer-Ste. 2-4
`D-60325 Frankfurt
`
`Prof. Dr.-Ing.
`ANASTASIOS VBNETSANOPOULOS
`University of Toronto
`Dept. of Electrical and Computer Engineering
`JO King's College Road
`M5S 3G4 Toronto, Ontario
`Canada
`
`Editors
`Prof. MICHAEL B RANDSTEIN
`Harvard University,
`Div. of Eng. and Applied Scciences
`33 Oxford Street
`MA 02138 Cambridge
`USA
`e-mail: msb@hrl.harvard.edu
`Dr. DARREN WARD
`Imperial College, Dept. of Electrical Engineering
`Exhibition Road
`SW7 2AZ London
`GB
`e-mail: d.ward@ic.ac.uk
`
`ISBN 3-540-41953-5 Springer-Verlag Berlin Heidelberg New York
`
`Cip data applied for
`
`This work is subject to copiight. All rights a.re reserved, whether the whole or part of the material is
`concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
`broadcasting, reproduction on microfilm or in other ways, and storage in data banks. Duplication of
`this publication or parts thereof is permitted only under the provisions of the German Copyright Law
`of September 9, 1965, in its current version, and permission for use must always be obtained from
`Springer-Verlag. Violations are liable for prosecution act under German Copyright Law.
`
`Springer-Verlag is a part of Springer Science+Business Media
`http://www.springer.de
`© Springer-Verlag Berlin Heidelberg New York 2001
`Printed in Germany
`
`The use of general descriptive names, registered names, trademarks, etc. in this publication does not
`imply, even in the absence of a specific statement, that such names are exempt from the relevant
`protective laws and regulations and therefore free for general use.
`
`Typesetting: Came.ra-ready copy by authors
`Cover-Design: de'blik, Berlin
`SPIN: 11559276
`62/31ll 5 4 3 2 I
`
`P1inted on acid-free paper
`
`- 4 -
`
`
`
`Preface
`
`The study and implementation of microphone arrays originated over 20 years
`ago. Thanks to the research and experimental developments pursued to the
`present day, the field has matured to the point that array-based technology
`now has immediate applicability to a number of current systems and a vast
`potential for the improvement of existing products and the creation of future
`devices.
`In putting this book together, our goal was to provide, for the firs t time,
`a single complete reference on microphone arrays. We invited the top re(cid:173)
`searchers in the field to contribute articles addressing their specific topic(s)
`of st.udy. The reception we received from our colleagues was quite enthusi(cid:173)
`astic and very encouraging. There was the general consensus that a work
`of this kind was well overdue. The results provided in this collection cover
`the current state of the art in microphone array research, development, and
`technological application.
`'l'his text is organized into four sections which roughly follow the major
`areas of microphone array research today. Parts I and II are primarily the(cid:173)
`oretical in nature and emphasize the use of microphone arrays for speech
`enhancement and source localization, respectively. Part III presents a num(cid:173)
`ber of specific applications of array-based technology. P art IV addresses some
`open questions and explores the future of the field.
`Part I concerns t he problem of enhancing the speech signal acquired by
`an array of microphones. For a variety of applications, including human(cid:173)
`computer interaction and hands-free telephony, the goal is to allow users to
`roam unfettered in diverse environments while still providing a high quality
`speech signal and robustness against background noise, interfering sources,
`and reverberation effects. The use of microphone arrays gives one the oppor(cid:173)
`tunity to exploit the fact that the source of the desired speech signal and the
`noise sources are physically separated in space. Conventional array process(cid:173)
`ing techniques, typically developed for applications such as radar and sonar,
`were initially applied to the hands-free speech acquisition problem. However,
`the environment in which microphone arrays is used is significantly different
`from that of conventional array applications. Firstly, the desired speech signal
`has an extremely wide b~ndwidth relative to its center frequency, meaning
`that conventional narrowband techniques are not suitable. Secondly, there
`
`I
`l
`I
`I
`
`- 5 -
`
`
`
`VI
`
`P reface
`
`is significant multipath interference caused by room reverberation. Finally,
`the speech source and noise signals may located close to the array, meaning
`that the conventional far-field assumption is typically not valid. These dif(cid:173)
`ferences ( amongst others) have meant that new array techniques have had
`to be formulated for microphone array applications. Chapter 1 describes the
`design of an array whose spatial response does not change appreciably over
`a wide bandwidth. Such a design ensures that the spatial filtering performed
`by the array is uniform across the entire bandwidth of the speech signal. The
`main problem with many array designs is t hat a very large physical array is
`required to obtain reasonable spatial resolution, especially at low frequencies.
`This problem is addressed in Chapter 2, which reviews so-called superdirec(cid:173)
`tive arrays. These arrays are designed to achieve spatial directivity that is
`significantly higher than a standard delay-and-sum beamformer . Chapter 3
`describes the use of a single-channel noise suppression filter on the output
`of a microphone array. The design of such a post-filter typically requires in(cid:173)
`formation about the correlation of the noise between different microphones.
`The spatial correlation functions for various directional microphones are in(cid:173)
`vestigated in Chapter 4, which also describes the use of these functions in
`adaptive noise cancellation applications. Chapter 5 reviews adaptive tech(cid:173)
`niques for microphone arrays, focusing on algorithms that are robust and
`perform well in real environments. Chapter 6 presents optimal spatial filter(cid:173)
`ing algorithms based on the generalized singular-value decomposition. These
`techniques require a large number of computations, so the chapter presents
`techniques to reduce t he computational complexity and t hereby permit real(cid:173)
`time implementation. Chapter 7 advocates a new approach that combines
`explicit modeling of the speech signal (a technique which is well-known in
`single-channel speech enhancement applications) with the spatial filtering af(cid:173)
`forded by multi-channel array processing.
`Part II is devoted to the source localization problem. The ability to locate
`and track one or more speech sources is an essential requirement of micro(cid:173)
`phone array systems. For speech enhancement applications, an accurate fix
`on the primary talker, as well as knowledge of any interfering talkers or coher(cid:173)
`ent noise sources, is necessary to effectively steer the array, enhancing a given
`source while simultaneously attenuating those deemed undesirable. Location
`data- may be used as a guide for discriminating individual speakers in a multi(cid:173)
`source scenario: With this information available, it would then be possible to
`automatically focus upon and follow a given source on an extended basis. Of
`particular interest lately, is the application of the speaker location estimates
`for aiming a camera or series of ca1neras in a video-conferencing system. In
`this regard, the automated localization information eliminates the need for a
`human or number of human camera operators. Several existing commercial
`products apply microphone-array technology in small-room environments to
`steer a robotic camera and frame active talkers. Chapter 8 summarizes the
`various approaches which have been explored to accurately locate an individ-
`
`\
`
`,:
`
`- 6 -
`
`
`
`Preface
`
`VII
`
`ual in a practical acoustic environment. The emphasis is on precision in the
`face of adverse conditions, with an appropriate method presented in detail.
`Chapter 9 extends the problem to the case of multiple active sources. While
`again considering realistic environments, the issue is complicated by the pres(cid:173)
`ence of several talkers. Chapter 10 further generalizes the source localization
`scenario to include knowledge derived from non-acoustic sensor modalities.
`In this case both audio and video signals are effectively combined to track
`the motion of a talker.
`Part III of this text details some specific applications of microphone array
`technology available today. Microphone arrays have been deployed for a vari(cid:173)
`ety of practical applications thus far and their utility and presence in our daily
`lives is increasing rapidly. At one extreme are large aperture arrays with tens
`to hundreds of elements designed for large rooms, distant talkers, and adverse
`acoustic conditions. Examples include t he two-dimensional, harmonic array
`installed in the main auditorium of Bell Laboratories, Murray Hill and the
`512-element Huge Microphone Array (HMA) developed at Drown University.
`While these systems provide t remendous functionality in the environments
`for which they are intended, small arrays consisting of just a handful (usu(cid:173)
`ally 2 to 8) of microphones and encompassing only a few centimeters of space
`have become far more common and affordable. These systems arc intended
`for sound capture in close-talking, low to moderate noise conditions (such
`as an individual dictating at a workstation or using a hands-free telephone
`in an automobile) and have exhibited a degree of effectiveness, especially
`when compared to their single microphone counterparts. The technology has
`developed to t he point that microphone arrays are now available in off-the(cid:173)
`shelf consumer electronic devices available for under $150. Because of their
`growing popularity and feasibility we have· chosen to focus primarily on the
`issues associated with small-aperture devices. Chapter 11 addresses the in(cid:173)
`corporation of multiple microphones into hearing aid devices. The ability of
`beamforming methods to reduce background noise and interference has been
`shown to dramatically improve the speech understanding of the hearing im(cid:173)
`paired and to increase their overall satisfaction with the device. Chapter 12
`focuses on t he case of a simple two-element array combined with postfiltering
`to achieve noise and echo reduction. The performance of t his configuration
`is analyzed under realistic acoustic conditions and its utility is demonstrated
`for desktop conferencing and intercom applications. Chapter 13 is concerned
`with the problem of acoustic feedback inherent in full-duplex communica(cid:173)
`tions involving loudspeakers and microphones. Existing single-channel echo
`cancellation methods are integrated within a beamforming context to achieve
`enhanced echo suppression. These results are applied to single- and multi(cid:173)
`channel conferencing scenarios. Chapter 14 explores the use of microphone
`arrays for sound capture in automobiles. The issues of noise, interference, and
`echo cancellation specifically within the car environment are addressed and a
`particularly effective approach is detailed. Chapter 15 discusses the applica-
`
`,I
`
`- 7 -
`
`
`
`VIII
`
`Preface
`
`tion of microphone arrays to improve the performance of speech recognition
`systems in adverse conditions. Strategies for effectively coupling the acous(cid:173)
`tic signal enhancements afforded through beamforming with existing speech
`recognition techniques are presented. A specific adaptation of a recognizer to
`function with an array is presented. Finally, Chapter 16 presents an overview
`of the problem of separating blind mixtures of acoustic signals recorded at a
`microphone array. This represents a very new application for microphone ar(cid:173)
`rays, and is a technique that is fundamentally different to the spatial filtering
`approaches detailed in earlier chapters.
`In the final section of the book, Part IV presents expert-summaries of
`current open problems in the field, as well as personal views of what the future
`of microphone array processing might hold. These summaries, presented in
`Chapters 17 and 18, describe both academically-oriented research problems,
`as well as industry-focused areas where microphone array research may be
`headed.
`The individual chapters that we selected for the book were designed to
`be tutorial in nature with a specific emphasis on recent important results.
`We hope the result is a text that will be of utility to a large audience, from
`the student or practicing engineer just approaching the field to the advanced
`researcher with multi-channel signal processing experience.
`
`Cambridge MA, USA
`London, UK
`January 2001
`
`Michael Brandstein
`Darren Ward
`
`I
`
`l
`I~
`
`- 8 -
`
`
`
`j . I r
`
`'· i
`
`Contents
`
`Part I. Speech Enhancement
`
`1 Constant Directivity Beamforming
`3
`. . . . . . . .
`Darren B. Ward, Rodney A. Kennedy, Robert C. Williamson
`1.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`3
`6
`1.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`7
`1.3 Theoretical Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`7
`1.3.1 Continuous sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`8
`1.3.2 Beam-shaping function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`9
`1.4 Practical Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`9
`1.4.1 Dimension-reducing parameterization . . . . . . . . . . . . . . . . . . . .
`1.4.2 Reference beam-shaping filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
`1.4.3 Sensor placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
`1.4.4 Summary of implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
`1.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
`1.6 Conclusions ...................... ·. . . . . . . . . . . . . . . . . . . . . . . . . . 16
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
`
`2 Superdfrective Microphone Arrays
`Joerg Bitzer, K. Uwe Simmer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
`2.1 Introduction............ .. ......... . .......... ........ . ...... 19
`2.2 Evaluation of Beamformers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
`2.2.1 Array-Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
`2.2.2 Beampattern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
`2.2.3 Directivity.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
`2.2.4. Front-to-Back Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 .
`2.2.5 White Noise Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
`2.3 Design of Superdirective Beamformers . . . . . . . . . . . . . . . . . . . . . . . . . 24
`2.3.1 Delay-and-Sum Beamformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
`2.3.2 Design for spherical isotropic noise . . . . . . . . . . . . . . . . . . . . . . . 26
`2.3.3 Design for Cylindrical Isotropic Noise . . . . . . . . . . . . . . . . . . . . 30
`2.3.4 Design for an Optimal Front-to-Back Ratio . . . . . . . . . . . . . . . 30
`2.3.5 Design for Measured Noise Fields . . . . . . . . . . . . . . . . . . . . . . . . 32
`2.4 Extensions and Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
`2.4.1 Alternative Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
`
`I
`I
`
`- 9 -
`
`
`
`X
`
`Contents
`
`2.4.2 Comparison with Gradient Microphones ................. . 35
`2.5 Conclusion ............................................ . ... . 36
`References .................................................... . 37
`
`3 Post-Filtering Techniques
`K. Uwe Simmer, Joerg Bitzer, Claude Marro ...................... . 39
`3.1 Introduction ............................................... . 39
`3.2 Multi-channel Wiener Filtering in Subbands ................... . 41
`3.2.1 Derivation of the Optimum Solution .................... . 41
`3.2.2 Factorization of the Wiener Solution .............. ...... . 42
`3.2.3 Interpretation ........................................ . 45
`3.3 Algorithms for Post-Filter Estimation ........................ . 46
`3.3.1 Analysis of Post-Filter Algorithms ...................... . 47
`3.3.2 Properties of Post-Filter Algorithms ......... . .... ...... . 49
`3.3.3 A New Post-Filter Algorithm .......................... . 50
`3.4 Performance Evaluation .................................... . 51
`3.4.1 Simulation System ..... ..... ..................... . .... . 52
`3.4.2 Objective Measures ................................... . 52
`3.4.3 Simulation Results ............................... . .... . 54
`3.5 Conclusion .............. · .................. . ............... . 57
`
`4 Spatial Coherence Functions for Differential Microphones
`in Isotropic Noise Fields
`Gary W. Elko ........ ... ...................................... . 61
`4.1 Introduction ............................................... . 61
`4.2 Adaptive Noise Cancellation ........ .. ....................... . 61
`4.3 Spherically Isotropic Coherence .............................. . 65
`4.4 Cylindrically Isotropic Fields ................................ . 73
`4.5 Conclusions ............................................... . 77
`R,eferences .................................................... . 84
`
`5 Robust Adaptive Beamforming
`Osamu Hoshuyama, Akihiko Sugiyama . .. .... ..................... . 87
`5.1 Introduction ......... . ..................................... . 87
`5.2 Adaptive Beamformers .................... .. . . ...... . ...... . 88
`5.3 Robustness Problem in the GJBF ............................ . 90
`5.4 Robust Adaptive Microphone Arrays -
`Solutions to Steering-
`Vector Errors ..................... .. ...................... . 92
`5.4.1 LAF-LAF Structure ..... .... ......................... . 92
`5.4.2 CCAF-LAF Structure.................................. 94
`5.4.3 CCAF-NCAF Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
`5.4.4 CCAF-NCAF Structure with an AMC . . . . . . . . . . . . . . . . . . . 97
`5.5 Software Evaluation of a Robust Adaptive Microphone Array . . . . . 99
`5.5.1 Simulated Anechoic Environment....................... . 99
`5.5.2 Reverberant Environment .............................. 101
`
`!)
`
`"
`
`.f
`
`. t
`t
`
`- 10 -
`
`
`
`Contents
`
`XI
`
`5.6 Hardware Evaluation of a Robust Adaptive Microphone Array .... 104 ·
`5.6.1 Implementation . .................. . . . ................. 104
`5.6.2 Evaluation in a Real Environment ....................... 104
`5.7 Conclusion .............. . ..... . ............................ 106
`References ..................................................... 106
`
`6 GSVD-Based Optimal Filtering for Multi-Microphone Speech
`Enhancement
`Simon Doclo, Marc Moonen .................. . . . ................. 111
`6.1 Introduction ................ . ....... . ......... . ............. 111
`6.2 GSVD-Based Optimal Filtering Technique ..................... 113
`6.2.1 Optimal Filter Theory ................................. 114
`6.2.2 General Class of Estimators ............................. 116
`6.2.3 Symmetry Properties for Time-Series Filtering ............ 117
`6.3 Performance of GSVD-Based Optimal Filtering ................. 118
`6.3.1 Simulation Environment ................................ 118
`6.3.2 Spatial Directivity Pattern ....... . ...................... 119
`6.3.3 Noise Reduction Performance ........................... 121
`6.3.4 Robustness Issues ...................................... 121
`6.4 Complexity Reduction .... . .................................. 122
`6.4.1 Linear Algebra Techniques for Computing GSVD .......... 122
`6.4.2 Recursive and Approximate GSVD-Updating Algorithms ... 123
`6.4.3 Downsampling Techniques .............................. 125
`6.4.4 Simulations ......................... . ................. 125
`6.4.5 Computational Complexjty ............................. 126
`6.5 Combination with ANC Postprocessing Stage ................... 127
`6.5.1 Creation of Speech and Noise References ............. . ... 127
`6.5.2 Noise Reduction Performance of ANC Postprocessing Stage . 128
`6.5.3 Comparison with Standard Beamforming Techniques ....... 129
`6.6 Conclusion ..................... ... ......................... 129
`References ..................... .. .............................. 130
`
`7 Explicit Speech Modeling for Microphone Array Speech
`Acquisition
`Michael Brandstein, Scott Griebel ................................. 133
`7. l Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
`7.2 Model-Based Strategies .............. .. ...................... 136
`7.2.1 Example 1: A Frequency-Domain Model-Based Algorithm .. 137
`7.2.2 Example 2: A Time-Domain Model-Based Algorithm ....... 140
`7.3 Conclusion .................. .. ....... .. . . .................. 148
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
`
`Part II. Source Localization
`
`I
`I i
`I I
`
`~
`
`- 11 -
`
`
`
`XII
`
`Contents
`
`8 Robust Localization in Reverberant Rooms
`Joseph H. DiBiase, Harvey F. Silverman, Michael S. Brandstein ..... 157
`8.1 Introduction ................................................ 157
`8.2 Source Localization Strategies ........ ..... .... ......... ... ... 158
`8.2.1 Steered-Beamformer-Based Locators . ... ..... . ... .... .... 159
`8.2.2 High-Resolution Spectral-EsUmation-Based Local.ors ....... l .~iO
`8.2.3 TDOA-Based Locators ..... .. ..... .. ... .. .............. 161
`8.3 A Robust Localizat,ion Algorithm ................ ... .... . ..... 164
`8.3.1 The Impulse Response Model ..... ........... ........... 164
`8.3.2 The GCC and PHAT Weighting Function ................ 166
`8.3.3 ML TDOA-Based Source Localization ........ . ....... . ... 167
`8.3.4 SRP-Based Source Localization . ......... ... .. .... ... ... . 169
`8.3.5 The SRP-PHAT AlgoriLhm ... ......... ..... . ........... 170
`8.4 Experimental Comparison ........... . .......... ... ........... 172
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
`
`9 Multi-Source Localization Strategies
`Elio D. Di Claudio, Raffaele P arisi . ............................... 181
`9.1 Introduction .......... ... ....................... . ........... 181
`9.2 Background .. ...... .. . ..... .. .... .. ........ . .. ..... .. . ..... 184
`9.2.1 Array Signal Model ... .......... ... .... .. .. ....... .. . .. 184
`9.2.2 Incoherent Approach ... ... .......... .. ................. 185
`9.2.3 Coherent Signal Subspace Method (CSSM) ............ ... 185
`9.2.1 Wideband Weighted Subspace Fitting (WB-WSF) ... . ..... 186
`9.3 The Issue of Coherent, Multipath in Array Processing ............ 187
`9.4 IrnplementaLion Issues ............... ... . . .. ... .... .. ... ..... 188
`9.5 Linear Predict.ion-ROOT-MUSIC TDOA EstimaLion ...... .. .... 189
`9.5.1 Signal Pre-Whitening . ... ........... ......... .......... 189
`9.5.2 An Approximate Model for Multiple Sources in Reverberant
`Environments ......... ... ..... . ....................... 191
`9.5.3 Robust, TDOA EsLimation via ROOT-MUSIC ... ... ... ... . 192
`9.5.1 Estimation of the Number of Relevant Reflections ..... .... 194
`9.5.5 Source Clustering ...................................... 195
`9.5.6 Experiment.al Results .. . ......... ..... . ... ..... . ..... . . 196
`References .. : ............................... ... .......... .. .... 198
`
`10 Joint Audio-Video Signal P rocessing for Object Localiza-
`tion and Tracking
`Norbert Strobel, Sascha Spors, Rudolf Rabenstein . ................... 203
`10.1 Introduction ... . ...... . ............ .. ....................... 203
`10.2 Recursive State Estimation ... . ............................... 205
`10.2.l Linear Kalman Filter ........ .. ..... ... ...... .. ........ 206
`10.2.2 Extended Kalman Fill.er due Lo a Measurement Nonlinearity 210
`10.2.3Decentralized Kalman Filter ........ . . ............ .. .... 212
`10.3 JmplemenLation .. ..... .. .. .... ... ....... . .......... .. ... .... 218
`
`(cid:141)
`
`- 12 -
`
`
`
`10.3.1 System description ............... .... .... ....... ....... 218
`10.3.2Results ............................................... 219
`10.4 Discussion and Conclusions ............... ..... .. ..... ....... 221
`References ..................................................... 222
`
`Contents
`
`XTTI
`
`Part III. Applications
`
`11 Microphone-Array Hearing Aids
`Julie E. Greenberg, Patrick M. Zurek ... .. ................. .. ...... 229
`11.1 Introduction ................................................ 229
`11.2 Implications for Design and Evaluation ........................ 230
`11.2.1 Assumptions Regarding Sound Sources .... .. ........... . . 230
`11.2.2Implementation Issues .......................... ... ..... 231
`ll.2.3Assessing Performance ................................. 232
`11.3 Hearing Aids with Directional Microphones .................... 233
`11.4 Fixed-Beamforming Hearing Aids ............................. 234
`11.5 Adaptive-Beamforming Hearing Aids .......................... 235
`11.5.1 Generalized Sidelobe Canceler with Modifications .......... 236
`11.5.2 Scaled Projection Algorithm ..... ..................... .. 242
`ll.5.3Direction of Arrival Estimation .......................... 243
`11.5.4 Other Adaptive Approaches and Devices .... . ............ 243
`11.6 Physiologically-Motivated Algorithms .......................... 244
`11.7 Beamformers with Binaural Outputs .......................... 245
`11.8 Discussion ......... .. ............ _ . . . . . . . . . . . . . . . . . . . . . . . . . 246
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
`
`12 Small Microphone Arrays with Postfilters
`for Noise and Acoustic Echo Reduction
`Rainer Martin .... . ............................................. 255
`12.1 Introduction ...... ...... .................................. .. 255
`12.2 Coherence of Speech and Noise ....................... .... .... 257
`12.2.1 The Magnitude Squared Coherence ...................... 257
`12.2.2 The Reverberation Distance ............................ 258
`12.2.3 Coherence of Noise and Speech in Reverberant Enclosures .. 259
`12.3 Analysis of the Wiener Filter with Symmetric Input Signals ...... 263
`12.3. l No Near End Speech ................................... 265
`12.3.2 High Signal to Noise Ratio .............................. 265
`12.4 A Noise Reduction Application ................ ............... 266
`12.4.1 An Implementation Based on the NLMS Algorithm ........ 266
`12.4.2 Processing in the 800 - 3600 Hz l3and .................... 268
`12.4.3Processing in the 240 - 800 Hz Band ..................... 269
`12.4.4Evaluation .............................. . ............. 269
`12.4.5 Alternative Implementations of the Coherence Based Postfilter271
`12.5 Combined Noise and Acoustic Echo Reduction .................. 271
`
`•
`
`•
`
`I
`I
`I
`!
`!
`I
`
`j
`
`I,.]
`i
`
`·I
`
`I I
`I
`
`- 13 -
`
`
`
`XIV
`
`Contents
`
`12.5.1 Experimental Results .. .. .......... .... ................ 274
`12.6 Conclusions .... . ........ ........................ ..... . ..... 275
`References ............................... .. ................ . ... 276
`
`13 Acoustic Echo Cancellation for Beamforming
`Microphone Arrays
`Walter L. Kellermann ....... . .. . .............. .................. 281
`13.1 Introduction ........ ... ......... ......... ...... ........ .... . 281
`13.2 Acoustic Echo Cancellation .................. .. .............. 282
`13.2.1 Adaptation algorithms ............ . .................. . . 284
`13.2.2AEC for multi-channel sound reproduction ... . ... ......... 287
`13.2.3 AEC for multi-channel acquisition ....................... 287
`13.3 Beamforming ................................ . .. : ........... 288
`13.3.1 General structure ........ . ............................. 288
`13.3.2Time-invariant beamform.ing ............................ 290
`13.3.3 Time-varying beamforming ............... _ .............. 291
`13.3.4 Computational complexity ......... .... ................. 292
`13.4 Generic structures for combining AEC with beamforming ........ 292
`13.4.1 Motivation ..................................... : ...... 292
`13.4.2 Basic options ......................................... 293
`13.4.3 'AEC first' ................... . . ... .. ... . .............. 293
`13.4.4 'Beamforming first' .................................... 296
`13.5 Integration of AEC into time-varying beamform.ing ............. 297
`13.5.1 Cascading time-invariant and time-varying beamforming .... 297
`13.5.2AEC with GSC-type beamforming structures ............. 301
`13.6 Combined AEC and beamforming for multi-channel recording and
`multi-channel reproduction ................................... 302
`13. 7 Conclusions .................. . ............................. 303
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
`
`14 Optimal and Adaptive Microphone Arrays for Speech In(cid:173)
`put in Automobiles
`Sven Nordholm, Ingvar Claesson, Nedelko Grbic . ................... 307
`14.1 Introduction: Hands-Free Telephony in Cars .................... 30_7
`14.2 Optimum and Adaptive Beamforming ..... ............ ....... . 309
`14.2.1 Common Signal Modeling ............................ . . 309
`14.2.2 Constrained Minimum Variance Beamforming and the Gen-
`eralized Sidelobe Canceler .............................. 310
`14.2.3 In Situ Calibrated Microphone Array (ICMA) .... . . ....... 312
`14.2.4 Time-Domain Minimum-Mean-Square-Error Solution . ...... 313
`14.2.5Frequency-Domain Minimum-Mean-Square-Error Solution .. 314
`14.2.6Optimal Near-Field Signal-to-Noise plus Interference Beam-
`former ........ ... ................ ....... .. ........ ... 316
`14.3 Sub band Implementation: of the Microphone Array .............. 317
`14.3.1 Description of LS-Subband Beamforming ................. 318
`
`- 14 -
`
`
`
`J
`:1
`
`Contents
`
`XV
`
`14.4 Multi-Resolution Time-Frequency Adaptive Beamforming ........ 319
`14.4.1 Memory Saving and Improvements ....................... 319
`14.5 Evaluation and Examples ...... . ............................. 320
`14.5.1 Car Environment ......... . ......................... . ... 320
`14.5.2Microphone Configurations ............................. 321
`14.5.3Performance Measures ................................. 321
`14.5.4Spectral Performance Measures . . ........................ 322
`14.5.5Evaluation on car data ................ . ................ 323
`14.5.6Evaluation Results ......................... . ......... \ . 323
`14.6 Summary and Conclusions ..... . ................... . ......... 324
`References ................ . .............. . ..................... 326
`
`15 Speech Recognition with Microphone Arrays
`Ma-urizio Omologo, Marco Matassoni, Piergiorgio Svaizer ............ 331
`15.1 Introduction ................................................ 331
`15.2 State of the Art ............................ ... ............. 332
`15.2.1 Automatic Speech Recognition .......................... 332
`15.2.2Robustness in ASR ............ : .... . .................. 336
`15.2.3Microphone Arrays and Related Processing for ASR ....... 337
`15.2.4Distant-Talker Speech Recognition ........... . ........... 339
`15.3 A Microphone Array-Based ASR System ....................... 342
`15.3.1 System Description . · ..... . ......................... . ... 342
`15.3.2 Speech Corpora and Task ................... . ........... 345
`15.3.3Experiments and Results .............. . ...... . ......... 346
`15.4 Discussion and Future Trends .