`
`- i -
`
`Amazon v. Jawbone
`U.S. Patent 11,122,357
`Amazon Ex. 1004
`
`
`
`VUE VstIVUG.VI YW sPV/Inidgdex, himl
`
`JMBER 8.
`
`ITPRED
`
`(ISSN 1053-5
`
`sing
`: Fractional Lower Order Statistics... T.-H. Liu and J. M. Me,
`lonarity with Applications fO- Speeeh 5555) oie wccseasca cent
`Sete es woe TO Ge Locwigt S. Gannot, D. Burshtein, and E. Weins,
`
`Zing Continuous-Time Filters...:... C. Wan andA. M. Schneic
`“quency Distributions, and Their Applications; ..o 2,05. ee.
`De at PE Lace ol gare dace dha weg wale E Sis oe S.-C. Pei and J.-J. Di;
`‘OUS Bandwidth and FrequencyConstraints .....ue Ws Bie a BO we dans
`ieee Bak Sy, Tt ttt tte e sees cess... F Cakrak and P. J. Loughli
`
`- ii -
`
`
`
`IEEE SIGNAL PROCESSING SOCIETY
`
`6
`
`the IEEE, of members with principal professionalinterest in the technology of transmission,
`The Signal Processing Society isan,organizatiqn, within the framework of
`d other signals by digital electronic, electrical, acoustic, mechanical, and optical reat
`juction,
`processing, and measurementof speech; other audio-frequency waves an
`g
`u
`;
`e
`casa a systems to accomplish these and related aims; and the environmental, psychological, and physiological factors of these technologies. All members of the
`are
`eligible for membership in the Society and will receive this TRANSACTIONS upon payment ofthe annual Society membership fee of $20.00 plus an annual subscriptionfee o}
`f $35.00,
`For information on joining, write to the IEEE at the address below. Membercopies of Transactions/Journals are for personal use only.
`:
`SOCIETY PUBLICATIONS
`Transactions on Signal Processing
`Transactions on Image Processing
`ARYE NEHORAI,Editor in-Chief
`A. C. BOVIK, Editor-in-Chief
`University of Texas at Austin
`University of Illinois at Chicago
`Chicago, IL 60607-7053
`Austin, TX 78712
`Signal Processing Magerine
`A. KATSAGGELOS,Editor-in-Chief
`S,
`Ed
`Northwestem University
`Evanston, IL 60628
`TRANSACTIONS ASSOCIATE EDITORS
`F. GINI
`a Li
`C. CHAKRABARTI
`Univ. Pisa
`.
`Arizona
`St. Univ.
`Pisa, Italy
`Yorktown Heights, NY 10598
`tomeAz or:
`A. GOROKHOV a
`CHEN
`Philips Research
`Univ. Illinois
`Camegie Mellon Univ.
`Eindhoven, The Netherlands
`Chicago, IL 60607
`Pittsburgh, PA 15213
`F, GUSTAFSSON
`P. LOUBATON
`C.-Y. CHI
`Univ. Marne-la-Vallée
`Linképing Univ.
`Natl. Tsing Hua Univ.
`Mame-la-Vallée, France
`Hsinchu, Taiwan
`Linkoping, Sweden
`Z.-Q. LUO
`;
`S. HEMAMI
`D. COCHRAN
`McMaster Univ.
`Comell Univ.
`ArizonaState Univ.
`Hamilton, ON, Canada
`Ithaca, NY 14853
`Tempe, AZ 85287
`D. R. MORGAN
`M. IKEHARA
`Z. DING
`Bell Labs.-Lucent Technol.
`Keio Univ.
`Univ. California Davis
`Murray Hill, NJ 07974
`Yokohoma, Japan
`Davis, CA 95616
`R. L. MOSES
`S. M. JESUS
`P. S. R. DINIZ
`Ohio St. Univ.
`Univ. Algarve
`Fed. Univ. of Rio de Janiero
`Columbus OH 43210
`Faro, Portugal
`Rio de Janiero, Brazil
`B. OTTERSTEN
`H. KRIM
`P. J. S. G, FERREIRA
`Royal Inst. Technol.
`North Carolina St. Univ.
`Univ. de Aveiro
`Stockholm, Sweden
`Raleigh, NC 27695
`Aveiro, Portugal
`A, P. PETROPULU
`A. KOT
`I. FUALKOW
`Drexel Univ.
`Nanyang Technol. Univ.
`ETIS/ENSEA
`Philadelphia, PA 19104
`Singapore
`Cergy-Pontoise, France
`M.J. RENDAS
`V. KRISHNAMURTHY
`A. B. GERSHMAN
`Laboratoire 13S
`Univ. Melbourne
`McMaster Univ.
`Biot, France
`Parkville, Australia
`Hamilton, ON, Canada
`THE INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS,INC.
`Officers
`
`Trans. Speech & Audio Processing
`B.-H. JUANG, Editor-in-Chief
`Bell Labs., Lucent Technologies
`"
`Murray Hill, NJ 07974
`Trans. on SP, IP, and SAP
`SPS Publications Office
`IEEE Signal Processing Society
`Piscataway, NJ 08855
`
`oan eave
`;
`.
`Los Angeles, CA 90095N.D.
`Cite,Wienesone
`Minneapolis, MN 55455
`eT ‘scemi
`
`ooMass.Inst. Technol.
`Cambridge, MA 02713
`A. K. SHAW
`;
`.
`Liles ajan
`: rapa
`7 6 me .
`Bees fio. | ane
`eer-Sheva,
`Israe
`Mi ik nol
`‘ass. Ins Vk Pe3
`Cambridge,
`xX. i a
`Texas A&M
`Univ.
`College Station, TX 77843
`X.-G. XIA
`Uniy. Delaware
`Newark, DE 19716
`A. ZOUBIR
`Curtin Univ. Technol.
`Perth, Australia
`
`Publications Board Chair
`J. M. F. MOURA, VP-Publications
`je Mellon Universi
`feiep: PA mired
`Srceasling
`a —
`G. B, GIANNAKIS
`University of Minnesota
`Minneapolis, MN 55455
`K. ABED-MERAIM
`i
`pscom Paris
`N. AL-DHAHIR
`AT&T
`Florham Park, NJ 07932
`G. ARCE
`Univ. Delaware
`Newark, DE 19716
`K. L. BELL
`George Mason Univ.
`Fairfax, VA 22030
`J. C. M. BERMUDEZ
`Fed. Univ. Santa Catarina
`Florianopolis, Brazil
`O. BESSON
`ENSICA
`Toulouse, France
`R. S. BLUM
`Lehigh Univ.
`Bethlehem, PA 18015
`H. BOELCSKEI
`Univ.Illinois
`Urbana, IL 61801
`O. CAPPE
`ENST
`Paris, France
`
`JOEL B. SNYDER, President
`RAYMOND D.FINDLAY,President-Elect
`HUGO M. FERNANDEZ VERSTEGEN,Secretary
`DALE M. CASTON,Treasurer
`BRUCE A. EISENSTEIN, Past President
`LYLE D. FEISEL, Vice President, EducationalActivities
`NAHID KHAZENIE, Division IX Director—Signals and Applications
`Executive Staff
`DANIEL J. SENESE, Executive Director
`RICHARD D. SCHWARTZ, Business Administration
`W. THOMAS SUTTLE,ProfessionalActivities
`MARY WARD-CALLAN,Technical Activities
`JOHN WITSKEN, /nformation Technology
`
`JAMES M.TIEN,Vice President, Publications, Products & Services
`MAURICEPapo,Vice President, Regional Activities
`MARCO W.MIGLIARO,President, Standards Association
`LEWIS M. TERMAN,Vice President, Technical Activities
`NED R. SAUTHOFF,President, IEEE USA
`
`DONALD CURTIS, Human Resources
`ANTHONY DURNIAK, Publications
`JUDITH GORMAN,Standards Activities
`CECELIA JANKOWSKI, RegionalActivities
`PETER A LEWIS, Educational Activities
`
`IEEEPeriodicals
`Transactions/Journals Department
`Staff Director: FRAN ZAPPULLA
`Editorial Director. DAWN M. MELLEY
`Production Director. ROBERT SMREK
`Transactions Manager: GAIL S. FERENC
`Managing Editor. MARTIN J. MORAHAN
`Senior Editor: CHRISTOPHER PERRY
`
`IEEE TRANSACTIONS ON SIGNAL PRocessinc (ISSN 1053-587X)is published monthly by the Institute of Electrical and Electronics Engineers, Inc. Responsibility for the contents rests
`uponthe authors and not upon the IEEE,the Society/Council, or its members. IEEE Corporate Office: 3 Park Avenue, 17th Floor, New York, NY 10016-5997. IEEE Operations
`Center: 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331. NJ Telephone: +1 732 981 0060. Price/Publication Information: Individual copies: IEEE Members $10.00
`(first copy only), nonmembers $20.00 per copy. (Note: Add $4.00 postage and handling charge to any order from $1.00 to $50.00, including prepaid orders.) Member and nonmember
`subscription prices available upon request. Available in microfiche and microfilm. Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries
`are permitted to photocopy for private use of patrons, provided the per-copy fee indicated in the code at the bottom ofthe first page is paid through the Copyright Clearance
`Center, 222 Rosewood Drive, Danvers, MA 01923. Forall other copying, reprint, or republication permission, write to Copyrights and Permissions Department, IEEE Publications
`Administration, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331. Copyright © 2001 by the Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
`Periodicals Postage Paid at New York, NY and at additional mailing offices. Postmaster: Send address changes to IEEE TRANSACTIONS ON SIGNAL ProcessiNG, IEEE, 445 Hoes
`Lane, P.O. Box 1331, Piscataway, NJ 08855-1331. GST Registration No. 125634188. Printed in U.S.A.
`
`
`
`- iii -
`
`
`
`1614
`
`IEEE TRANSACTIONSON SIGNAL PROCESSING, VOL. 49, NO. 8, AUGUST200;
`
`Signal Enhancement Using Beamforming and
`Nonstationarity with Applications to Speech
`
`Sharon Gannot, Student Member, IEEE, David Burshtein, Senior Member, IEEE, and Ehud Weinstein, Fellow, IEEE
`
`Abstract—We consider a sensor array located in an enclo-
`sure, where arbitrary transfer functions (TFs) relate the source
`signal and the sensors. The array is used for enhancing a signal
`contaminated by interference. Constrained minimum power
`adaptive beamforming, which has been suggested by Frost and,
`in particular, the generalized sidelobe canceler (GSC) version,
`which has been developed by Griffiths and Jim, are the most
`widely used beamforming techniques. These methods rely on the
`assumption that the received signals are simple delayed versions
`of the source signal. The good interference suppression attained
`underthis assumptionis severely impaired in complicated acoustic
`environments, where arbitrary TFs may be encountered. In this
`paper, we consider the arbitrary TF case. We propose a GSC
`solution, which is adapted to the general TF case. We derive a
`suboptimal algorithm that can be implemented by estimating
`the TFs ratios, instead of estimating the TFs. The TF ratios are
`estimated by exploiting the nonstationarity characteristics of the
`desired signal. The algorithm is applied to the problem of speech
`enhancementin a reverberating room. The discussion is supported
`by an experimental study using speech and noise signals recorded
`in an actual room acoustics environment.
`
`Index Terms—Beamforming, nonstationarity, speech enhance-
`ment.
`
`I.
`
`INTRODUCTION
`
`IGNAL quality might significantly deteriorate in the
`
`G rrescnce of interference, especially when the signal
`
`is
`
`also subject to reverberation. Multisensor-based enhancement
`algorithms typically incorporate both spatial and spectral
`information. Hence,
`they have the potential to improve on
`single sensor solutions that utilize only spectral information.
`In particular, when the desired signal is speech, single micro-
`phonesolutions are knownto be limited in their performance.
`Beamforming methods have therefore attracted a great deal of
`interest in the past three decades. Applications of beamforming
`to the speech enhancementproblem have also emergedrecently.
`Constrained minimum power adaptive beamforming, which
`has been suggested by Frost [1], deals with the problem of a
`broadband signal received by an array, where pure delay re-
`lates each pair of source and sensor. Each sensorsignal is pro-
`_cessed by a tap delay line after applying a proper time delay
`
`Manuscript received March 28, 2000; revised April 30, 2001. The associate
`editor coordinating the review ofthis paper and approvingit for publication was
`Dr. Alex C. Kot.
`the Department
`is with
`S. Gannot
`(SISTA), Katholieke Universiteit Leuven,
`Sharon.Gannot @esat.kuleuven.ac.be).
`D. Burshtein and E. Weinstein are with the Department of. Electrical
`Engineering—Systems, Tel-Aviv University, Tel-Aviv,
`Israel
`(e-mail:
`burstyn @eng.tau.ac.il, udi@eng.tau.ac.il),
`Publisher Item Identifier S 1053-587X(01)05874-3.
`
`of Electrical Engineering
`Leuven, Belgium (e-mail:
`
`compensation. The algorithm is capable ofsatisfying some de-
`sired frequency responsein the look direction while minimizing
`the output noise power by using constrained minimization of
`the total output power. This minimization is realized by ad-
`justing the taps ofthe filters under the desired constraint. Frost
`suggested a constrained LMS-type algorithm. Griffiths and Jim
`[2] reconsidered Frost’s algorithm and introduced the general-
`ized sidelobe canceler (GSC) solution. The GSC algorithm is
`comprisedof three building blocks. Thefirst is a fixed beam-
`former, whichsatisfies the desired constraint. The secondis
`a blocking matrix, which produces noise-only reference sig-
`nals by blocking the desired signal (e.g., by subtracting pairs of
`time-aligned signals). The third is an unconstrained LMS-type
`algorithm that attempts to cancel the noise in the fixed beam-
`formeroutput. In [2], it is shown that Frost algorithm can be
`viewed as a special case of the GSC. The main drawback ofthe
`GSCalgorithm isits delay-only propagation assumption.
`Van Veen and Buckley [3] summarized various methods for
`spatial filtering, including the GSC, and introduced a wider
`range of possible constraints on the beam pattern. Cox efal.
`[4] suggested constraint of the norm of the adaptive canceler
`coefficients in order to solve the superdirectivity problem,
`i.e.,
`its sensitivity to steering errors. In particular, they have
`suggested to update Frost’s (or the Griffiths and Jim) algorithm
`by applying a quadratic constraint on the norm of the noise
`canceler coefficients. This constraint, which can limit
`the
`superdirectivity, is added to the usual linear constraints.
`Someauthorshaverecently suggestedusing the GSC forspeech
`enhancementin a reverberating environment. Hoshuyamaetal.
`[5]-[7] used a three-block structuresimilar to the GSC. However,
`the blocking matrix has been modified to operate adaptively. In
`orderto limitthe leakageofthedesired signal, whichisresponsible
`fordistortioninthe outputsignal, aquadraticconstraintisimposed
`onthe normofthe noisecancelercoefficients. Alternatively, useof
`the leaky LMSalgorithm has been suggested.
`Nordholm et al.
`[8] used a GSC solution in which the
`blocking matrix is realized by spatial highpass filtering, thus
`yielding improved noise-only reference signals. Meyer: and
`Sydow [9] have suggested to construct
`the noise reference
`signals by steering the lobes of a multibeam beamformer
`toward the noise and desired signal directions separately.
`Widrow and Stearns [10] have proposed a dual structure
`beamformer. The master beamformeradaptsits coefficients to
`minimize the output power while maintaining the beam-pattern
`toward a predetermined pilot signal from the desired direc-
`tion. Those coefficients are continuously copied to a slave
`beamformer that.is used to enhance the speech signal. Dahl
`et al.
`[11] have extended this solution by proposing a dual
`
`1053-587X/01$10.00 © 2001 IEEE
`
`- 1614 -
`
`- 1614 -
`
`
`
`GANNOT et al.: SIGNAL ENHANCEMENTUSING BEAMFORMING AND NONSTATIONARITY
`
`1615
`
`beamformer that attempts to cancel both noise and jammer
`signals (e.g., loudspeaker). The pilot signal is constructed by
`offline recordings of the jammer and desired signal
`in the
`actual acoustic environmentduring a calibration phase. Thus,
`both echo cancellation and noise suppression are achieved
`simultaneously.
`Othersolutionsutilize a beamformertype algorithm,followed
`by a postprocessor. Zelinski [12] suggested a Wienerfilter, fol-
`lowed by further noise reduction in a postprocessing configura-
`tion. Meyer and Simmer[13] addressed the problem of high co-
`herence betweenthe microphonesignals at low frequencies,as in-
`dicated by Dal-Degan andPrati 14]. They have suggestedthe use
`ofa spectral subtraction algorithm in the low-frequency band and
`Wienerfiltering in the high-frequency band. Fischer and Kam-
`meyer[15] suggested to further split the microphone array into
`differentially equispaced subarrays. This structure has been fur-
`ther analyzed by Marro etal. [16]. Bitzer eral. [17] analyzed the
`performance of the GSC solution and showedits dependence on
`the noise field. They showed thatthe noise reduction mightbe in-
`finitely large when the noise source is directional. However, in
`the more practical situation of a reverberant enclosure, when the
`noise field can be regarded as diffused, the performance degrades
`severely. Bitzer ef al. [18] suggested a GSC with fixed Wienerfil-
`ters in the noise canceling blockand furtherpostfilters at the GSC
`output, An improved performancein the lower frequency range
`is achieved. In [19], itis shownthat the Wienerfilters can be com-
`puted in advancebyutilizing prior knowledgeofthenoisefield.
`Jan and Flanagan [20] suggested a matched filter beam-
`forming (MFBF) instead of the conventional delay and sum
`beamformer (DSBF). The MFBFconfiguration realizes signal
`alignment by convolving the microphone signals with the
`(estimated) acoustic transfer function (TF). Rabinkinet al. [21]
`proved that the performance of MFBFis superior to DSBF,
`provided that the room acoustics TF is not too complicated.
`They have also suggested truncation of the estimated acoustic
`TFsto ensure reliable estimates.
`Grenier et al. [22]-[29] have proposed GSC-based enhance-
`mentalgorithms.In [29], the case where general TFs relate the
`source and microphones was considered. A subspace tracking
`solution [30] has been proposed. The resulting TFs are con-
`strained to the array manifold underthe assumption of an FIR
`model and small displacements of the talker. The fixed beam-
`former block of the GSC is realized using MFBF.
`In this paper, we consider a sensor array located in an enclo-
`sure, where general TFsrelate the source signal and the sensors.
`Thearray is used for enhancing a signal contaminated byinter-
`ference. We propose a GSCsolution, which is adapted to the
`general TF case. The TFs are estimated by exploiting the non-
`stationarity characteristics of the desired signal. The algorithm
`is applied to the problem of speech enhancement in a rever-
`berating room. The discussion is supported by an experimental
`study using speech and noise signals recorded in an actual room
`acoustics environment. The outcomeconsists of the assessment
`of sound sonograms, signal-to-noise ratio (SNR) enhancement,
`and informal subjective listening tests. The paper is organized
`as follows. In Section II, we formulate the problem of beam-
`forming in a general TF environmentin the frequency domain.
`The constrained power minimizationis presented in SectionIII,
`
`where both Frost’s algorithm [1] and the Griffiths and Jim [2]
`interpretation are derived in the frequency domain. This deriva-
`tion motivatesthe intuitive structure suggested by other authors
`for the beamformingproblem in reverberant environments. We
`then show that a suboptimal algorithm can be implemented by
`estimating the TF ratios instead of estimating the actual TFs. In
`Section IV, we address the problem ofestimating the TF ratios
`by extendingthe nonstationarity principle, which was suggested
`by Shalvi and Weinstein [31]. An application of the suggested
`algorithm to the speech enhancement problem is presented in
`Section V. Section VI concludes the paper.
`
`II. PROBLEM FORMULATION
`
`Consideran array of sensors in a noisy and reverberantenvi-
`ronment. Thereceived signal is comprised of two components.
`Thefirst is some nonstationary (e.g., speech) signal. The second
`is somestationary interference signal. Our goal is to reconstruct
`the nonstationary signal componentfrom the received signals.
`Weuse the following notation.
`Zm(t)
`mth sensor signal;
`a(t)
`desired signal source;
`Nn (t)
`interference signal of the mth sensor comprised of
`some directional noise component and some am-
`bient noise component;
`time-varying TFs from the desired speech source to
`the mth sensor.
`
`a,,(t)
`
`We have
`
`Zm(t) = am(t) * (t) + M(t); m=1,....M (1)
`
`where * denotes convolution. Suppose that the analysis frame
`duration T’ is chosen such that the signal may be considered
`stationary over the analysis frame. Typically,
`the TFs are
`changing slowly in time so that they may also be considered
`stationary over the analysis frame. Multiplying both sides of
`(1) by a rectangular window function w(t) [w(t) = 1 over the
`analysis frame w(t) = 0 otherwise] and applying the discrete
`time Fourier transform (DTFT)operatoryields
`
`Zm(t, 8”) & Am(e?”)S(t, e2”) + Nin(t, e%”)
`
`m=1,..., M.
`
`(2)
`
`justified for T sufficiently large.
`The approximation is
`Zm(t, e”), S(t, e”) and Nm(t, e%) are the short
`time
`Fourier transforms (STFTs) of the respective signals. Am(e%”)
`is the TF of the mth sensor. Note that we have assumed that
`the TFs are time invariant.
`The vector formulation of the equation set (2) is
`
`Z(t, e”) = A(e™)S(t, e”) + Nit, e)
`
`(3)
`
`where
`
`Z(t, &) = [Zi(t, e*) Zal(t, e) --- Zrr(t, e™)|
`AT (e¥) = [Ar(e™) Ag(e”) +. Anu (e)]
`N7(t, e%”) = [Ni(t, 2%) No(t, ce”)
`--- Nu(t, e)).
`
`- 1615 -
`
`- 1615 -
`
`
`
`1616
`
`IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL.49, NO. 8, AUGUST 2091
`
`wt (t, e”)bz2(t, ee“)Wit, ev)
`
`Wopt (t, e )
`F(e*) = ots Fle’)
`
`
`
`
`
`
`
`Constraint plane: At(e*”)W(t,e”) = Fle”)
`
`Fig.1.
`
`Constrained minimization.
`
`III. CONSTRAINED OUTPUT POWER MINIMIZATION
`
`In [1], a beamforming algorithm was proposed under the
`assumption that the TF from the desired signal source to each
`sensor includes only gain and delay values, In this section,
`we consider the general case of arbitrary TFs. By following
`the derivation of [1] in the frequency domain, we derive a
`beamforming algorithm for the general TF case. First, we
`obtain a closed-form, linearly constrained, minimum variance
`beamformer. Then, we derive an adaptive solution. The out-
`come will be a constrained LMS-type algorithm. We proceed,
`following the footsteps of Griffiths and Jim [2], and formulate
`an unconstrained adaptive solution. We will initially assume
`that the TFs are known.Later, in Section IV, we deal with the
`problem of estimating the TFs.
`
`A. Frequency Domain Frost Algorithm
`1) Optimal Solution: Let W*(t, e”);m=1,..., Mbea
`set of M filters
`
`Writ, ei”) = (Wit, ee”) Wi(t, e”)
`
`--- Waylt, e*)]
`
`where * denotes conjugation, and t denotes conjugation trans-
`pose. A beamformeris realized by filtering each sensor output
`by W*(t, ce”) m=1,..., M and summing the outputs
`
`Y(t, ei”) = W(t, e”)Z(t, e”)
`= Writ, e”)A(e%”)S(t, el)
`+ W(t, e%”)N(t, e”)
`Sy, (t,) + Ya(t, ”)
`
`7)
`
`where Y,(t, e?”) is the desired signal part, and Y,,(t, e”) is the
`noise part. The output power of the beamformeris
`
`E{Y(t, &)¥*(t, e)}
`= E{Wit, e”)Z(t, e)Zt(t, e”)W(t, e””)}
`= Wi(t, e™”)Sz2(t, e7”)Wt, e”)
`
`where Szz(t, 4”) 2 E{Z(t, e%)Zt(t, e”)}. We want to
`minimize the output powersubject to the following constraint
`on Y,(t, e?”)
`Y,(t, e#”) = Wit, e@”)A(e%”)S(t, e?”)
`= F*(t, e™”)S(t, e”)
`
`where F*(t, e?”) is some prespecified filter (usually a simple
`delay). We thus have the following minimization problem:
`min {Wi(t, 8”)bzz(t, e”)W(t, e”)}
`subject to W(t, 7”)A(e™) = F*(t, e!”).
`
`(5)
`
`Theminimization (5) is demonstrated in Fig. 1. The point where
`the equipower contours are tangent to the constraint plane is
`the optimum vector of beamformingfilters. The perpendicular
`F(e?“) from theorigin to the constraint plane will be calculated
`in Section III-A2.
`To solve (5), we first define the following complex Lagrange
`functional:
`
`L(W) = Wit, e)dz2(t, ce”)W(t, e”)
`+A [wiie, e™”)A(el”) — F*(t, e”)|
`+A* [AT(t, e”)W(e) — F(t, e*)]
`
`where A is a Lagrange multiplier. Setting the derivative with
`respect to W*to 0 (e.g., [32]) yields
`
`Vw-L(W) = z2z(t, &”)W(t, e”) + AA(e™) =0.
`
`Now,recalling the constraintin (5), we obtain the following set
`of optimal filters:
`W(t, el) = [At(e)O52(t, &)A(e)]
`- BZ2(t, ef”)A(e)Fle).
`
`This closed-form solutionis difficult to implementand does not
`havethe ability to track changes in the environment. Therefore,
`an adaptive solution should be more useful.
`
`- 1616 -
`
`- 1616 -
`
`
`
`GANNOT et al.: SIGNAL ENHANCEMENT USING BEAMFORMING AND NONSTATIONARITY
`
`1617
`
`Wit = 0, e) = F(eiw)
`Wi(t+1,e) =
`P(e”) [W(t,e) —ee,el)¥*(t,ed”)] +Fev)
`(P(e#) and F(e#)are defined by (6) and (7).
`
`Fig. 2. Frequency domain frost algorithm.
`
`2) Adaptive Solution: Consider the following steepest de-
`scent, adaptive algorithm:
`
`W(t +1, e”)
`= Wit, e”) — nVw- Le)
`= Wit, e™) — w [Szz(t, &)W(t, e*) + AA(e™)].
`Imposing our constraint on W(t + 1, e”) yields
`
`F(e™) = At(e*)W(t + 1, e)
`= At(e*)Wit, e/”)
`— pAl(e)”)Sz7(t, ei”)W(t, ei)
`— pAt(e”)A(e3”).
`
`Solving for the Lagrange multiplier and applying furtherre-
`arrangement of terms yields
`
`W(t+1, e”) = P(e”)Wit, e”)
`— pP(e?”)Ozz(t, e@”)W(t, e”)+F(e”)
`
`where
`
`A(e)At(el)
`5
`")
`(AIP
`P(e”) =I - —— (6)
`
`and Jim for our case (arbitrary TFs) and derive an unconstrained
`adaptive enhancement algorithm.
`Considerthe null space of A(e!”), which is defined by
`
`N(e*) = {W|At(e?”)W = 0}.
`
`The constraint hyperplane
`
`A(e”)) 2 {W| At(e#)W = F(e*)}
`is parallel to. ’(e”’). In addition to that, let
`
`R(ei#) & {«A(e%™) | for any real «}
`be the column space. By the fundamental theorem of linear al-
`gebra (e.g., (33]) R(ei”) 1 N(e%”). In particular, F(e™) is
`perpendicular to V/(e4”) since
`
`jw
`F(e)
`jw
`ju
`F(e’ j= jaceyqe A ) E Rie? ).
`
`Furthermore
`
`At (ec)F(e)
`= At(e)A(e%) (At(e*)A(e%)) * F(e5”) =
`Thus, F(e%”) € A(e”) and F(e”) 1 A(e”). Hence, F(e?”)
`is the perpendicular from the origin to the constraint hyperplane
`A(e3”), The matrix P(e”), which is defined in (6), is the pro-
`jection matrix to the null space of A(e”’), N(e7”).
`Now,a vector in linear space can be uniquely split into a sum
`of two vectors in mutually orthogonal subspaces (e.g., [33]).
`Hence
`
`Fie”).
`
`Wit, e%”) = Wo(t, e”) — V(t, e*”)
`
`(8)
`
`and
`
`F(e*”) = —*___F(e?”). (7)
`
`;
`where Wo(t, e””) € R(e!), and —V(t, e%”) € N(e”). By
`3
`A(e?”)
`.
`= Tae)?
`
`the definition of M/(e7”)
`replacing
`by
`simplification can be
`achieved
`Further
`$zz(t, e%”) byits instantaneousestimator Z(t, e%”)Z'(t, e#”)
`and recalling (4). We thus obtain
`
`V(t, &%) = H(e*)G(t,e’”)
`
`(9)
`
`where H(e!”) is some M x (M — 1) matrix, such that the
`columns of H(e?“) span the null space of A(e%“), ie.,
`
`W(t +1, e”)
`= P(e) [W(t, e%”)—uZ(t, e*)Y*(t, e%*)] + Fle).
`
`The algorithm is summarized in Fig.2.
`
`At(e!”)H(el”) =0
`
`rank {H(e)} = M-—1.
`
`(10)
`
`The vector G(t, e%”) is an (M — 1) x 1 vector of adjustable
`filters. By the geometrical interpretation of Frost’s algorithm
`
`B. Generalized Sidelobe Canceler (GSC) Interpretation
`
`In [2], Griffiths and Jim considered the case where each TF
`is a delay element (with some gain). Griffiths and Jim obtained
`an unconstrained adaptive enhancement algorithm, using the
`same constrained, minimum output powercriterion used by
`Frost
`[1]. The unconstrained algorithm is computationally
`more efficient than the constrained algorithm. Furthermore, the
`unconstrained algorithm is based on the well behaved NLMS
`scheme.In Section III-A2, we obtained an adaptive algorithm
`for the case where each TF is represented by an arbitrary linear
`time-invariant system by tracing the derivation of Frost in the
`frequency domain. We now repeat the arguments of Griffiths
`
`A(ei”)
`ei”) =
`Wolt, e%”) = F( Aer
`[Recall that F(e%”) is the perpendicular from theorigin to the
`constraint hyperplane A(e?“).] Now, using (4), (8), and (9) we
`get
`
`F(e™).
`
`qd)
`
`Y(t, e&) = Yrnr(t, e”) — Yno(t, e”)
`
`(12)
`
`where
`
`Yepr(t, e”) = Walt, e*)Z(t, e)
`Ync(t, %) = Gt(t, e)H"(e™)Z(é, e).
`
`(13)
`
`- 1617 -
`
`- 1617 -
`
`
`
`1618
`
`The outputof the constrained beamformeris a difference of two
`terms, both operating on the input signal Z(t, e7”). The first
`term Yrpr(t, e?”) utilizes only fixed components (which de-
`pend on the TFs); therefore, it can be viewed as a fixed beam-
`former (FBF). We now examinethe secondterm Ync(t, e”).
`Note that
`
`U(t, e%”) =Ht(e”
`)Z(t, e*)
`_=H" (e)
`“) [A(e”)S(t, e”) + N(E, e)|
`=H(el
`N(t, e”).
`
`(14)
`
`IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 49, NO. 8, AUGUST 200)
`
`Thus, when Wo(t, e%”) is given by (16), the signal term of
`Yrar(t, e’) is the desired signal distorted only bythefirst TF
`A,(e4”). Now, suppose that
`
`Wolt, e) = He)F(e%),
`
`a7
`
`In this case, Wo(t, e?”) is comprised ofthe cascade of H(e%),
`whichis a filter matched to the TFsratio, and ¥(e/”). The new
`Wo(t, e”) can be derived from (16) under the assumption that
`||H(e%”)||? is constant. In fact, Grenier et al. [29] argue that
`this assumption can be verified empirically. The FBF term of
`the output is now given by
`
`Yrar(t, e!”) =
`
`F*(e”)S(t, e!)
`
`AC)?
`Ag(e3)
`+ F*(ei”)H'(e”)N(t, e!”),
`
`(18)
`
`The signal componentof Yrpr(t, e?”) is now distorted. Hence,
`only a suboptimal solution is achieved. Note, however,that all
`the sensor outputs are added together coherently [this can be
`seen from the term || A(e”)||?].
`2) Blocking Matrix
`(BM): Consider
`M x (M — 1) matrix H(e%”):
`
`following
`
`the
`
`The last transition is due to (10). U(t, e) are reference
`noise signals. Hence,
`the signal dependent component of
`Ync(¢, e%”) is completely eliminated (blocked) by Ht(e*”)
`so that Yno(t, e/”) is a pure noise term. The noise term of
`Yrpr(t, e?”) can be reduced by properly adjustingthefilters
`G(t, e7”), using the minimum output powercriterion. This
`adjustment problem is in fact the classical multichannel noise
`cancellation problem. An adaptive LMSsolutionto the problem
`was proposed by Widrow[34].
`The GSCsolution is comprised of three components:
`1) fixed beamformer (FBF);
`2) blocking matrix (BM)that constructs the noise reference
`signals;
`
`i (e?*)
`_ Axle”) A3(e7*)
`3) multichannel noise canceler (NC).
`We nowdiscuss each of these componentsin details.
`
`Aj(e)—Aj (e) ~ Aj(e)
`1) Fixed Beamformer (FBF): By (3), (11), and (13), we
`1
`0
`wee
`0
`have
`0
`1
`-
`0
`
`Yrar(t, e™) = F*(e™)S(t, e)
`F*(ei4)
`| A(e2”) ||?
`
`+
`
`At(e™”)N(t, e”).
`
`0
`
`0
`
`1
`
`(19)
`
`H(e*) =
`
`It can be easily verified that this matrix satisfies (10) and is,
`hence, a proper blocking matrix that may be used for generating
`the reference noise signals U(t, e*”). By (14), we have
`
`Thefirst term on the right-hand side is the signal term. The
`secondis the noise term. Note that by setting F*(e/”) = e~J¥7
`(i.e., adelay), the signal componentof Yrpr(t, e?”) is an undis-
`torted, delayed version of the desired signal.
`Unfortunately, we usually do not have access to the actual
`.
`Am(e”)
`.
`-
`TFs (An(e7”); m = 1, ..., M). Later, we show how we can
`(e”*)
`(t, e”) Ae) Z(t, e”*)
`
`Um(e) = Z(t, ¥) - SF2,(4,ei
`estimate the TFs ratio
`m=2,..., M.
`
`Am(e™)
`Hn(e™”) = Ale)| m=1,...,M.
`
`(15)
`
`Let
`
`-
`
`Ap(ei”
`
`Am
`
`(e™
`
`AT (ei
`
`If in (11), the actual TFs are replaced by the TFsratios, then
`Wolt, o%) = EN) re),
`6
`H(e%)|?
`
`By (3) and (13), we have
`
`Yrar(t, e™)
`
`= A;(e”)F* (el) S(t, e”)
`F*(el”)
`|H(e%) ||?
`
`+
`
`Ht (e!”)N(t, e?”).
`
`(20)
`
`=
`ratios H,,(e”)
`the TFs
`the knowledge of
`Thus,
`Am(e™)/A1(e/”) is sufficient
`to implement
`the sidelobe
`canceler.
`3) Noise Canceler: By the GSC derivation, we have con-
`structed twosignals. The first is Yrgr(t, e7”), which contains
`both a desired speech term and a residual noise term. The second
`signal is Yyo(t, e””). Yno(t, e?”) consists of an adaptive set
`offilters G(t, e?”) that are applied to the noise-only signals
`U(t, e%”).
`Recall that our goal is to minimize the output power under
`a constraint on the response at the desired direction. By setting
`Wo(t, e?”) accordingto (11), the constraintis satisfied. Hence,
`minimization of the output power is achieved by adjusting the
`filters G(t, e”). This is an unconstrained minimization, ex-
`actly as in Widrow’s classical problem [34]. We can implement
`
`- 1618 -
`
`- 1618 -
`
`
`
`GANNOT etal.: SIGNAL ENHANCEMENT USING BEAMFORMING AND NONSTATIONARITY
`
`1619
`
`it by using the multichannel Wienerfilter. Recalling (12), our
`goalis to set G(t, e) to minimize
`E {|lYrer(t, e”) — Gt(t, e)U(t, e)|7}.
`
`Let
`
`buy (t, ) =E {U(t, e)¥sp(t, e”)}
`duult, e) = E {U(t, e*)Ut(t, ei”)}.
`Then,the multichannel Wienerfilter is given by [19], [35]
`G(t, e”) = 5b (t, e”)Our(t, e).
`
`(21)
`
`In orderto be able to track changes, we processthe signals by
`segments. The following frequency domain LMSalgorithm is
`used. Let the residual signal be
`
`Y(t, e!”) = Yppr(t, e™”) — Gtit, e”)U(t, e#).
`
`Note that the residual signal is also the output of the enhance-
`ment algorithm. By the orthogonality principle, the erroris or-
`thogonal to the measurements. Thus
`
`E{U(t, e™*)Y*(t, e*)} =0.
`
`(22)
`
`Following the standard Widrow procedure, the solutionis
`
`G(t+1, e”) = Git, e”) + pUlt, e”)Y*(t, e”).
`
`Usually, a more stable solution is achieved by using the nor-
`malized LMS (NLMS)algorithm, in which each frequency is
`normalized separately, yielding
`
`Gn(t+ 1, e”) =Gnilt, el”) tu
`m=2,...,M
`
`Umn(t, e%)¥*(t, e™)
`P. e(t ej)
`
`where
`Pest(t, e?”) = pPest(t -1, e”) + (1 _ p) > |Zm(t, ei)|?
`(23)
`pis a forgetting factor(typically 0.8 < p < 1). Anotherpossi-
`bility is to calculate P.,_ using the powerofthe noise reference
`signals. However,in that case, an energy detector is required so
`that G(t, e?”) is updated only whenthere is no active signal.
`If on the other hand, wecalculate P.s(t, e7”) using the input
`sensorsignals,as indicated in (23); then, an energy detector may
`be avoided. This is due to the fact that the adaptation term be-
`comesrelatively small during periods ofactive input signal.
`We assume that the noncasual TFs ratios h,, and the noise
`cancelingfilters g,, are both FIRs:
`
`ht =([hm(—az), ---) Mm(¢r)]
`gl, = [9m(—-Kx), seey 9m(Kr)]
`
`(24)
`
`(both h,,, and gm are functionsof time; however, for notational
`simplicity, we omit this dependence). Note that the TFs might
`havezeros outside the unit circle. Thus, to ensure stability of the
`TFs ratios, we do not impose them to be causal. When A; (e)”)
`contains zerosthatare close to the unitcircle,the noise reference
`
`signals U,,(e”) at the corresponding frequencies might assume
`very large values[recall (20)]. This may result in sharp peaks in
`the reconstructed spectrum.This problemis partially overcome
`by constraining the impulse response ofh,,, to an FIR structure.
`It is also possible to constrain the maximal valueof the estimated
`|H,n(e2”)| to be lower than somethreshold.
`In orderto fulfill the FIR structure constraint (24), the filters
`update is now given by
`
`Um(t, e*)¥*(t, e?)
`Gn(t +1, €”) = Gm(t, e*) +p ut, ef”)
`Gin(t +1, &”) = Ga(t +1, e”)
`
`(25)
`
`for m = 2,..., M. The operator <— includes the following
`three stages,First, we transform Gn (t+1, e?”) to the time do-
`main. Second, we truncate