throbber
(cid:8)(cid:7)(cid:3)(cid:14)(cid:1)(cid:11)(cid:13)(cid:15)(cid:10)(cid:5)(cid:15)(cid:2)(cid:10)(cid:9)(cid:6)(cid:11)(cid:4)(cid:12)(cid:12)(cid:15)
`
`Meta Platforms, Inc. Exhibit 1006
`Page i of 107
`
`

`

`Pageii of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Meta Platforms, Inc. Exhibit 1006
`Page ii of 107
`
`TK 5182
`
`5 A
`
`OL
`
`Na
`
`
`
`
`
`ONISS3IONdWNOISWLIDSIG
`
`Al
`
`See
`Ps
`Pm
`
`=ee)
`
`

`

`gital
`
`Volume 12, Number1
`January 2002
`
`DEQFirst
`
`WD setitespurtisnedonaor
`
`http:/Avww.ideallbra
`
`rocessing
`
`A Review Journal
`
`RTXOGB5482033%
`
`
`
`(a) GSC Lower Path
`
`(b) NFAB Lower Path
`
`Editors
`Jim Schroeder
`Joe Campbell
`
`ISSN 1051-2004
`
`a)
`
`ACADEMIC
`PRESS
`
`An Elsevier Science Imprint
`
`
`Meta Platforms, Inc. Exhibit 1006
`Meta Platforms, Inc. Exhibit 1006
`Page iii of 107
`Pageiii of 107
`
`

`

`Digital
`Signal
`Processin
`
`A Review Journal
`
`Editors
`
`Jim Schroeder
`SPRI/CSSIP
`Adelaide, SA, Australia
`E-mail: schroeder@cssip.edu.au
`
`Joe Campbell
`MLT. Lincoin Laboratory
`Lexington, Massachuseits
`E-mail: j.campbell@ieee.org
`
`Editorial Board
`
`Maurice Bellanger
`CNAM
`Paris, France
`
`Robert E. Bogner
`University of Adelaide
`Adelaide, SA, Australia
`Johann F. Bohme
`Ruhr-Universitat Bochum
`Bochum, Germany
`James A. Cadzow
`Vanderbilt University
`Nashville, Tennessee
`G. Clifford Carter
`NUWC
`Newport, Rhode Island
`A. G. Constantinides
`imperial College
`London, England
`Petar M. Djuric
`State University of New York
`Stony Brook, New York
`Anthony D. Fagan
`University College Dublin
`Dublin, treland
`Sadaoki Furui
`Tokyo Institute of Technology
`Tokyo, Japan
`
`John E. Hershey
`General Electric Company
`Schenectady, New York
`B. R. Hunt
`University of Arizona
`Tucson, Arizona
`JamesF. Kaiser
`Duke Universily
`Durham, North Carolina
`R. Lynn Kirlin
`University of Victoria
`Victoria, British Columbia, Canada
`Ercan Kuruoglu
`Istituto di Elaborazione della Informazione
`Ghezzano,Italy
`Meemong Lee
`Jet Propulsion Laboratory
`Pasadena, California
`Petre Stoica
`Uppsala University
`Uppsala, Sweden
`Mati Wax
`Wavion, Ltd
`Yoqneam,Isreai
`Rao Yarlagadda
`Oklahoma State University
`Stittwater, Oklahoma
`
`Cover photo. Lower path directivity pattern at 5000 Hz. See the article by McCowan, Moore, and Sridharan in
`this issue
`
`Meta Platforms, Inc. Exhibit 1006
`Meta Platforms, Inc. Exhibit 1006
`Page iv of 107
`Page iv of 107
`
`

`

`
`
`LopYRIGHT DEE
`
`COPY,
`
`Digital Signal Processing
`
`Volume 12, Number 1, January 2002
`
`© 2002 Elsevier Science (USA)
`
`All Rights Reserved
`
`No part of this publication may be reproduced ortransmitted in any form or by any means, electronic or mechanical, including photocopy,
`recording, or any information storage andretrieval system, without permission in writing from the Publisher. Exceptions: Explicit permission
`from Academic Pressis not required to reproduce a maximum oftwofigures or tables from an Academic Pressarticle in anotherscientific or
`research publication provided that the material has not been credited to another source andthatfull credit to the Academic Pressarticle is
`given. In addition, authors of work contained herein need not obtain permissionin the following casesonly: (1) to use their original figures or
`tables in their future works; (2) to make copiesoftheir papers for use in their classroom teaching; and (3) to includetheir papers aspart of their
`dissertations.
`The appearance of the codeat the bottom ofthefirst page of an article in this journal indicates the Publisher's consentthat copies of the
`article may be madefor personalorinternal use, or for the personal or internal use of specific clients. This consentis given on the condition,
`however, that the copier pay the stated per copy fee through the Copyright Clearance Center,
`Inc. (222 Rosewood Drive, Danvers,
`Massachusetts 01923), for copying beyondthat permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to
`other kinds of copying, such as copying for generaldistribution, for advertising or promotional purposes, for creating new collective works, orfor
`resale, Copy fees for pre-2002articles are as shownonthearticle title pages; if no fee code appears onthetitle page, the copyfee is the same
`as thosefor currentarticles.
`
`1051-2004/02 $35.00
`MADE IN THE UNITED STATES OF AMERICA
`This journalis printed on acid-free paper
`

`DIGITAL SIGNAL PROCESSING(ISSN 1051-2004)
`Published quarterly by Elsevier Science.
`Editorial and Production Otfices: 525 B Street, Suite 1900, San Diego, CA 92101-4495
`Accounting andCirculation Offices: 6277 Sea Harbor Drive, Orlando, FL 32887-4900
`2002: Volume 12. Price $343.00 U.S.A. and Canada; $374.00 all olher countries
`All prices include postage and handling
`Information concerning personal subscription rales may be obtained by wriling to the Publishers. All correspondence, permission requests, and subscription orders
`should be addressedto the office of ihe Publishers at 6277 Sea HarborDrive, Orlando, FL 32887-4900 (telephone: 407-345-2000). Sendnolices of changeof address
`to the office of the Publishers at least 6 to 8 weeksin advance. Pleaseinclude both old and new addresses. POSTMASTER: Send changesol addresslo Digital Signat
`Processing, 6277 Sea Harbor Drive, Orlando, FL 32887-4900.
`
`Meta Platforms, Inc. Exhibit 1006
`Meta Platforms, Inc. Exhibit 1006
`Page v of 107
`Page v of 107
`
`

`

`This material may be protected by Copyright law (Title 17 U.S. Code)
`
`Meta Platforms, Inc. Exhibit 1006
`Page 87 of 107
`
`

`

`88
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`practical array dimensions. Low frequency performanceis critical for speech
`processing applications, as significant speech energy is located below 1 kHz.
`By explicitly maximizing the array gain, superdirective beamforming tech-
`niquesare able to achieve greater directivity than conventional techniques with
`closely spaced sensorarrays [1]. This directivity generally comes at the expense
`of a controlled reduction in the white noise gain of the array. Recent work has
`demonstrated the suitability of superdirective beamforming for speech enhance-
`ment and recognition tasks [2, 3]. By employing a spherical propagation model
`in its formulation, rather than assuminga far-field model, near-field superdirec-
`tivity (NFSD) succeedsin achieving high directivity at low frequencies for near-
`field speech sources in diffuse noise conditions [4]. In previous work, near-field
`superdirectivity has been shownto lead to good speech recognition performance
`in high noise conditions for a near-field speaker [5].
`Superdirective techniques are typically formulated assuming a diffuse noise
`field. While this is a good approximation to many practical noise conditions,
`further noise reduction would result from a more accurate model of the
`actual noise conditions during operation. Adaptive array processing techniques
`continually update their parameters based on the statistics of the measured
`input noise. The generalized sidelobe canceler (GSC) [6] presents a structure
`that can be used to implement a variety of adaptive beamformers. A block
`diagram of the basic GSC system is shown in Fig. 1. The GSC separates
`the adaptive beamformer into two main processing paths—a standard fixed
`beamformer, w, with L constraints on the desired signal response, and an
`adaptive path, consisting of a blocking matrix, B, and a set of adaptive filters, a.
`As the desired signal has been constrained in the upper path, the lower path
`filters can be updated using an unconstrained adaptive algorithm, such as the
`least-mean-square (LMS) algorithm.
`While the theory of adaptive techniques promises greater signal enhance-
`ment, this is not always the case in real situations. A common problem with
`the GSC system is leakage of the desired signal through the blocking matrix,
`resulting in signal degradation at the beamformer output. This is particularly
`problematic for broadband signals, such as speech, and especially for speech
`recognition applications where signaldistortion iscritical.
`In this paper we propose a system that is suited to speech enhancementin a
`practical near-field situation, having both the good low frequency performance
`of near-field superdirectivity and the adaptability of a GSC system, while taking
`
`
`
`FIG. 1. Generalized sidelobe canceler structure.
`
`
`
`Meta Platforms, Inc. Exhibit 1006
`Page 88 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 88 of 107
`
`

`

`McCowan, Moore, and Sridharan: Near-field Adaptive Beamformer
`
`89
`
`care to minimize the problem of signal degradation for near-field sources.
`We begin by formulating a concise model for near-field sound propagation in
`Section 2. This model is then used in Section 3 to develop the proposed near-
`field adaptive beamforming (NFAB) technique. To demonstrate the benefit of
`the technique over existing methods, an experimental evaluation assessing
`directivity patterns, speech enhancement performance, and speech recognition
`performanceis detailed in Sections 4 and 5.
`
`2. NEAR-FIELD SOUND PROPAGATION MODEL
`
`a I
`
`n sensor array applications, a succinct means of characterizing both the
`array geometry andthelocation of a signal sourceis via the propagation vector.
`The propagation vector concisely describes the theoretical propagation of the
`signal from its source to each sensor in the array. In this section, we develop an
`expression for the propagation vector of a sound source located in the near-field
`of a microphone array using a spherical propagation model. This expression is
`then used in the formulation of the proposed near-field adaptive beamformerin
`the following sections.
`Many microphone array processing techniques assume a planar signal
`wavefront. This is reasonable for a far-ficld source, but when the desired
`source is close to the array a more accurate spherical wavefront model must
`be employed. For a microphonearray of length L, a source is considered to be
`in the near-field if r <2L2/A, where r is the distance to the source and A is the
`wavelength.
`Wedefine the reference microphoneas the origin of a 3-dimensional vector
`space, as shown in Fig. 2, The position vector for a source in direction (5, ¢s),
`at distance r, from the reference microphone,is denoted p, and is given by:
`
`cos 9; sin ds
`
`cos
`
`Ps =rs(&% y. Z]|sind, sing,|. (1)
`
`(i = 1,...,N), are similarly
`The microphone position vectors, denoted as p;
`defined. The distance from the source to microphone /is thus
`
`d; = ||[Ps — Pill:
`
`(2)
`
`where|| || is the Euclidean vector norm.
`In such a model, the differences in distance to each sensor can besignificant
`for a near-field source, resulting in phase misalignment across sensors. The
`difference in propagation time to each microphonewith respect to the reference
`microphone(/ = 1) is given by
`
`
`UG oe
`Cc
`
`(3)
`
`
`
`Meta Platforms, Inc. Exhibit 1006
`Page 89 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 89 of 107
`
`

`

`90
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`source
`
`Az
`
`microphone ij
`Bi
`
`-
`x
`
`FIG. 2. Near-field propagation model.
`
`where c = 340 ms ! for sound. In addition, the wavefront amplitude decays at a
`rate proportional to the distance traveled. The resulting amplitude differences
`across sensors are negligible for far-field sources, but can be significant in
`the near-field case. The microphone attenuation factors, with respect to the
`amplitude on the reference microphone, are given by
`
`a= a
`
`Thus, if +1(f) is the desired source at the reference microphone, the signal on
`the ith microphoneis given by
`
`xi(f) =ayjxy(fye i,
`
`6)
`
`Consequently, we define the near-field propagation vector for a source at
`distance r and direction (6, @) as
`
`d(f.r,6,@)= [ae227m LeFATIH ayeJ2tftu] Tt
`
`(6)
`
`3. NEAR-FIELD ADAPTIVE BEAMFORMING
`
`= T
`
`he proposed system structure is shown in Fig. 3. The objective of the
`proposed technique is to add the benefit of good low frequency directivity
`to a standard adaptive beamformer, as low frequency performanceis critical
`in speech processing applications. The upper path consists of a fixed near-
`field superdirective beamformer, while the lower path contains a near-field
`compensation unit, a blocking matrix and an adaptive noise cancelingfilter.
`The principal components of the system are discussed in the followingsections.
`
`Meta Platforms, Inc. Exhibit 1006
`Page 90 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 90 of 107
`
`

`

`McCowan, Moore, and Sridharan: Near-field Adaptive Beamformer
`
`91
`
`Fixed
`
` NFSD +
`Beamformer
`>(+)
`yi
`
`Near-field
`Blocking
`
`
`
`Matrix
`compensation
`
`
`D(f)
`
`
`
`FIG. 3. Near-field adaptive beamformer.
`
`Section 3.1 gives an explanation of the near-field superdirective beamformer.
`Section 3.2 proposes the inclusion of a near-field compensation unit in the
`adaptive sidelobe canceling path and examines its effect on reducing signal
`distortion at the output. Once this near-field compensation has been performed,
`a standard generalized sidelobe canceling blocking matrix and adaptive filters
`can be applied to reduce the output noise power, as discussed in Section 3.3.
`
`3.1. Near-field Superdirective Beamformer
`Superdirective beamforming techniques are based upon the maximization of
`the array gain, or directivity index. The array gain is defined as the ratio of
`output signal-to-noise ratio to input signal-to-noise ratio and for the general
`case can be expressed in matrix notation as [1]
`
`w(f)P(fywis)
`Gf) =.
`(f) wif) Q(fywi/)
`
`where w(/) is a column vector of channelgains,
`
`wif) = (wif)... wi(f) ... wf]
`
`7
`
`(8)
`
`()” is the complex conjugate transpose operator, and P(f) and Q(/) are
`the cross-spectral density matrices of the signal and noise respectively. In
`practical speech processing applications the form of the signal and noise cross-
`spectral density matrices is generally unknown and must be estimated, either
`from mathematical models (fixed beamformers) or from thestatistics of the
`multichannel inputs (adaptive beamformers). Superdirective beamformers are
`calculated based on assumed mathematical models for the P(f) and Q(f)
`matrices.
`Whenthe desired signal is known to emanate from a single source at location
`(r;, 95, @s), the signal cross-spectral matrix P simplifies to the propagation vector
`of the source, and the array gain can be expressed as
`
`GP)=
`
`IwfF dCf re. A). Gol?
`Seee
`wfiQsywif)
`
`”
`
`9
`
`9%)
`
`Meta Platforms, Inc. Exhibit 1006
`Page 91 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 91 of 107
`
`

`

`92
`
`Digital Signal Processing Vol, 12, No. 1, January 2002
`
`where d(/,r, @,¢) is the propagation vector for the desired source, as defined in
`Eq.(6).
`A diffuse (spherically isotropic) noisefield is often a good approximation for
`many practical situations, particularly in reverberant closed spaces, such as ina
`car or an office [7, 8]. For diffuse noise, the noise cross-spectral density matrix Q
`can be formulated as
`
` | [acre oraceo. or" sinaaads,
`1
`Qh = ta
`
`Jo
`
`(10)
`
`where d(/, 9, ¢) is the propagation vectorof a far-field noise source (r > 2.7/2)
`in direction (0, ¢).
`The superdirectivity problem is thus formulatedas:
`
`IW PACS. rs. bo)?
`wih wif FQC(fyw is)
`
`(11)
`
`to formulate the propagation
`By using a spherical propagation model]
`vector, d, the standard superdirective formulation can be optimized for a near-
`field source [9, 4], As such, the only difference in the calculation of the standard
`and near-field superdirective channel filters is the form of the propagation
`vector, d. For a near-field source, the assumption of plane wave (far-field)
`propagation leads to errors in the array response to the desired signal due
`to curvature of the direct wavefront. A thorough discussion of the use of a
`near-field model for superdirective microphone arrays is given by Ryan and
`Goubran [9].
`Cox [10] gives the general superdirective filter solution subject to
`
`1. L linear constraints, C(/)"w(/) = g(f) (explained below); and
`2. aconstraint on the maximum white noise gain, w(f)" w(f) = 62, where
`6“ is the desired white noisegain.
`as
`
`a2)
`w=telene ntdameyn}gn,
`where¢ is a Lagrange multiplier that is iteratively adjusted to satisfy the white
`noise gain constraint. The white noise gain is the array gain for spatially white
`(incoherent) noise; that is, Q(f) =I. A constraint on the white noise gainis
`necessary as an unconstrained superdirective solution will in fact result in
`significant gain to any incoherentnoise, particularly at low frequencies. Cox [10]
`states that the technique of adding a small amount to each diagonal matrix
`elementprior to inversionis in fact the optimum meansofsolving this problem.
`A study of the relationship between the multiplier « and the desired white
`noise gain 6*, shows that the white noise gain increases monotonically with
`increasing «. One possible means of obtaining the desired value of ¢ is thus
`an iterative technique employing a binary search algorithm between a specified
`minimum and maximum valuefor «. The computational expenseofthe iterative
`procedure is not critical, as the beamformer filters depend only on the source
`
`ee
`
`Meta Platforms, Inc. Exhibit 1006
`Page 92 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 92 of 107
`
`

`

`McCowan, Moore, and Sridharan: Near-field Adaptive Beamformer
`
`93
`
`location and array geometry, and thus must only be calculated once for a given
`configuration.
`The constraint matrix, C%(f), is of order L x N, where there are L linear
`constraints being applied, and the vector g(f) is a length-L column vector
`of constraining values. The constraints generally include one specifying unity
`response for the desired signal, d”(f)w(/) = 1, and where this is the sole
`constraint the above solution can by simplified by substituting C(f) = d(f) and
`g(f) = 1, giving
`
`(Qf) +edp)
`df)QCP) + elt f)
`Once the optimalfilters w(f) have been calculated, the near-field superdirec-
`tive beamformeroutputis calculated as
`
`wtf)
`
`(13)
`
`yu f) = wf)? x(f),
`
`where x(f) is the N-channel input column vector
`
`x(f)= [ad (A ani)”
`
`(14)
`
`(15)
`
`3.2. Near-field Compensation Unit
`The first element in the adaptive path of standard GSC is the blocking
`matrix [6]. Its purpose is to block the desired signal from the adaptive noise
`estimatc. To ensure complete blocking, the desired signal must both be time
`aligned and have equal amplitudes across all channels. If this is the case,
`cancellation occurs if each row of the blocking matrix sumsto zero, andall rows
`are linearly independent.
`For a near-field desired source, to align the desired signal on all channels,
`a near-field compensation mustfirst be applied to the input channels prior to
`blocking. To ensure full cancellation we need to compensate for both phase
`misalignment and amplitude scaling of the desired signal across sensors. We
`define the diagonal matrix
`
`Df) =[diag(d( fy,
`
`(16)
`
`where d(f) is the near-field propagation vector from Eq. (6). In this paper we
`define the diagonal operator, diag( ), to produce a diagonal matrix from a vector
`parameter. Conversely, if invoked with a matrix parameter, it produces a row
`vector corresponding to the matrix diagonal. The near-field compensation can
`be applied as
`
`x'(f) =D(f)x(f).
`
`(17)
`
`Oncethis near-field compensation has been performed, a standard GSC blocking
`matrix can be employed to block the desired signal from the adaptive path.
`The inclusion of this compensation unit is critical for a near-field desired
`signal. Without compensation for both phase and amplitude differences between
`sensors, blocking of the desired signal will not be ensured, leading to signal
`
`Meta Platforms, Inc. Exhibit 1006
`Page 93 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 93 of 107
`
`

`

`94
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`
`
`0
`
`20
`
`40
`
`60
`
`100
`60
`Direction of Arrival (deg)
`
`120
`
`140
`
`160
`
`160
`
`FIG. 4. Comparison of blocking matrix row beam-patterns.
`
`cancellation at the output. The near-field compensation effectively ensures that
`a true null exists in the beam-pattern of each blocking matrix row in the
`direction and distance correspondingto the desired source. Toillustrate, Fig. 4
`showsthedirectivity pattern at 2 kHz for the first row in the blocking matrix
`using the array shown in Fig. 5, with the desired source directly in front of the
`center microphone at a distance of 0.6 m. The figure shows the compensated
`response in the far- and near-fields, as well as the uncompensated near-field
`response.It is clear that the uncompensated system will allow a high degree of
`signal leakage into the adaptive path as it blocks noise sources rather than the
`desired signal.
`
`
`
`60cm
`
`270 cm
`
`desired
`source
`
`DeN,
`e
`localised
`noise
`
`FIG. 5. Experimental configuration.
`
`Meta Platforms, Inc. Exhibit 1006
`Page 94 of 107
`
`
`
`Farfletd
`--» NF Uncompensated (r=0.6m)
`— NF Compensated (r=0.6m)
`
`
`
` -
`
`
`
`
`
`BlockingMatrixRowResponse(dB)
`
`-40
`
`Meta Platforms, Inc. Exhibit 1006
`Page 94 of 107
`
`

`

`McCowan, Moore, and Sridharan: Near-field Adaptive Beamformer
`
`95
`
`3.3. Blocking Matrix and Adaptive Noise Canceling Filter
`The blocking matrix and adaptive noise canceling filters are taken from the
`standard GSC technique [6]. The order of the blocking matrix is N x (N — L),
`where there are L constraints applied in the fixed upper path beamformer.
`Generally only a unity constraint on the desired signal is specified, and the
`standard N x (N — 1) Griffiths—Jim blocking matrix is used:
`
`1
`
`O - 0
`
`O
`
`O
`
`O
`
`.
`
`(18)
`
`B=
`
`-1
`
`1
`
`0
`
`0
`
`0
`
`-1
`
`"
`
`-%
`
`oO
`
`:
`0
`
`0
`
`1
`
`-1
`0
`
`1
`-1
`
`The output of the blocking matrix is calculated as
`
`x" (f) =B"x'(f),
`
`(19)
`
`where x”(f) is an (N — 1)-length column vector. Defining the (N — 1)-length
`adaptive filter columnvector as
`
`a(f)=[ai(f) ... aif)... away’.
`
`the output of the lower path is given as
`
`yf) = al fy x" (f).
`
`(20)
`
`(21)
`
`The NFAB outputis then calculated from the upper and lower path outputs as
`
`WP) = yu) — 1)
`
`(22)
`
`and the adaptive filters are updated using the standard unconstrained LMS
`algorithm
`
`arii(f) = ar(f) + eR Doe),
`
`(23)
`
`where y is the adaptation step size and k denotes the current frame.
`
`3.4. Summary of Technique
`
`In summary, the proposed NFABtechniqueis characterized by the series of
`equations
`
`yu(f) = wf)" x(f)
`x, = BYD(S)x(/)
`
`(24a)
`(24b)
`
`Meta Platforms, Inc. Exhibit 1006
`Page 95 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 95 of 107
`
`

`

`96
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`yf) =afy xi(f)
`yf) = sul f) — 0 (P)
`api (f)=ac(f) Fuxyf),
`
`(24c)
`(24d)
`(24e)
`
`where all terms have been defined in the preceding discussion.
`
`4, EXPERIMENTAL CONFIGURATION
`
`a F
`
`or the experimental evaluation in this paper, we used the 11 element array
`shown in Fig. 5. The array consists of a nine element broadside array, with an
`additional two microphonessituated directly behind the end microphones. The
`total array is 40 cm wide and 15 cm deepin the horizontal plane. The broadside
`microphonesare arranged according to a standard broadband subarray design,
`where different subarrays are used for different frequency ranges for the fixed
`upper path beamformer. The two endfire microphonesare included for use by
`the near-field superdirective beamformerin the low frequency range. The four
`subarrays are thus
`
`e (f <1 kHz): microphones 1-11;
`e (1 kHz < f < 2 kHz): microphones 1, 2, 5, 8, and 9;
`e (2kHz < f <4 kHz): microphones2, 3, 5, 7, and 8; and
`e (4kHz < f < 8 kHz): microphones3-7.
`
`The array was situated in a computer room, with different sound source
`locations, as shownin Fig. 5. The two sound sources were
`
`1. the desired speaker situated 60 cm from the center microphone,directly
`in front of the array; and
`2. a localized noise source at an angle of 124° and a distance of 270 cm from
`the array.
`
`Impulse responses of the acoustic path between each source and microphone
`were measured from multichannel recordings made in the room with the ar-
`ray using the maximum length sequence technique detailed in Rife and Van-
`derkooy [11]. As the impulse responses were calculated from real recordings
`made simultaneously acrossall input channels, they take into account the real
`acoustic properties of the room and the array. The multichannel desired speech
`andlocalized noise microphone inputs were then generated by convolving the
`original single-channel speech and noise signals with these impulse responses.
`In addition, a real multichannel background noise recording of normal operat-
`ing conditions was made in the room with other workers present. This record-
`ing is referred to in the experiments as the ambient noise signal and is approxi-
`mately diffuse in nature. It consists mainly of computernoise, a variable level of
`background speech, and noise from an air-conditioning unit. The ambient noise
`effectively represents a diffuse noise field, while the localized noise represents
`a coherent noise source. In this paper, we specify the levels of the two different
`noise sources independently, as the signal to ambient-noise ratio (SANR) and
`
`Neat
`
`Meta Platforms, Inc. Exhibit 1006
`Page 96 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 96 of 107
`
`

`

`_
`
`McCowan, Moore, and Sridharan: Near-field Adaptive Beamformer
`
`97
`
`signal to localized-noise ratio (SLNR). These values are calculated as the aver-
`age segmental SNR from the speech and noise input, as measured at the center
`microphoneof the array.
`In this way, realistic multichannel input signals can be simulated for specified
`levels of ambient and localized noise. As well as facilitating the generation
`of different noise conditions, simulating the multichannel inputs using the
`impulse response method is more practical than making real recordings for
`speech recognition experiments, as existing single channel speech corpora may
`be used.
`
`5. EXPERIMENTAL RESULTS
`
`a T
`
`his section presents the results of the experimental evaluation. The
`proposed NFABtechnique is compared to a conventional
`fixed filter-sum
`beamformer, a fixed near-field superdirective beamformer, and a conventional
`GSC adaptive beamformer. These beamformersare specified in Table 1.
`The techniques are first assessed in terms of the directivity pattern in
`order to demonstrate the advantage of the proposed NFAB over conventional
`beamforming techniques, particularly at low frequencies. Following this, the
`techniques are evaluated for speech enhancement in terms of the improvement
`in signal to noise ratio and the log area ratio. Finally, the techniques are
`compared in a hands-free speech recognition task in noisy conditions using the
`TIDIGITSdatabase [12].
`
`5.1. Directivity Analysis
`As hasbeenstated, the main objective of the proposed technique is to produce
`an adaptive beamformer that exhibits good low frequency performance for near-
`field speech sources. To assess the effectiveness of the proposed technique in
`achieving this objective, in this section we analyze the horizontal directivity
`pattern. The directivity of a filter-sum beamformer is expressed in matrix
`notation as
`
`h(f.r,8,0) = wolf)" d( fr, 8,0),
`
`where w,is the length N channelfilter vector
`,
`.
`wf) = [woi(h) an woilf) ee Won (f))
`
`T
`
`:
`
`(25)
`
`(26)
`
`TABLE1
`
`
`Beamforming Techniques in Evaluation
`
`Technique
`
`Description
`
`Filters
`
`FS
`NFSD
`GSC
`
`NFAB
`
`Conventional FS beamformer
`Near-field superdirective beamformer
`GSC system with FS fixed upper path
`beamformer
`Near-field adaptive beamformer
`
`w,(f) = |diag(D(f))]"
`w.(f) =w(f)
`w,(f) = [diag(D(f))|" — Df) Bats)
`
`w,(f) =w(f) — Di f)Ba(f)
`
`Meta Platforms, Inc. Exhibit 1006
`Page 97 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 97 of 107
`
`

`

`98
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`150,
`
`180)
`
`210'
`
`270
`
`270
`
`(a) FS
`
`(b) NFSD
`
`FIG. 6. Upperpath directivity pattern at 300 Hz.
`
`5.1.1, Upper path directivity. First, we seek to demonstrate the directiv-
`ity improvement that NFSD achieves at low frequencies compared to a conven-
`tional filter-sum (FS) beamformer. For the FS beamformer, a commonsolutionis
`to choose w,(/’) = [diag(D(f)) |". This effectively ensures that the desiredsignal
`is aligned for phase and amplitude across sensors using a spherical propagation
`model. For NFSD,we usethefilter vector w() described in Section 3.1. Figure 6
`shows the near-field directivity pattern at 300 Hz for the FS and NFSD. From
`these figures, it is clear that the NFSD technique results in greater directional
`discrimination at low frequencies compared to a conventional beamformer. At
`higher frequencies (f > 1 kHz), conventional beamformers offer reasonabledi-
`rectivity, and so the FS and NFSD techniques give comparable performance.
`5.1.2. Lower path directivity.
`Second, we wish to demonstrate the effect
`of the noise canceling path. The directivity of the noise canceling filters can be
`obtained by using the channelfilters w,(/) = D({)Ba(/). The blocking matrix
`and adaptive filters essentially implement a conventional (nonsuperdirective)
`beamformer that adaptively focuses on the major sources of noise. To examine
`the directivity of the lower path filters, the beamformer was run on an input
`speech signal with a white localized noise source (at the location shownin Fig.5)
`added at an SLNR of 0 dB and a low level of ambient noise (SANR = 20 dB).
`The steady-state adaptive filter vector, a(f), was written to file for both the
`proposed NFAB technique and the conventional GSC beamformer. The near-
`field directivity patterns of the lower path filters are plotted in Figs. 7 and 8
`for 300 and 5000 Hz, respectively. We see that the lower path adaptive filters
`for both beamformers converge to similar solutions in terms of directivity,
`producing a main lobe in the direction of the cohcrent noise source (+124° from
`Fig. 5), as well as a null in the location of the desired speaker. As expected, the
`directivity of the adaptive path is poor at low frequencies, as seen in Fig.7.
`§.1.3. Overall beamformerdirectivity.
`Finally, we examinethedirectivity
`pattern of the overall beamformer for the NFAB and conventional adaptive
`
`Meta Platforms, Inc. Exhibit 1006
`Page 98 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 98 of 107
`
`

`

`McCowan, Moore, and Sridharan: Near-field Adaptive Beamformer
`
`99
`
`90
`
`4 270
`
`270
`
`(a) GSC LowerPath
`
`(b) NFAB LowerPath
`
`FIG. 7. Lower path directivity pattern at 300 Hz.
`
`systems. The near-field directivity patterns at 300 Hz are shown in Fig. 9.
`We see that the directivity pattern of the NFAB system exhibits a true null
`in the direction and at the distance of the noise source, while the directivity of
`the conventional beamformeris too poor to significantly attenuate the noise at
`this frequency. At frequencies above 1 kHz the directivity performance of both
`techniques is comparable.
`5.1.4. Summary of beamformer directivity.
`termsof directivity, the proposed NFAB system:
`
`In summary wesee that, in
`
`e outperforms the conventional FS system in terms of low frequency
`performance and theability to attenuate coherent noise sources,
`e outperforms the NFSD system due to the ability to attenuate coherent
`noise sources, and
`
`
`
`270
`
`270
`
`(a) GSC LowerPath
`
`(b) NFAB Lower Path
`
`FIG. 8. Lowerpath directivity pattern at 5000 Hz.
`
`Meta Platforms, Inc. Exhibit 1006
`Page 99 of 107
`
`Meta Platforms, Inc. Exhibit 1006
`Page 99 of 107
`
`

`

`100
`
`Digital Signal Processing Vol. 12, No. 1, January 2002
`
`
`
`270
`
`270
`
`(a) GSC
`
`(b) NFAB
`
`FIG. 9. Overall beamformer directivity pattern at 300 Hz.
`
`e outperforms the conventional GSC system in terms of low frequency
`performance.
`In this way, we see that the proposed system succeeds in meeting the stated
`objectives and should therefore demonstrate improved performance in speech
`processing applications.
`
`§.2. Speech Enhancement Analysis
`The signal plots in Fig. 10 give an indication of the level of enhancement
`achieved by the NFAB technique. For the desired speech signal, we used a
`segment of speech from the TIDIGITS database corresponding to the digit
`sequence one-nine-eight-six. Ambient noise was added at an SANR level of
`10 dB, and a localized white noise signal was added at an SLNRlevel of 0 dB.
`Theplots indicate that NFAB succeeds in reducingthe noise level with negligible
`distortion to the desired signal.
`To better measure the level of enhancement, objective speech measures were
`used to compare the different techniques. Two measures were used, these
`being the SNR improvement and the log area ratio distortion measure. The
`SNR improvementis defined as the difference in SNR at the array output and
`input. As the true SNR cannot be measured, it is estimated as the average
`segmental signal-plus-noise to noise ratio. While the signal to noise ratio is a
`useful measurefor assessing noise reduction, it does not necessarily give a good
`indication of how much distortion has been introduced to the desired speech
`signal. The log area ratio (LAR) measure of speech quality is more highly
`correlated with perceptual intelligibility in humans [13]. The log area ratio
`measurefor a frameof speech is calculated as
`
`1
`
`LAR(#) =
`
`P S low etre)
`4, tee)
`2ST) °F 1-H®
`
`1/2
`f
`
`(27)
`
`—
`
`Meta Platforms, Inc. Exhibit 1006
`Page 100 of 107
`
`Meta Pl

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket