throbber
Attorney Docket No. ALPH.P035
`
`DUAL OMNIDIRECTIONAL MICROPHONE ARRAY (DOMA)
`
`Inventor:
`
`Gregory C. Burnett
`
`RELATED APPLICATIONS
`
`This application claims the benefit of United States (US) Patent Application
`
`Numbers 60/934,551, filed June 13, 2007, 60/953,444, filed August 1, 2007,
`
`60/954,712, filed August 8, 2007, and 61/045,377, filed April 16, 2008.
`
`TECHNICAL FIELD
`
`The disclosure herein relates generally to noise suppression.
`
`In particular,
`
`this disclosure relates to noise suppression systems, devices, and methods for use
`
`in acoustic applications.
`
`BACKGROUND
`
`10
`
`15
`
`Conventional adaptive noise suppression algorithms have been around for
`
`some time. These conventional algorithms have used two or more microphones to
`
`sample both an (unwanted) acoustic noise field and the (desired) speech of a user.
`
`20
`
`The noise relationship between the microphonesis then determined using an
`
`adaptivefilter (such as Least-Mean-Squares as described in Haykin & Widrow,
`
`ISBN# 0471215708, Wiley, 2002, but any adaptive or stationary system
`
`identification algorithm may be used) and that relationship used tofilter the noise
`
`from the desired signal.
`
`25
`
`Most conventional noise suppression systems currently in use for speech
`
`communication systems are based on a single-microphone spectral subtraction
`
`technique first develop in the 1970’s and described, for example, by S. F. Boll in
`
`“Suppression of Acoustic Noise in Speech using Spectral Subtraction," IEEE Trans.
`
`on ASSP, pp. 113-120, 1979. These techniques have been refined over the years,
`
`30
`
`but the basic principles of operation have remained the same. See, for example,
`
`US Patent Number 5,687,243 of McLaughlin, et al., and US Patent Number
`
`4,811,404 of Vilmur, et al. There have also been several attempts at multi-
`
`microphone noise suppression systems, such as those outlined in US Patent
`1
`Amazon v. Jawbone
`US. Patent 8,280,072
`
`Amazon Ex. 1009
`
`Amazon v. Jawbone
`U.S. Patent 8,280,072
`Amazon Ex. 1009
`
`

`

`Attorney Docket No. ALPH.P035
`
`Number 5,406,622 of Silverberg et al. and US Patent Number 5,463,694 of Bradley
`et al. Multi-microphone systems have not been very successful for a variety of
`reasons, the most compelling being poor noise cancellation performance and/or
`significant speech distortion. Primarily, conventional multi-microphone systems
`attempt to increase the SNR of the user’s speechby“steering” the nulls of the
`system to the strongest noise sources. This approachis limited in the number of
`noise sources removed by the number of available nulls.
`The Jawbone earpiece (referred to as the “Jawbone), introduced in December
`2006 by AliphCom of San Francisco, California, was the first known commercial
`product to use a pair of physical directional microphones (instead of omnidirectional
`microphones) to reduce environmental acoustic noise. The technology supporting
`the Jawbone is currently described under one or more of US Patent Number
`7,246,058 by Burnett and/or US Patent Application Numbers 10/400,282,
`10/667,207, and/or 10/769,302. Generally, multi-microphone techniques make
`use of an acoustic-based Voice Activity Detector (VAD) to determine the
`background noise characteristics, where “voice” is generally understood to include
`human voiced speech, unvoiced speech, or a combination of voiced and unvoiced
`speech. The Jawbone improved on this by using a microphone-based sensor to
`construct a VADsignal using directly detected speech vibrations in the user’s cheek.
`This allowed the Jawbone to aggressively remove noise when the user was not
`producing speech. However, the Jawbone usesa directional microphone array.
`
`INCORPORATION BY REFERENCE
`
`Each patent, patent application, and/or publication mentionedin this
`specification is herein incorporated by referencein its entirety to the same extent
`as if each individual patent, patent application, and/or publication was specifically
`and individually indicated to be incorporated by reference.
`
`10
`
`15
`
`20
`
`25
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`Figure 1 is a two-microphone adaptive noise suppression system, under an
`
`30
`
`embodiment.
`
`

`

`Attorney Docket No. ALPH.P035
`
`Figure 2 is an array and speech source (S) configuration, under an
`
`embodiment. The microphones are separated by a distance approximately equal to
`
`2do, and the speech sourceis located a distance d, away from the midpoint of the
`
`array at an angle 6. The system is axially symmetric so only d, and 6 need be
`
`specified.
`
`Figure 3 is a block diagram for a first order gradient microphone using two
`
`omnidirectional elements 0, and O2, under an embodiment.
`
`Figure 4 is a block diagram for a DOMAincluding two physical microphones
`
`configured to form twovirtual microphones V; and V2, under an embodiment.
`
`10
`
`Figure 5 is a block diagram for a DOMAincluding two physical microphones
`
`configured to form N virtual microphones V,; through Vy, where N is any number
`
`greater than one, under an embodiment.
`
`Figure 6 is an example of a headset or head-worn device that includes the
`
`DOMA, as described herein, under an embodiment.
`
`15
`
`Figure 7 is a flow diagram for denoising acoustic signals using the DOMA,
`
`under an embodiment.
`
`Figure8is a flow diagram for forming the DOMA, under an embodiment.
`
`Figure9is a plot of linear response of virtual microphone V2 to a 1 kHz
`
`speech source at a distance of 0.1 m, under an embodiment. The null is at 0
`
`20
`
`degrees, where the speechis normally located.
`
`Figure 10 is a plot of linear response of virtual microphone V2 to a 1 kHz
`noise source at a distance of 1.0 m, under an embodiment. There is no null and all
`
`noise sources are detected.
`
`Figure 11 is a plot of linear response of virtual microphone V; to a 1 kHz
`speech source at a distance of 0.1 m, under an embodiment. There is no null and
`
`25
`
`the response for speech is greater than that shownin Figure 9.
`Figure 12 is a plot of linear response of virtual microphone V; to a 1 kHz
`noise source at a distance of 1.0 m, under an embodiment. Thereis no null and
`
`30
`
`the responseis very similar to V2 shown in Figure 10.
`Figure 13 is a plot of linear response of virtual microphone V; to a speech
`source at a distance of 0.1 m for frequencies of 100, 500, 1000, 2000, 3000, and
`
`4000 Hz, under an embodiment.
`
`

`

`Attorney Docket No. ALPH.P035
`
`Figure 14 is a plot showing comparison of frequency responses for speech
`for the array of an embodiment and for a conventional cardioid microphone.
`Figure 15 is a plot showing speech responsefor V; (top, dashed) and V2
`(bottom, solid) versus B with d,; assumed to be 0.1 m, under an embodiment. The
`spatial null in V2 is relatively broad.
`Figure 16 is a plot showing a ratio of V:/V2 speech responses shown in
`Figure 10 versus B, under an embodiment. The ratio is above 10 dB for all 0.8 < B
`< 1.1. This means that the physical p of the system need not be exactly modeled
`
`for good performance.
`Figure 17 is a plot of B versus actual d, assuming that d, = 10 cm and theta
`= 0, under an embodiment.
`Figure 18is a plot of B versus theta with d, = 10 cm and assuming d; = 10
`cm, under an embodiment.
`Figure 19 is a plot of amplitude (top) and phase (bottom) response of N(s)
`with B = 1 and D = -7.2 usec, under an embodiment. The resulting phase
`difference clearly affects high frequencies more than low.
`Figure 20is a plot of amplitude (top) and phase (bottom) responseof N(s)
`with B = 1.2 and D = -7.2 usec, under an embodiment. Non-unity B affects the
`
`entire frequency range.
`Figure 21 is a plot of amplitude (top) and phase (bottom) responseof the
`effect on the speech cancellation in V2 due to a mistake in the location of the
`speech source with ql = 0 degrees and q2 = 30 degrees, under an embodiment.
`The cancellation remains below -10 dB for frequencies below 6 kHz.
`Figure 22is a plot of amplitude (top) and phase (bottom) response of the
`effect on the speech cancellation in V2 due to a mistake in the location of the
`speech source with qi = 0 degrees and q2 = 45 degrees, under an embodiment.
`The cancellation is below -10 dB only for frequencies below about 2.8 kHz and a
`
`reduction in performanceis expected.
`Figure 23 shows experimental results for a 2d) = 19 mm arrayusing a
`linear B of 0.83 on a Bruel and Kjaer Head and Torso Simulator (HATS) in very loud
`(~85 dBA) music/speech noise environment, under an embodiment. The noise has
`
`10
`
`15
`
`20
`
`25
`
`30
`
`

`

`Attorney Docket No. ALPH.P035
`
`been reduced by about 25 dB and the speechhardly affected, with no noticeable
`
`distortion.
`
`DETAILED DESCRIPTION
`
`A dual omnidirectional microphone array (DOMA) that provides improved
`noise suppression is described herein. Compared to conventional arrays and
`algorithms, which seek to reduce noise by nulling out noise sources, the array of an
`embodimentis used to form two distinct virtual directional microphones which are
`
`configured to have very similar noise responses and very dissimilar speech
`responses. The only null formed by the DOMAis one used to remove the speech of
`the user from V2. The two virtual microphones of an embodiment can be paired
`with an adaptivefilter algorithm and/or VAD algorithm to significantly reduce the
`noise without distorting the speech, significantly improving the SNR of the desired
`speech over conventional noise suppression systems. The embodiments described
`herein are stable in operation, flexible with respect to virtual microphone pattern
`choice, and have proven to be robust with respect to speech source-to-array
`distance and orientation as well as temperature and calibration techniques.
`In the following description, numerous specific details are introduced to
`provide a thorough understanding of, and enabling description for, embodiments of
`the DOMA. One skilled in the relevant art, however, will recognize that these
`embodiments can be practiced without one or moreof the specific details, or with
`other components, systems, etc.
`In other instances, well-known structures or
`operations are not shown, orare not described in detail, to avoid obscuring aspects
`
`of the disclosed embodiments.
`
`10
`
`15
`
`20
`
`25
`
`Unless otherwise specified, the following terms have the corresponding
`meanings in addition to any meaning or understanding they may convey to one
`
`skilled in the art.
`
`The term “bleedthrough” means the undesired presence of noise during
`
`30
`
`speech.
`The term “denoising” means removing unwanted noise from Mic1, and also
`refers to the amount of reduction of noise energy in a signal in decibels (dB).
`
`

`

`Attorney Docket No. ALPH.P035
`
`The term “devoicing” means removing/distorting the desired speech from
`
`Micl.
`
`The term “directional microphone (DM)” meansa physical directional
`
`microphone that is vented on both sides of the sensing diaphragm.
`The term “Micl (M1)” means a general designation for an adaptive noise
`
`suppression system microphone that usually contains more speech than noise.
`The term “Mic2 (M2)” means a general designation for an adaptive noise
`suppression system microphone that usually contains more noise than speech.
`
`The term “noise” means unwanted environmental acoustic noise.
`
`10
`
`The term “null” means a zero or minima in the spatial response of a physical
`
`or virtual directional microphone.
`The term “O,” means a first physical omnidirectional microphone used to
`
`form a microphone array.
`The term “O.” means a second physical omnidirectional microphone used to
`
`15
`
`form a microphone array.
`
`The term “speech” means desired speech of the user.
`
`The term “Skin Surface Microphone (SSM)”is a microphone usedin an
`
`earpiece (e.g., the Jawbone earpiece available from Aliph of San Francisco,
`California) to detect speech vibrations on the user’s skin.
`The term “V,” meansthe virtual directional “speech” microphone, which has
`
`no nulls.
`
`The term “V2” means the virtual directional “noise” microphone, which has a
`
`null for the user’s speech.
`The term “Voice Activity Detection (VAD) signal” means a signal indicating
`
`when user speechis detected.
`The term “virtual microphones (VM)”or “virtual directional microphones”
`means a microphone constructed using two or more omnidirectional microphones
`
`and associated signal processing.
`Figure 1 is a two-microphone adaptive noise suppression system 100, under
`an embodiment. The two-microphone system 100 including the combination of
`physical microphones MIC 1 and MIC 2 along with the processing or circuitry
`components to which the microphonescouple (described in detail below, but not
`
`20
`
`25
`
`30
`
`

`

`Attorney Docket No. ALPH.P035
`
`shown in this figure) is referred to herein as the dual omnidirectional microphone
`array (DOMA) 110, but the embodimentis not so limited. Referring to Figure 1, in
`analyzing the single noise source 101 and the direct path to the microphones, the
`total acoustic information coming into MIC 1 (102, which can be an physical or
`virtual microphone) is denoted by m;(n). The total acoustic information coming
`into MIC 2 (103, which can also be an physical or virtual microphone) is similarly
`labeled m2(n).
`In the z (digital frequency) domain, these are represented as M,(z)
`
`and M2(z). Then,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`with
`
`so that
`
`M, (Zz) =S(Z)+N,(2)
`M,(z)=N(Z) +8, ()
`
`N,()=N@H,(@)
`S,(z)=S(@H,(Z),
`
`M, (z)=S() + N(@)H, @)
`M,(z)=N(z)+S@H,(z).
`
`Eq.
`
`1
`
`This is the general case for all two microphone systems. Equation 1 has four
`unknowns and only two known relationships and therefore cannot be solved
`
`explicitly.
`However, there is another way to solve for some of the unknownsin
`Equation 1. The analysis starts with an examination of the case where the speech
`is not being generated, that is, where a signal from the VAD subsystem 104
`(optional) equals zero.
`In this case, s(n) = S(z) = 0, and Equation 1 reduces to
`
`Myx (2)=N(Z)H, (2)
`Mon (Z)=N@),
`
`where the N subscript on the M variables indicate that only noise is being received.
`
`This leads to
`
`Myy (2)=M>n (ZH, @)
`
`My)
`m@= Mn (2)
`
`7
`
`Eq. 2
`
`

`

`Attorney Docket No. ALPH.P035
`
`The function H(z) can be calculated using any of the available system identification
`algorithms and the microphone outputs when the system is certain that only noise
`is being received. The calculation can be done adaptively, so that the system can
`
`react to changesin the noise.
`A solution is now available for H,(z), one of the unknowns in Equation 1. The
`
`final unknown, H2(z), can be determined by using the instances where speechis
`being produced and the VAD equals one. Whenthis is occurring, but the recent
`(perhaps less than 1 second) history of the microphonesindicate low levels of
`noise, it can be assumed that n(s) = N(z) ~ 0. Then Equation 1 reduces to
`
`which in turn leads to
`
`M)s5(Z)=S@)
`M).(Z)=S(Z)H, (z),
`
`Mp5 (Z)=Mj5(Z)H,(2)
`M5 (Z)
`m@= Mis (Z)
`
`whichis the inverse of the H(z) calculation. However, it is noted that different
`inputs are being used (nowonly the speech is occurring whereas before only the
`noise was occurring). While calculating H2(z), the values calculated for H(z) are
`held constant (and vice versa) and it is assumed that the noise level is not high
`
`enough to cause errors in the H2(z) calculation.
`After calculating H:(z) and H(z), they are used to removethe noise from the
`
`signal.
`
`If Equation 1 is rewritten as
`
`S(z)=M,(2)-N@H,@)
`N(Z)=M(Z)-S@)H2 @)
`S@)=M, (2)-[M2(Z)-S@H,(@IH, @)
`S(z)[1-H, (2H, (Z)]=M, (z)-M,(2)H, @),
`
`10
`
`15
`
`20
`
`25
`
`30
`
`then N(z) may be substituted as shown to solve for S(z) as
`
`

`

`Attorney Docket No. ALPH.P035
`
`gz) =OMOH@| Eq. 3
`
`
`1-H, (2H, (z)
`
`If the transfer functions H(z) and H2(z) can be described with sufficient
`accuracy, then the noise can be completely removed and the original signal
`recovered. This remains true without respect to the amplitude or spectral
`characteristics of the noise.
`If there is very little or no leakage from the speech
`source into M2, then H,(z)*0 and Equation 3 reduces to
`
`S(z)*M,(z)—~M,(@H,(Z).
`
`Eq. 4
`
`Equation 4 is much simpler to implement and is very stable, assuming H(z)
`is stable. However,if significant speech energy is in M2(z), devoicing can occur.
`In
`order to construct a well-performing system and use Equation 4, considerationis
`
`given to the following conditions:
`
`R1.
`
`R2.
`
`R3.
`R4.
`R5.
`
`Availability of a perfect (or at least very good) VADin noisy conditions
`
` Sufficiently accurate H1(z)
`
`Very small (ideally zero) H(z).
`During speech production, Hi(z) cannot change substantially.
`During noise, H2(z) cannot change substantially.
`
`Condition R1 is easy to satisfy if the SNR of the desired speech to the
`unwantednoise is high enough. “Enough” meansdifferent things depending on the
`method of VAD generation.
`If a VAD vibration sensor is used, as in Burnett
`7,256,048, accurate VADin very low SNRs(-10 dB or less) is possible. Acoustic-
`only methodsusing information from O, and O2 can also return accurate VADs, but
`are limited to SNRs of ~3 dB or greater for adequate performance.
`Condition R5 is normally simple to satisfy because for most applications the
`microphoneswill not changeposition with respect to the user’s mouth very often or
`rapidly.
`In those applications where it may happen (such as hands-free
`conferencing systems)it can be satisfied by configuring Mic2 so that H,(z)=0.
`
`9
`
`10
`
`15
`
`20
`
`25
`
`30
`
`

`

`Attorney Docket No. ALPH.P035
`
`Satisfying conditions R2, R3, and R4 are moredifficult but are possible given
`the right combination of V; and V2. Methods are examined below that have proven
`to be effective in satisfying the above, resulting in excellent noise suppression
`performance and minimal speech removal and distortion in an embodiment.
`The DOMA,in various embodiments, can be used with the Pathfinder system
`
`as the adaptivefilter system or noise removal. The Pathfinder system, available
`from AliphCom, San Francisco, CA, is described in detail in other patents and patent
`applications referenced herein. Alternatively, any adaptivefilter or noise removal
`algorithm can be used with the DOMAin one or more various alternative
`
`10
`
`embodiments or configurations.
`
`When the DOMAis used with the Pathfinder system, the Pathfinder system
`
`generally provides adaptive noise cancellation by combining the two microphone
`signals (e.g., Micl1, Mic2) byfiltering and summing in the time domain. The
`adaptive filter generally uses the signal received from a first microphoneof the
`DOMAto remove noise from the speech received from at least one other
`
`microphone of the DOMA, whichrelies on a slowly varying linear transfer function
`between the two microphonesfor sources of noise. Following processing of the two
`channels of the DOMA, an output signal is generated in which the noise contentis
`attenuated with respect to the speech content, as described in detail below.
`Figure 2 is a generalized two-microphone array (DOMA)including an array
`201/202 and speech source S configuration, under an embodiment. Figure 3 is a
`system 300 for generating or producing a first order gradient microphone V using
`two omnidirectional elements O, and O2, under an embodiment. The array of an
`
`embodimentincludes two physical microphones 201 and 202 (e.g., omnidirectional
`microphones) placed a distance 2d apart and a speech source 200 is located a
`distance d, away at an angle of 8. This array is axially symmetric (at least in free
`space), so no other angle is needed. The output from each microphone 201 and
`202 can be delayed (z; and zz), multiplied by a gain (A; and Az), and then summed
`with the other as demonstrated in Figure 3. The outputof the array is or forms at
`least one virtual microphone, as described in detail below. This operation can be
`over any frequency range desired. By varying the magnitude and sign of the delays
`and gains, a wide variety of virtual microphones (VMs), also referred to herein as
`
`15
`
`20
`
`25
`
`30
`
`10
`
`

`

`Attorney Docket No. ALPH.P035
`
`virtual directional microphones, can be realized. There are other methods knownto
`
`those skilled in the art for constructing VMs but this is a common one and will be
`
`used in the enablement below.
`
`As an example, Figure 4 is a block diagram for a DOMA 400 including two
`physical microphonesconfigured to form twovirtual microphones V; and V2, under
`an embodiment. The DOMAincludes twofirst order gradient microphones V, and
`
`V2 formed using the outputs of two microphonesor elements 0, and O2 (201 and
`202), under an embodiment. The DOMA of an embodimentincludes two physical
`microphones 201 and 202 that are omnidirectional microphones, as described
`
`above with reference to Figures 2 and 3. The output from each microphone is
`
`coupled to a processing component 402, or circuitry, and the processing component
`outputs signals representing or corresponding to the virtual microphones V, and V2.
`
`In this example system 400, the output of physical microphone 201 is
`
`coupled to processing component 402 that includes a first processing path that
`
`includes application of a first delay 21, and a first gain Ai, and a second processing
`path that includes application of a second delay 212 and a second gain Ai. The
`output of physical microphone 202 is coupled to a third processing path of the
`processing component 402 that includes application of a third delay zz; and a third
`gain Az, and a fourth processing path that includes application of a fourth delay Z22
`and a fourth gain A22. The output of the first and third processing paths is summed
`to form virtual microphone Vi, and the output of the second and fourth processing
`
`paths is summedto form virtual microphone Vp.
`As described in detail below, varying the magnitude and sign of the delays
`
`and gains of the processing paths leads to a wide variety of virtual microphones
`(VMs), also referred to herein as virtual directional microphones, can berealized.
`While the processing component 402 described in this example includes four
`processing paths generating twovirtual microphones or microphone signals, the
`embodimentis not so limited. For example, Figure 5 is a block diagram for a
`
`DOMA500 including two physical microphones configured to form N virtual
`microphonesV; through Vy, where N is any number greater than one, under an
`embodiment. Thus, the DOMAcan include a processing component 502 having any
`
`10
`
`15
`
`20
`
`25
`
`30
`
`11
`
`

`

`Attorney Docket No. ALPH.P035
`
`number of processing paths as appropriate to form a number N ofvirtual
`
`microphones.
`
`The DOMA of an embodiment can be coupled or connected to one or more
`
`remote devices.
`
`In a system configuration, the DOMA outputs signals to the
`
`remote devices. The remote devices include, but are not limited to, at least one of
`
`cellular telephones, satellite telephones, portable telephones, wireline telephones,
`
`Internet telephones, wireless transceivers, wireless communication radios, personal
`
`digital assistants (PDAs), personal computers (PCs), headset devices, head-worn
`
`devices, and earpieces.
`
`10
`
`Furthermore, the DOMA of an embodiment can be a component or subsystem
`
`In this system configuration, the DOMA outputs
`integrated with a host device.
`signals to components or subsystems of the host device. The host device includes,
`
`but is not limited to, at least one of cellular telephones, satellite telephones,
`
`portable telephones, wireline telephones, Internet telephones, wireless
`
`15
`
`transceivers, wireless communication radios, personal digital assistants (PDAs),
`
`personal computers (PCs), headset devices, head-worn devices, and earpieces.
`
`As an example, Figure 6 is an example of a headset or head-worn device
`
`600 that includes the DOMA, as described herein, under an embodiment. The
`
`20
`
`headset 600 of an embodimentincludes a housing having two areas or receptacles
`(not shown) that receive and hold two microphones(e.g., O1 and O2). The headset
`600 is generally a device that can be worn by a speaker 602, for example, a
`headset or earpiece that positions or holds the microphonesin the vicinity of the
`speaker’s mouth. The headset 600 of an embodimentplacesa first physical
`microphone (e.g., physical microphone 0O,) in a vicinity of a speaker's lips. A
`
`25
`
`second physical microphone(e.g., physical microphone Oz) is placed a distance
`
`behind the first physical microphone. The distance of an embodimentis in a range
`
`of a few centimeters behind the first physical microphone or as described herein
`
`(e.g., described with reference to Figures 1-5). The DOMAis symmetric and is
`used in the same configuration or manneras a single close-talk microphone, butis
`
`30
`
`not so limited.
`
`Figure 7 is a flow diagram for denoising 700 acoustic signals using the
`DOMA, under an embodiment. The denoising 700 begins by receiving 702 acoustic
`
`12
`
`

`

`Attorney Docket No. ALPH.P035
`
`In
`signals at a first physical microphone and a second physical microphone.
`response to the acoustic signals, a first microphone signal is output from the first
`physical microphone and a second microphone signal is output from the second
`physical microphone 704.Afirst virtual microphone is formed 706 by generating a
`first combination of the first microphone signal and the second microphone signal.
`A second virtual microphone is formed 708 by generating a second combination of
`the first microphone signal and the second microphone signal, and the second
`combination is different from the first combination. The first virtual microphone
`
`10
`
`15
`
`20
`
`25
`
`30
`
`and the second virtual microphone are distinct virtual directional microphones with
`substantially similar responses to noise and substantially dissimilar responses to
`speech. The denoising 700 generates 710 output signals by combining signals from
`the first virtual microphone and the second virtual microphone, and the output
`signals include less acoustic noise than the acoustic signals.
`Figure 8 is a flow diagram for forming 800 the DOMA, under an
`embodiment. Formation 800 of the DOMAincludes forming 802 a physical
`microphone array including a first physical microphone and a second physical
`microphone. The first physical microphone outputs a first microphone signal and
`the second physical microphone outputs a second microphonesignal. A virtual
`microphone array is formed 804 comprising a first virtual microphone and a second
`virtual microphone. The first virtual microphone comprisesa first combination of
`the first microphone signal and the second microphone signal. The second virtual
`microphone comprises a second combination of the first microphone signal and the
`second microphone signal, and the second combination is different from the first
`combination. The virtual microphone array including a single null oriented in a
`
`direction toward a source of speech of a human speaker.
`The construction of VMs for the adaptive noise suppression system of an
`embodiment includes substantially similar noise response in V; and V2.
`Substantially similar noise response as used herein meansthat H;(z) is simple to
`model and will not change much during speech, satisfying conditions R2 and R4
`described above and allowing strong denoising and minimized bleedthrough.
`The construction of VMs for the adaptive noise suppression system of an
`embodimentincludes relatively small speech responsefor V2. The relatively small
`
`13
`
`

`

`Attorney Docket No. ALPH.P035
`
`speech response for V2 means that H-{z) ¥ 0, which will satisfy conditions R3 and R5
`
`described above.
`
`The construction of VMs for the adaptive noise suppression system of an
`
`embodimentfurther includes sufficient speech response for V; so that the cleaned
`
`speech will have significantly higher SNR than the original speech captured by Oj.
`The description that follows assumes that the responses ofthe
`
`omnidirectional microphones O, and O> to an identical acoustic source have been
`
`normalized so that they have exactly the same response (amplitude and phase) to
`
`that source. This can be accomplished using standard microphone array methods
`
`10
`
`(such as frequency-based calibration) well known to those versed in the art.
`
`Referring to the condition that construction of VMs for the adaptive noise
`suppression system of an embodimentincludesrelatively small speech responsefor
`V2, it is seen that for discrete systems V2(z) can be represented as:
`
`15
`
`where
`
`Vo (z)= O, (z) —z"BO, (z)
`
`d
`= SL
`
`B d,
`(samples)
`y= “et -f,
`d, =4/d2 - 2d,d, cos(0)+ d?
`d, =4/d2 +2d,d, cos(0)+ d?
`
`20
`
`25
`
`The distances d, and d> are the distance from O,; and O, to the speech source (see
`Figure 2), respectively, and y is their difference divided by c, the speed of sound,
`and multiplied by the sampling frequencyf;. Thus y is in samples, but need not be
`an integer. For non-integer y, fractional-delay filters (well known to those versedin
`
`the art) may be used.
`It is important to note that the B aboveis not the conventional B used to
`denote the mixing of VMs in adaptive beamforming; it is a physical variable of the
`system that depends on the intra-microphone distance do (which is fixed) and the
`
`14
`
`

`

`Attorney Docket No. ALPH.P035
`
`distance d, and angle 6, which can vary. As shown below,for properly calibrated
`
`microphones, it is not necessary for the system to be programmed with the exact
`
`B of the array. Errors of approximately 10-15%in the actual B (i.e. the B used by
`
`the algorithm is not the B of the physical array) have been used withverylittle
`
`degradation in quality. The algorithmic value of B may be calculated and set for a
`
`particular user or may be calculated adaptively during speech production whenlittle
`
`or no noise is present. However, adaptation during use is not required for nominal
`
`performance.
`
`Figure 9 is a plot of linear response of virtual microphone V2 with B = 0.8 to
`
`10
`
`a 1 kHz speech source at a distance of 0.1 m, under an embodiment. The null in
`
`the linear response of virtual microphone V2 to speech is located at 0 degrees,
`
`wherethe speech is typically expected to be located. Figure 10 is a plotoflinear
`
`responseof virtual microphone V2 with B = 0.8 to a 1 kHz noise source at a
`
`distance of 1.0 m, under an embodiment. The linear response of V2 to noise is
`
`15
`
`devoid of or includes no null, meaning all noise sources are detected.
`
`The above formulation for V2(z) has a null at the speech location and will
`
`therefore exhibit minimal response to the speech. This is shown in Figure 9 for an
`
`array with dg = 10.7 mm and a speech source on the axis of the array (8 = 0) at 10
`
`cm (8 = 0.8). Note that the speech null at zero degrees is not presentfor noise in
`
`20
`
`the far field for the same microphone, as shown in Figure 10 with a noise source
`
`distance of approximately 1 meter. This insures that noise in front of the user will
`
`be detected so that it can be removed. This differs from conventional systems that
`
`can havedifficulty removing noise in the direction of the mouth of the user.
`
`The V,(z) can be formulated using the general form for Vi(z):
`
`25
`
`Since
`
`V(z)=0,0,(z)-2°A - 30,(z)-2°
`
`V, (z) =O, (z) —z"BO, (z)
`
`30
`
`and, since for noise in the forward direction
`
`15
`
`

`

`Attorney Docket No. ALPH.P035
`
`then
`
`Onn (z) = Ow (z) ZO
`
`Von (z)= Oww (z)- z*-—z"*BOw (z)
`Von (z) = (1 - B\On (z): z*)
`
`If this is then set equal to V;(z) above, the resultis
`
`Viy (2) = 0,O1y (2) 2% ~ 04,01 (z)-27 Ze =(1-BO,y(z)-27)
`
`thus we may set
`
`10
`
`15
`
`to get
`
`da =7
`
`dg = 0
`
`a= 1
`
`ag = B
`
`Vi (z)= 0; (z): zy BO, (z)
`
`The definitions for V; and V2 above mean that for noise H,(z) is:
`
`_ V2) _ 7 BO, (z)+ O,(z)- Zz"
`B= @)
`0,(2)-2780,()
`
`20
`
`which, if the amplitude noise responses are about the same, has the form of an
`allpass filter. This has the advantage of being easily and accurately modeled,
`
`especially in magnitude response, satisfying R2.
`
`25
`
`This formulation assures that the noise responsewill be as similar as possible and
`that the speech responsewill be proportional to (1-B”). Since B is the ratio of the
`distances from O, and O, to the speech source,it is affected by the size of the array
`
`and the distance from the array to the speech source.
`
`16
`
`

`

`Attorney Docket No. ALPH.P035
`
`Figure 11 is a plot of linear response of virtual microphone V; with B = 0.8
`to a 1 kHz speech source at a distance of 0.1 m, under an embodiment. The linear
`responseofvirtual microphone V, to speechis devoid of or includes no null and the
`
`response for speech is greater than that shown in Figure 4.
`Figure 12 is a plot of linear response of virtual microphone V; with B = 0.8
`to a 1 kHz noise source at a distance of 1.0 m, under an embodiment. The linear
`
`response of virtual microphoneV; to noise is devoid of or includes no null and the
`
`response is very similar to V2 shown in Figure 5.
`
`Figure 13 is a plot of linear response of virtual microphone V; with B = 0.8
`to a speech source at a distance of 0.1 m for frequencies of 100, 500, 1000, 2000,
`3000, and 4000 Hz, under an embodiment. Figure 14 is a plot showing
`comparison of frequency responses for speech for the array of an embodiment and
`
`for a conventional cardioid microphone.
`
`The response of V; to speech is shownin Figure 11,

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket