`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 1 of 39
`
`EXHIBIT I
`EXHIBIT |
`
`
`
`(12) United States Patent
`Burnett
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,503,691 B2
`* Aug. 6, 2013
`
`US008503691 B2
`
`(54) VIRTUAL MICROPHONE ARRAYS USING
`DUAL OMNIDIRECTIONAL MICROPHONE
`ARRAY (DOMA)
`(75) Inventor: Gregory C. Burnett, Dodge Center, MN
`(US)
`(73) Assignee: AliphCom, San Francisco, CA (US)
`(*) Notice:
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1050 days.
`This patent is Subject to a terminal dis
`claimer.
`
`(21) Appl. No.: 12/139,333
`
`(22) Filed:
`
`Jun. 13, 2008
`
`(65)
`
`Prior Publication Data
`US 2009/OOO3623 A1
`Jan. 1, 2009
`Related U.S. Application Data
`(60) Provisional application No. 60/934,551, filed on Jun.
`13, 2007, provisional application No. 60/953,444,
`filed on Aug. 1, 2007, provisional application No.
`60/954,712, filed on Aug. 8, 2007, provisional
`application No. 61/045.377, filed on Apr. 16, 2008.
`(51) Int. Cl.
`H04R 3/00
`
`(2006.01)
`
`(52) U.S. Cl.
`USPC ...... 381/92: 381/94.7: 704/233; 704/E21.004
`(58) Field of Classification Search
`USPC .................... 381/92, 94.7: 704/233, E21.004
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,473,701 A * 12/1995 Cezanne et al. ................ 381/92
`7,386,135 B2 * 6/2008 Fan ................................. 381/92
`* cited by examiner
`Primary Examiner — Howard Weiss
`(74) Attorney, Agent, or Firm — Kokka & Backus, PC
`(57)
`ABSTRACT
`A dual omnidirectional microphone array noise Suppression
`is described. Compared to conventional arrays and algo
`rithms, which seek to reduce noise by nulling out noise
`Sources, the array of an embodiment is used to form two
`distinct virtual directional microphones which are configured
`to have very similar noise responses and very dissimilar
`speech responses. The only null formed is one used to remove
`the speech of the user from V. The two virtual microphones
`may be paired with an adaptive filter algorithm and VAD
`algorithm to significantly reduce the noise without distorting
`the speech, significantly improving the SNR of the desired
`speech over conventional noise Suppression systems.
`
`46 Claims, 17 Drawing Sheets
`
`
`
`500
`? 201
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 2 of 39
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 1 of 17
`
`US 8,503,691 B2
`
`
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 3 of 39
`
`((3) | 00I
`
`TVNOIS
`
`(u)s
`
`{{SION | [0]
`
`(u)u
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 2 of 17
`
`US 8,503,691 B2
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 4 of 39
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 3 of 17
`
`US 8,503,691 B2
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 5 of 39
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 4 of 17
`
`US 8,503,691 B2
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 6 of 39
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 5 Of 17
`
`US 8,503,691 B2
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 7 of 39
`
`FIG.6
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 6 of 17
`
`US 8,503,691 B2
`
`Receive acoustic signals at a first physical
`microphone and a second physical microphone.
`
`Output first microphone signal from first physical
`microphone and second microphone signal from
`second physical microphone.
`
`Form first virtual microphone using the first combination
`of first microphone signal and second microphone signal.
`
`Form second virtual microphone using second combination
`of first microphone signal and second microphone signal.
`
`Generate denoised output signals having less
`acoustic noise than received acoustic signals.
`700-1
`FIG.7
`
`
`
`Form physical microphone array including first
`physical microphone and second physical microphone.
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 8 of 39
`
`Form virtual microphone array including first virtual
`microphone and second virtual microphone using
`signals from physical microphone array.
`
`702
`
`704
`
`706
`
`708
`
`710
`
`802
`
`804
`
`
`
`U.S. Patent
`
`US 8,503,691 B2
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 9 of 39
`
`270
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 8 of 17
`
`US 8,503,691 B2
`
`Linear response of W1 to a speech source at 0.10 meters
`
`-- - - - -
`
`fW1 to a noise source a
`tl meters
`L
`1near response 0
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 10 of 39
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 9 Of 17
`
`US 8,503,691 B2
`
`Linear response of Wl to a speech source at 0.1 meters
`
`
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 11 of 39
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 10 of 17
`
`US 8,503,691 B2
`
`Frequency response at 0 degrees
`
`Cardioid speech
`response
`
`-1 O
`
`-1 5
`
`
`
`-2 O
`O
`
`---------------------------. W1 speech ----------------------------
`response
`
`- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
`
`1000
`
`2000
`
`5000
`4000
`3000
`Frequency (Hz)
`
`6000
`
`7000
`
`8000
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 12 of 39
`
`FIG.14
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 11 of 17
`
`US 8,503,691 B2
`
`W1 (top, dashed) and V2 speech response VS. Bassuming d = 0.lm
`
`0.5
`
`
`
`0.6
`
`0.7
`
`0.9
`
`0.8
`B
`FIG.15
`W1W2 for speech versus Bassuming d = 0.lm
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 13 of 39
`
`0.4
`
`0.5
`
`0.6
`
`0.9
`
`1
`
`1.1
`
`0.7
`
`0.8
`B
`FIG.16
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 12 of 17
`
`US 8,503,691 B2
`
`Bfactor VS, actuald assuming d = 0.1m and theta=0
`
`
`
`0.05
`
`0.1
`
`
`
`0.15
`
`0.35
`
`0.3
`0.25
`0.2
`Actualds (meters)
`FIG.17
`B versus theta assuming d = 0.lm
`
`0.4
`
`0.45
`
`0.5
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 14 of 39
`
`-80
`
`-60
`
`-40
`
`20
`O
`-20
`theta (degrees)
`FIG.18
`
`40
`
`60
`
`80
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 13 of 17
`
`US 8,503,691 B2
`
`O
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`
`
`-100
`
`1000
`
`2000
`
`5000
`4000
`3000
`Frequency (Hz)
`
`6000
`
`7000
`
`8000
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 15 of 39
`
`FIG.19
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 14 of 17
`
`US 8,503,691 B2
`
`2.
`S.
`B
`E
`
`s
`
`40
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`
`
`260
`g 240-------------------------------------------------------------------
`
`5 220 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --1a- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
`
`200 - - - - - - - - - - - - - -ae- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
`
`A-
`
`180;
`
`1000
`
`:
`2000
`
`:
`5000
`4000
`3000
`Frequency (Hz)
`
`6000
`
`7000
`
`8000
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 16 of 39
`
`FIG20
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 15 Of 17
`
`US 8,503,691 B2
`
`10 Cancellation with d1 = 1, thetal = 0, d2 = 1, and theta2 = 30
`
`-10---jujulu-r
`-20------------------------------------------------------
`-30-1------------------------
`
`-40
`O
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`th
`3.
`
`A
`
`
`
`
`
`60
`
`1000
`
`2000
`
`5000
`4000
`3000
`Frequency (Hz)
`
`6000
`
`7000
`
`8000
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 17 of 39
`
`FIG21
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 16 of 17
`
`US 8,503,691 B2
`
`Cancellation with d1 = 1, theta1 = 0, d2 = 1, and theta2 = 45
`
`O
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`
`
`
`
`O
`
`1000
`
`2000
`
`5000
`4000
`3000
`Frequency (Hz)
`
`6000
`
`7000
`
`8000
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 18 of 39
`
`FIG.22
`
`
`
`U.S. Patent
`
`Aug. 6, 2013
`
`Sheet 17 Of 17
`
`US 8,503,691 B2
`
`Original V1 (top) and cleaned Wl (bottom) with simplified WAD(dashed) in noise
`
`2
`
`
`
`0.3
`0.2
`0.
`O
`-0.1
`-0.2
`
`O
`
`0.5
`
`15
`1
`Time (samples at 8 kHz/sec)
`
`2
`
`2.5
`x 10
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 19 of 39
`
`FIG.23
`
`
`
`US 8,503,691 B2
`
`1.
`VIRTUAL MICROPHONE ARRAYS USING
`DUAL OMNIDIRECTIONAL MICROPHONE
`ARRAY (DOMA)
`
`2
`allowed the Jawbone to aggressively remove noise when the
`user was not producing speech. However, the Jawbone uses a
`directional microphone array.
`
`RELATED APPLICATIONS
`
`INCORPORATION BY REFERENCE
`
`This application claims the benefit of U.S. Patent Applica
`tion Nos. 60/934,551, filed Jun. 13, 2007, 60/953,444, filed
`Aug. 1, 2007, 60/954,712, filed Aug. 8, 2007, and 61/045,
`377, filed Apr. 16, 2008.
`
`10
`
`TECHNICAL FIELD
`
`The disclosure herein relates generally to noise Suppres
`Sion. In particular, this disclosure relates to noise Suppression
`systems, devices, and methods for use in acoustic applica
`tions.
`
`15
`
`BACKGROUND
`
`25
`
`30
`
`35
`
`40
`
`Conventional adaptive noise Suppression algorithms have
`been around for Some time. These conventional algorithms
`have used two or more microphones to sample both an (un
`wanted) acoustic noise field and the (desired) speech of a user.
`The noise relationship between the microphones is then
`determined using an adaptive filter (Such as Least-Mean
`Squares
`as
`described
`in
`Haykin & Widrow,
`ISBN#0471215708, Wiley, 2002, but any adaptive or station
`ary system identification algorithm may be used) and that
`relationship used to filter the noise from the desired signal.
`Most conventional noise Suppression systems currently in
`use for speech communication systems are based on a single
`microphone spectral Subtraction technique first develop in the
`1970s and described, for example, by S. F. Boll in “Suppres
`sion of Acoustic Noise in Speech using Spectral Subtraction.”
`IEEE Trans. on ASSP. pp. 113-120, 1979. These techniques
`have been refined over the years, but the basic principles of
`operation have remained the same. See, for example, U.S. Pat.
`No. 5,687.243 of McLaughlin, et al., and U.S. Pat. No. 4,811,
`404 of Vilmur, et al. There have also been several attempts at
`multi-microphone noise Suppression systems, such as those
`outlined in U.S. Pat. No. 5,406,622 of Silverberg et al. and
`U.S. Pat. No. 5,463,694 of Bradley et al. Multi-microphone
`systems have not been very Successful for a variety of reasons,
`the most compelling being poor noise cancellation perfor
`45
`mance and/or significant speech distortion. Primarily, con
`ventional multi-microphone systems attempt to increase the
`SNR of the user's speech by “steering the nulls of the system
`to the strongest noise Sources. This approach is limited in the
`number of noise sources removed by the number of available
`nulls.
`The Jawbone earpiece (referred to as the “Jawbone), intro
`duced in December 2006 by AliphCom of San Francisco,
`Calif., was the first known commercial product to use a pair of
`physical directional microphones (instead of omnidirectional
`microphones) to reduce environmental acoustic noise. The
`technology Supporting the Jawbone is currently described
`under one or more of U.S. Pat. No. 7,246,058 by Burnett
`and/or U.S. patent application Ser. Nos. 10/400.282, 10/667,
`207, and/or 10/769,302. Generally, multi-microphone tech
`niques make use of an acoustic-based Voice Activity Detector
`(VAD) to determine the background noise characteristics,
`where “voice' is generally understood to include human
`Voiced speech, unvoiced speech, or a combination of Voiced
`and unvoiced speech. The Jawbone improved on this by using
`a microphone-based sensor to construct a VAD signal using
`directly detected speech vibrations in the user's cheek. This
`
`50
`
`55
`
`60
`
`65
`
`Each patent, patent application, and/or publication men
`tioned in this specification is hereinincorporated by reference
`in its entirety to the same extent as if each individual patent,
`patent application, and/or publication was specifically and
`individually indicated to be incorporated by reference.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a two-microphone adaptive noise Suppression
`system, under an embodiment.
`FIG. 2 is an array and speech Source (S) configuration,
`under an embodiment. The microphones are separated by a
`distance approximately equal to 2d. and the speech source is
`located a distanced away from the midpoint of the array at an
`angle 0. The system is axially symmetric so only d and 0 need
`be specified.
`FIG. 3 is a block diagram for a first order gradient micro
`phone using two omnidirectional elements O and O, under
`an embodiment.
`FIG. 4 is a block diagram for a DOMA including two
`physical microphones configured to form two virtual micro
`phones V and V2, under an embodiment.
`FIG. 5 is a block diagram for a DOMA including two
`physical microphones configured to form N virtual micro
`phones V through V, where N is any number greater than
`one, under an embodiment.
`FIG. 6 is an example of a headset or head-worn device that
`includes the DOMA, as described herein, under an embodi
`ment.
`FIG. 7 is a flow diagram for denoising acoustic signals
`using the DOMA, under an embodiment.
`FIG. 8 is a flow diagram for forming the DOMA, under an
`embodiment.
`FIG.9 is a plot of linear response of virtual microphone V.
`to a 1 kHz speech source at a distance of 0.1 m, under an
`embodiment. The null is at 0 degrees, where the speech is
`normally located.
`FIG. 10 is a plot of linear response of virtual microphone
`V to a 1 kHz noise Source at a distance of 1.0 m, under an
`embodiment. There is no null and all noise sources are
`detected.
`FIG. 11 is a plot of linear response of virtual microphone
`V to a 1 kHz speech Source at a distance of 0.1 m, under an
`embodiment. There is no null and the response for speech is
`greater than that shown in FIG. 9.
`FIG. 12 is a plot of linear response of virtual microphone
`V to a 1 kHz, noise source at a distance of 1.0 m, under an
`embodiment. There is no null and the response is very similar
`to V, shown in FIG. 10.
`FIG. 13 is a plot of linear response of virtual microphone
`V to a speech source at a distance of 0.1 m for frequencies of
`100, 500, 1000, 2000, 3000, and 4000 Hz, under an embodi
`ment.
`FIG. 14 is a plot showing comparison of frequency
`responses for speech for the array of an embodiment and for
`a conventional cardioid microphone.
`FIG. 15 is a plot showing speech response for V (top,
`dashed) and V (bottom, solid) versus B with d assumed to be
`0.1 m, under an embodiment. The spatial null in V is rela
`tively broad.
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 20 of 39
`
`
`
`US 8,503,691 B2
`
`3
`FIG.16 is a plot showing a ratio of V/V, speech responses
`shown in FIG. 10 versus B, under an embodiment. The ratio
`is above 10 dB for all 0.8<B<1.1. This means that the physical
`B of the system need not be exactly modeled for good perfor
`aCC.
`FIG. 17 is a plot of B versus actual d assuming that d-10
`cm and theta=0, under an embodiment.
`FIG. 18 is a plot of B versus theta with d-10 cm and
`assuming di-10 cm, under an embodiment.
`FIG. 19 is a plot of amplitude (top) and phase (bottom)
`response of N(s) with B=1 and D=-7.2 usec, under an
`embodiment. The resulting phase difference clearly affects
`high frequencies more than low.
`FIG. 20 is a plot of amplitude (top) and phase (bottom)
`response of N(s) with B=1.2 and D=-7.2 usec, under an
`embodiment. Non-unity Baffects the entire frequency range.
`FIG. 21 is a plot of amplitude (top) and phase (bottom)
`response of the effect on the speech cancellation in V, due to
`a mistake in the location of the speech source with q1=0
`degrees and q2–30 degrees, under an embodiment. The can
`cellation remains below -10 dB for frequencies below 6 kHz.
`FIG. 22 is a plot of amplitude (top) and phase (bottom)
`response of the effect on the speech cancellation in V, due to
`a mistake in the location of the speech source with q1=0
`degrees and q2–45 degrees, under an embodiment. The can
`cellation is below -10 dB only for frequencies below about
`2.8 kHz and a reduction in performance is expected.
`FIG. 23 shows experimental results for a 2d 19 mm array
`using a linear f3 of 0.83 on a Bruel and Kjaer Head and Torso
`Simulator (HATS) in very loud (-85 dBA) music/speech
`noise environment, under an embodiment. The noise has been
`reduced by about 25 dB and the speech hardly affected, with
`no noticeable distortion.
`
`4
`In accordance with another embodiment, a device includes
`a first microphone outputting a first microphone signal and a
`second microphone outputting a second microphone signal;
`and a processing component coupled to the first microphone
`signal and the second microphonesignal, the processing com
`ponent generating a virtual microphone array comprising a
`first virtual microphone and a second virtual microphone,
`wherein the first virtual microphone comprises a first combi
`nation of the first microphone signal and the second micro
`phone signal, and wherein the second virtual microphone
`comprises a second combination of the first microphone sig
`nal and the second microphone signal. The second combina
`tion is different from the first combination. The first virtual
`microphone and the second virtual microphone have substan
`tially similar responses to noise and Substantially dissimilar
`responses to speech.
`In accordance with another embodiment, a device includes
`a first microphone outputting a first microphone signal and a
`second microphone outputting a second microphone signal,
`wherein the first microphone and the second microphone are
`omnidirectional microphones; and a virtual microphone array
`comprising a first virtual microphone and a second virtual
`microphone, wherein the first virtual microphone comprises a
`first combination of the first microphone signal and the sec
`ond microphone signal, and the second virtual microphone
`comprises a second combination of the first microphone sig
`nal and the second microphone signal. The second combina
`tion is different from the first combination, and the first virtual
`microphone and the second virtual microphone are distinct
`virtual directional microphones.
`In accordance with another embodiment, a device includes
`a first physical microphone generating a first microphone
`signal; a second physical microphone generating a second
`microphone signal; and a processing component coupled to
`the first microphonesignal and the second microphonesignal,
`the processing component generating a virtual microphone
`array comprising a first virtual microphone and a second
`virtual microphone. The first virtual microphone comprises
`the second microphone signal Subtracted from a delayed ver
`sion of the first microphone signal, and the second virtual
`microphone comprises a delayed version of the first micro
`phone signal Subtracted from the second microphone signal.
`In accordance with another embodiment, a sensor includes
`a physical microphone array including a first physical micro
`phone and a second physical microphone, the first physical
`microphone outputting a first microphone signal and the sec
`ond physical microphone outputting a second microphone
`signal; and a virtual microphone array comprising a first
`virtual microphone and a second virtual microphone, the first
`virtual microphone comprising a first combination of the first
`microphone signal and the second microphone signal, the
`second virtual microphone comprising a second combination
`of the first microphone signal and the second microphone
`signal. The second combination is different from the first
`combination, and the virtual microphone array includes a
`single null oriented in a direction toward a source of speech of
`a human speaker.
`
`DETAILED DESCRIPTION
`
`A dual omnidirectional microphone array (DOMA) that
`provides improved noise Suppression is described herein.
`Compared to conventional arrays and algorithms, which seek
`to reduce noise by nulling out noise Sources, the array of an
`embodiment is used to form two distinct virtual directional
`microphones which are configured to have very similar noise
`responses and very dissimilar speech responses. The only null
`
`10
`
`15
`
`25
`
`30
`
`SUMMARY OF THE INVENTION
`
`The present invention provides for dual omnidirectional
`microphone array devices, systems and methods.
`In accordance with one embodiment, a microphone array is
`formed with a first virtual microphone that includes a first
`combination of a first microphone signal and a second micro
`phonesignal, wherein the first microphonesignal is generated
`by a first physical microphone and the second microphone
`signal is generated by a second physical microphone; and a
`second virtual microphone that includes a second combina
`tion of the first microphonesignal and the second microphone
`signal, wherein the second combination is different from the
`first combination. The first virtual microphone and the second
`virtual microphone are distinct virtual directional micro
`phones with Substantially similar responses to noise and Sub
`stantially dissimilar responses to speech.
`In accordance with another embodiment, a microphone
`array is formed with a first virtual microphone formed from a
`first combination of a first microphone signal and a second
`microphone signal, wherein the first microphone signal is
`generated by a first omnidirectional microphone and the sec
`ond microphone signal is generated by a second omnidirec
`tional microphone; and a second virtual microphone formed
`from a second combination of the first microphone signal and
`the second microphone signal, wherein the second combina
`tion is different from the first combination. The first virtual
`microphone has a first linear response to speech that is devoid
`of a null, and the second virtual microphone has a second
`linear response to speech that has a single null oriented in a
`direction toward a source of the speech, wherein the speech is
`human speech.
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 21 of 39
`
`
`
`5
`formed by the DOMA is one used to remove the speech of the
`user from V. The two virtual microphones of an embodiment
`can be paired with an adaptive filter algorithm and/or VAD
`algorithm to significantly reduce the noise without distorting
`the speech, significantly improving the SNR of the desired
`speech over conventional noise Suppression systems. The
`embodiments described herein are stable in operation, flex
`ible with respect to virtual microphone pattern choice, and
`have proven to be robust with respect to speech source-to
`array distance and orientation as well as temperature and
`calibration techniques.
`In the following description, numerous specific details are
`introduced to provide a thorough understanding of, and
`enabling description for, embodiments of the DOMA. One
`skilled in the relevant art, however, will recognize that these
`embodiments can be practiced without one or more of the
`specific details, or with other components, systems, etc. In
`other instances, well-known structures or operations are not
`shown, or are not described in detail, to avoid obscuring
`aspects of the disclosed embodiments.
`Unless otherwise specified, the following terms have the
`corresponding meanings in addition to any meaning or under
`standing they may convey to one skilled in the art.
`The term "bleedthrough” means the undesired presence of
`noise during speech.
`25
`The term “denoising means removing unwanted noise
`from Micl, and also refers to the amount of reduction of noise
`energy in a signal in decibels (dB).
`The term “devoicing means removing/distorting the
`desired speech from Mic1.
`The term “directional microphone (DM) means a physical
`directional microphone that is vented on both sides of the
`sensing diaphragm.
`The term “Mic1 (M1) means a general designation for an
`adaptive noise Suppression system microphone that usually
`contains more speech than noise.
`The term “Mic2 (M2) means a general designation for an
`adaptive noise Suppression system microphone that usually
`contains more noise than speech.
`The term “noise' means unwanted environmental acoustic
`noise.
`The term “null means a Zero or minima in the spatial
`response of a physical or virtual directional microphone.
`The term “O'” means a first physical omnidirectional
`microphone used to form a microphone array.
`The term “O'” means a second physical omnidirectional
`microphone used to form a microphone array.
`The term “speech” means desired speech of the user.
`The term "Skin Surface Microphone (SSM) is a micro
`phone used in an earpiece (e.g., the Jawbone earpiece avail
`able from Aliph of San Francisco, Calif.) to detect speech
`vibrations on the user's skin.
`The term “V” means the virtual directional “speech”
`microphone, which has no nulls.
`The term “V” means the virtual directional “noise' micro
`phone, which has a null for the user's speech.
`The term “Voice Activity Detection (VAD) signal' means
`a signal indicating when user speech is detected.
`The term “virtual microphones (VM) or “virtual direc
`tional microphones' means a microphone constructed using
`two or more omnidirectional microphones and associated
`signal processing.
`FIG. 1 is a two-microphone adaptive noise Suppression
`system 100, under an embodiment. The two-microphone sys
`tem 100 including the combination of physical microphones
`MIC 1 and MIC 2 along with the processing or circuitry
`components to which the microphones couple (described in
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12/23/21 Page 22 of 39
`
`US 8,503,691 B2
`
`5
`
`10
`
`15
`
`6
`detail below, but not shown in this figure) is referred to herein
`as the dual omnidirectional microphone array (DOMA) 110.
`but the embodiment is not so limited. Referring to FIG. 1, in
`analyzing the single noise source 101 and the direct path to
`the microphones, the total acoustic information coming into
`MIC 1 (102, which can bean physical or virtual microphone)
`is denoted by m(n). The total acoustic information coming
`into MIC 2 (103, which can also be an physical or virtual
`microphone) is similarly labeled m(n). In the Z (digital fre
`quency) domain, these are represented as M(z) and M2(Z).
`Then,
`
`This is the general case for all two microphone systems.
`Equation 1 has four unknowns and only two known relation
`ships and therefore cannot be solved explicitly.
`However, there is another way to solve for some of the
`unknowns in Equation 1. The analysis starts with an exami
`nation of the case where the speech is not being generated,
`that is, where a signal from the VAD subsystem 104 (optional)
`equals Zero. In this case, s(n)=S(Z)=0, and Equation 1 reduces
`tO
`
`where the N subscript on the M variables indicate that only
`noise is being received. This leads to
`
`MN (z) = M2N (3)H (3)
`MN (3)
`Hi(x) = f;
`
`Eq. 2
`
`The function H (Z) can be calculated using any of the avail
`able system identification algorithms and the microphone
`outputs when the system is certain that only noise is being
`received. The calculation can be done adaptively, so that the
`system can react to changes in the noise.
`A solution is now available for H (Z), one of the unknowns
`in Equation 1. The final unknown, H2(Z), can be determined
`by using the instances where speech is being produced and the
`VAD equals one. When this is occurring, but the recent (per
`haps less than 1 second) history of the microphones indicate
`low levels of noise, it can be assumed that n(s)=N(Z)-0. Then
`Equation 1 reduces to
`
`which in turn leads to
`
`
`
`US 8,503,691 B2
`
`7
`
`M2s (3) = M1s (3) H2(3)
`Ms (3)
`H:(c) =
`i
`
`which is the inverse of the H (Z) calculation. However, it is
`noted that different inputs are being used (now only the
`speech is occurring whereas before only the noise was occur
`ring). While calculating H(Z), the values calculated for H (Z)
`are held constant (and Vice versa) and it is assumed that the
`noise level is not high enough to cause errors in the H(Z)
`calculation.
`After calculating H (Z) and H(Z), they are used to remove
`the noise from the signal. If Equation 1 is rewritten as
`
`10
`
`15
`
`then N(z) may be substituted as shown to solve for S(Z) as
`
`If the transfer functions H (Z) and H(z) can be described
`with Sufficient accuracy, then the noise can be completely
`removed and the original signal recovered. This remains true
`without respect to the amplitude or spectral characteristics of
`the noise. If there is very little or no leakage from the speech
`source into M, then H(Z)s0 and Equation 3 reduces to
`
`25
`
`30
`
`35
`
`8
`satisfying the above, resulting in excellent noise Suppression
`performance and minimal speech removal and distortion in an
`embodiment.
`The DOMA, in various embodiments, can be used with the
`Pathfinder system as the adaptive filter system or noise
`removal. The Pathfinder system, available from AliphCom,
`San Francisco, Calif., is described in detail in other patents
`and patent applications referenced herein. Alternatively, any
`adaptive filter or noise removal algorithm can be used with the
`DOMA in one or more various alternative embodiments or
`configurations.
`When the DOMA is used with the Pathfinder system, the
`Pathfinder system generally provides adaptive noise cancel
`lation by combining the two microphone signals (e.g., Mic1,
`Mic2) by filtering and Summing in the time domain. The
`adaptive filter generally uses the signal received from a first
`microphone of the DOMA to remove noise from the speech
`received from at least one other microphone of the DOMA,
`which relies on a slowly varying linear transfer function
`between the two microphones for sources of noise. Following
`processing of the two channels of the DOMA, an output
`signal is generated in which the noise content is attenuated
`with respect to the speech content, as described in detail
`below.
`FIG. 2 is a generalized two-microphone array (DOMA)
`including an array 201/202 and speech Source S configura
`tion, under an embodiment. FIG. 3 is a system 300 for gen
`erating or producing a first order gradient microphone V
`using two omnidirectional elements O and O, under an
`embodiment. The array of an embodiment includes two
`physical microphones 201 and 202 (e.g., omnidirectional
`microphones) placed a distance 2d apart and a speech source
`200 is located a distanced away at an angle of 0. This array
`is axially symmetric (at least in free space), so no other angle
`is needed. The output from each microphone 201 and 202 can
`be delayed (Z and Z), multiplied by again (A and A), and
`then summed with the other as demonstrated in FIG. 3. The
`output of the array is or forms at least one virtual microphone,
`as described in detail below. This operation can be over any
`frequency range desired. By varying the magnitude and sign
`of the delays and gains, a wide variety of virtual microphones
`(VMs), also referred to herein as virtual directional micro
`phones, can be realized. There are other methods known to
`those skilled in the art for constructing VMs but this is a
`common one and will be used in the enablement below.
`As an example, FIG. 4 is a block diagram for a DOMA 400
`including two physical microphones configured to form two
`virtual microphones V and V2, under an embodiment. The
`DOMA includes two first order gradient microphones V and
`V formed using the outputs of two microphones or elements
`O and O. (201 and 202), under an embodiment. The DOMA
`of an embodiment includes two physical microphones 201
`and 202 that are omnidirectional microphones, as described
`above with reference to FIGS. 2 and 3. The output from each
`microphone is coupled to a processing component 402, or
`circuitry, and the processing component outputs signals rep
`resenting or corresponding to the virtual microphones V and
`V.
`In this example system 400, the output of physical micro
`phone 201 is coupled to processing component 402 that
`includes a first processing path that includes application of a
`first delay Z, and a first gain A and a second processing path
`that includes application of a second delay Z, and a second
`gain A. The output of physical microphone 202 is coupled
`to a third processing path of the processing component 402
`that includes application of a third delay Z, and a third gain
`A and a fourth processing path that includes application of
`
`Case 6:21-cv-00984-ADA Document 19-9 Filed 12