throbber
Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 1 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 1 of 57
`
`
`
`
`EXHIBIT D
`EXHIBIT D
`
`
`
`
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 2 of 57
`ee”—STETTATATTAAAT
`
`US008321213B2
`
`US 8,321,213 B2
`(10) Patent No:
`a2) United States Patent
`Petit et al.
`(45) Date of Patent:
`*Nov. 27, 2012
`
`
`(54) ACOUSTIC VOICE ACTIVITY DETECTION
`(AVAD) FOR ELECTRONIC SYSTEMS
`
`(75)
`
`Inventors: Nicolas Petit, San Francisco, CA (US);
`Gregory Burnett, Dodge Center, MN
`(US); Zhinian Jing, San Francisco, CA
`US(US)
`(73) Assignee: uy Inc., San Francisco, CA
`:
`:
`:
`:
`+.
`(*) Notice:
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 540 days.
`.
`.
`.
`.
`.
`This patent is subject to a terminal dis-
`claimer.
`
`(21) Appl. No.: 12/606,146
`
`(22)
`(65)
`
`Filed:
`
`Oct. 26, 2009
`Prior Publication Data
`US 2010/0128894 Al
`May27, 2010
`
`Related U.S. Application Data
`(63) Continuation-in-part of application No. 12/139,333
`filed on Jun. 13. 2008. and a continuation-in-part of
`application No. 11/805,987,filed on May 25, 2007,
`now abandoned.
`
`(60) Provisional application No. 61/108,426, filed on Oct.
`24, 2008.
`
`(51)
`
`(56)
`
`Int. Cl.
`(2006.01)
`GIOL 11/06
`(52) US. Cd ccc cccteesceseneeenecnees 704/208; 704/214
`(58) Field of Classification Search ........0......... 704/208,
`704/210, 214, 215; 381/99, 100, 46
`See application file for complete search history.
`References Cited
`US. PATENT DOCUMENTS
`5,459,814 A *
`10/1995 Gupta etal. oo. 704/233
`7,171,357 B2*
`.. 704/231
`1/2007 Boland.
`.......
`
`704/296
`746.058 B2*
`7/2007 Burnett _....
`
`...scssesecre 704/210
`7,464,029 B2* 12/2008 Visser et al.
`9/2011 Bumettetal. ou... 381/718
`8,019,091 B2*
`4/2009 Wangetal. wo. 704/233
`2009/0089053 Al*
`* cited by examiner
`
`Primary Examiner — Abul Azad
`(74) Attorney, Agent, or Firm — Kokka & Backus, PC
`
`ABSTRACT
`(57)
`Acoustic Voice Activity Detection (AVAD) methodsandsys-
`tems are described. The AVAD methods and systems, includ-
`ing corresponding algorithms or programs, use microphones
`to generate virtual directional microphones which have very
`similar noise responsesand very dissimilar speech responses.
`The ratio of the energies of the virtual microphonesis then
`calculated over a given windowsize andthe ratio can then be
`used with a variety ofmethodsto generate a VAD signal. The
`virtual microphonescan be constructed using either an adap-
`tive or a fixedfilter.
`
`42 Claims, 35 Drawing Sheets
`
`Formingfirst virtual microphone by combining
`first signal offirst physical microphone and
`secondsignal of second physical microphone.
`
`Formingfilter that describes relationship for
`speech betweenfirst physical microphone
`and second physical microphone.
`
`v0
`
`502
`
`504
`
`energy ratio is greater than threshold value. 508
`
`Forming secondvirtual microphone by
`applyingfilter to first signal to generate
`first intermediate signal, and summing
`first intermediate signal and secondsignal.
`
`506
`
`Generating energy ratio of energiesoffirst virtual
`microphone and secondvirtual microphone.
`
`Detecting acoustic voice activity of speaker when
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 3 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 3 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 1 of 35
`
`US 8,321,213 B2
`
`
`
`FIG.2
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 4 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 4 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 2 of 35
`
`US 8,321,213 B2
`
`
`
`FIG.3
`
`
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 5 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 5 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 3 of 35
`
`US 8,321,213 B2
`
`first intermediate signal and second signal.
`
`y00
`
`502
`
`504
`
`306
`
`508
`
`510
`
`Forming first virtual microphone by combining
`first signal of first physical microphone and
`secondsignal of second physical microphone.
`
`Formingfilter that describes relationship for
`speech between first physical microphone
`and second physical microphone.
`
`Forming second virtual microphone by
`applyingfilter to first signal to generate
`first intermediate signal, and summing
`
`Generating energy ratio of energies offirst virtual
`microphone and second virtual microphone.
`
`Detecting acoustic voice activity of speaker when
`energy ratio is greater than threshold value.
`
`FIG.S
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 6 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 6 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 4 of 35
`
`US 8,321,213 B2
`
`
`
`OSIOUUlB}9qPaXlfJoy(W10}}0q)ZApue(do})[A
`
`
`
`
`
`
`
`
`
`(998)oy
`
`9DIA
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 7 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 7 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 5 of 35
`
`US 8,321,213 B2
`
` Ayao
`
`yooodsvjaqpoxtyJo}(W10}}0q)ZApue(dod)TA
`
`0€G202OlG
`
`
`
`(99s)aunty
`
`LOld
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 8 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 8 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 6 of 35
`
`US 8,321,213 B2
`
`[A
`
`
`
`
`QSI0UUIYooodsvjJ9qPoxXIyJOF(W10}}0q)ZApur(do})
`
`SESS ESUS eS aA SF
`
`GE0¢G¢4GhOl
`
`(998)ouuT}
`
`8DIA
`
` ------1____
`
`subseaey
`CTee
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 9 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 9 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 7 of 35
`
`US 8,321,213 B2
`
`
`
`
`
`QSIOUUlBjaqSAIdepeJ0F(W0j30q)ZApure(dor)[A
`
`
`
`
`
`G
`
`
`
`
`
`(998)
`
`OUI}0fwovGtsk
`
`6Did
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 10 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 10 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 8 of 35
`
`US 8,321,213 B2
`
`
`
`
`
`AyuoyooadsbyaqaAt}depeJoy(W0j0q)ZApue(doy)[A
`
`
`
`
`
`
`
`
`
`r-
`
`
`
`(998)SUIT}
`
`01Did
`
`qoondb---]
`
`
`
`LambJULdH
`
`toonaneas
`
`‘
`
`bee
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 11 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 11 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 9 of 35
`
`US 8,321,213 B2
`
`
`
`
`
`
`
`[A
`
`
`
`asiouUlyooadsejoqaAtjdepeJoy(woy0q)7Apue(dor)
`
`ae
`
`owt}wezcUGSbOl
`
`(998)
`
`IPDW
`
`!!11
`
`111
`
`Lt
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 12 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 12 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 10 of 35
`
`US 8,321,213 B2
`
`1230 Detection
`
`1250
`
`1240
`
`1230
`
` Processor
`
`Subsystem
`—
`Denoising
`Subsystem
`
`Voicing
`Sensors
`
`FIG.12
`
`
`
` Processor
`
`Denoising
`
`Detection
`
`FIG.13
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 13 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 13 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 11 of 35
`
`US 8,321,213 B2
`
`yooadspouesy)
`
`ISION
`
`7ON [PAOWAY
`
`())
`
`SION
`
`(u)u
`
`viDIA
`
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 14 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 14 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 12 of 35
`
`US 8,321,213 B2
`
`1250
`
`a
`
`Pa
`
`Constants:
`
`~ Readoeeeerdata|V=Oifnoise, 1 if UV, 2ifV
`
`from m1, m2, gems
`VTC = voiced threshold for corr
`
`—
`VTS = voiced threshold for std. dev.
`
`ff = forgetting factor for std. dev.
`
`Calculate XCORR of m1, gems|um_ma = # of taps in m.a.filter
`
`UV_ma = UVstd dev m.a.thresh
`Step 10 msec
`UV_std = UV std dev threshold
`UV = binary values denoting UV
`
`detected in each subband
`Calc mean (abs(XCORR)) = MC
`NAVSAD
`num_begin = # win at "beginning
`
`Variables:
`bh1 = LMScalc of MIC 1-2 TF
`Calc STD DEV of gems= GSD]
`keep_old = 1 if last win V/UV, 0 ow
`sd_ma_vector= last NV sd values
`Viwindow) = 2
` PSAD
`sd_ma = m.a.of the last NV sd
`bhi = bh1_old
`UV = [0,0], Filter m1 and
`
`m2 into 2 bands, 1500-2500
`and 2500-3500 Hz
`
`Calculate bh1 using
`Pathfinder for each subband
`
`Is
`
`new_sid> oy sd_ma
`new_sum = sum(abs(bh1));
`
`news oRV.s¢
`If not keep_old orat beginning,
`
`
`are we at the beginning?
`add new_sum fo new_sum_vector
`
`
`(ff numberslong)
`
`
`
`old_std = new_std
`
`new_std = STD DEV of
`keepold =0
`
`new sum vector
`
`UV(subband) = 2
`bh1_old = bh1
`
`
`
`bh1 = bh1_old
`
`
` If not keep_old or at beginning,
`keep_old = 1
`shift sd_ma_vectorto right
`
`After both subbands
`
`
`
`Replacefirst value in
`checked, is
`CEIL(SUM(UV}/2) = 1?
`sd ma vector with old std
`
`
` Filter sd_ma_vector with moving
`FIG. l 5
`averagefilter to get sd_ma
`
`IsMC >VTC and
`GSD > VTS?
`
`\_N
`
`
`
` 0
`
`td >
`
`UV
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 15 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 15 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 13 of 35
`
`US 8,321,213 B2
`
`Gems and Mean Correlation
`
`0
`
`0.5
`
`|
`
`1.5
`
`2
`
`2.5
`
`3
`
`3.5
`
`4
`
`FIG.16A
`
`Gems and Standard Deviation
`
`0
`
`0.5
`
`|
`
`1.5
`
`2
`
`2.5
`
`3
`
`3.5
`
`4
`
`FIG.16B
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 16 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 16 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 14 of 35
`
`US 8,321,213 B2
`
`v7 1100
`
`Voicing
`
`Noise
`
`1706.
`|
`| Acoustic
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 17 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 17 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 15 of 35
`
`US 8,321,213 B2
`
` Linear array
`
`midline
`
`FIG.18
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 18 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 18 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 16 of 35
`
`US 8,321,213 B2
`
`1900
`
`di versus delta M for delta d= 1, 2, 3,4 cm
`
`di (cm)
`
`FIG.19
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 19 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 19 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 17 of 35
`
`US 8,321,213 B2
`
`2000
`
`Gain Parameter
`2002
`
`Acoustic data (solid) and gain parameter (dashed)
`
`Acoustic Data
`
`0
`
`0.5
`
`I
`
`1.5
`2
`time (samples)
`
`2.5
`
`3
`
`3.5
`
`4
`
`x 10°
`
`FIG.20
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 20 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 20 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 18 of 35
`
`US 8,321,213 B2
`
`2100
`
`Mic1 and V for "pop pan" in \headmic\micgemsp1.bin
`
`Voicing Signal
`
`Audio Signal
`2104
`
`Level
`
`A. Unvoiced
`Level
`
`"Gems Signal
`2106
`
`Not Voiced
`
`0
`
`0.5
`
`|
`
`1.5
`
`2
`
`2.5
`
`3
`
`3.5
`
`4
`
`time (samples)
`
`FIG.21
`
`x 10
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 21 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 21 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 19 of 35
`
`US 8,321,213 B2
`
`
`yosadgpourayy
`
`[PAOWOYISION
`
`HSION
`
`(u)u
`
`0077—*
`
`00
`
`
`
`()
`
`(u)s
`
`10¢¢
`
`(sy)
`
`‘IVNDIS
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 22 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 22 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 20 of 35
`
`US 8,321,213 B2
`
`
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 23 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 23 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 21 of 35
`
`US 8,321,213 B2
`
`
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 24 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 24 of 57
`
`U.S. Patent
`
`US 8,321,213 B2
`
`Nov.27, 2012
`
`Sheet 22 of 35
`
`
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 25 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 25 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 23 of 35
`
`US 8,321,213 B2
`
` aw”
`
`TTT TT eee
`
`¢2
`
`702
`
`FIG.27
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 26 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 26 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 24 of 35
`
`US 8,321,213 B2
`
`Receive acoustic signals at a first physical
`microphone and a second physical microphone.
`
`Output first microphonesignal from first physical
`microphone and second microphonesignal from
`second physical microphone.
`
`Form first virtual microphoneusing the first combination
`of first microphone signal and second microphonesignal.
`
`Form second virtual microphone using second combination
`of first microphone signal and second microphonesignal.
`
`Generate denoised output signals having less
`acoustic noise than received acoustic signals.
`2800
`FIG.28
`
`Form physical microphone array includingfirst
`physical microphone and second physical microphone.
`
`signals from physical microphonearray.
`
`Form virtual microphone array includingfirst virtual
`microphone and second virtual microphone using
`
`2900—*
`
`FIG.29
`
`2802
`
`2804
`
`9806
`
`9808
`
`2810
`
`2902
`
`2904
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 27 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 27 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 25 of 35
`
`US 8,321,213 B2
`
`Linear response of V2 to a speech source at 0.10 meters
`
`4------
`
`4------b-----
`
`180
`+11l1 FIG.31
`
`-----4------
`
`11(114bo.
`
`11(1
`
`Linear response of V2 to a noise source at 1 meters
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 28 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 28 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 26 of 35
`
`US 8,321,213 B2
`
`Linear response of V1 to a speech source at 0.10 meters
`0
`
`0.8
`
`7
`
`near Tesponse 0
`
`f V1 toa no
`
`IS SOUrCE a
`
`t 1 meters
`
`L cM
`
`omO—Loy
`
`
`
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 29 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 29 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 27 of 35
`
`US 8,321,213 B2
`
`Linear response of V1 to a speech source at 0.1 meters
`
`180
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 30 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 30 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 28 of 35
`
`US 8,321,213 B2
`
`
`
`Response(dB)
`
`|
`:
`
`,
`:
`
`'Cardioid speech '
`___Tesponse
`
`!
`:
`
`|
`3
`
`Frequency response at 0 degrees
`
`facepevceereeterrereertens gph,fenerperenne
`response
`-------b---------eeeeet-te
`
`
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`Frequency (Hz)
`
`FIG.35
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 31 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 31 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 29 of 35
`
`US 8,321,213 B2
`
`
`
`Response(dB)
`
`
`
`V1/V2forspeech(dB)
`
`0
`FIG.36
`
`V1/V2 for speech versus B assuming d, = 0.1m
`
`V1(top, dashed) and V2 speech response vs. B assuming d, = 0.1m
`
`
`0.4
`
`0.5
`
`06
`
`07
`
`0.8
`B
`FIG.37
`
`0.9
`
`1
`
`1.1
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 32 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 32 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 30 of 35
`
`US 8,321,213 B2
`
`
`
`B factorvs. actual d, assuming d, = 0.1m and theta = 0
`B versus theta assuming d, = 0.1m
`
`0.05
`
`01
`
`015
`
`O08
`025
`O02
`Actual d, (meters)
`FIG.38
`
`035
`
`04
`
`045
`
`05
`
`
`
`80
`
`60
`
`40
`
`-20
`0
`20
`theta (degrees)
`FIG.39
`
`40
`
`60
`
`80
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 33 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 33 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 31 of 35
`
`US 8,321,213 B2
`
`(dB)
`Amplitude
`(degrees)
`Phase
`
`6000
`
`7000
`
`8000
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`Frequency (Hz)
`
`FIG.40
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 34 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 34 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 32 of 35
`
`US 8,321,213 B2
`
`40)
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`(dB)
`Amplitude
`
`
`Phase(degrees)
`
`180
`
`0
`
`1000
`
`2000
`
`
`5000
`3000
`4000
`Frequency (Hz)
`
`6000
`
`7000
`
`8000
`
`FIG.41
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 35 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 35 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 33 of 35
`
`US 8,321,213 B2
`
`Cancellation with dl = 1, thetal = 0, d2 = 1, and theta2 = 30 Amplitude
`(dB)
`(degrees)
`Phase
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`0
`
`1000
`
`2000
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`Frequency (Hz)
`
`FIG.42
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 36 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 36 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 34 of 35
`
`US 8,321,213 B2
`
`
`
`
`
`Cancellation with dl = 1, thetal = 0, d2 = 1, and theta2 = 45
`
`Phase (degrees)
`
`
`
`Amplitude(dB)
`
`
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`Frequency (Hz)
`
`FIG.43
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 37 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 37 of 57
`
`U.S. Patent
`
`Nov.27, 2012
`
`Sheet 35 of 35
`
`US 8,321,213 B2
`
`Original V1 (top) and cleaned V1 (bottom) with simplified VAD (dashed) in noise
`
`Noisy
`Cleaned
`
`0
`
`0.5
`
`1
`1.5
`Time (samples at 8 kHz/sec)
`
`2
`
`2.5
`
`FIG.44
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 38 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 38 of 57
`
`US 8,321,213 B2
`
`1
`ACOUSTIC VOICE ACTIVITY DETECTION
`
`(AVAD) FOR ELECTRONIC SYSTEMS
`
`RELATED APPLICATIONS
`
`This application claimsthe benefit of U.S. Patent Applica-
`tion No. 61/108,426, filed Oct. 24, 2008.
`This application is a continuation in part of U.S. patent
`application Ser. No. 11/805,987, filed May 25, 2007.
`This application is a continuation in part of U.S. patent
`application Ser. No. 12/139,333, filed Jun. 13, 2008.
`
`TECHNICAL FIELD
`
`The disclosure herein relates generally to noise suppres-
`sion. In particular, this disclosure relates to noise suppression
`systems, devices, and methods for use in acoustic applica-
`tions.
`
`BACKGROUND
`
`The ability to correctly identify voiced and unvoiced
`speech is critical to many speech applications including
`speech recognition, speaker verification, noise suppression,
`and many others. In a typical acoustic application, speech
`from a human speaker is captured and transmitted to a
`receiver in a different location. In the speaker’s environment
`there may exist one or more noise sources that pollute the
`speech signal, the signal of interest, with unwanted acoustic
`noise. This makes it difficult or impossible for the receiver,
`whether human or machine, to understand the user’s speech.
`Typical methods for classifying voiced and unvoiced
`speech haverelied mainly on the acoustic content of single
`microphone data, which is plagued by problems with noise
`and the corresponding uncertainties in signal content. This is
`especially problematic withthe proliferation ofportable com-
`munication devices like mobile telephones. There are meth-
`ods knownin the art for suppressing the noise present in the
`speech signals, but these normally require a robust method of
`determining when speech is being produced. Non-acoustic
`methods have been employed successfully in commercial
`products suchas the Jawbone headset produced by Aliphcom,
`Inc., San Francisco, Calif. (Aliph), but an acoustic-only solu-
`tion is desired in some cases (e.g., for reduced cost, as a
`supplementto the non-acoustic sensor, etc.).
`
`INCORPORATION BY REFERENCE
`
`Each patent, patent application, and/or publication men-
`tioned in this specification is herein incorporated by reference
`in its entirety to the sameextent as if each individualpatent,
`patent application, and/or publication was specifically and
`individually indicated to be incorporated by reference.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG.1 is a configuration of a two-microphone array with
`speech source S, under an embodiment.
`FIG.2 is a block diagram ofV, construction using a fixed
`B(z), under an embodiment.
`FIG. 3 is a block diagram of V, construction using an
`adaptive B(z), under an embodiment.
`FIG. 4 is a block diagram of V, construction, under an
`embodiment.
`FIG. 5 is a flow diagram of acoustic voice activity detec-
`tion, under an embodiment.
`
`2
`FIG.6 showsexperimentalresults of the algorithm using a
`fixed beta when only noise is present, under an embodiment.
`FIG. 7 shows experimentalresults of the algorithm using a
`fixed beta when only speech is present, under an embodiment.
`FIG. 8 showsexperimentalresults of the algorithm using a
`fixed beta when speech andnoise is present, under an embodi-
`ment.
`
`FIG. 9 shows experimental results of the algorithm using
`an adaptive beta whenonly noise is present, under an embodi-
`ment.
`
`FIG. 10 shows experimentalresults of the algorithm using
`an adaptive beta when only speech is present, under an
`embodiment.
`
`FIG. 11 shows experimentalresults of the algorithm using
`an adaptive beta when speech and noise is present, under an
`embodiment.
`
`FIG. 12 is ablock diagram of aNAVSADsystem, under an
`embodiment
`
`FIG. 13 is a block diagram of a PSAD system, under an
`embodiment.
`
`FIG. 14 is a block diagram of a denoising subsystem,
`referred to herein as the Pathfinder system, under an embodi-
`ment.
`
`FIG. 15 is a flow diagram of a detection algorithm for use
`in detecting voiced and unvoiced speech, under an embodi-
`ment.
`
`20
`
`25
`
`FIGS. 16A, 16B, and 17 show data plots for an example in
`which a subject twice speaks the phrase “pop pan”, under an
`embodiment.
`
`30
`
`FIG. 16A plots the received GEMSsignalfor this utterance
`along with the mean correlation between the GEMSsignal
`and the Mic 1 signal and the threshold T1 used for voiced
`speech detection, under an embodiment.
`FIG. 16Bplots the recetved GEMSsignalfor this utterance
`along with the standard deviation ofthe GEMSsignal and the
`threshold T2 used for voiced speech detection, under an
`embodiment.
`
`FIG. 17 plots voiced speech detected from the acoustic or
`audio signal, along with the GEMSsignal and the acoustic
`noise; no unvoiced speechis detectedin this example because
`ofthe heavy background babble noise, under an embodiment.
`FIG. 18 is a microphonearray for use under an embodi-
`ment of the PSAD system.
`FIG. 19 is a plot of AM versus d, for several Ad values,
`under an embodiment.
`
`FIG. 20 showsa plotofthe gain parameteras the sum ofthe
`absolute values of H,(z) and the acoustic data or audio from
`microphone1, under an embodiment.
`FIG.21 is an alternative plot of acoustic data presented in
`FIG. 20, under an embodiment.
`FIG. 22 is a two-microphoneadaptive noise suppression
`system, under an embodiment.
`FIG. 23 is a generalized two-microphone array (DOMA)
`including an array and speech source S configuration, under
`an embodiment.
`
`FIG. 24 is a system for generating or producing a first order
`gradient microphone V using two omnidirectional elements
`O, and O,, under an embodiment.
`FIG. 25 is a block diagram for a DOMA including two
`physical microphones configured to form two virtual micro-
`phones V, and V,,, under an embodiment.
`FIG. 26 is a block diagram for a DOMA including two
`physical microphones configured to form N virtual micro-
`phones V, through V,,, where N is any numbergreater than
`one, under an embodiment.
`
`35
`
`45
`
`50
`
`55
`
`60
`
`65
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 39 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 39 of 57
`
`3
`FIG. 271s an example ofa headset or head-worn device that
`includes the DOMA,as described herein, under an embodi-
`ment.
`
`4
`signal but requires training. In addition, restrictions can be
`placedonthefilter to ensure thatit is training only on speech
`and not on environmentalnoise.
`
`US 8,321,213 B2
`
`FIG. 28 is a flow diagram for denoising acoustic signals
`using the DOMA,under an embodiment.
`FIG.29 is a flow diagram for forming the DOMA,under an
`embodiment.
`
`FIG.30 is a plot of linear response of virtual microphone
`V, with B=0.8 to a 1 kHz speech source at a distance of 0.1 m,
`under an embodiment.
`
`FIG.31 is a plot of linear response of virtual microphone
`V, with (8=0.8 to a 1 kHz noise source at a distance of 1.0 m,
`under an embodiment.
`FIG.32 is a plot of linear response of virtual microphone
`V, with B=0.8 toa 1 kHz speech source at a distance of 0.1 m,
`under an embodiment.
`FIG.33 is a plot of linear response of virtual microphone
`V, with B=0.8 to a 1 kHz noise sourceat a distance of 1.0 m,
`under an embodiment.
`
`FIG.34 is a plot of linear response of virtual microphone
`V, with 6=0.8 to a speech source at a distance of 0.1 m for
`frequencies of 100, 500, 1000, 2000, 3000, and 4000 Hz,
`under an embodiment.
`FIG. 35 is a plot showing comparison of frequency
`responses for speech for the array of an embodimentand for
`a conventional cardioid microphone, under an embodiment.
`FIG. 36 is a plot showing speech response for V, (top,
`dashed) andV,, (bottom, solid) versus B with d, assumedto be
`0.1 m, under an embodiment, under an embodiment.
`FIG.37 is a plot showing a ratio ofV/V, speech responses
`shown in FIG. 31 versus B, under an embodiment.
`FIG.38 is a plot of B versus actual d, assuming that d,=10
`cm and theta=0, under an embodiment.
`FIG. 39 is a plot of B versus theta with d=10 cm and
`assuming d,=10 cm, under an embodiment.
`FIG. 40 is a plot of amplitude (top) and phase (bottom)
`response of N(s) with B=1 and D=-7.2 usec, under an
`embodiment.
`
`FIG. 41 is a plot of amplitude (top) and phase (bottom)
`response of N(s) with B=1.2 and D=-7.2 usec, under an
`embodiment.
`FIG. 42 is a plot of amplitude (top) and phase (bottom)
`responseofthe effect on the speech cancellation in V, due to
`a mistake in the location of the speech source with q1=0
`degrees and q2=30 degrees, under an embodiment.
`FIG. 43 is a plot of amplitude (top) and phase (bottom)
`response of the effect on the speech cancellation in V, due to
`a mistake in the location of the speech source with q1=0
`degrees and q2=45 degrees, under an embodiment.
`FIG.44 shows experimentalresults for a 2d,=19 mm array
`using a linear B of 0.83 and B1=B2=1 on a Bruel and Kjaer
`Head and Torso Simulator (HATS) in very loud (~85 dBA)
`music/speech noise environment.
`
`DETAILED DESCRIPTION
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`In the following description, numerousspecific details are
`introduced to provide a thorough understanding of, and
`enabling description for, embodiments. One skilled in the
`relevant art, however, will recognize that these embodiments
`can be practiced without one or more ofthe specific details, or
`with other components, systems, etc. In other instances, well-
`known structures or operations are not shown, or are not
`described in detail, to avoid obscuring aspectsofthe disclosed
`embodiments.
`FIG. 1 is a configuration of a two-microphonearray of the
`AVAD with speech source S, under an embodiment. The
`AVADof an embodimentuses two physical microphones (O,
`and O.,) to form two virtual microphones (V, and V,). The
`virtual microphones of an embodimentare directional micro-
`phones, but the embodimentis not so limited. The physical
`microphones of an embodiment
`include omnidirectional
`microphones, but the embodiments described herein are not
`limited to omnidirectional microphones. The virtual micro-
`phone (VM)V,is configured in sucha waythatit has minimal
`responseto the speech of the user, while V, is configured so
`that it does respondto the user’s speech buthas a very similar
`noise magnitude response to V,, as described in detail herein.
`The PSAD VAD methodscan then be used to determine when
`speech is taking place. A further refinementis the use of an
`adaptivefilter to further minimize the speech response ofV,,
`thereby increasing the speech energyratio used in PSAD and
`resulting in better overall performance of the AVAD.
`The PSAD algorithm as described herein calculates the
`ratio of the energies of two directional microphones M, and
`M3:
`
`
`My (2;
`My(zi)”
`
`66599
`1
`
`wherethe “z’”indicates the discrete frequency domain and
`ranges from the beginning of the window ofinterest to the
`end, but the samerelationship holds in the time domain. The
`summation can occur over a window of any length; 200
`samples at a sampling rate of 8 kHz has been used to good
`effect. Microphone M,is assumedto have a greater speech
`response than microphone M,. The ratio R depends on the
`relative strength of the acoustic signal of interest as detected
`by the microphones.
`For matched omnidirectional microphones(i.e. they have
`the same response to acoustic signals forall spatial orienta-
`tions and frequencies), the size of R can be calculated for
`speech and noise by approximating the propagation of speech
`and noise wavesas spherically symmetric sources. For these
`the energy of the propagating wave decreases as 1/y:
`
`Acoustic Voice Activity Detection (AVAD) methods and
`systems are described herein. The AVAD methodsandsys-
`tems, which include algorithms or programs, use micro-
`phones to generate virtual directional microphones which
`have very similar noise responses and very dissimilar speech
`Thedistance d, is the distance from the acoustic source to
`responses. The ratio of the energies of the virtual micro-
`M,, d, is the distance from the acoustic source to M,, and
`phonesis then calculated over a given window size and the
`d=d,-d, (see FIG. 1). It is assumed that O, is closer to the
`ratio can then be used with a variety of methodsto generate a
`speech source (the user’s mouth) so that d is alwayspositive.
`VAD signal. The virtual microphones can be constructed
`
`using either a fixed or an adaptive filter. The adaptivefilter If the microphonesand the user’s mouthare all onaline, then
`generally results in a more accurate and noise-robust VAD
`d=2d,, the distance between the microphones. For matched
`
`» Miz dz
`
`. Moz
`
`R=
`
`60
`
`65
`
`
`=2-.
`dy
`
`di +d
`
`aq
`
`

`

`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 40 of 57
`Case 6:21-cv-00984-ADA Document 19-4 Filed 12/23/21 Page 40 of 57
`
`US 8,321,213 B2
`
`5
`omnidirectional microphones, the magnitude of R, depends
`only onthe relative distance between the microphonesand the
`acoustic source. For noise sources, the distances are typically
`a meter or more, and for speech sources, the distances are on
`the order of 10 cm, but the distances are not so limited.
`Therefore for a 2-cm array typical values of R are:
`
`6
`Thefilter B(z) can also be determined experimentally using
`an adaptivefilter. FIG. 3 is a block diagram ofV, construction
`using an adaptive B(z), under an embodiment, where:
`
`
`a(Z)O2(z)
`ZO;(2)
`
`Bz) =
`
`
`Ru @ 12cm _
`5 aq 10em
`
`102 cm 1.02
`d
`vad 100em
`
`where the “S”subscript denotes the ratio for speech sources
`and “N”the ratio for noise sources. There is not a significant
`amount of separation between noise and speech sources in
`this case, and therefore it would be difficult to implement a
`robust solution using simple omnidirectional microphones.
`A better implementationis to use directional microphones
`where the second microphone has minimal speech response.
`As described herein, such microphones can be constructed
`using omnidirectional microphonesO, and O,:
`
`Fi @=-B@)a@)O2Z)+01(@)2"
`
`Vo(@)-a(z)O2(z)-BE)Oi@)z*
`
`where a(z) is a calibration filter used to compensate O,’s
`response so that it is the same as O,, 6(z) is a filter that
`describes the relationship between O, and calibrated O, for
`speech, andy is a fixed delay that depends onthe size of the
`array. There is no loss of generality in defining a(z) as above,
`as either microphone may be compensated to match the other.
`For this configuration V, and V, have very similar noise
`response magnitudes and very dissimilar speech response
`magnitudesif
`
`20
`
`25
`
`30
`
`35
`
`40
`
`where again d=2d, and c is the speed of soundin air, which is
`temperature dependent and approximately
`
`45
`
`|
`
`m
`T
`1+ a5 sec
`
`ec =331.3
`
`50
`
`where T is the temperature of the air in Celsius.
`Thefilter B(z) can be calculated using wave theory to be
`
`a
`Pa= a=
`
`
`dy
`djt+d
`
`55
`
`[2]
`
`where again d, is the distance from the user’s mouth to O,.
`FIG.2 is a block diagram of V, construction using a fixed
`B(z), under an embodiment. This fixed (or static) 6B works
`sufficiently well ifthe calibration filter a(z) is accurate and d,
`and d., are accurate for the user. This fixed-B algorithm, how-
`ever, neglects important effects such as reflection,diffraction,
`poorarray orientation(i.e. the microphonesand the mouth of
`the userare not all ona line), andthe possibility of different d,
`and d, values for different users.
`
`60
`
`65
`
`The adaptive process varies A(z) to minimizethe outputofV>
`when only speech is being received by O, and O,. A small
`amountof noise maybetolerated with little ill effect, butit is
`preferred that only speech is being received whenthe coeffi-
`cients of A(z) are calculated. Any adaptive process may be
`used; a normalized least-mean squares (NLMS) algorithm
`wasused in the examples below.
`The V, can be constructedusing the current value for f(z)
`or the fixed filter B(z) can be used for simplicity. FIG. 4 is a
`block diagram ofV, construction, under an embodiment.
`Nowtheratio R is
`
`
`_ WM
`Ie
`
`(-Blz)a(Z)O2(z) + Ovle)e")
`(a(2)Ox(z) — B20 (Zz)
`
`where double bar indicates norm and again any size window
`maybe used.If B(z) has been accurately calculated,therati

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket