throbber
US008838184B2
`
`a2) United States Patent
`US 8,838,184 B2
`(0) Patent No.:
`*Sep. 16, 2014
`(45) Date of Patent:
`Burnett et al.
`
`(54)
`
`(75)
`
`WIRELESS CONFERENCE CALL
`TELEPHONE
`
`Inventors: Gregory C. Burnett, Northfield, MN
`(US); Michael Goertz, Redwood City,
`CA (US); Nicolas Jean Petit, Mountain
`View, CA (US); Zhinian Jing, Belmont,
`CA (US); Steven Foster Forestieri,
`Santa Clara, CA (US); Thomas Alan
`Donaldson, London (GB)
`
`(73)
`
`Assignee: AliphCom, San Francisco, CA (US)
`
`(*)
`
`Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 132 days.
`
`This patent is subject to a terminal dis-
`claimer.
`
`(21)
`
`Appl. No.: 13/184,422
`
`(22)
`
`Filed:
`
`Jul. 15, 2011
`
`(65)
`
`(63)
`
`(60)
`
`(51)
`
`Prior Publication Data
`
`US 2012/0288079 Al
`
`Nov. 15, 2012
`
`Related U.S. Application Data
`
`Continuation-in-part of application No. 12/139,333,
`filed on Jun. 13, 2008, now Pat. No. 8,503,691, and a
`continuation-in-part of application No. 10/667,207,
`filed on Sep. 18, 2003, now Pat. No. 8,019,091.
`
`Provisional application No. 61/364,675, filed on Jul.
`15, 2010.
`
`Int. Cl.
`
`HO04B 1/38
`HO04M 1/00
`HO4M 3/56
`HOA4R 3/04
`GIOL 21/0208
`HO4R 3/00
`HO4R 1/40
`
`(2006.01)
`(2006.01)
`(2006.01)
`(2006.01)
`(2013.01)
`(2006.01)
`(2006.01)
`
`(2006.01)
`(2013.01)
`
`HO4R 1/10
`GIOL 21/0216
`US. Cl.
`CPC wee H04M 3/568 (2013.01); HO4R 1/1083
`(2013.01); HO4M 2203/509 (2013.01); GIOL
`2021/02165 (2013.01); HO4R 2420/07
`(2013.01); HO4M 3/56 (2013.01); HO4R 3/04
`(2013.01); G1OL 21/0208 (2013.01); HO4R
`3/005 (2013.01); HO4M 2250/62 (2013.01);
`HOAR 1/406 (2013.01)
`USPC ooceececceeceneseneeeescneneenees 455/569.1; 455/570
`Field of Classification Search
`USPC oo ceeeeeeseneteceecneeeeneceeees 455/569.1, 570
`See application file for complete search history.
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`(52)
`
`(58)
`
`(56)
`
`6,707,910 BL*
`2002/0039425 Al*
`
`3/2004 Valve etal. o.. 379/388.06
`4/2002 Burnett etal. oc. 381/94.7
`
`(Continued)
`
`Primary Examiner — Howard Weiss
`(74) Attorney, Agent, or Firm — Kokka & Backus, PC
`
`(57)
`
`ABSTRACT
`
`A wireless conference call telephone system uses body-worn
`wired or wireless audio endpoints comprising microphone
`arrays and, optionally, speakers. These audio-endpoints,
`which include headsets, pendants, and clip-on microphones
`to name a few, are used to capture the user’s voice and the
`resulting data may be used to remove echo and environmental
`acoustic noise. Each audio-endpointtransmits its audio to the
`telephony gateway, where noise and echo suppression can
`take place if not already performed on the audio-endpoint,
`and where each audio-endpoint’s output can be labeled, inte-
`grated with the output of other audio-endpoints, and trans-
`mitted over one or more telephony channels of a telephone
`network. The noise and echo suppression can also be done on
`the audio-endpoint. The labeling of each user’s output can be
`usedby the outside caller’s phone to spatially locate each user
`in space, increasing intelligibility.
`
`16 Claims, 33 Drawing Sheets
`
`tivo 7
`
`]
`
`
`
`
`
`4g0
`
`Hag”
`‘Wireless Children and/orFriends
`
`1
`
`APPLE 1012
`
`Pacent Hog"
`Network Connection fa
`
`
`JFJrtz
`Telephony,
`Telephony
`Telephony,
`
`Connection]
`
`
`Connection
`Conseco]
`LT ij
`
`Multi-way Calling Subsystem 420
`Connection 44p
`Audio Processing
`
`
`Management
`435
`ll
`Ll
`|
`Wireless
`Wireless
`Wireless
`Radio qq?
`Ratiogy0
`Radio 44"
`
`
`whea
`Child
`wes
`
`|
`|
`
`
`
`
`
`
`
`eH,
` é= HAs
`
`APPLE 1012
`
`1
`
`

`

`US 8,838,184 B2
`
`Page 2
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`........... 455/416
`2009/0264114 Al* 10/2009 Virolainenet al.
`2012/0184337 Al*
`7/2012 Burnett etal.
`....0..... 455/569.1
`
`2009/0081999 Al*
`
`3/2009 Khasawnehetal. .......... 455/416
`
`* cited by examiner
`
`2
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 1 of 33
`
`US8,838,184 B2
`
`We
`
`Clip
`
`\
`Vent
`
`
`o\9 Directional -—~” fx
`mic (inside)
`
`Battery/Radio/DSP
`Multi-use “"S'G2) 10
`button
`140
`
`Vent
`iv
`
`Figure 1
`
`3
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 2 of 33
`
`US8,838,184 B2
`
`Zoe")
`
`Y
`
`Zhe
`Multi-use
`button
`
`ZZo
`™\ Connects behind
`neck of user
`
`Battery/Radio/DSP
`(inside)
`zefo
`
`0, vent
`Zg0
`
`Figure 2
`
`4
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 3 of 33
`
`US8,838,184 B2
`
`Soo J
`
`g
`
`connections
`
`——
`
`4
`
`392
`
`headset Children
`
`440
`Child recharging
`stations
`| \\ s, Four wireless
`Power and telephony
`
`
`
` ZLO
`
`Message window
`
`Function buttons
`
`$30
`
`410
`
`34
`
`Figure 3
`
`5
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 4 of 33
`
`US8,838,184 B2
`
`{vo
`
`Parent 40"
`
`
`
`anagement
`
`
`
`
`
`
`
`Network Connection Lf ja
`
`
`
`
`Jf
`
`
`Telephony,,|:
`Telephony
`Telephony
`
`
`
`
`
`
`
`
`
` Neomnection ud
`.
`Wired
`
`Child
`uLs
`
`
`
`
`
`Wireless
`Wireless
`Radio 44e
`
`Radio gd
`
`Wireless Children and/or Friends
`
`Figure 4
`
`6
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 5 of 33
`
`US8,838,184 B2
`
`
`
`
`Far-end
`user over
`
`
`
`Far end
`
`user over
`
`
`end user
`
`
`SIP B
`Friend A
`
`$20
`
`
`
`
` Near-
`
`
`
`end user
`
`
`Friend B
`
`
`Figure 5
`
`7
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 6 of 33
`
`US8,838,184 B2
`
` ba
`
`
`
`
`
`
`= User participates in conferencecall
`
`Leaves room
`LL 69°
`G32
`
`t Sets up telephonyconnection tocontinuecall (optional)
`we
`
`Breaks Connection
`-
`
`34
`
`
`Removes from
`
`conferencecall
`
`
`Figure 6
`
`
`
`ty Yor
`
`Request audio connection
`
`tt
`
`Accepts Connection
`
`Request audio connection
`
`Sets up audio connection
`
`624
`
`Accepts audio connection
`SE
`
`626
`
`
`Adds audio to
`conference call
`
`
`
`
`8
`
`

`

`Sep. 16, 2014
`
`U.S. Patent
`
`((5))|oOtL
`
`(u)s
`
`TVNOIS
`
`Sheet 7 of 33
`
`US 8,838,184 B2
`
`yo
`
`(())
`
`HSION
`
`(u)u
`
`9
`
`

`

`Sheet 8 of 33
`
`US8,838,184 B2
`
`U.S. Patent
`
`Sep. 16, 2014
`
`
`
`10
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 9 of 33
`
`US 8,838,184 B2
`
`
`
`FIG, 10
`
`11
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 10 of 33
`
`US 8,838,184 B2
`
`
`
`FIG
`
`12
`
`12
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 11 of 33
`
`US 8,838,184 B2
`
`
`
`[Zoo
`
`TSS meee
`
`-aa
`
`FIG,IZ
`
`13
`
`13
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 12 of 33
`
`US 8,838,184 B2
`
`Receive acoustic signals at a first physical
`microphone and a second physical microphone.
`
`
`
`Output first microphone signal from first physical
`microphone and second microphone signal from
`second physical microphone,
`
`
`
`702.
`
`(Be
`
`Form first virtual microphoneusing the first combination
`of first microphone signal and second microphonesignal.
`
`306
`
`Form second virtual microphone using second combination 130g
`of first microphonesignal and second microphonesignal.
`
`Generate denoised output signals having less
`acoustic noise than received acoustic signals.
`
`(310
`
`LA
`
`Vee
`
`FIG, '3
`
`
`
`Form physical microphonearray includingfirst
`physical microphone and secondphysical microphone.
`© [Oe
`
`
`
`
`
`
`Form virtual microphonearray includingfirst virtual
`microphone and secondvirtual microphoneusing a |4o4
`signals from physical microphonearray.
`
`
`\4 00 7
`FIG. 4
`
`
`
`14
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 13 of 33
`
`US 8,838,184 B2
`
`Linear response of V2 to a speech source at 0.10 meters
`
`FIG. 15
`0.8
`
`90
`
`15
`
`15
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 14 of 33
`
`US 8,838,184 B2
`
`Linear response of V1 to a speech sourceat 0.10 meters oo
`
`fi,
`
`CD—<
`
`Linear response of V1 to a noise source at | meters
`
`16
`
`16
`
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 15 of 33
`
`US8,838,184 B2
`
`Linear response of V1 to a speech source at 0.1 meters
`
`180
`
`17
`
`17
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 16 of 33
`
`US 8,838,184 B2
`
`
`
`Response(dB)
`
`Frequency response at 0 degrees
`
`Catdioid speech
`response
`
`aprossosscboscossonndenms VIL gneegh wbovcecccbtet
`|
`!
`|
`response
`|
`
`|
`
`-20
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`Frequency (Hz)
`
`FIG, 20
`
`18
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 17 of 33
`
`US8,838,184 B2
`
`
`
`Response(dB)
`
`0.7
`
`0.8
`B
`FIG,2
`
`V1 (top, dashed) and V2 speech responsevs. B assuming d, = 0.1m
`
`
`V1/V2 for speech versus B assuming d, = 0.1m
`
`
`
`
`
`
`
`V1/V2forspeech(dB)
`
`19
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 18 of 33
`
`US 8,838,184 B2
`
`B factorvs. actual d, assuming d, = 0.1m and theta = 0
`
`
`005
`01
`015
`O02
`025
`O38
`0385
`O04
`045
`05
`Actual d; (meters)
`FIG, 2%
`
`B versus theta assuming d, = 0.1m
`
`
`80
`-60
`40
`-20
`0
`20
`40
`60
`80
`theta (degrees)
`FIG. 24
`
`20
`
`20
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 19 of 33
`
`US 8,838,184 B2
`
`(dB) &
`
`100
`
`1000
`
`2000
`
`Amplitude
`
`bb
`3S
`
`3p
`
`y
`
`5000
`4000
`3000
`Frequency (Hz)
`
`6000
`
`7000
`
`8000
`
`FIG, 25
`
`21
`
`21
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 20 of 33
`
`US 8,838,184 B2
`
`Oe
`
`-10
`
`baneTerran
`
`:
`-20
`$BQ |nennnreeerence eet eecnc nn
`“40
`1000
`2000
`3000
`4000
`
`dp cenn ccna nnncenbanncanone
`5000
`6000
`7000
`8000
`
`260
`2ee
`
`N(s) for B = 1.2 and D = -7.2e-006 seconds
`
`
`
`
`Amplitude(dB)
`
`
`
`Phase(degrees)
`
`DOD|------- haebe neeeebbeee
`
`220
`
`180
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`Frequency (Hz)
`
`FIG. 26
`
`22
`
`22
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 21 of 33
`
`US 8,838,184 B2
`
`Cancellation with dl = 1, thetal = 0, d2 = 1, and theta2 = 30
`
`
`
`
`8000
`
`
`
`605
`1000
`2000
`3000
`4000
`5000
`6000
`7000
`Frequency (Hz)
`
`23
`
`
`
`Amplitude(dB)
`
`Eb
`
`3 #P
`
`o
`
`23
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 22 of 33
`
`US 8,838,184 B2
`
`Cancellation with dl = 1, thetal = 0, d2 = 1, and theta2 = 45
`
`
`
`
`
`
`
`
`
`
`
`
`0
`
`1000
`
`2000
`
`3000
`
`4000
`
`5000
`
`6000
`
`7000
`
`8000
`
`
`
`Amplitude(dB)
`
`
`
`Phase(degrees)
`
`
`
`7000
`
`8000
`
`
`
`6000
`
`3000
`4000
`5000
`Frequency (Hz)
`
`0
`
`1000
`
`2000
`
`FIG. 23
`
`24
`
`24
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 23 of 33
`
`US 8,838,184 B2
`
`Original V1 (top) and cleaned V1 (bottom) with stmplifted VAD (dashed) in noise
`
`Noisy
`
`Cleaned
`
`0
`
`0.5
`
`1
`1.5
`Time (samples at 8 kHz/sec)
`
`2
`
`2.5
`
`FIG, a1
`
`25
`
`25
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 24 of 33
`
`US 8,838,184 B2
`
`OeSls
`
`OvOE
`
`BUISTOUaT
`
`waysAsqns
`
`SUINTOA
`
`slosuag
`
`OcO0e
`
`
`
`ogoJOSS900Ig
`
`souoqdors1
`
`OLOE\—cove
`
`
`yoaedgpours])
`
`JBAOWAYSION
`
`LeOld
`
`26
`
`26
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`—ow—/(3)
`
`
`
`:(2)s|SN(2)'Hjeusis
`
`Sep. 16, 2014
`
`Sheet 25 of 33
`
`US 8,838,184 B2
`
`ceOls
`
`(9)
`
`(2)"NUoSION
`
`27
`
`27
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 26 of 33
`
`US 8,838,184 B2
`
`ceOld
`
`()
`
`eusts
`
`(2)s
`
`28
`
`28
`
`

`

`U.S. Patent
`
`FIG.34
`
`Sep. 16, 2014
`
`Sheet 27 of 33
`
`Receive acoustic signals
`
`Receive voice activity
`(VAD) information
`
` US 8,838,184 B2
`
`Determine absence of
`voicing and generate first
`transfer function
`
`3406
`
`Determine presence of
`voicing and generate
`second transfer function
`
`Produce denoised
`acoustic data stream
`
`3410
`
`29
`
`29
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 28 of 33
`
`US 8,838,184 B2
`
`Dirty
`Audio
`3504
`
`Noise Removal Results for American English Female Saying 406-5562
`
`3502
`
`4
`
`Cleaned
`Audio
`
`30
`
`30
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 29 of 33
`
`US 8,838,184 B2
`
`-I6.36A
`
`FIG.36B
`
`VAD
`
`Device
`
`VAD
`
`3602A
`
`3630
`
`3640
`
`|
`
`3601
`
`3602B
`
`Signal
`Processing
`
`3600
`
`VAD
`Algorithm
`3650
`3664 fs 3604
` System
`
`Noise
`
`Suppression
`System
`
`3601
`
`31
`
`31
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 30 of 33
`
`US 8,838,184 B2
`
`yo 8100
`
`FIG.37
`
`32
`
`32
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 31 of 33
`
`US 8,838,184 B2
`
`Denoised
`
`AccelerometerNoisyAudio
`
`Time (samples at 8 kHz)
`Ke
`
`FIG.38
`
`33
`
`33
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 32 of 33
`
`US 8,838,184 B2
`
`2 oh
`2 0
`3902
`:
`
` 4 3904
`
`-0.4
`
`Audio
`
`35
`
`4
`
`45
`
`5
`
`55
`
`6
`
`65
`
`SSM
`
`Denoised
`
`2
`
`25
`
`3
`
`Time (samples at 8 kHz)
`
`FIG.39
`
`34
`
`34
`
`

`

`U.S. Patent
`
`Sep. 16, 2014
`
`Sheet 33 of 33
`
`US 8,838,184 B2
`
`
`
`GEMSNoisyAudio
`
`Audio
`
`Denoised
`
`\
`
`Time (samples at 8 kHz)
`|
`
`FIG.40
`
`)
`
`35
`
`35
`
`

`

`US 8,838,184 B2
`
`1
`WIRELESS CONFERENCE CALL
`TELEPHONE
`
`RELATED APPLICATIONS
`
`This application claimsthe benefit of U.S. Patent Applica-
`tion No. 61/364,675, filed Jul. 15, 2010.
`This application is a continuation in part of U.S. patent
`application Ser. No. 12/139,333, filed Jun. 13, 2008.
`This application is a continuation in part of U.S. patent
`application Ser. No. 10/667,207, filed Sep. 18, 2003.
`
`TECHNICAL FIELD
`
`The disclosure herein relates generally to telephones con-
`figured for conference calling, including such implementa-
`tions as personal computers or servers acting as telephony
`devices.
`
`BACKGROUND
`
`Conventional conference call telephones use one or more
`microphonesto sample acoustic soundin the environmentof
`interest and one or more loudspeakersto broadcast the incom-
`ing communication. There are severaldifficulties involved in
`such communications systems, including strong echo paths
`between the loudspeaker(s) and the microphone(s), difficulty
`in clearly transmitting the speech of users in the room, and
`little or no environmental acoustic noise suppression. These
`problemsresult in the outside caller(s) having difficulty hear-
`ing and/or understandingall of the users, poor or impossible
`duplex communication, and noise (such as mobile phone
`ringers and typing on keyboards on the same table as the
`conference phone) being clearly transmitted through the con-
`ference call to the outside caller(s)—sometimes at a higher
`level than the users’ speech.
`
`INCORPORATION BY REFERENCE
`
`Each patent, patent application, and/or publication men-
`tioned in this specification is herein incorporated by reference
`in its entirety to the sameextent as if each individual patent,
`patent application, and/or publication was specifically and
`individually indicated to be incorporated by reference.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shows a body-worn Child device as a clip-on micro-
`phonearray, under an embodiment.
`FIG. 2 shows a body-worn Child device as a pendant
`microphonearray, under an alternative embodiment.
`FIG. 3 showsa wireless conference call telephone system
`comprising a Parent with four wireless Children and one
`wired Child, under an embodiment.
`FIG. 4 showsa block diagram of wireless conferencecall
`telephone system comprising a Parent and its modules and the
`Children/Friends (three headsets and a loudspeaker), under
`an embodiment.
`
`FIG. 5 is a flow diagram showing audio streaming between
`two far-end users and two near-end users, under an embodi-
`ment.
`
`FIG. 6 is a flow chart for connecting wireless Friends/
`Children and a Parent of the wireless conference call tele-
`
`phone system, under an embodiment.
`FIG. 7 is a two-microphone adaptive noise suppression
`system, under an embodiment.
`
`2
`FIG. 8 is an array and speech source (S) configuration,
`under an embodiment. The microphones are separated by a
`distance approximately equal to 2d), and the speech sourceis
`located a distance d, away from the midpointofthe array at an
`angle 0. The system is axially symmetric so only d, and 0 need
`be specified.
`FIG. 9 is a block diagram fora first order gradient micro-
`phoneusing two omnidirectional elements O, and O., under
`an embodiment.
`
`FIG. 10 is a block diagram for a DOMA including two
`physical microphones configured to form two virtual micro-
`phones V, and V.,, under an embodiment.
`FIG. 11 is a block diagram for a DOMA including two
`physical microphones configured to form N virtual micro-
`phones V, through V,,, where N is any numbergreater than
`one, under an embodiment.
`FIG. 12 is an exampleof a headset or head-worn devicethat
`includes the DOMA,as described herein, under an embodi-
`ment.
`
`FIG. 13 is a flow diagram for denoising acoustic signals
`using the DOMA,under an embodiment.
`FIG. 14 1s a flow diagram for forming the DOMA,under an
`embodiment.
`
`FIG.15 is a plot of linear response of virtual microphone
`V, to a 1 kHz speech sourceat a distance of 0.1 m, under an
`embodiment. The null is at 0 degrees, where the speech is
`normally located.
`FIG.16 is a plot of linear response of virtual microphone
`V, to a 1 kHz noise source at a distance of 1.0 m, under an
`embodiment. There is no null and all noise sources are
`detected.
`FIG. 17 is a plot of linear response of virtual microphone
`V, toa 1 kHz speech source at a distance of 0.1 m, under an
`embodiment. There is no null and the response for speech is
`greater than that shown in FIG.9.
`FIG.18 is a plot of linear response of virtual microphone
`V, toa 1 kHz noise source at a distance of 1.0 m, under an
`embodiment. There is no null andthe responseis very similar
`to V, shown in FIG. 10.
`FIG. 19 is a plot of linear response of virtual microphone
`V,, to aspeech source at a distance of 0.1 m for frequencies of
`100, 500, 1000, 2000, 3000, and 4000 Hz, under an embodi-
`ment.
`
`FIG. 20 is a plot showing comparison of frequency
`responses for speech for the array of an embodiment andfor
`a conventional cardioid microphone.
`FIG. 21 is a plot showing speech response for V, (top,
`dashed) andV,, (bottom,solid) versus B with d, assumedto be
`0.1 m, under an embodiment. Thespatial null in V, is rela-
`tively broad.
`FIG.22 is a plot showing a ratio ofV,/V, speech responses
`shown in FIG. 10 versus B, under an embodiment. Theratio
`is above 10 dB forall 0.8<B <1.1. This meansthat the physi-
`cal B of the system need not be exactly modeled for good
`performance.
`FIG.23is a plot of B versus actual d, assumingthat d,=10
`cm and theta=0, under an embodiment.
`FIG. 24 is a plot of B versus theta with d=10 cm and
`assuming d,=10 cm, under an embodiment.
`FIG. 25 is a plot of amplitude (top) and phase (bottom)
`response of N(s) with B=1 and D=-7.2 usec, under an
`embodiment. The resulting phase difference clearly affects
`high frequencies more than low.
`FIG. 26 is a plot of amplitude (top) and phase (bottom)
`response of N(s) with B=1.2 and D=-7.2 sec, under an
`embodiment. Non-unity B affects the entire frequency range.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`36
`
`36
`
`

`

`US 8,838,184 B2
`
`3
`FIG. 27 is a plot of amplitude (top) and phase (bottom)
`response of the effect on the speech cancellation in V, due to
`a mistake in the location of the speech source with q1=0
`degrees and q2=30 degrees, under an embodiment. The can-
`cellation remains below -10 dB for frequencies below 6 kHz.
`FIG. 28 is a plot of amplitude (top) and phase (bottom)
`response of the effect on the speech cancellation in V, due to
`a mistake in the location of the speech source with q1=0
`degrees and q2=45 degrees, under an embodiment. The can-
`cellation is below -10 dB only for frequencies below about
`2.8 kHz and a reduction in performanceis expected.
`FIG. 29 shows experimentalresults for a 2d)=19 mm array
`using a linear § of 0.83 on a Bruel and Kjaer Head and Torso
`Simulator (HATS) in very loud (~85 dBA) music/speech
`noise environment, under an embodiment. The noise has been
`reduced by about 25 dB and the speech hardly affected, with
`no noticeable distortion.
`
`FIG. 30 is a block diagram ofa denoising system, under an
`embodiment.
`
`FIG.31 is a block diagram including components ofa noise
`removalalgorithm, underthe denoising system of an embodi-
`ment assuming a single noise source and direct paths to the
`microphones.
`FIG.32 is a block diagram including front-end components
`of a noise removal algorithm of an embodiment generalized
`to n distinct noise sources (these noise sources maybereflec-
`tions or echoes of one another).
`FIG.33 is a block diagram including front-end components
`of a noise removal algorithm of an embodimentin a general
`case wherethereare n distinct noise sources and signalreflec-
`tions.
`
`FIG.34 is a flow diagram of a denoising method, under an
`embodiment.
`
`FIG. 35 showsresults of a noise suppression algorithm of
`an embodiment for an American English female speaker in
`the presence of airport terminal noise that includes many
`other human speakers and public announcements.
`FIG. 36Ais a block diagram of a Voice Activity Detector
`(VAD) system including hardware for use in receiving and
`processing signals relating to VAD, under an embodiment.
`FIG.36B is a block diagram of a VAD system using hard-
`ware of a coupled noise suppression system for use in receiv-
`ing VAD information, under an alternative embodiment.
`FIG. 37 is a flow diagram of a method for determining
`voiced and unvoiced speech using an accelerometer-based
`VAD, under an embodiment.
`FIG. 38 shows plots including a noisy audio signal (live
`recording) along with a corresponding accelerometer-based
`VADsignal, the corresponding accelerometer output signal,
`and the denoised audio signal following processing by the
`noise suppression system using the VAD signal, under an
`embodiment.
`
`FIG. 39 shows plots including a noisy audio signal (live
`recording) along with a corresponding SSM-based VAD sig-
`nal, the corresponding SSM output signal, and the denoised
`audio signal following processing by the noise suppression
`system using the VAD signal, under an embodiment.
`FIG. 40 shows plots including a noisy audio signal (live
`recording) along with a corresponding GEMS-based VAD
`signal,
`the corresponding GEMS output signal, and the
`denoised audio signal following processing by the noise sup-
`pression system using the VADsignal, under an embodiment.
`
`DETAILED DESCRIPTION
`
`The conference-call telephone, also referred to as a speak-
`erphone,
`is a vital tool in business today. A conventional
`
`4
`speakerphonetypically uses a single loudspeaker to transmit
`far-end speech and one or more microphonesto capture near-
`end speech. The proximity of the loudspeaker to the micro-
`phone(s) requires effective echo cancellation and/or half-
`duplex operation. Also, the intelligibility of the users on both
`ends is often poor, and there may be very large differences in
`sound levels betweenusers, depending on their distance to the
`speakerphone’s microphone(s).
`In addition, no effective
`noise suppression of the near-end is possible, and various
`noises (like mobile phones ringing) create a large nuisance
`during thecall.
`A wireless conference call telephone system is described
`herein that addresses many of the problems of conventional
`conference call telephones. Instead of using microphones on
`or near the conference call
`telephone,
`the embodiments
`described herein use body-worn wiredor wireless audio end-
`points (e.g., comprising microphones and optionally, loud-
`speakers). These body-worn audio-endpoints (for example,
`headsets, pendants, clip-on microphones, etc.) are used to
`capture the user’s voice andthe resulting data may be used to
`remove echo and environmental acoustic noise. Each headset
`
`or pendant transmits its audio to the conference call phone,
`where noise and echo suppression can take place if not
`already performed on the body-worn unit, and where each
`headset or pendant’s output can be labeled, integrated with
`the other headsets and/or pendants, and transmitted over a
`telephone network, over one or more telephony channels. The
`noise and echo suppression can also be done on the headset or
`pendant. The labeling of each user’s output can be used by the
`outside caller’s phoneto spatially locate each user in space,
`increasing intelligibility.
`In the following description, numerousspecific details are
`introduced to provide a thorough understanding of, and
`enabling description for, embodiments of the wireless con-
`ference call telephone system and methods. Oneskilled in the
`relevant art, however, will recognize that these embodiments
`can be practiced without one or more ofthe specific details, or
`with other components, systems, etc. In other instances, well-
`known structures or operations are not shown, or are not
`described in detail, to avoid obscuring aspectsofthe disclosed
`embodiments.
`Unless otherwise specified, the following terms have the
`corresponding meaningsin addition to any meaning or under-
`standing they may conveyto oneskilled in theart.
`The term “conference calling” is defined as the use of a
`telephony device that is designed to allow one or more near-
`end users to connectto a phonethat will then connect through
`an analog or digital
`telephony network to
`another
`telephone(s).
`The term “omnidirectional microphone” meansa physical
`microphone that is equally responsive to acoustic waves
`originating from any direction.
`The term “near-end”refers to the side of the telephonecall
`that is in acoustic proximity to the conference calling system.
`The term “far-end”refers to the side of the telephonecall
`that is not in acoustic proximity to the conference calling
`system.
`The term “noise” means unwanted environmental acoustic
`noise in the environment of the conference call phone.
`The term “virtual microphones (VM)”or “virtual direc-
`tional microphones” means a microphone constructed using
`two or more omnidirectional microphones and associated
`signal processing.
`The term “Children” refers to one or more body-worn
`audio endpoints (for example, headsets or pendants or other
`body-worn devices that contain microphonearraysofat least
`one microphone and an optional loudspeaker). They may be
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`37
`
`37
`
`

`

`US 8,838,184 B2
`
`5
`wired or wireless. Children are hard-coded to the Parent so
`that they cannoteasily be used with other devices. If needed,
`they may be recharged on the Parent for efficiency and con-
`venience.
`The term “Friends” refers to headsets or other similar
`devices that can be used with the Parentbut are notrestricted
`to the Parent. They may be wired or wireless. Examples are
`Bluetooth devices such as Aliph’s Jawbone Icon headset
`(http:/Awww.jawbone.com) and USB devices such as Log-
`itech’s ClearChat Comfort USB headset.
`
`The term “Parent”refers to the main body of the confer-
`ence call phone, where the different wired and/or wireless
`streams from each Child are received, integrated, and pro-
`cessed. The Parent broadcasts the incoming acoustic infor-
`mation to the Children and the Friends, or optionally, using a
`conventional loudspeaker.
`The term HCIis an acronym for Host Controller Interface.
`The term HFPis an acronym for the Hands-FreeProfile, a
`wireless interface specification for Bluetooth-based commu-
`nication devices.
`The term PSTN is an acronym for Public Switched Tele-
`phone Network.
`The term SDFis an acronym for Service Discovery Proto-
`col.
`
`The term SIP is an acronym for Session Initiate Protocol.
`The term SPI busis an acronym for Serial Peripheral Inter-
`face bus.
`The term UARTis an acronym for Universal asynchronous
`receiver/transmitter.
`The term USART is an acronym for Universal synchro-
`nous/asynchronousreceiver/transmitter.
`The term USBis an acronym for Universal Serial Bus.
`The term UUID is an acronym for Universally Unique
`Identifier.
`The term VoIP is an acronym for Voice over Internet Pro-
`tocol.
`
`The wireless conference call telephone system described
`herein comprises wearable wired and/or wireless devices to
`transmit both incoming and outgoing speech with or without
`a loudspeaker to ensure that all users’ speech is properly
`captured. Noise and/or echo suppression can take place on the
`wireless devicesor on the Parent device. Some of the devices
`
`mayberestricted to use only on the Parent to simplify opera-
`tion. Other wireless devices such as microphones and loud-
`speakers are also supported, and any wireless transmission
`protocols alone or in combination can be used.
`The wireless conference call
`telephone system of an
`embodiment comprises a fixed or mobile conferencing unit
`and a multiplicity of body-worn wireless telephony units or
`endpoints. The fixed or mobile conferencing unit comprises a
`telephony terminal that acts as an endpoint for a multiplicity
`of telephony calls (via PSTN, VoIP and similar). The fixed or
`mobile conferencing unit comprises a wireless terminal that
`acts as the gateway for a multiplicity of wireless audio ses-
`sions (for example Bluetooth HFP audio session). The fixed
`or mobile conferencing unit comprises an audio signal pro-
`cessing unit that inter-alia merges and optimizes a multiplic-
`ity of telephony calls into a multiplicity of wireless audio
`sessions and vice-versa. Optionally, the fixed or mobile con-
`ferencing unit comprises a loudspeaker.
`The body-worn wireless telephony unit of an embodiment
`comprises a wireless communication system that maintains
`an audio session with the conferencing unit (such as a Blue-
`tooth wireless system capable of enacting the HFP protocol).
`The body-worm wireless telephony unit comprises a user
`speech detection and transmission system (e.g., microphone
`system). The body-worn wireless telephony unit optionally
`
`6
`comprises a meansof presenting audioto the user. The body-
`worn wireless telephony unit optionally comprises a signal
`processorthat optimizes the user speech for transmission to
`the conferencing unit (for example by removing echo and/or
`environmental noise). The body-wom wireless telephony
`unit optionally comprises a signal processor that optimizes
`received audio for presentation to the user.
`Moving the microphones from the proximity of the loud-
`speaker to the body ofthe useris a critical improvement. With
`the microphones on the bodyofthe user, the speech to noise
`ratio (SNR)is significantly higher and similarforall near-end
`users. Using technology like the Dual Omnidirectional
`Microphone Array (DOMA)(described in detail herein andin
`USS. patent application Ser. No. 12/139,333, filed Jun. 13,
`2008)available from Aliph,Inc., San Francisco, Calif., two or
`more microphones can be used to capture audio that can be
`used to remove acoustic noise (including other users speak-
`ing) and echo (if a loudspeaker is still used to broadcast
`far-end speech). Under the embodiments herein, the signal
`processing is not required to be done on the device carried on
`the user, as the recorded audio from the microphones can be
`transmitted for processing on the Parent device. If a wireless
`headset device is used to house the microphones, the incom-
`ing far-end speech could also be broadcast to the headset(s)
`instead ofusing the loudspeaker. This improves echo suppres-
`sion and allowstrue duplex, highly intelligible, private, con-
`ference conversationsto take place.
`The components of the wireless conference call telephone
`system are describedin detail below. Each component, while
`described separately forclarity, can be combined with one or
`more other components to form a complete conference call
`system.
`Wearable Devices (Children)
`The term “Children” refers to one or more body-worn
`audio endpoints (for example, headsets or pendants or other
`body-worn devices that contain microphonearraysofat least
`one microphone and an optional loudspeaker). They may be
`wired or wireless. Children are hard-codedto a Parentso that
`
`they cannoteasily be used with other devices. If desired, they
`may be recharged on the Parent for efficiency and conve-
`nience.
`The wearable devices of an embodiment comprise a single
`microphone(e.g., omnidirectional microphone, directional
`microphone,etc.), analog to digital convertor (ADC), and a
`digital signal processor. The wearable devices also include a
`wireless communication component(e.g., Bluetooth, etc.) for
`transferring data or information to/from the wearable device.
`The wireless communication component enablesfixed pair-
`ing between Parent and Child so that the Children don’t get
`removed from the Parent. To assist this, the Children can be
`made to beep and/orflash and/or turn off when removed from
`the proximity of the Parent. Forbest effect, the Children may
`recharge on the Parent. Any number of Children maybe used;
`four to eight should be sufficient for most conferencecalls.
`Optionally, wired devices such as headsets, microphones, and
`loudspeakers can be supported as well.
`The wearable devices of an alternative embodiment com-
`
`prise two or more microphonesthat form a microphonearray
`(e.g., the DOMA(describedin detail herein and in U.S. patent
`application Ser. No. 12/139,333, filed Jun. 13, 2008) available
`from Aliph, Inc., San Francisco, Calif.). Using physical
`microphonearrays, virtual directional microphonesare con-
`structed that increase the SNR of the user’s speech. The
`speech can be processed using an adaptive noise suppression
`algorithm, for example, the Pathfinder available from Aliph,
`Inc., San Francisco, Calif., and described in detail herein and
`in US patent application Ser. No. 10/667,207, filed Sep. 18,
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`38
`
`38
`
`

`

`US 8,838,184 B2
`
`7
`2003. The processing used in support of DOMA,Pathfinder,
`and echo suppression can be performed on the Child or,
`alternatively, on the Parent. If a Parent loudspeaker is used
`and echo suppression is done on the Child, the Parent can
`route the speaker output to the Child via wireless communi-
`cations to assist in the echo suppression process.
`The Child may be head-worn (like a headset), in which case
`a Child loudspeaker can be used to broadcast the far-end
`speech into the ear of the user, or body-worn, in which case
`the Parent will be required to use a loudspeaker to broadcast
`the far-end speech. The body-worn device can clip on to the
`clothing of the user, or be hung from the headlike a pendant.
`The pendant can use a hypoallergenic substance to construct
`the structure that goes around the neck since it may be in
`contact with the user’s skin. Ifa headset is used as a Child, an
`on-the-ear mount is recommendedoveran in-the-ear mount,
`due to hygienic considerations.
`As an example, FIG. 1 shows a body-worn Child device as
`a clip-on microphone array, under an embodiment. The
`device attaches to a user with a gator

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket