throbber
Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 1 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 1 of 22
`
`
`
`
`
`
`
`
`EXHIBIT 2
`EXHIBIT 2
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 2 of 22
`sew= STATE ATTA
`
`US008019091B2
`
`US 8,019,091 B2
`(10) Patent No.:
`a2) United States Patent
`Burnett et al.
`(45) Date of Patent:
`*Sep. 13, 2011
`
`
`(54) VOICE ACTIVITY DETECTOR(VAD) -BASED
`MULTIPLE-MICROPHONE ACOUSTIC
`NOISE SUPPRESSION
`Inventors: Gregory C. Burnett, Dodge Center, MN
`(US); Eric F. Breitfeller, Dublin, CA
`(US)
`
`(75)
`
`(73) Assignee: Aliphcom, Inc., San Francisco, CA (US)
`(*) Notice:
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 713 days.
`This patent is subject to a terminal dis-
`claimer.
`(21) Appl. No.: 10/667,207
`i a °
`Filed:
`Sep. 18, 2003
`
`(22)
`
`4,901,354 A *
`2/1990 Gollmar etal.
`aea ‘ . 31903 baba hi
`212,
`tyoshi
`5,400,409 A
`3/1995 Linhard.
`5,406,622 A *
`4/1995 Silverb tal. 381/94.7
`5,414,776 A
`5/1995 Sims.I we
`5,463,694 A * 10/1995 Bradley et al... 381/92
`(Continued)
`
`EP
`
`FOREIGN PATENT DOCUMENTS
`0.637 187 A * 2/1995
`(Continued)
`
`OTHER PUBLICATIONS
`ZhaoLi et al: “Robust Speech Coding Using Microphone Arrays”,
`Signals Systems and Computers, 1997. Conf. recordof3 1stAsilomar
`Conf., Nov. 2-5, 1997, IEEE Comput. Soc. Nov.2, 1997. USA.
`(Continued)
`
`(65)
`
`Prior Publication Data
`US 2004/0133421 Al
`‘Jul. 8, 2004
`Related US. Appl
`D
`icati
`t
`t
`S.
`ee
`Ppmcatron ee
`(63) Continuation-in-part of application No. 09/905,361,
`filed on Jul. 12, 2001, now abandoned.
`(60) Provisional application No. 60/219,297, filed on Jul.
`19, 2000.
`Int. Cl
`(51)
`(2006.01)
`OBB 2900
`381/71.8: 704/215
`,
`(52) US. Cl
`- 381/70
`Fi ld f Cloeficatiue5verehwv
`58
`(58)
`Fie 381/9ati.7 a18 9..“90.Dda117047200.
`an 704/231 933 46 314.21 5
`See applicationfile for complete search histo
`P
`"y
`PP
`References Cited
`
`(56)
`
`U.S. PATENT DOCUMENTS
`
`3,789,166 A *
`4,006,318 A *
`4,591,668 A *
`
`1/1974 Sebesta
`2/1977 Sebesta etal.
`5/1986 Iwata
`
`Primary Examiner — Davetta Goins
`Assistant Examiner — Lun-See Lao
`(74) Attorney, Agent, or Firm — Gregory & Sawrie LLP
`
`ABSTRACT
`(67)
`Acoustic noise suppression is provided in multiple-micro-
`phone systems using Voice Activity Detectors (VAD). A host
`system receives acoustic signals via multiple microphones.
`The system also receives information on the vibration of
`humantissue associated with human voicing activity via the
`VAD.In response, the system generates a transfer function
`representative ofthe received acoustic signals upon determin-
`ing that voicing information is absent from the received
`acoustic signals during at least one specified period of time.
`The system removes noise from the received acoustic signals
`using the transfer function, thereby producing a denoised
`acoustic data stream.
`
`20 Claims, 10 Drawing Sheets
`
`204
`
`Voicing Information
`
`200
`
`
`
`
`
`Noise Removal
`
`
`
`Cleaned Speech
`
`
`
`100
`
`()
`Signal
`s(n)
`
`101
`
`()
`Noise
`n(n)
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 3 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 3 of 22
`
`US 8,019,091 B2
` Page 2
`
`U.S. PATENT DOCUMENTS
`5,473,701 A * 12/1995 Cezanne etal. wu... 381/92
`5,473,702 A * 12/1995 Yoshidaet al.
`.....0.. 381/94.7
`5,515,865 A *
`5/1996 Scanlonet al.
`5,517,435 A *
`5/1996 Sugiyama oo... 708/322
`5,539,859 A
`7/1996 Robbe etal.
`5,590,241 A * 12/1996 Parketal. oo. 704/227
`5,633,935 A *
`5/1997 Kanamori etal.
`.............. 381/26
`oeeadeo ‘ x Hitoos cuplaetal
`5.729.694 A *
`3/1998 Holzrichter etal. ss... 705/17
`5,754,665 A *
`5/1998 Hosoi occ 381/94.1
`5,835,608 A »
`1DLoos Wamaka et al.
`oe A
`iho00 scan al
`5.966.090 A
`10/1999 McEwan
`5,986,600 A
`11/1999 McEwan
`6,006,175 A * 12/1999 Holzrichter 0.0.0.0... 704/208
`6,009,396 A . 12/1999 Nagata
`eo a
`500" wo et al.
`6266.422 BI
`7/2001 Ikeda
`6,430,295 Bl
`8/2002 Handel etal.
`......0.. 379/388 .06
`6,707,910 B1*
`3/2004 Valveetal.
`2002/0039425 A1*
`4/2002 Burnettet al.
`2003/0228023 Al* 12/2003 Burnett et al. o... 381/92
`FOREIGN PATENT DOCUMENTS
`0795 851 A2 *
`9/1997
`0 984 660 A2 *
`3/2000
`
`EP
`EP
`
`JP
`wo
`
`2000 312 395
`* 11/2000
`Es
`wooOT x Tees
`
`OTHER PUBLICATIONS
`
`L.C. Ng et al.: “Denoising of Human Speech Using Combined.
`Acoustic and EM Sensor Signal Processing”, 2000 IEEEIntl Conf on
`Acoustics Speech and Signal Processing. Proceedings (Cat. No.
`00CH37100),Istanbul, Turkey, Jun. 5-9, 2000 XP002 186255, ISBN
`0-7803-6293-4.
`S. Affes et al.: “A Signal Subspace Tracking Algorithm for Micro-
`phone Array Processing of Speech”. IEEE Transactions on Speech
`and Audio Processing, N.Y, USA vol. 5, No. 5, Sep. 1, 1997.
`XP000774303, ISBN 1063-6676.
`Gregory C. Burnett: “The Physiological Basis of Glottal Electromag-
`netic Micropower Sensors (GEMS) and Their Use in Defining an
`Excitation Function for the Human Vocal Tract”, Dissertation. Uni-
`versity of California at Davis, Jan. 1999, USA.
`L.C. Nget al.: “Speaker Verification Using Combined Acoustic and
`EMSensorSignal Processing”, ICASSP-2001, Salt Lake City, USA.
`A. Hussain: “Intelligibility Assessment of a Multi-Band Speech
`Enhancement Scheme”, Proceedings IEEEIntl. Conf. on Acoustics,
`Speech & Signal Processing (ICASSP-2000). Istanbul, Turkey, Jun.
`2000.
`
`* cited by examiner
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 4 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 4 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 1 of 10
`
`US 8,019,091 B2
`
`
`
`OLOFBUISTOUISUOIOA
`0¢JOssa01gsauoydolo
`
`
`waysXsqnsS70S09802
`
`_seuoydoxonn[01~0001
`
`COld(a)
`
`101
`
`(sy)
`
`
`yosedgpauray)
`[PAOWYVION
`
`ne001
`
`())
`
`[eusis
`
`
`
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 5 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 5 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 2 of 10
`
`US 8,019,091 B2
`
`<—(2)'W—f(%)
`:(2)s|tN(2)'Hjeusig
`
`
`
`<—(2)—~(z)'p(>)
`
`006—~
`
`(2)'nz)°(2)'N
`
`(z)"9()
`(z)q¢9SION
`:(z)"N
`
`OldoyUaston
`
`(>)
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 6 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 6 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 3 of 10
`
`US 8,019,091 B2
`
`~~
`
`A= S
`
`SJ
`
`e=
`
`ST.
`2S
`tL
`
`S
`ih =
`
`~“
`
`aC
`
`O
`
`~
`
`me
`
`ww
`
`a
`
`
`
`Ss
`on
`
`we
`
`—
`
`=a
`a8nR
`EMR
`“
`
`a —
`“ —~
`Ts
`e232 eee
`eert
`SBS SE a SES
`Z,
`Zz,
`Zz.
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 7 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 7 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 4 of 10
`
`Receive acoustic signals
`
`FIG.5
`
`Receive voice activity
`(VAD) information
`
`Determine absence of
`voicing and generate first
`transfer function
`
`Determine presence of
`voicing and generate
`second transfer function
`
`Produce denoised
`acoustic data stream
`
`US 8,019,091 B2
`
`502
`
`504
`
`506
`
`508
`
`510
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 8 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 8 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 5 of 10
`
`US 8,019,091 B2
`
`Noise Removal Results for American English Female Saying 406-5562
`4
`
`x 10
`
`Dirty
`Audio
`604
`
`40)
`
`Cleaned
`Audio
`
`602
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 9 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 9 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 6 of 10
`
`US 8,019,091 B2
`
`FIG.7A
`
`FIG.7B
`
`
`
`VAD
`Device
`
`VAD
`Algorithm
`
`
`
`
`
`
`
`
`704
`
`
`
`Noise
`Suppression
`
`764
`
`704
`
`
`
`VAD
`
`Algorithm
`
`
`
`Signal
`Processing
`
`System
`
`
`
`Noise
`
`Suppression
`System
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 10 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 10 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 7 of 10
`
`US 8,019,091 B2
`
`,— 800
`
`FIG.8
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 11 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 11 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 8 of 10
`
`US 8,019,091 B2
`
`Denoised
`
`AccelerometerNoisyAudio
`
`Time (samples at 8 kHz)
`KS
`FIG.9
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 12 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 12 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 9 of 10
`
`US 8,019,091 B2
`
`Noisy
`
`SSM
`
`0.1
`
`0.05
`
`=
`
`-0.05
`
`Audio
`-0.1
`Audio
`Denoised
`
`2
`
`25
`
`3
`
`35
`
`4
`
`45
`
`#5
`
`55
`
`6)
`
`6.5
`
`x 10
`
`Time (samples at 8 kHz)
`Ke
`FIG.10
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 13 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 13 of 22
`
`U.S. Patent
`
`Sep. 13, 2011
`
`Sheet 10 of 10
`
`US8,019,091 B2
`
`GEMSNoisyAudio
`
`
`
`DenoisedAudio
`
`Time (samples at 8 kHz)
`Ke J
`FIG.11
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 14 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 14 of 22
`
`US 8,019,091 B2
`
`1
`VOICE ACTIVITY DETECTOR(VAD) -BASED
`MULTIPLE-MICROPHONE ACOUSTIC
`NOISE SUPPRESSION
`
`RELATED APPLICATIONS
`
`This patent application is a continuation-in-part of U.S.
`patent application Ser. No. 09/905,361, filed Jul. 12, 2001,
`now abandonedwhichclaimspriority from U.S. patent appli-
`cation Ser. No. 60/219,297, filed Jul. 19, 2000. This patent
`application also claimspriority from U.S. patent application
`Ser. No. 10/383,162, filed Mar. 5, 2003.
`
`FIELD OF THE INVENTION
`
`The disclosed embodiments relate to systems and methods
`for detecting and processing a desired signal in the presence
`of acoustic noise.
`
`BACKGROUND
`
`Manynoise suppression algorithms and techniques have
`been developed overthe years. Mostof the noise suppression
`systemsin use today for speech communication systems are
`based on a single-microphonespectral subtraction technique
`first develop in the 1970’s and described, for example, by S.
`F. Boll in “Suppression of Acoustic Noise in Speech using
`Spectral Subtraction,’ IEEE Trans. on ASSP, pp. 113-120,
`1979. These techniques have beenrefined overthe years, but
`the basic principles ofoperation have remained the same. See,
`for example, U.S. Pat. No. 5,687,243 of McLaughlin,et al.,
`and U.S. Pat. No. 4,811,404 ofVilmur, et al. Generally, these
`techniques make use of a microphone-based Voice Activity
`Detector (VAD) to determine the background noise charac-
`teristics, where “voice” is generally understood to include
`human voiced speech, unvoiced speech, or a combination of
`voiced and unvoiced speech.
`The VAD hasalso been usedin digital cellular systems. As
`an example of such a use, see U.S. Pat. No. 6,453,291 of
`Ashley, where a VAD configuration appropriate to the front-
`end of a digital cellular system is described. Further, some
`Code Division Multiple Access (CDMA) systemsutilize a
`VAD to minimize the effective radio spectrum used, thereby
`allowing for more system capacity. Also, Global System for
`Mobile Communication (GSM)systems can include a VAD
`to reduce co-channel interference and to reduce battery con-
`sumption on the client or subscriber device.
`These typical microphone-based VAD systemsare signifi-
`cantly limited in capability as a result of the addition of
`environmental acoustic noise to the desired speech signal
`received by the single microphone, wherein the analysis is
`performed using typical signal processing techniques. In par-
`ticular,
`limitations in performance of these microphone-
`basedVADsystemsare noted when processing signals having
`a low signal-to-noise ratio (SNR), and in settings where the
`backgroundnoise varies quickly. Thus, similar limitations are
`foundin noise suppression systems using these microphone-
`based VADs.
`
`BRIEF DESCRIPTION OF THE FIGURES
`
`FIG.1 is a block diagram of a denoising system, under an
`embodiment.
`
`FIG.2 is a block diagram including components of a noise
`removalalgorithm, underthe denoising system of an embodi-
`ment assuming a single noise source and direct paths to the
`microphones.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`FIG.3 is a block diagram including front-end components
`of a noise removal algorithm of an embodiment generalized
`to n distinct noise sources (these noise sources maybereflec-
`tions or echoes of one another).
`FIG.4 is a block diagram including front-end components
`of a noise removalalgorithm of an embodimentin a general
`case wherethereare n distinct noise sources and signalreflec-
`tions.
`
`FIG. 5 is a flow diagram of a denoising method, under an
`embodiment.
`
`FIG.6 showsresults of a noise suppression algorithm of an
`embodimentfor an American English female speaker in the
`presence of airport terminal noise that includes many other
`human speakers and public announcements.
`FIG. 7A is a block diagram of a Voice Activity Detector
`(VAD) system including hardware for use in receiving and
`processing signals relating to VAD, under an embodiment.
`FIG.7Bis a block diagram of a VAD system using hard-
`ware of a coupled noise suppression system for use in receiv-
`ing VAD information, under an alternative embodiment.
`FIG. 8 is a flow diagram of a method for determining
`voiced and unvoiced speech using an accelerometer-based
`VAD, under an embodiment.
`FIG. 9 showsplots including a noisy audio signal (live
`recording) along with a corresponding accelerometer-based
`VADsignal, the corresponding accelerometer output signal,
`and the denoised audio signal following processing by the
`noise suppression system using the VAD signal, under an
`embodiment.
`FIG. 10 showsplots including a noisy audio signal (live
`recording) along with a corresponding SSM-based VADsig-
`nal, the corresponding SSM output signal, and the denoised
`audio signal following processing by the noise suppression
`system using the VAD signal, under an embodiment.
`FIG. 11 shows plots including a noisy audio signal (live
`recording) along with a corresponding GEMS-based VAD
`signal,
`the corresponding GEMS output signal, and the
`denoised audio signal following processing by the noise sup-
`pression system using the VADsignal, under an embodiment.
`
`DETAILED DESCRIPTION
`
`The following description provides specific details for a
`thorough understanding of, and enabling description for,
`embodiments of the noise suppression system. However, one
`skilled in the art will understand that the invention may be
`practiced without these details. In other instances, well-
`known structures and functions have not been shown or
`
`described in detail to avoid unnecessarily obscuring the
`description of the embodiments of the noise suppression sys-
`tem. In the following description, “signal” represents any
`acoustic signal (such as human speech) that is desired, and
`“noise” is any acoustic signal (which may include human
`speech) that is not desired. An example would be a person
`talking on acellular telephone with a radio in the background.
`The person’s speech is desired and the acoustic energy from
`the radio is notdesired. In addition, “user” describes a person
`who is using the device and whose speech is desired to be
`captured by the system.
`Also, “acoustic” is generally defined as acoustic waves
`propagating in air. Propagation of acoustic waves in media
`other than air will be noted as such. Referencesto “speech” or
`“voice” generally refer to human speech including voiced
`speech, unvoiced speech, and/or a combination of voiced and
`unvoiced speech. Unvoiced speech or voiced speech is dis-
`tinguished where necessary. The term “noise suppression”
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 15 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 15 of 22
`
`US 8,019,091 B2
`
`3
`generally describes any method by which noise is reduced or
`eliminated in an electronic signal.
`Moreover, the term “VAD”is generally defined as a vector
`or array signal, data, or information that in some manner
`represents the occurrence of speech in the digital or analog
`domain. A commonrepresentation of VAD informationis a
`one-bit digital signal sampled at the samerate as the corre-
`sponding acoustic signals, with a zero value representing that
`no speech has occurred during the corresponding time
`sample, and a unity value indicating that speech has occurred
`during the corresponding time sample. While the embodi-
`ments described herein are generally describedin thedigital
`domain, the descriptionsare alsovalid for the analog domain.
`FIG. 1 is a block diagram of a denoising system 1000 of an
`embodimentthat uses knowledge ofwhen speechis occurring
`derived from physiological information on voicing activity.
`The system 1000 includes microphones 10 and sensors 20
`that provide signals to at least one processor 30. The proces-
`sor includes a denoising subsystem or algorithm 40.
`FIG.2 is a block diagram including components of a noise
`removal algorithm 200 of an embodiment. A single noise
`source and a direct path to the microphones are assumed. An
`This is the general case forall two microphonesystems. In
`operational description ofthe noise removal algorithm 200 of
`a practical system there is always going to be some leakage of
`an embodimentis provided using a single signal source 100
`noise into MIC 1, and some leakage of signal into MIC 2.
`and a single noise source 101, but is not so limited. This
`Equation 1 has four unknowns and only two knownrelation-
`algorithm 200 uses two microphones: a “signal” microphone
`ships and therefore cannot be solved explicitly.
`1 (“MIC1”) and a “noise” microphone 2 (“MIC 2”), but is not
`However, there is another way to solve for some of the
`so limited. The signal microphone MIC 1 is assumedto cap-
`unknowns in Equation 1. The analysis starts with an exami-
`ture mostly signal with some noise, while MIC 2 captures
`nation ofthe case wherethe signalis not being generated, that
`mostly noise with some signal. The data from the signal
`is, where a signal from the VAD element 204 equals zero and
`source 100 to MIC 1 is denoted by s(n), where s(n) is a
`speech is not being produced.In this case, s(n) S(z)=0, and
`discrete sample of the analog signal from the source 100. The
`Equation 1 reduces to
`data from the signal source 100 to MIC2is denoted bys(n).
`M,,(2)-N@A1E)
`The data from the noise source 101 to MIC 2 is denoted by
`n(n). The data from the noise source 101 to MIC 1 is denoted
`by n,(n). Similarly, the data from MIC 1 to noise removal
`element 205 is denoted by m,(n), and the data from MIC 2 to
`noise removal element 205 is denoted by m,(n).
`The noise removal element 205 also receives a signal from
`a voice activity detection (VAD) element 204. The VAD 204
`uses physiological information to determine when a speaker
`is speaking. In various embodiments, the VAD can include at
`least one of an accelerometer, a skin surface microphone in
`physical contact with skin of a user, a human tissue vibration
`detector, a radio frequency (RF) vibration and/or motion
`detector/device, an electroglottograph, an ultrasounddevice,
`an acoustic microphonethat is being used to detect acoustic
`frequency signals that correspond to the user’s speech
`directly from the skin of the user (anywhere on the body), an
`airflow detector, and a laser vibration detector.
`Thetransfer functions from the signal source 100 to MIC 1
`and from the noise source 101 to MIC 2 are assumed to be
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`unity. The transfer function from the signal source 100 to MIC
`2 is denoted by H,(z), andthe transfer function from the noise
`source 101 to MIC 1 is denoted by H, (z). The assumption of
`unity transfer functions does not inhibit the generality of this
`algorithm, as the actual relations between the signal, noise,
`and microphonesare simply ratios andtheratios are redefined
`in this mannerfor simplicity.
`In conventional two-microphone noise removal systems,
`the information from MIC 2 is used to attempt to remove
`noise from MIC 1. However, an (generally unspoken)
`assumptionis that the VAD element 204 is never perfect, and
`thus the denoising must be performed cautiously, so as not to
`remove too muchofthe signal along with the noise. However,
`if the VAD 204is assumedto be perfect such thatit is equal to
`zero whenthere is no speech being producedby the user, and
`
`55
`
`60
`
`65
`
`4
`equal to one when speechis produced, a substantial improve-
`mentin the noise removal can be made.
`
`In analyzing the single noise source 101 andthe direct path
`to the microphones, with reference to FIG.2,the total acous-
`tic information coming into MIC 1 is denoted by m,(n). The
`total acoustic information coming into MIC 2 is similarly
`labeled m,(n). In the z (digital frequency) domain, these are
`represented as M,(z) and M.(z). Then,
`M,@)=S@)+N2(Z)
`
`M2(z)=N(Z)+So(Z)
`
`with
`
`N2Z)-N@)A@)
`
`S)(z)=S(2)A2(2),
`so that
`
`M,@)-SEHN@)A(EZ)
`
`M2(Z)-N@)+S(@)Ha(Z)
`
`Eq. 1
`
`M2,(Z)-N©@),
`
`where the n subscript on the M variables indicate that only
`noise is being received. This leads to
`
`Min(@) = Mon) Hi (2)
`
`Min
`ow ta
`
`Eq. 2
`
`The function H,(z) can be calculated using any of the
`available system identification algorithms and the micro-
`phone outputs when the system is certain that only noise is
`being received. The calculation can be done adaptively, so
`that the system can react to changesin the noise.
`A solution is now available for one of the unknowns in
`Equation 1. Another unknown, H,(z), can be determined by
`using the instances where the VAD equals one and speechis
`being produced. Whenthis is occurring, but the recent (per-
`haps less than 1 second) history of the microphones indicate
`low levels ofnoise, it can be assumedthat n(s)=N(z)~0. Then
`Equation 1 reduces to
`M,,2)=St)
`
`M,,(2)-S@)HD),
`which in turn leads to
`
`Mo5(zZ) = Mis(Z)H2(2)
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 16 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 16 of 22
`
`US 8,019,091 B2
`
`5
`-continued
`
`
`
`whichis the inverse of the H,(z) calculation. However,it is
`noted that different inputs are being used (now onlythe signal
`is occurring whereas before only the noise was occurring).
`While calculating H,(z), the values calculated for H,(z) are
`held constant and vice versa. Thus, it is assumed that while
`one of H, (z) and H,(z) are being calculated, the one not being
`calculated does not change substantially.
`After calculating H,(z) and H,(2), they are used to remove
`the noise from the signal. If Equation 1 is rewritten as
`S(Z)=—M, @)-N@)Ai@)
`
`N@)=M2(@)-S(@)HD(Z)
`
`S(Z)=M, @)-[Mo(@)-S@)(2)|i)"
`
`S(2)[1-H2(2)(@)]=M@)-M2@)2),
`
`then N(z) may be substituted as shownto solve for S(z) as
`
`6
`mitted. Once again, the “n” subscripts on the microphone
`inputs denote only that noise is being detected, while an “s”
`subscript denotes that only signal is being received by the
`microphones.
`Examining Equation 4 while assuming an absence ofnoise
`produces
`M,,=8
`
`My,=SHo.
`
`Thus, H, can be solved for as before, using any available
`transfer function calculating algorithm. Mathematically,
`then,
`
`Ho
`
`
`_ Mos
`Mis
`
`20
`
`Rewriting Equation 4, using H, defined in Equation 6,
`provides,
`
`S
`
`Mi (Z) — Ma(Z) Mi (2)
`=
`= Thome
`
`M,-S
`~
`Ay, =.
`Ma —SHo
`
`Eq. 3
`
`25
`
`Solving for S yields,
`
`Eq. 7
`
`Eg. 8
`
`If the transfer functions H,(z) and H,(z) can be described
`with sufficient accuracy, then the noise can be completely
`removed andthe original signal recovered. This remains true
`without respect to the amplitude or spectral characteristics of
`the noise. The only assumptions madeinclude use of a perfect
`VAD, sufficiently accurate H,(z) and H,(z), and that when
`one of H,(z) and H(z) are being calculated the other does not
`change substantially. In practice these assumptions have
`proven reasonable.
`The noise removal algorithm described herein is easily
`generalized to include any numberofnoise sources. FIG.3 is
`a block diagram including front-end components 300 of a
`noise removal algorithm of an embodiment, generalized to n
`distinct noise sources. These distinct noise sources may be
`reflections or echoes of one another, but are not so limited.
`There are several noise sources shown, each with a transfer
`function, or path, to each microphone. The previously named
`path H, has been relabeled as H,,so that labeling noise source
`2’s path to MIC 1 is more convenient. The outputs of each
`microphone, when transformedto the z domain,are:
`M(2)=S(@)+N(2)(@)+No(2)Ho@)+ « .. N, AZ)
`
`My(2)=S@)Ho(Z)+N{@)G1@)+No(Z)Go(z)+ . .. Nz@)G,,
`(z)
`
`Eq. 4
`
`When there is no signal (VAD=0), then (suppressing z for
`clarity)
`M,,=N,H+Nolbo+... NH,
`
`M>,=N,G,+N>Got... NG.
`A newtransfer function can now be defined as
`
`Min _— NH +NoHy +... NaHn
`A, =-— = —__——*"
`"My,
`N{G, +N2G.+...N,G,”
`
`Eg. 5
`
`Eq. 6
`
`where FI, is analogous to H,(z) above. Thus FI, depends only
`on the noise sources and their respective transfer functions
`and can be calculated any timethere is no signal being trans-
`
`30
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Se M, - M>Fl,
`1-Hoft,
`
`|
`
`whichis the same as Equation 3, with H, taking the place of
`H,, and H, taking the place of H,. Thus the noise removal
`algorithm still is mathematically valid for any number of
`noise sources, including multiple echoes of noise sources.
`Again, if H, and H, can be estimated to a high enoughaccu-
`racy, and the above assumption of only one path from the
`signal to the microphonesholds, the noise may be removed
`completely.
`The most general case involves multiple noise sources and
`multiple signal sources. FIG. 4 is a block diagram including
`front-end components 400 of a noise removal algorithm of an
`embodimentin the most general case where there are n dis-
`tinct noise sources and signal reflections. Here, signal reflec-
`tions enter both microphones MIC 1 and MIC2. This is the
`most general case, as reflections of the noise source into the
`microphones MIC 1 and MIC 2 can be modeled accurately as
`simple additional noise sources. For clarity, the direct path
`from the signal to MIC 2 is changed from H,(z) to H,,(z), and
`the reflected paths to MIC 1 and MIC 2 are denoted by Hy, (z)
`and H,.(z), respectively.
`The input into the microphones now becomes
`M1 (2)=S@)+S(2)Ao (Z)4+N1 (ZA (Z)4+No(@)Ho(Z)+ .
`.
`-
`N,.@H,@)
`
`My(2)=S@)Ho0(Z)+S)o2(Z)+N@)G(@)+N2(2) Ga(z)+
`...N,(2)G,(Z).
`
`Eq. 9
`
`When the VAD=0, the inputs become (suppressing z again)
`M,,=N\H+NoHot....N,H,
`
`M3, =N,\G4+NoGot ...N,Gys
`which is the same as Equation 5. Thus, the calculation of H,
`in Equation 6 is unchanged, as expected. In examining the
`situation where there is no noise, Equation 9 reduces to
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 17 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 17 of 22
`
`US 8,019,091 B2
`
`M,,=S+SHo,
`
`M5,=SHo0+SHop.
`
`This leads to the definition of H, as
`
`
`— Hoo + Hor
`~ 1+Ho |
`
`Eq. 10
`
`Rewriting Equation 9 again using the definition for H, (as
`in Equation 7) provides
`
`8
`tially while the other is calculated. If the user environmentis
`such that echoes are present, they can be compensatedforif
`coming from a noise source. If signal echoes are also present,
`they will affect the cleaned signal, but the effect should be
`negligible in most environments.
`In operation, the algorithm of an embodiment has shown
`excellent results in dealing with a variety of noise types,
`amplitudes, and orientations. However,
`there are always
`approximations and adjustments that have to be made when
`moving from mathematical concepts to engineering applica-
`tions. One assumption is made in Equation 3, where H.(z) is
`assumed small and therefore H,(z)H, (z)=0, so that Equation
`3 reduces to
`
`i
`
`M, - SU. + Ho1)
`* Ma = S(Hoo + Hoa)’
`
`15
`
`Eq. 11
`
`S(@)=M(Z)-M2 (2)(2).
`
`This meansthat only H,(z) hasto be calculated, speeding up
`the process and reducing the number of computations
`required considerably. With the proper selection of micro-
`phones, this approximation is easily realized.
`Another approximation involves the filter used in an
`embodiment. The actual H,(z) will undoubtedly have both
`poles and zeros, but for stability and simplicity an all-zero
`Finite Impulse Response (FIR) filter is used. With enough
`taps the approximation to the actual H,(z) can be very good.
`To further increase the performance of the noise suppres-
`sion system, the spectrum of interest (generally about 125 to
`3700 Hz) is divided into subbands. The wider the range of
`frequencies over which a transfer function must be calcu-
`lated, the moredifficult it is to calculate it accurately. There-
`fore the acoustic data was divided into 16 subbands, and the
`denoising algorithm wasthen applied to each subband in turn.
`Finally, the 16 denoised data streams were recombined to
`yield the denoised acoustic data. This works very well, but
`any combinations of subbands (i.e., 4, 6, 8, 32, equally
`spaced, perceptually spaced, etc.) can be used and all have
`been found to work better than a single subband.
`The amplitude of the noise was constrained in an embodi-
`ment so that the microphones used did not saturate (that is,
`operate outside a linear response region). It is important that
`the microphones operate linearly to ensure the best perfor-
`mance. Even with this restriction, very low signal-to-noise
`ratio (SNR) signals can be denoised (down to -10 dBorless).
`The calculation of H,(z) is accomplished every 10 milli-
`seconds using the Least-Mean Squares (LMS) method, a
`commonadaptive transfer function. An explanation may be
`found in “Adaptive Signal Processing” (1985), by Widrow
`and Steams, published by Prentice-Hall, ISBN 0-13-004029-
`0. The LMS was used for demonstration purposes, but many
`other system idenfication techniques can be usedto identify
`H,(z) and H,(z) in FIG.2.
`The VAD for an embodimentis derived from a radio fre-
`
`quency sensor and the two microphones, yielding very high
`accuracy (>99%) for both voiced and unvoiced speech. The
`VADof an embodimentuses a radio frequency (RF) vibration
`detector interferometer to detect tissue motion associated
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`Somealgebraic manipulation yields
`
`S(L + Ho. — Ay(Hoo + Ho2)) = M, - M2 A,
`~,
`(Hoo + Ho2)
`o
`sd+ Ho1)[1 -fA, Tay | =M, — Mol,
`SU + Ho1)[1 - Ay Aa | =M,- MF,
`
`and finally
`
`M,- Mf,
`SU + Ao.)=
`1-A, A,
`
`Eq. 12
`
`Equation 12 is the same as equation 8, with the replacement
`of H, by H.,and the addition ofthe (1+H,,) factor on theleft
`side. This extra factor (1+H,) meansthat S cannot be solved
`for directly in this situation, but a solution can be generated
`for the signal plus the additionofall of its echoes. This is not
`such a badsituation, as there are many conventional methods
`for dealing with echo suppression, and even if the echoes are
`not suppressed,it is unlikely that they will affect the compre-
`hensibility of the speech to any meaningful extent. The more
`complexcalculation of H, is needed to accountfor the signal
`echoes in MIC 2, which act as noise sources.
`FIG. 5 is a flow diagram 500 of a denoising algorithm,
`under an embodiment. In operation, the acoustic signals are
`received, at block 502. Further, physiological information
`associated with human voicing activity is received, at block
`504. A first transfer function representative of the acoustic
`signal is calculated upon determining that voicing informa-
`tion is absent from the acoustic signalfor at least one specified
`period of time, at block 506. A secondtransfer function rep-
`resentative ofthe acoustic signal is calculated upon determin-
`ing that voicing information is present in the acoustic signal
`for at least one specified period of time,at block 508. Noise is
`removed from the acoustic signal using at least one combi-
`nation ofthe first transfer function and the second transfer
`function, producing denoised acoustic data streams, at block
`510.
`
`with human speech production, but is not so limited. The
`signal from the RF device is completely acoustic-noisefree,
`and is able to function in any acoustic noise environment. A
`simple energy measurement of the RF signal can be used to
`An algorithm for noise removal, or denoising algorithm,is
`determine ifvoiced speech is occurring. Unvoiced speech can
`described herein, from the simplest case of a single noise
`be determined using conventional acoustic-based methods,
`source withadirect path to multiple noise sources with reflec-
`by proximity to voiced sections determined using the RF
`tions and echoes. The algorithm has been shownherein to be
`sensoror similar voicing sensors,or through a combination of
`viable under any environmental conditions. The type and
`the above. Since there is much less energy in unvoiced speech,
`amount of noise are inconsequential if a good estimate has
`its detection accuracy is notas critical to good noise suppres-
`been made of H, and H,, andif one does not change substan-
`sion performanceas is voiced speech.
`
`60
`
`

`

`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 18 of 22
`Case 6:21-cv-00984-ADA Document 55-2 Filed 05/25/22 Page 18 of 22
`
`US 8,019,091 B2
`
`9
`With voiced and unvoiced speech detected reliably, the
`algorithm of an embodiment can be implemented. Once
`again, it is useful to repeat that the noise removal algorithm
`does not depend on how the VAD is obtained, only that it is
`accurate, especially for voiced speech. If speech is not
`detected and training occurs on the speech, the subsequent
`denoised acoust

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket