`Marash et al.
`
`US006363345B1
`(10) Patent No.:
`US 6,363,345 B1
`(45) Date of Patent:
`Mar. 26, 2002
`
`(54) SYSTEM, METHOD AND APPARATUS FOR
`CANCELLING NOISE
`_
`_
`_
`(75) Inventors. ,lIgostgm Mar'a'sh, Haiféa), lilarpch
`er ugo’ Klnat'Ata’ 0t 0 (IL)
`(73) Assignee: Andrea Electronics Corporation,
`Melville, NY (Us)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl- N04 09/252,874
`(22) Filed,
`Feb 18’ 1999
`
`(51) Int. Cl.7 .............................................. .. G10L 21/02
`
`_
`_
`(52) US. Cl. ...................... .. 704/226, 704/233, 704/205
`.
`(58) Fleld 0f Search ............................... .. 704/270, 500,
`704/233> 200> 201> 205> 226> 227> 228>
`211> 216; 379/22~08> 392~01> 3> 406~01>
`406~12> 406~13> 406~14> 40605
`_
`References Clted
`U_S_ PATENT DOCUMENTS
`
`(56)
`
`2,379,514 A
`2,972,018 A
`3,098,121 A
`3 101 744 A
`’
`’
`
`7/1945 Fisher
`2/1961 Hawley et 211.
`7/1963 Wadsworth
`8/1963 Wamaka
`
`(LiSt Continued 0H IleXt page.)
`FOREIGN PATENT DOCUMENTS
`
`DE
`DE
`DE
`5;
`EP
`EP
`EP
`EP
`
`2640324
`3719963
`4008595
`
`Z;
`
`3/1978
`3/1988
`9/1991
`
`10/1990
`0 390 386
`2/1991
`0 411 360 B1
`0 509 742 A2 10/1992
`0 483 845
`1/1993
`
`OTHER PUBLICATIONS
`B.D. Van Veen and KM. Buckley, “Beamforming: A Ver
`satile Approach to Spatial Filtering,” IEEE ASSN Magazine,
`V01‘ 5, NO‘ 2, Apr‘ 1988, pp‘ 4_24~
`Beranek, Acoustics (American Institute of Physics, 1986)
`pp. 116—135.
`Boll, IEEE Trans. on Acous., vol. ASSP—27, No. 2, Apr.
`1979, pp. 113—120.
`Daniel Sweeney, “Sound Conditioning Through DSP”, The
`_
`_
`Equipment Authority, 1994.
`(List continued on next page.)
`Primary Examiner—Richemond Dorvil
`(74) Attorney, Agent, or Firm—Frommer LaWrence &
`Haug; Thomas J. KoWalski
`(57)
`ABSTRACT
`
`A threshold detector precisely detects the positions of the
`noise elements, even Within continuous speech segments, by
`determinin Whether fre uenc S ecmlm elements, or bins,
`g
`9
`Y P
`of the input Signal are Within a threshold Set according to
`current and future minimum values of the frequency spec
`trum elements. In addition, the threshold is continuously set
`and initiated Within a predetermined period of time. The
`estimate magnitude of the input audio signal is obtained
`using a multiplying combination of the real and imaginary
`part of the input in accordance With the higher and loWer
`values betWeen the real and imaginary part of the signal. In
`order to further reduce instability of the spectral estimation,
`a tWo-dimensional smoothing is applied to the signal esti
`mate using neighboring frequency bins and an exponential
`average over time. A?lter multiplication effects the subtrac
`tion thereby avoiding phase calculation dif?culties and
`effecting full-Wave recti?cation Which further reduces arti
`facts. Since the noise elements are determined Within con
`tinuous speech segments, the noise is canceled from the
`audio signal nearly continuously thereby providing excellent
`noise cancellation characteristics. Residual noise reduction
`reduces the residual noise remaining after noise cancella
`tion. Implementation may be effected in various noise can
`celing schemes including adaptive beamforming and noise
`cancellation using computer program applications installed
`as softWare or hardWare.
`
`(List continued on next page.)
`
`47 Claims, 10 Drawing Sheets
`
`202
`
`R01) IE
`'
`
`R(n) |(n)
`V
`
`__
`
`200 (114)
`
`206
`
`Y(n)=
`1l3[Y(n-1)+Y(n)+Y(n+1)]
`
`204
`P
`iY(n)=MaX[R(n)J(n)]
`‘ +O.4’Min[R(n),|(n)]
`
`208
`P
`‘ Y(n)=
`Y(n),*o.3+Y(n),_1*0.7
`
`210
`Subtraction - Noise
`
`212 (300)
`
`Process
`
`Estimation
`
`‘Fme Domain
`214” InputSignal
`
`- OutputTo
`IFFT
`
`216
`
`218
`
`Noise Processing
`
`RTL345-2_1001-0001
`
`
`
`US 6,363,345 B1
`Page 2
`
`US. PATENT DOCUMENTS
`
`3,170,046
`3,247,925
`3,262,521
`3,298,457
`3,330,376
`3,394,226
`3,416,782
`3,422,921
`3,562,089
`3,702,644
`3,830,988
`3,889,059
`3,890,474
`4,068,092
`4,122,303
`4,153,815
`4,169,257
`4,239,936
`4,241,805
`4,243,117
`4,261,708
`4,321,970
`4,334,740
`4,339,018
`4,363,007
`4,409,435
`4,417,098
`4,433,435
`4,442,546
`4,453,600
`4,455,675
`4,459,851
`4,461,025
`4,463,222
`4,473,906
`4,477,505
`4,489,441
`4,490,841
`4,494,074
`4,495,643
`4,517,415
`4,527,282
`4,530,304
`4,539,708
`4,559,642
`4,562,589
`4,566,118
`4,570,155
`4,581,758
`4,589,136
`4,589,137
`4,600,863
`4,622,692
`4,628,529
`4,630,302
`4,630,304
`4,636,586
`4,649,505
`4,653,102
`4,653,606
`4,654,871
`4,658,426
`4,672,674
`4,683,010
`4,696,043
`4,718,096
`4,731,850
`4,736,432
`4,741,038
`4,750,207
`
`2/1965
`4/1966
`7/1966
`1/1967
`7/1967
`7/1968
`12/1968
`1/1969
`2/1971
`11/1972
`8/1974
`6/1975
`6/1975
`1/1978
`10/1978
`5/1979
`9/1979
`12/1980
`12/1980
`1/1981
`4/1981
`3/1982
`6/1982
`7/1982
`12/1982
`10/1983
`11/1983
`2/1984
`4/1984
`6/1984
`6/1984
`7/1984
`7/1984
`7/1984
`9/1984
`10/1984
`12/1984
`12/1984
`1/1985
`1/1985
`5/1985
`7/1985
`7/1985
`9/1985
`12/1985
`12/1985
`1/1986
`2/1986
`4/1986
`5/1986
`5/1986
`7/1986
`11/1986
`12/1986
`12/1986
`12/1986
`1/1987
`3/1987
`3/1987
`3/1987
`3/1987
`4/1987
`6/1987
`7/1987
`9/1987
`1/1988
`3/1988
`4/1988
`4/1988
`6/1988
`
`Leale
`Warnaka
`Warnaka
`Warnaka
`Warnaka
`Andrews, Jr.
`Warnaka
`Warnaka
`Warnaka et al.
`Fowler et al.
`M01 et al.
`Thompson et al.
`Glicksberg
`Ikoma et al.
`Chaplin et al.
`Chaplin et al.
`Smith
`Sakoe
`Chance, Jr.
`Warnaka
`Gallagher
`Thigpen
`Wray
`Warnaka
`Haramoto et al.
`Ono
`Chaplin et al.
`David
`Ishigaki
`Thigpen
`Bose et al.
`Crostack
`Franklin
`Poradowski
`Warnaka et al.
`Warnaka
`Chaplin et al.
`Chaplin et al.
`Bose
`Orban
`Laurence
`Chaplin et al.
`Gardos
`Norris
`Miyaji et al.
`Warnaka et al.
`Chaplin et al.
`Skarman et al.
`Coker et al.
`Poldy et al.
`Miller
`Chaplin et al.
`Cole
`Borth et al.
`Kryter
`Borth et al.
`Schiff
`Zinser, Jr. et al.
`Hansen
`Flanagan
`Chaplin et al.
`Chabries et al.
`Clough et al.
`Hartmann
`IWahara et al.
`Meisel
`Levitt et al.
`Cantrell
`Elko et al.
`Gebert et al.
`
`4,752,961
`4,769,847
`4,771,472
`4,783,798
`4,783,817
`4,783,818
`4,791,672
`4,802,227
`4,811,404
`4,833,719
`4,837,832
`4,847,897
`4,862,506
`4,878,188
`4,908,855
`4,910,718
`4,910,719
`4,928,307
`4,930,156
`4,932,063
`4,937,871
`4,947,356
`4,951,954
`4,955,055
`4,956,867
`4,959,865
`4,963,071
`4,965,834
`4,977,600
`4,985,925
`4,991,433
`5,001,763
`5,010,576
`5,018,202
`5,023,002
`5,029,218
`5,046,103
`5,052,510
`5,070,527
`5,075,694
`5,086,385
`5,086,415
`5,091,954
`5,097,923
`5,105,377
`5,117,461
`5,121,426
`5,125,032
`5,126,681
`5,133,017
`5,134,659
`5,138,663
`5,138,664
`5,142,585
`5,192,918
`5,208,864
`5,209,326
`5,212,764
`5,219,037
`5,226,077
`5,226,087
`5,241,692
`5,251,263
`5,251,863
`5,260,997
`5,272,286
`5,276,740
`5,311,446
`5,311,453
`5,313,555
`5,313,945
`
`6/1988
`9/1988
`9/1988
`11/1988
`11/1988
`11/1988
`12/1988
`1/1989
`3/1989
`5/1989
`6/1989
`7/1989
`8/1989
`10/1989
`3/1990
`3/1990
`3/1990
`5/1990
`5/1990
`6/1990
`6/1990
`8/1990
`8/1990
`9/1990
`9/1990
`9/1990
`10/1990
`10/1990
`12/1990
`1/1991
`2/1991
`3/1991
`4/1991
`5/1991
`6/1991
`7/1991
`9/1991
`10/1991
`12/1991
`12/1991
`2/1992
`2/1992
`2/1992
`3/1992
`4/1992
`5/1992
`6/1992
`6/1992
`6/1992
`7/1992
`7/1992
`8/1992
`8/1992
`8/1992
`3/1993
`5/1993
`5/1993
`5/1993
`6/1993
`7/1993
`7/1993
`8/1993
`10/1993
`10/1993
`11/1993
`12/1993
`1/1994
`5/1994
`5/1994
`5/1994
`5/1994
`
`Kahn
`Taguchi
`Williams, III et al.
`LeibholZ et al.
`Hamada et al.
`Graupe et al.
`Nunley et al.
`Elko et al.
`Vilmur et al.
`Carme et al.
`Fanshel
`Means
`Landgarten et al.
`Ziegler et al.
`Ohga et al.
`Horn
`Thubert
`Lynn
`Norris
`Nakamura
`Hattori
`Elliott et al.
`MacNeill
`Fujisaki et al.
`Zarek et al.
`Stettiner et al.
`Larwin et al.
`Miller
`Ziegler
`Langberg et al.
`Warnaka et al.
`Moseley
`Hill
`Takahashi et al.
`SchWeiZer et al.
`Nagayasu
`Warnaka et al.
`Gossman
`Lynn
`Donnangelo et al.
`Launey et al.
`Takahashi et al.
`Sasaki et al.
`Ziegler et al.
`Ziegler, Jr.
`Moseley
`Bavmhauer
`Meister et al.
`Ziegler, Jr. et al.
`Cain et al.
`Moseley
`Moseley
`Kimura et al.
`Taylor
`Sugiyama
`Kaneda
`Harper
`Ariyoshi
`Smith et al.
`Lynn et al.
`Ono
`Harrison et al.
`Andrea et al.
`Gossman et al.
`Gattey et al.
`Cain et al.
`Inanaga et al.
`Ross et al.
`Denenberg et al.
`Kamiya
`Friedlander
`
`RTL345-2_1001-0002
`
`
`
`US 6,363,345 B1
`Page 3
`
`5/1994 Gossman et al.
`5,315,661 A
`6/1994 Hunt
`5,319,736 A
`7/1994 Stites, 111
`5,327,506 A
`7/1994 Gossman et al.
`5,332,203 A
`8/1994 Addeo et al.
`5,335,011 A
`9/1994 Harper
`5,348,124 A
`5,353,347 A 10/1994 Irissou et al.
`5,353,376 A 10/1994 on et al.
`5,361,303 A 11/1994 Eatwell
`5,365,594 A 11/1994 Ross et al.
`5,375,174 A 12/1994 Denenberg
`5,381,473 A
`1/1995 Andrea et al.
`5,381,481 A
`1/1995 Gammie et al.
`5,384,843 A
`1/1995 Masuda er a1,
`5,402,497 A
`3/1995 Nishimoto et al.
`5,412,735 A
`5/1995 Engebretson er a1,
`5,414,769 A
`5/1995 Gattey et al.
`5,414,775 A
`5/1995 Scribner et al.
`5,416,845 A
`5/1995 shen
`5,416,847 A
`5/1995 B016
`5,416,887 A
`5/1995 Shimada
`5,418,857 A
`5/1995 Eatwell
`5,423,523 A
`6/1995 Gossman et al.
`5,431,008 A
`7/1995 Ross et a1_
`5,432,859 A
`7/1995 Yang et a1,
`5,434,925 A
`7/1995 Nadim
`5,440,642 A
`8/1995 Denenberg er a1,
`5,448,637 A
`9/1995 Yamaguchi et al.
`5,452,361 A
`9/1995 Jones
`5,457,749 A 10/1995 Cain er a1,
`5,469,087 A 11/1995 Eatwell
`5,471,106 A 11/1995 Curtis et al.
`5,471,538 A 11/1995 Sasaki et a1.
`5,473,214 A 12/1995 Hildebrand
`5,473,701 A 12/1995 CeZanee et al.
`5,473,702 A 12/1995 Yoshida et al.
`5,475,761 A 12/1995 Eatwell
`5,479,562 A * 12/1995 Fielder et a1. ............ .. 704/229
`5,481,615 A
`1/1996 Eatwell er a1,
`5,485,515 A
`1/1996 Allen et a1.
`5,493,615 A
`2/1996 Burke er a1,
`5,502,869 A
`4/1996 Smith et al.
`5,511,127 A
`4/1996 Wamaka
`5,511,128 A
`4/1996 Lindeman
`5,515,378 A
`5/1996 Roy, III et a1.
`5,524,056 A
`6/1996 Killion et al.
`5,524,057 A
`6/1996 Akiho et a1.
`5,526,432 A
`6/1996 Denenberg
`5,546,090 A
`8/1996 Roy, III et a1.
`5,546,467 A
`8/1996 Denenberg
`5,550,334 A
`8/1996 Langley
`5,553,153 A
`9/1996 Eatwell
`5,563,817 A 10/1996 Ziegler et a1.
`5,568,557 A 10/1996 Ross et a1.
`5,581,620 A 12/1996 Brandstein et a1.
`5,592,181 A
`1/1997 Cai et a1.
`5,592,490 A
`1/1997 Barratt et a1.
`5,600,106 A
`2/1997 Langley
`5,604,813 A
`2/1997 Evans et a1.
`5,615,175 A
`3/1997 Cater et a1.
`5,617,479 A
`4/1997 Hildebrand et a1.
`5,619,020 A
`4/1997 Jones et a1.
`5,621,656 A
`4/1997 Langley
`5,625,697 A
`4/1997 Bowen et a1.
`5,625,880 A
`4/ 1997 Goldburg et a1.
`5,627,746 A
`5/1997 Ziegler, Jr. et a1.
`5,627,799 A
`5/1997 Hoshuyama
`5,638,022 A
`6/1997 Eatwell
`5,638,454 A
`6/1997 Jones et a1.
`5,638,456 A
`6/1997 Conley et a1.
`5,642,353 A
`6/1997 Roy, III et a1.
`
`7/1997 Ikeda
`5,644,641 A
`7/1997 Gifford et al.
`5,649,018 A
`7/1997 Eatwell
`5,652,770 A
`7/1997 Ross et al.
`5,652,799 A
`8/1997 Crow
`5,657,393 A
`9/1997 Chu et al.
`5,664,021 A
`9/1997 Obashi
`5,668,747 A
`5,668,927 A * 9/1997 Chan et al. ............... .. 704/240
`5,673,325 A
`9/1997 Andrea et al.
`5,676,353 A 10/1997 Jones et al.
`5,689,572 A 11/1997 Ohki et al.
`5,692,053 A 11/1997 Fuller et al.
`5,692,054 A 11/1997 Parrella et al.
`5,699,436 A 12/1997 Claybaugh et al.
`5,701,344 A 12/1997 Waklli
`5,706,394 A * 1/1998 Wynn ....................... .. 704/219
`5,715,319 A
`2/1998 Chu
`5,715,321 A
`2/1998 Andrea et 91
`5,719,945 A
`2/1998 Fuller et al.
`5,724,270 A
`3/1998 Posch
`5,727,073 A
`3/1998 Ikeda
`5,732,143 A
`3/1998 Andrea etal.
`5,745,581 A
`4/1998 EatWell 6t 61
`5,748,749 A
`5/1998 Miller et al.
`5,768,473 A
`6/1998 EatWell et al.
`5,774,859 A
`6/1998 Houser etal.
`5,787,259 A * 7/1998 Haroun et al. ............ .. 709/253
`5,798,983 A
`8/1998 Kuhn et 91
`5,812,682 A
`9/1998 Ross et al.
`5,815,582 A
`9/1998 Claybaugh et al.
`5,818,948 A * 10/1998 Gulick ...................... .. 381/77
`5,825,897 A 10/1998 Andrea et a1.
`5,825,898 A 10/1998 Marash
`5,828,768 A 10/1998 EatWell et al.
`5,835,608 A 11/1998 Warnaka 6t 61
`5,838,805 A 11/1998 Warnaka et al
`5,874,918 A
`3/1999 CZarneckietal.
`5,909,495 A
`6/1999 Andrea
`5,914,877 A * 6/1999 Gulick ................ .. 364/400.01
`5,914,912 A
`6/1999 Yang
`5,995,150 A * 11/1999 Hsieh et al. .............. .. 348/409
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`EP
`EP
`EP
`FR
`GB
`GB
`GB
`GB
`GB
`GB
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`W0
`W0
`W0
`W0
`W0
`W0
`W0
`
`2/1994
`0 583 900 A1
`5/1994
`0 595 457 A1
`7/1996
`0 721 251
`11/1996
`0 724 415
`10/1976
`2305909
`8/1969
`1 160 431
`9/1972
`1 289 993
`12/1974
`1 378 294
`9/1986
`2 172 769 A
`7/1991
`2 239 971 B
`2 289 593 A 11/1995
`56-89194
`7/1981
`59-64994
`4/1984
`62-189898
`8/1987
`1-149695
`6/1989
`1-314098
`12/1989
`2-070152
`3/1990
`3-169199
`7/1991
`3-231599
`10/1991
`4-16900
`1/1992
`WO 88/09512
`12/1988
`WO 92/05538
`4/1992
`WO 92/17019
`10/1992
`WO 94/16517
`7/1994
`WO 95/08906
`3/1995
`WO 96/15541
`5/1996
`WO 97/23068
`6/1997
`
`RTL345-2_1001-0003
`
`
`
`US 6,363,345 B1
`Page 4
`
`OTHER PUBLICATIONS
`
`Edward J. Foster, “Switched on Silence”, Popular Science,
`1994, p. 33.
`Kuo, Automatic Control of Systems, pp. 504—585.
`Luenberger, Optimization by Vector Space Method, pp.
`134—138.
`Ogata, Modern Control Engineering, pp. 474—508.
`Oppenheirn Schafer, Digital Signal Processing (Prentice
`Hall) pp. 542—545.
`P.P. Vaidyanathan, “Multirate Digital Filters, Filter Banks,
`Polyphase Networks, and Applications; A Tutorial,” IEEE
`Proc., vol. 78, No. 1, Jan. 1990.
`PP. Vaidyanathan, “Quadrature Mirror Filter Banks,
`M—band Extensions and Perfect—Reconstruction Tech
`niques,” IEEE ASSP Magazine, Jul. 1987, pp. 4—20.
`Rabiner et al., IEEE Trans. on Acous., vol. ASSP—24, No. 5,
`Oct. 1976, pp. 399—418.
`
`Rubiner et al., Digital Processing of Speech Signals (Pren
`tice Hall, 1978) pp. 130—135.
`Sapontis, Probability, Lambda Variables and Structural
`Processes, pp. 467—474.
`Scott C. Douglas, “A Family of NorrnaliZed LMS Algo
`rithrns,” IEEE Signal Proc. Letters, vol. 1, No. 3, Mar. 1994.
`SeWald et al., “Application of .
`.
`. Bearnforrning to Reject
`Turbulence Noise in Airducts,” IEEE ICASSP vol. 5, No.
`CONF—21, May 7, 1996, pp. 2734—2737.
`White, Moving—Coil Earphone Design, 1963, pp. 188—194.
`WidroW et al., “Adaptive Noise Canceling: Principles and
`Applications,” Proc. IEEE, vol. 63, No. 12, Dec. 1975, pp.
`1692—1716.
`Youla et al., IEEE Trans. on Acous., vol. MI—1, No. 2, Oct.
`1982, pp. 81—101.
`
`* cited by eXarniner
`
`RTL345-2_1001-0004
`
`
`
`3U
`
`t
`
`mm
`
`0
`
`US 6,363,345 B1
`
`
`
`
` M£wm_qEmw._.n_n:mcfimmooi._.n_n_m5930%_a>o852:8NE
`
`
`
`mm:
`
`S
`
`m:w:Scum:o:
`
`
`
`¢nIa.mEm_oEmooP962m
`
`mm29:X:N2
`
`M.>r..22:9%98a5:,Eon.QM#3C_T262mmm
`
`wm_qEmm
`
`RTL345-2 1001-0005
`
`
`
`
`
`Eo.m>mcozomabsm_m:omo_m
`
`_..O_u_
`
`RTL345-2_1001-0005
`
`
`
`
`U.S. Patent
`
`Mar. 26, 2002
`
`Sheet 2 0f 10
`
`US 6,363,345 B1
`
`
`
`608 N3.
`
`
`
`cozmEzwm 382m
`
`
`
`@202 , cotbmznsw
`
`
`
`PUT: 380$
`
`632.6%
`
`
`EmEoQ wEc.
`2265 :35 \(wrm
`
`
`N .UE
`
`
`
`mEwwwooE @202
`
`
`
`z 1552;; 75am:
`
`:1: 08
`
`
`
`A A A wow 8.6. v8
`
`6: 6E
`
`E: EE
`
`mom
`
`RTL345-2_1001-0006
`
`
`
`U.S. Patent
`
`Mar. 26, 2002
`
`Sheet 3 0f 10
`
`US 6,363,345 B1
`
`Assm
`E @2630
`
`85 O8
`
`Em
`
`
`
`6m cogmww A w EEEEE 23:". wow
`
`m .9"
`
`
`
`
`
`$805 cozmEzwm @202
`
`RTL345-2_1001-0007
`
`
`
`P&U
`
`m
`
`M
`
`US 6,363,345 B1
`
`e|.||n|l||M.SEoov
`
`62
`
`
`
`m_AV_6,IIFI:5:
`
`Ev_._.,25__:_n:5_=:OM362_m:_u_mwm_mBsoE_._,,:5m_c_u25«__:.o
`m382“.
`
`_EE_:_
`
`25:...
`
`
`
`mwmooico_..om:o_:m
`
`w.0_n_
`
`RTL345-2 1001-0008
`
`RTL345-2_1001-0008
`
`
`
`
`
`U
`
`netaP
`
`mM
`
`m
`
`US 6,363,345 B1
`
`tmom
`
`S.ASNVcom
`
`
`
`:_mEon_mE_._.
`
`_m:m_m..S9:
`
`
`
`W.commawéozM._on_:0E2.
`
`E:E_c__>_E:wEm55>Em%momamm
`Mm:_m>
`5E:
`
`RTL345-2 1001-0009
`
`
`
`
`
`mmoo2n_mm_oz_m.:_o_mmm_
`
`m.0_u_
`
`RTL345-2_1001-0009
`
`
`
`
`U.S. Patent
`
`Mar. 26, 2002
`
`Sheet 6 0f 10
`
`US 6,363,345 B1
`
`gm
`
`2:
`
`wcm
`
`6m :0 EB.
`
`comwawéoz
`
`
`
`@2552? $321 $62 _m:_2wmm
`
`
`
`
`
`RTL345-2_1001-0010
`
`
`
`U.S. Patent
`
`Mar. 26, 2002
`
`Sheet 7 0f 10
`
`US 6,363,345 B1
`
`Read Input
`Samples
`
`600
`
`602
`
`Store Data in
`Buffer
`
`604
`
`Are 256
`New Points
`Accumulated
`
`N0—>
`
`608
`
`Stored Inputs
`
`Yes
`t
`Move 512 Last
`Points to Processing
`Buffer
`
`_______—> Perform 512
`Points FFT
`
`606
`
`610
`
`614
`
`Stored Data
`[R(()_255);|(O_255)]
`
`store 256
`_
`Slgnl?cant Complex
`Points in Buffer
`
`612
`
`FIG. 6
`
`RTL345-2_1001-0011
`
`
`
`U.S. Patent
`
`Mar. 26, 2002
`
`Sheet 8 0f 10
`
`US 6,363,345 B1
`
`704
`
`Stored Y(0-255)
`
`702
`
`Stored Data
`[R(0-255);|(0-255)]
`
`'v 706
`
`700 a
`
`V
`
`i
`
`V
`
`l
`
`710w (FU’IUre Minimum)
`
`(Current Minimum)
`
`712
`
`714
`>
`Future
`Minimum
`
`Yes
`
`716
`P
`Replace Future
`Minimum With Y(n)
`
`718
`
`Yes
`t
`Replace Current
`Minimum With Y(n)
`
`N0
`
`it
`
`|
`
`|
`
`724
`
`720
`
`Current
`Minimum
`
`it
`
`init Future Minimum
`With Current Y(n)
`
`Did 5 Seconds
`
`Init Current Minimum
`With Future Minimum
`
`<
`722
`
`8
`726
`
`RTL345-2_1001-0012
`
`
`
`U.S. Patent
`
`Mar. 26, 2002
`
`Sheet 9 0f 10
`
`US 6,363,345 B1
`
`800
`
`Y(n)<4*Current
`Minimum(n
`
`N 0__,
`
`802
`P
`
`806
`
`N(O-255)
`
`804
`>
`
`Yes
`
`N(n)t:
`N(n)t_1*0.095+
`Y(n)*0.05
`<——————————
`v
`
`808
`8
`
`810
`
`_
`Sm“ Y“) 255)
`
`H(n)=
`{||Y(n>|-N(n)|}/|Y<n>|
`
`N(0-255)
`Buffer
`
`812
`
`Stored Out
`[R(0-255),l(0-255)]
`
`Stored Data
`
`820
`
`RTL345-2_1001-0013
`
`
`
`U.S. Patent
`
`Mar. 26, 2002
`
`Sheet 10 0f 10
`
`US 6,363,345 B1
`
`900
`a
`
`902
`a
`
`4'
`
`904
`
`(
`
`Stored IFFT
`Results
`
`'
`
`Perform IFFT
`
`Stored Out
`[R(O-255),|(O-255)]
`
`Sum First 256 Points~v906
`With
`Previous Last 256
`Points
`
`1
`
`i Out
`
`FIG. 9
`
`RTL345-2_1001-0014
`
`
`
`US 6,363,345 B1
`
`1
`SYSTEM, METHOD AND APPARATUS FOR
`CANCELLING NOISE
`
`RELATED APPLICATIONS INCORPORATED
`BY REFERENCE
`
`The following applications and patent(s) are cited and
`hereby herein incorporated by reference: US. patent Ser.
`No. 09/130,923 ?led Aug. 6, 1998, US. patent Ser. No.
`09/055,709 ?led Apr. 7, 1998, US. patent Ser. No. 09/059,
`503 ?led Apr. 13, 1998, US. patent Ser. No. 08/840,159 ?led
`Apr. 14, 1997, US. patent Ser. No. 09/130,923 ?led Aug. 6,
`1998, US. patent Ser. No. 08/672,899 noW issued US. Pat.
`No. 5,825,898 issued Oct. 20, 1998. And, all documents
`cited herein are incorporated herein by reference, as are
`documents cited or referenced in documents cited herein.
`
`10
`
`15
`
`FIELD OF THE INVENTION
`
`The present invention relates to noise cancellation and
`reduction and, more speci?cally, to noise cancellation and
`reduction using spectral subtraction.
`
`20
`
`BACKGROUND OF THE INVENTION
`
`Ambient noise added to speech degrades the performance
`of speech processing algorithms. Such processing algo
`rithms may include dictation, voice activation, voice com
`pression and other systems. In such systems, it is desired to
`reduce the noise and improve the signal to noise ratio (S/N
`ratio) Without effecting the speech and its characteristics.
`Near ?eld noise canceling microphones provide a satis
`factory solution but require that the microphone in the
`proximity of the voice source (e.g., mouth). In many cases,
`this is achieved by mounting the microphone on a boom of
`a headset Which situates the microphone at the end of a
`boom proximate the mouth of the Wearer. HoWever, the
`headset has proven to be either uncomfortable to Wear or too
`restricting for operation in, for example, an automobile.
`Microphone array technology in general, and adaptive
`beamforming arrays in particular, handle severe directional
`noises in the most ef?cient Way. These systems map the
`noise ?eld and create nulls toWards the noise sources. The
`number of nulls is limited by the number of microphone
`elements and processing poWer. Such arrays have the bene?t
`of hands-free operation Without the necessity of a headset.
`HoWever, When the noise sources are diffused, the per
`formance of the adaptive system Will be reduced to the
`performance of a regular delay and sum microphone array,
`Which is not alWays satisfactory. This is the case Where the
`environment is quite reverberant, such as When the noises
`are strongly re?ected from the Walls of a room and reach the
`array from an in?nite number of directions. Such is also the
`case in a car environment for some of the noises radiated
`from the car chassis.
`
`OBJECTS AND SUMMARY OF THE
`INVENTION
`
`The spectral subtraction technique provides a solution to
`further reduce the noise by estimating the noise magnitude
`spectrum of the polluted signal. The technique estimates the
`magnitude spectral level of the noise by measuring it during
`non-speech time intervals detected by a voice sWitch, and
`then subtracting the noise magnitude spectrum from the
`signal. This method, described in detail in Suppression of
`Acoustic Noise in Speech Using Spectral Subtraction,
`(Steven F Boll, IEEE ASSP-27 NO.2 April, 1979), achieves
`good results for stationary diffused noises that are not
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`correlated With the speech signal. The spectral subtraction
`method, hoWever, creates artifacts, sometimes described as
`musical noise, that may reduce the performance of the
`speech algorithm (such as vocoders or voice activation) if
`the spectral subtraction is uncontrolled. In addition, the
`spectral subtraction method assumes erroneously that the
`voice sWitch accurately detects the presence of speech and
`locates the non-speech time intervals. This assumption is
`reasonable for off-line systems but dif?cult to achieve or
`obtain in real time systems.
`More particularly, the noise magnitude spectrum is esti
`mated by performing an FFT of 256 points of the non-speech
`time intervals and computing the energy of each frequency
`bin. The FFT is performed after the time domain signal is
`multiplied by a shading WindoW (Hanning or other) With an
`overlap of 50%. The energy of each frequency bin is
`averaged With neighboring FFT time frames. The number of
`frames is not determined but depends on the stability of the
`noise. For a stationary noise, it is preferred that many frames
`are averaged to obtain better noise estimation. For a non
`stationary noise, a long averaging may be harmful.
`Problematically, there is no means to knoW a-priori Whether
`the noise is stationary or non-stationary.
`Assuming the noise magnitude spectrum estimation is
`calculated, the input signal is multiplied by a shading
`WindoW (Hanning or other), an FFT is performed (256
`points or other) With an overlap of 50% and the magnitude
`of each bin is averaged over 2—3 FFT frames. The noise
`magnitude spectrum is then subtracted from the signal
`magnitude. If the result is negative, the value is replaced by
`a Zero (Half Wave Recti?cation). It is recommended,
`hoWever, to further reduce the residual noise present during
`non-speech intervals by replacing loW values With a mini
`mum value (or Zero) or by attenuating the residual noise by
`30 dB. The resulting output is the noise free magnitude
`spectrum.
`The spectral complex data is reconstructed by applying
`the phase information of the relevant bin of the signal’s FFT
`With the noise free magnitude. An IFFT process is then
`performed on the complex data to obtain the noise free time
`domain data. The time domain results are overlapped and
`summed With the previous frame’s results to compensate for
`the overlap process of the FFT.
`There are several problems associated With the system
`described. First, the system assumes that there is a prior
`knoWledge of the speech and non-speech time intervals. A
`voice sWitch is not practical to detect those periods.
`Theoretically, a voice sWitch detects the presence of the
`speech by measuring the energy level and comparing it to a
`threshold. If the threshold is too high, there is a risk that
`some voice time intervals might be regarded as a non-speech
`time interval and the system Will regard voice information as
`noise. The result is voice distortion, especially in poor signal
`to noise ratio cases. If, on the other hand, the threshold is too
`loW, there is a risk that the non-speech intervals Will be too
`short especially in poor signal to noise ratio cases and in
`cases Where the voice is continuous With little intermission.
`Another problem is that the magnitude calculation of the
`FFT result is quite complex. This involves square and square
`root calculations Which are very expensive in terms of
`computation load. Yet another problem is the association of
`the phase information to the noise free magnitude spectrum
`in order to obtain the information for the IFFT. This process
`requires the calculation of the phase, the storage of the
`information, and applying the information to the magnitude
`data—all are expensive in terms of computation and
`
`RTL345-2_1001-0015
`
`
`
`US 6,363,345 B1
`
`3
`memory requirements. Another problem is the estimation of
`the noise spectral magnitude. The FFT process is a poor and
`unstable estimator of energy. The averaging-over-time of
`frames contributes insuf?ciently to the stability. Shortening
`the length of the FFT results in a Wider bandwidth of each
`bin and better stability but reduces the performance of the
`system. Averaging-over-time, moreover, smears the data
`and, for this reason, cannot be extended to more than a feW
`frames. This means that the noise estimation process pro
`posed is not suf?ciently stable.
`It is therefore an object of this invention to provide a
`spectral subtraction system that has a simple, yet ef?cient
`mechanism, to estimate the noise magnitude spectrum even
`in poor signal-to-noise ratio situations and in continuous fast
`speech cases.
`It is another object of this invention to provide an ef?cient
`mechanism that can perform the magnitude estimation With
`little cost, and Will overcome the problem of phase associa
`tion.
`It is yet another object of this invention to provide a stable
`mechanism to estimate the noise spectral magnitude Without
`the smearing of the data.
`In accordance With the foregoing objectives, the present
`invention provides a system that correctly determines the
`non-speech segments of the audio signal thereby preventing
`erroneous processing of the noise canceling signal during
`the speech segments. In the preferred embodiment, the
`present invention obviates the need for a voice sWitch by
`precisely determining the non-speech segments using a
`separate threshold detector for each frequency bin. The
`threshold detector precisely detects the positions of the noise
`elements, even Within continuous speech segments, by
`determining Whether frequency spectrum elements, or bins,
`of the input signal are Within a threshold set according to a
`minimum value of the frequency spectrum elements over a
`preset period of time. More precisely, current and future
`minimum values of the frequency spectrum elements. Thus,
`for each syllable, the energy of the noise elements is
`determined by a separate threshold determination Without
`examination of the overall signal energy thereby providing
`good and stable estimation of the noise. In addition, the
`system preferably sets the threshold continuously and resets
`the threshold Within a predetermined period of time of, for
`example, ?ve seconds.
`In order to reduce complex calculations, it is preferred in
`the present invention to obtain an estimate of the magnitude
`of the input audio signal using a multiplying combination of
`the real and imaginary parts of the input in accordance With,
`for example, the higher and the loWer values of the real and
`imaginary parts of the signal. In order to further reduce
`instability of the spectral estimation, a tWo-dimensional
`(2D) smoothing process is applied to the signal estimation.
`A tWo-step smoothing function using ?rst neighboring fre
`quency bins in each time frame then applying an exponential
`time average effecting an average over time for each fre
`quency bin produces excellent results.
`In order to reduce the complexity of determining the
`phase of the frequency bins during subtraction to thereby
`align the phases of the subtracting elements, the present
`invention applies a ?lter multiplication to effect the subtrac
`tion. The ?lter function, a Weiner ?lter function for example,
`or an approximation of the Weiner ?lter is multiplied by the
`complex data of the frequency domain audio signal. The
`?lter function may effect a full-Wave recti?cation, or a
`half-Wave recti?cation for otherWise negative results of the
`subtraction process or simple subtraction. It Will be appre
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`4
`ciated that, since the noise elements are determined Within
`continuous speech segments, the noise estimation is accurate
`and it may be canceled from the audio signal continuously
`providing excellent noise cancellation characteristics.
`The present invention also provides a residual noise
`reduction process for reducing the residual noise remaining
`after noise cancellation. The residual noise is reduced by
`Zeroing the non-speech segments, e. g., Within the continuous
`speech, or decaying the non-speech segments. A voice
`sWitch may be used or another threshold detector Which
`detects the non-speech segments in the time-domain.
`The present invention is applicable With various noise
`canceling systems including, but not limited to, those sys
`tems described in the US. patent applications incorporated
`herein by reference. The present invention, for example, is
`applicable With the adaptive beamforming array. In addition,
`the present invention may be embodied as a computer
`program for driving a computer processor either installed as
`application softWare or as hardWare.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Other objects, features and advantages according to the
`present invention Will become apparent from the folloWing
`detailed description of the illustrated embodiments When
`read in conjunction With the accompanying draWings in
`Which corresponding components are identi?ed by the same
`reference numerals.
`FIG. 1 illustrates the present invention;
`FIG. 2 illustrates the noise processing of the present
`invention;
`FIG. 3 illustrates the noise estimation processing of the
`present invention;
`FIG. 4 illustrates the subtraction processing of the present
`invention;
`FIG. 5 illustrates the residual noise processing of the
`present invention;
`FIG. 5A illustrates a variant of the residual noise process
`ing of the present invention;
`FIG. 6 illustrates a How diagram of the present invention;
`FIG. 7 illustrates a How diagram of the present invention;
`FIG. 8 illustrates a How diagram of the present invention;
`and
`FIG. 9 illustrates a How diagram of the present invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`
`FIG. 1 illustrates an embodiment of the present invention
`100. The system receives a digital audio signal at input 102
`sampled at a frequency Which is at least tWice the bandWidth
`of the audio signal. In one embodiment, the signal is derived
`from a microphone signal that has been processed through
`an analog front end, A/D converter and a decimation ?lter to
`obtain the required sampling frequency. In another
`embodiment, the input is taken from the output of a beam
`former or even an adaptive beamformer. In that case the
`signal has been processed to eliminate noises arriving from
`directions other than the desired one leaving mainly noises
`originated from the same direction of the desired one. In yet
`another embodiment, the input signal can be obtained from
`a sound board When the processing is implemented on a PC
`processor or similar computer processor.
`The input samples are stored in a temporary buffer 104 of
`256 points. When the buffer is full, the neW 256 points are
`combined in a combiner 106 With the previous 256 points to
`
`RTL345-2_1001-0016
`
`
`
`US 6,363,345 B1
`
`5
`provide 512 input points. The 512 input points are multiplied
`by multiplier 108 With a shading WindoW With the length of
`512 points. The shading WindoW contains coefficients that
`are multiplied With the input data accordingly. The shading
`WindoW can be Hanning or other and it serves tWo goals: the
`?rst is to smooth the transients betWeen tWo processed
`blocks (together With the overlap process); the second is to
`reduce the side lobes in the frequency domain and hence
`prevent the masking of loW energy tonals by high energy
`side lobes. The shaded results are converted to the frequency
`domain through an FFT (Fast Fourier Transform) processor
`110. Other lengths of the FFT samples (and accordingly
`input buffers) are possible including 256 points or 1024
`points.
`The FFT output is a complex vector of 256 signi?cant
`points (the other 256 points are an anti-symmetric replica of
`the ?rst 256 points). The points are processed in the noise
`processing block 112(200) Which includes the noise mag
`nitude estimation for each frequency bin—the subtraction
`process that estimates the noise-free complex value for each
`frequency bin and