`
`6383345
`
`6363345
`
`U.S. UTILIW PATENT APPLICATION
`
`tlAR E s ?0m
`
`ART UNIT' -,
`,4-
`;-'
`--;-',/
`
`FILEO WITH:
`
`DrsK (cRF) flrrcHe
`
`{Anached in pockst on dght insid€ tlap)
`
`PREPARED AND APPROVED FOR ISSUE
`
`z F (
`
`.)
`tl-
`
`6o
`Jo
`uJloqt
`
`tJ)
`
`,.0
`
`Ni
`
`. VN
`
`,._f
`
`t!N
`
`ORIGII{AL
`GLASS I SUBCLASS
`70q
`I aa,i'
`2t/ oA
`
`INTERNATIONAL CLASSIFICATION
`g I
`
`L
`
`tssurNG cLAsstFtcATtoN
`cRoss REF€FEI{CE(S)
`
`CLASS
`-70 c4
`
`suBcLAsSr(oNE SUBCLASS PER BLOCX)
`
`233 ,9"ar
`
`fl Continued on lssue Slip lnside File Jgckal *
`
`I lorsclarueR
`r--TTERMINAL
`
`E a) The term of this patent
`subsequent to _---._
`has been disclaimed.
`E U) fn" lerm of thls patent shall
`
`(dale)
`
`of U.S Patent. No.
`
`fl c) The terminal
`of
`lhis palenl have been disclaimed.
`-months
`
`' Sheets Drwg.
`l\r
`
`DRAWINGS
`
`Figs. Drwg.
`
`{o
`
`Print Fig.
`.")
`
`^
`
`(Assislsnt Examan6r)
`
`Ftichemond Oory,{
`PrimaryExarnifler
`
`q#l
`
`CLAIMgALLOWED
`
`Total CJFirn$
`
`Print Claim for O.G.
`
`I
`
`,&,',
`lo-t ot o/
`
`NOTICE OF ALLOWANCE MAILET'
`
`ISSUE FEE
`
`Amount Due
`
`Date Paid
`
`ff 6la, "n
`ISSUE BATCH NUMBER
`
`l)- . lt ,- ")
`
`,ulf
`
`M 52-
`
`WARNING:
`The information disclosed heroin may bo restricted. unaulhorizsd disclosure may be prohibited by tho united states code Tille 35, soctions t?2, 181 and 368,
`Possession oulside lhe U.S. Patent & Trad€mark Ottica is restrictgd to authorized employges and contractoG only.
`
`Fom PTS436A
`(Rov. 6/98)
`
`r'x){.: ) j.rjiitil S,_...
`
`q*
`
`-:{,,$,,r
`
`*0.
`(LABEL AREA)
`
`lsms For
`
`{FACE)
`
`RTL345-2_1020-0001
`
`
`
`PATENT APPLICATION
`
`0925?87tn
`
`t'
`
`-
`ic549 U.S' PTo
`09/2s28?{
`
`ililIfilry!rut[|llu|!
`
`illlllllll
`
`r1161urwrnu$-nLL-
`
`Application
`
`!- p"purt.
`
`l.
`2.
`
`3.
`
`4.'
`
`5.
`
`6.
`
`7.
`
`q
`
`9.
`a To.
`
`11.
`
`12'.
`
`17.
`
`18.
`
`19.
`
`20.
`
`21.
`
`22.
`
`23.
`
`24.
`
`25.
`
`26.
`
`27.
`
`28.
`
`29.
`
`30.
`
`31.
`
`.32.
`
`33.
`
`34.
`
`35..
`
`36.
`
`37.
`
`38.
`
`39.
`
`40.
`
`41.
`
`CONTENTS
`Date received
`(lncl. C. of M.)
`or
`Date Mailed
`
`7€.
`
`42.
`
`44.
`
`45.
`
`46.
`
`47.
`
`4 -/J- 17
`,|
`, t" tl\
`\ * id"."v
`
`ff
`
`48.
`
`s0.
`
`L(U&14, 4e.
`4 sr.
`{ ur,
`53. ,
`61 54. :
`55' ,, ,
`se. 'tl ri
`
`74.
`
`78.
`
`A*
`
`(LEFr OUTSIDE)
`
`RTL345-2_1020-0002
`
`
`
`ISSTIE SLIP STAPT ; dtEA (for arJ.-"".' '.i r;-'ss references)
`
`3-s/- rf
`
`Non-electild
`.,....., lnterference
`........ APPeal
`........0biected
`
`Dale
`
`Claim
`
`6oo
`
`u-
`
`INDEX OF CLAIMS
`......,. Reiected
`........ Allowed
`(Through numeral)... Canceled
`........ Restricted
`
`N I A 0
`
`lf more than 150 claims or 10 actions
`staple additional sheet here
`
`(LEFT INSIDE)
`
`RTL345-2_1020-0003
`
`
`
`SEARCHED
`
`SEARCH NOTES
`trNcr-uolNa SEARCH STRATEGY)
`
`JlLull
`
`j-- Zr<'o t
`
`Oa7\
`ttD
`233
`2oo
`2olZof
`ZLb
`,227
`278
`
`ZL/"'
`721
`,1 19La. e
`2-bQ
`zo t-
`ZOJ
`L75>ll
`Ltb
`21C
`r6.t2
`Lz'oz
`317.ot
`
`3L
`
`t06'o
`u46'
`cp6"t9
`u46. t t+
`+oi-of
`
`RTL345-2_1020-0004
`
`
`
`,l^'
`
`RTL345-2_1020-0005
`
`
`
`SERIAL NUMBEB
`
`09 l2E2 r874
`
`FILING DATE
`02/L8/ee
`
`CLASS
`
`381
`
`GROUP ART dNIT
`2743
`
`ATTORNEY DOCKET NO.
`
`670025-2800
`
`JoSEPII UARASH' HAIFA, ISRMLt BARuCtl BERDUGo, KTRIAT-ATA 28OOO, ISRAEL.
`
`!
`
`I
`
`:r *CONTINUINC DOI.{ESTIC DAAA* * * * * * * * * * * * * * * * * * * * *
`VERIFIED
`
`Nv< ,6M
`I
`*37 1 ( NAT, L STAGE ) DATA* * * * * * * * * * * * * * * * * rf * * *
`'t
`VERII'IED
`ilror , (lY'f
`
`n/'
`
`/
`
`,/
`,/. .
`./
`
`* :hl'OREIGN APPLICAIIONS* * * * * * t, t, t t' t, t 1/
`vERrrrED
`/
`il',@
`
`/It,
`
`TATE OR
`fouNrRY
`rtx
`
`TOTAL
`ctAfMs
`49
`
`INDEP€ISDENT
`CLAIMS
`3
`
`rF RaQurRED, FoRErcN FrtrNc LrcENsE cRANTED o)4o9/99 *t svet\,.6NTrry **
`!gf9tS!-P-ri-o[rty clcimed
`tryes [76o
`SHEETS/
`35 USC 119 (edl conditions met Eiies^EpA EMet efter Ailowance
`DRAWING
`Verified andAcknowledg*r***ffi}
`10
`THOUAS .' KO!{ALSKI
`FROUI,TSR LA?IRENCE &
`745 FIFTH AVENUE
`NEW YORK NY 10151
`
`tt)
`
`vtulEoo
`
`-rniriErE--/
`
`HAUG
`
`All Fees
`1.16 Fees {Filing)
`1.17 Fees tProcessing Ext. of time)
`1.18 Fees {lssue}
`Other
`Credit
`
`f Efft
`
`SYSTEIIT, I.ISIHOD AND APPARATUS FOR CANCELLING NoIsE
`
`utJbF
`
`FILING FEE
`FECEIVED
`
`s654
`
`FEES: Authority has been given in Paper
`No. _
`to charge/credit DEPOSIT ACCOUNT
`for the following:
`NO.
`
`RTL345-2_1020-0006
`
`
`
`F
`'1,
`
`cdi.*r-*K
`.- iuft-"
`''f-r
`*;-
`;ld
`
`FROIT{MER IJAWRENCE & HAUG tLP
`745 Fifth Avenue
`New York, New york 10151
`TeL (2L2) 5gg-0800
`Fax (2L2) 588-0500
`PATEMT APPTJTCATION TRAI{SMITTAT
`Dat.e: February 19, l?99
`Re:
`670025-2800
`TO: THE COMMISSIONER OF PATENTS AND TRADEMARKS
`Box PATENT APPLICATION
`Washington, D.C. 2O23I
`
`Sir:
`with reference to the filing
`in the unj_ted states patent ancl
`Trademark office of an application for patent j_n the name ofr
`{IOSEPH MARiilSH and BARUCH BERDUGO
`ENTitled: SYSTEM. METIIOD AI{D APPARATUS FoR CAI{CEIIJING NoIsE
`The following are enclosed:
`X Specification {22 pages) and One page of Abstract (p" j i
`49 Claims (including 3 independent -laims; pp. 23-31-)
`X
`X
`10 Sheets of Drawings.(Figs. 1, 2,3, 4,5,-5A, 6,7,
`8, 9)
`unsigned Declarati-on and power of Attorney (2 pages)
`The. f iling fee will be paid 1ater, in response t.o a
`Notice to File Missing parts. Kindly accord the
`applj-cation a February 18, 199g f iling dat.e and addres;,
`all communicat.ions to the undersigned at the address
`above.
`Respectfully submitted
`an
`
`xx
`
`Thomas
`
`Kowalski, Reg. No. 32,147
`
`EXPRESS MAII
`Mailing Label Number Etn2g9O913lUS
`DateofDeDoait @
`I hereby-cirtifyTEFtnrs paper or fee is being
`deposited rith the United SLaEes postal Service
`n&<press Mail PosC Office to Addresseen Seryice
`under 37 CFR 1.10 on the dat.e indi-caced above and
`
`Eure of person ma
`
`RTL345-2_1020-0007
`
`
`
`IN TIIE I'NITED STATES PATEIIT ATiTD TRADEMARK OFFICE
`
`APPLICATION FOR IJETTERS PATEIW
`
`Tit,Ie:
`Inventors:
`
`22 pages
`49 Claim
`-!lL sheets
`
`SYSTEITI, I,IETHOD AlrD APPARAIUS FOR CAIICELLING NOISE
`'Joseph Marash, Baruch Berdugo
`
`specification and one page of Abstract (page i)
`(3 Independent; on pages 23-3L)
`of Figs. (Figs. l--5, 5A, 6-9)
`
`EXPRESS MAII,
`
`is addresged bo the Assl"stant Comi.ssioner of
`, Washington, D.C. 20231
`
`Thomas J. Kowafski
`Registration No. 32,t47
`I. Marc Asperas
`Registration No. 37,274
`FROMMER LAWRENCE & HAUG LLP
`745 Fifth Avenue
`New York, New York 10151
`(2L2) s88-0800
`
`MARCA\2800.APP (IMA: car)
`
`RTL345-2_1020-0008
`
`
`
`PATENT
`670025-28OO
`
`RELATED APPI,ICATIONS INCORPORATED BY REFERENCE.
`The following applications and patent (s) are cited and
`hereby herein incorporated by reference: U.S. Patent Serial No.
`og/a30,923 filed August 6, 1-998, U.S. Patent serial No.
`'l , 1-998, U. S. PatenL Serial No. 09/059, 503
`Og /A55,7Ag f iled April
`filed April 13, 1998, U.S. Patent Serial No. 08/840,1-59 filed
`!4, 19g'.7, U.S. PaLent Serial No. og/I30,923 filed August 6,
`April
`L998, U.S. Patent Serial No. 08/672,899 now issued U.S. Patent
`No. 5,825,898 issued October 20, 1-998. And, al-t documents cited
`herein are incorporated herein by reference, ds are documents
`cited or referenced in documents cited herein-
`FIELD OF THE IIiIVENTION.
`The presenL invention relates to noise cancellation and
`to noise cancellation and
`reduction and, more specifically,
`reduction using spectral subtraction.
`
`BACKGROITIID OF THE INVEMTTON.
`Ambient noise added to speech degrades the performance
`of speech processing algorithms. Such processing algorithms may
`include dictation, voice activation, voice compression and oLher
`systems. In such systems, it is desired to reduce the noise and
`improve the signal to noise rat.io (S/N ratio) without effecting
`the speech and its characteristics.
`Near field noise canceling microphones provide a
`
`MARCA\2B00.APP ( IMA: car)
`
`1e:
`
`151j
`
`20
`
`25
`
`RTL345-2_1020-0009
`
`
`
`the
`
`this
`
`PATENT
`67002s -2800
`satisfactory solut.ion but require that the microphone in the
`proximity of the voice source (e.g., mouth). In many cases,
`is achieved. by mounting the microphone on a boom of a headset
`which situates the microphone at the end of a boom proximate
`mouth of the wearer. However, t.he headset has proven to be
`either uncomfortable to wear or too restricting for operation
`for example, an automobile.
`Microphone array t.echnology in general, and adaptive
`beamforming arrays in particular, handle severe directional
`noises in the most efficient way. These systems map the noise
`f iel-d and. create nul-ls towards t.he noise sources. The number of
`nulls is limited by the number of microphone elements and
`processing power. Such arrays have the benefit of hands-free
`operation without the necessity of a headset.
`However, when the noise sources are diffused, Lhe
`performance of t.he adaptive system will be reduced Lo the
`perfornance of a regular delay and sum microphone array, which is
`not always satisfactory. This is the case where t.he environment
`is quite reverberant, such as when Lhe noises are strongly'
`reflected from the wal1s of a room and reach the array from an
`infinite number of directions. Such is also the case in a car
`environment for some of t.he noises radiated from the car chassis.
`
`OBiTECTS AI{D SI'MMARY OF THE IN1TENTION
`The spectral subtraction technique provides a solution
`MARCA\2800 .APP ( IMA: car)
`
`4*
`
`rsi
`
`:1t
`?nsi
`
`.!*
`
`1si
`:lt.-
`s;c
`r&
`
`2A
`
`25
`
`RTL345-2_1020-0010
`
`
`
`PATENT
`670025-2800
`to furt.her reduce the noise by estimating the noise magnitude
`spectrum of the polluted signal. The t.echnique estimates the
`magnitude spectral level of t.he noise by measuring it during non-
`speech time intervals det.ected. by a voice switch, and then
`subt.racting the noise magnit.ude spectrum f rom t.he signal. This
`method, described in detail tn Suppression of Acoustic Noise in
`Speech Using Speetral Subtraction, (Steven F Bo7l, IEEE ASSP-27
`NO.2 Apri7, 7g7g), achieves good results for stationary diffused
`noises t.hat are not correlated with the speech signal. The
`spectral subtraction method, however, creates artifacts,
`sometimes described as musical noise, that may reduce the
`'perf ormance of t.he speech algorithm (such as vocoders or voice
`activat,ion) if the spectral subtraction is uncontrolled. In
`addit.ion, the spectral subtraction method assumes erroneously
`that. the voice switch accuralely det.ects the presence of speech
`and locates t.he non-speech t.ime intervals. This assumption is
`to achieve or
`reasonable for off-]ine systems but difficult
`obt.ain in real time systems.
`the noise magnitude spectrum is
`More particularly,
`esLimated by performing an FFT of 255 points of the non-speech
`time intervals and computing the energy of each frequency bin.
`The FFT is performed after the time domain signal is multiplied
`by a shading window (Hanning or other) with an overlap of 50?.
`The energy of eaeh frequency bin is averaged with neighboring FFT
`time frames. The number of frames is not determined but. depends
`
`MARCA\28OO.APP ( IMA : CAr)
`
`43
`
`l- q:g
`
`l-3;tr-
`
`';
`
`20
`
`25
`
`RTL345-2_1020-0011
`
`
`
`t_fff
`
`'f-i
`'.-a
`
`l atl
`
`PATENT
`670425-2800
`on the stability of the noise. For a stationary noise, it is
`preferred that many frames are averaged to obtain better noise
`estimation. For a non-stationary noise, a long averaging may be
`harmful. Problemati-ca1ly, there is no means to know a-priori
`whether the noise is stat.ionary or non-statj-onary.
`Assuming the noise magnitude spectrum estimat.ion is
`calculated, the input signal is multiplied by a shading window
`(Hanning or other), an FFT is performed (255 points or other)
`with an overlap of 50? and the magnitude of each bin is averaged
`over 2-3 FFT frames. ?he noise magini-tude spectrum is then
`subtracted from the signal magnitude. rf the result is negat.J-ve,
`the val-ue is replaced by a zero (Half Wave Rectification).
`ft is
`recommended, however, to further reduce the resi-dual noise
`present duri-ng non-speech int.ervals by replacing low values with
`a minimum value (or zero) or by at.tenuating the residual noise by
`3odB. The resurting output is the noise free magnitude spectrum.
`The spectral complex dat.a is reconstructed by applying
`the phase information of the relevant bin of the signal's FFT
`with the noise free magnitude. An IFFT process is then performed
`on the complex data to obtain t.he noise f ree time domain dat.a.
`The t.ime domain resulLs are overlapped and summed with the
`previous frame's results t.o compensate for the overlap process of
`the FFT.
`
`There are
`described. First,
`
`several problems associated with the system
`the syst.em assumes that there is a prior
`
`MARcA\2Boo -App ( rMA: car)
`
`RTL345-2_1020-0012
`
`
`
`PATENT
`67 0425- 2800
`knowledge of the speech and non-speech time intervals. A voice
`switch is not pract.ical to detect those periods. Theoretically,
`a voice swit.ch det.ects the presence of the speech by measuring
`the energy level and comparing it to a threshold. If the
`threshold is too high, there is a risk that some voice time
`intervals might be regarded as a non-speech time int.erval and the
`system will regard voice information as noise. The result is-
`voice distortion, especially in poor signal to noise ratio cases.
`Tf, on the other hand, the threshold is too low, there is a risk
`that t.he non-speech intervals will be too short especially in
`poor signal to noise raLio cases and in cases where the voice is
`continuous with little
`intermission.
`Another problem is t.hat. the magnitude cal-culation of
`the FFT result is quite complex. This j-nvol-ves sguare and square
`root calculations which are very expensive in terms of
`computation load. Yet another problem is t.he association of the
`phase information to t.he noise free magnit.ude spectrum in order
`to obtain the information for the IFFT. This process requires
`the calculation of the phase, the storage of the informat.ion, and
`applying the information to the magnitude dat.a - al-l are
`expensive in terms of computat.ion and memory requirements.
`Another problem i-s the estimation of the noise spectral
`magnitude. The FFT process is a poor and unstable estimator of
`energy. The averaging-over-time of frames contributes
`insufficiently Lo t.he stability. Shortening the length of the
`MARcA\28oo.App (rMA:car)
`
`U
`
`u
`
`Iw
`
`16J
`
`iF
`
`ri
`
`r3'i I
`
`t--
`i;f
`
`2A
`
`25
`
`RTL345-2_1020-0013
`
`
`
`PATENT
`670025 -2800
`FFT results in a wider bandwidth of each bin and better stability
`but reduces the performance of the system. Averaging-over-time,
`moreover, smears the data and, for t.his reason, cannot be
`extended to more than a few frames. This means that the noise
`estimat.ion process proposed is not sufficienLly stable.
`It is t.herefore an object of this invention to provide
`a spectral subtraction system that has a simple,.YeL efficient
`mechanism, to estimate the noise magnitude spectrum even in poor
`signal-to-noise ratio situations and in continuous fast speech
`cases.
`
`It is another object of this invent.ion to provide an
`efficient mechanism that. can perform the magnitude estimation
`with littLe cost, and will overcome the probfem of phase
`association.
`It is yet another object of this invention Lo provide a
`stable mechanism to estimate t.he noisei spectral magnitude wit.hout
`the smearing of the dat.a
`In accordance wit.h t.he foregoing objectives, the
`present invention provides a system that correctly determines the
`non-speech segments of t.he audio signal t.hereby preventing
`erroneous processing of the noise canceling signal during the
`speech segments. In the preferred embodiment, the present
`invention obviates the need for a voice switch by precisely
`d.etermining t.he non-speech segments using a separate threshold
`detector for each frequency bin. The threshol-d detector
`MARcA\2Boo.App (fMA:car)
`
`,4 6
`tl
`I
`
`i"c
`r_sf
`
`r3d
`qgi
`
`rfrts
`
`irE
`
`151,;
`i-r
`
`. i|h
`
`20
`
`25
`
`RTL345-2_1020-0014
`
`
`
`PATENT
`67A025 -2800
`precisely detect.s the positions of t.he noise elements, even
`within continuous speech segments, by det.ermining whether
`frequency spectrum elements, ,or bins, of t.he input signal are
`within a threshold set according to a'minimum value of the
`freguency spectrum efements over a preset period of time. More
`precisely, current and future minimum values of the frequency
`spectrum elements. Thus, for each syllable, the energy of the
`noise elements is det.ermined by a separate threshol-d
`determi-nation without examination of the overall signal energy
`thereby providing good and st.able estimat.ion of the noise. In
`addition, the slrstem preferably sets t.he t.hreshold continuously
`and resets the threshold within a predetermined period of time
`of, for example, five seconds.
`In order to reduce complex caleulations, iL is
`preferred in the present invenLion to obtain an estimate of the
`magnitude of the input. audio signal using a multiplying
`combination of the real and imaginary parts'of the input in
`accordance with, for example, the higher and the lower values of
`In ord"er to further
`the real and imaginary parts of the signal.
`reduce instability of the spectral estimation, a two-dimensional
`(2D) smoothing irocess is applied to the signal estimation. A
`two-step smoothing function using first neighboring frequency
`bins in each time frame then applying an exponent.ial- t.ime average
`effecting an average over t.ime for each freguency bin produces
`excellent result.s.
`MARCA\28o0.APP (rMA:car)
`
`7
`
`4
`U
`
`1df
`
`iiT
`
`.";
`
`;F
`
`l.3:f'
`
`20
`
`RTL345-2_1020-0015
`
`
`
`t**
`I V;ot
`
`i:5
`i* i'
`
`,Es
`
`!'"n
`
`l_5id
`
`PATENT
`67A025-2800
`In order to reduce Lhe complexity of determining the
`phase of t.he frequency bins during subtraction to thereby align
`the phases of t.he subtracting elements, the present invention
`applies a filt.er multiplication t.o effect the subtraetion. The
`function, a Weiner fil-ter function for example, or an
`filter
`is multiplied by the complex
`approximation of the Weiner filter
`dat.a of the frequency domain audio signal. The filter
`function
`may effect a, fulI-wave rectification, ot a half-wave
`rectifi-cation for otherwise negat.ive results of the subtraction
`process or simple subtraction. It will be appreciated that,
`since the noise elements are determined within conLinuous speech
`segments, the noise estimation is accurate and it. may be canceled
`from t.he audio signal conti.nuously providing excellent noise
`cancellation characteristics .
`The present invention also provides a residual noise
`reduction process for reducing the residual noise remaining afLer
`noise cancellation. The residual noise is reduced by zeroing the
`t.he continuous speech, or
`non-speech segments, €.g., within
`decaying the non-speech segments. A voice switch may be used or
`another threshold detector which detects the non-speech segments
`in the time-domain.
`The present invention is applicable with various noise
`canceling systems including, bu1 not limit.ed to, those systems
`described in the U.S. patent applications incorporated herein by
`reference. The present invenLion, for exampJ-e, is applicable
`MARCA\2g00.APP (IMA: car)
`
`8
`
`Cl
`
`i
`
`RTL345-2_1020-0016
`
`
`
`10:g
`r{#
`
`iqi
`
`!F
`
`. df,;
`I ?:.I
`i'-
`
`PATENT
`670025 -2800
`wit.h the adapt.ive beamforming array.
`In addition, the present
`invention may be embodied as a computer program for driving a
`computer processor either instal-led as application software or as
`hardware.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`Other objects, features and advantages according to the
`present invent.ion will become apparent from the following
`detailed description of the illustrated embodiments when read in
`conjunction with the accompanying drawings in which corresponding
`components are identified by the same reference numerafs.
`Fig. l- illustrates
`the present invent.ion;
`Fig. 2 il-lustrates the noise processing of the present
`invention;
`Fig. 3 illustrates
`the noise estimaLion processing of
`t.he present. invention;
`Fig. 4 itlustrates t.he subtraction processing of the
`presenL invention;
`Fig. 5 illustrates
`present invention;
`Fig. 5A illustraLes a variant of Lhe residual noise
`processing of the present invention;
`Fig. 5 illustrates a flow diagram of the present
`invention;
`Fig. 7 illustrates a fl-ow diagram of Lhe presenL
`
`the residual noise processing of the
`
`MARcA\2goo.APP ( rMA: car)
`
`f8
`
`RTL345-2_1020-0017
`
`
`
`PATENT
`57A025-2800
`
`invention;
`Fig. 8 illustraLes a flow diagram of the present
`invention; and
`Fig. 9 illustrates a flow diagram'of t.he present
`'invention.
`
`DETAIIJED DESCRIPTION OF' THE PREFERRED EMBODIME}ffTS
`Figure 1 illustraLes an embodiment. of the present
`invention l-00. The system receives a digital audio signal at
`input aA2 sampled at a frequency which is at least twice the
`In one embodiment, the signal is
`bandwidth of the audio signal.
`derived from a microphone signal t.hat has been processed through
`an analog fronl end, A/D converter and a decimation filter
`to
`obtain the required sampling freqr.rency. In another embodiment,
`the input. is t,aken from the output. of a beamformer or even an
`adaptive beamformer. In that case the signal has been processed
`to eliminate noises arriving from directions other than the
`desired one leaving mainly noises originated from the same
`direction of the desired one. fn yet another embodiment, Lhe
`input signal can be obtained from a sound board when the
`processing is implemented on a PC processor or similar computer
`processor.
`
`1d.I
`
`lori
`
`o'
`
`lEij
`
`irq?t$
`
`-iq
`
`20
`
`The input samples are
`of 255 point.s. When the buf fer
`combined in a combiner 105 with
`
`stored in a temporary buffer 104
`the new 255 points are
`is full,
`the previous 255 points to
`
`25
`
`MARCA\28OO.APP (IMA: CAr)
`
`1-0
`
`RTL345-2_1020-0018
`
`
`
`PATENT
`570025-2800
`provide 512 input point.s. The 51-2 input points are mul-Liplied by
`multiplier 108 with a shading window with the length of 5L2
`points. The shading window contains coefficients that are
`multiplied with the input data accordj-ngly. The shading window
`can be Hanning or othe'r and it serves two goals: the f irst is to
`smooth t.he transients between two processed blocks (together with
`t.he overlap process) ; t.he second is to reduce the side lobes in
`the frequency domain and hence prevenL the masking of 1ow energy
`tonals by high energy side lobes. The shaded results are
`converted t.o t.he frequency domain through an FFT (Fast Fourier
`Transform) processor 110. Other lengths of the FFT samples (and
`accordingly input buffers) are possibte including 256 points or
`AO24 points
`The FFT output is a complex vector of 256 significant
`points (the other 256 points are an anti-symmet.ric replica of Lhe
`first 256 points). The points are processed in the noise
`processing block LL2i€AO) which includes the noise magnitude
`estimation for each frequency bin - the subtraction process that
`estimates the noise-free complex value for each frequency bin and
`the residual noise reduction process. An IFFT (Inverse Fast
`Fourier Transform) processor 1L4 performs the Inverse Fourier
`Transform on the complex noise free data to provide 5L2 time
`domain points. The fi::st 256 time domain points are summed by
`the summer 115 with the previous last 256 data points to
`compensate for the input overlap and shading process and output
`MARcA\28oo.APP (rMA:car)
`
`11
`
`l,v
`
`rlFd
`
`tit
`
`20
`
`25
`
`RTL345-2_1020-0019
`
`
`
`PATENT
`570025 -2800
`at output terminal 118. The remaining 255 points are saved for
`t.he next iteration.
`It will be appreciated that, while specific transforms
`are utilized in the preferred embodimenLs, it is of course
`understood t.hat ot.her transforms niay be applied to the present
`invention to obtain the spectral noise signal.
`Figure 2 is a detailed description of the noise
`processing block 2OO(11'2). First, each frequency bin (n) 202
`magnitude is estimated. The straight forward approach is to
`i=t1-d:: est.imate t.he magnitude by calculating:
`
`5
`
`riT
`
`il$ v h) = ( (Rear (n) )2+ (rmas (n) )2 )-2
`.i[
`*
`In order to save processing t.ime and complexity the
`1g]j signal magnit.ude (Y) is estimated by an estimator 2o4 using an
`{i} approximation formula instead:
`
`lid
`
`20
`
`25
`
`y (n) = Max [lneal (n) ,Imag (n) | ] +A .4* Min tlReaI (n) ,Imag (n) | l
`
`In order to reduce the instability of the spectral
`estimation, which typically plagues the FFT Process (ref [2]
`DigitaT Signal Processing, Oppenheim Schafer, Prentice HaLl P:
`542545), the present invention implements a 2D smoothing process.
`Each bin is replaced with the average of it.s value and the two
`neighboring bins' value (of the same time frame) by a first
`MARCA\28O0.APP (IMA:car)
`12
`
`i"l
`I5
`
`RTL345-2_1020-0020
`
`
`
`PATENT
`670025-28A4
`averager 206. In addition, the smoothed value of each smoothed
`bin is further smoothed by a second averager 208 using a t.ime
`exponential average with a time constant of A.7 (which is the
`equivalent of averaging over 3 time frames). The 2D-smoothed
`value is then used by two processes - Lhe noide estimation
`process by noise estimati-on processor 2L2(300) and the
`subt.raction process by subtract or 2IO. The noise estimation
`process estimaLes the noise at each frequency bin and the result
`is used by the noise subtraction process. The output. of the
`noise subtraction is fed inLo a residual noise reduction
`processor 2ir6 to further reduce the noise. In one embodiment,
`the time domain signal is also used by the residual noise process
`2:-6 to determine the speech free segments. The noise free signal
`is moved to the IFFT process |o obtain t.he time domain output
`2L8.
`
`Figure 3 is a det.aifed description of the noise
`estimation processor 3OO (2L2) - Theoretieally, the noise should
`be estimated by taking a long time average of the signal
`tiris requires that a
`magnit.ude ivl of non-speech time intervals.
`voice switch be used to detect. the speech/non-speech intervals.
`However, a Loo-sensiLive a switch may result in the use of a
`speech signal for the noise est.imation which will defect the
`voice signal. A less sensitive switch, on the other hand, may
`dramatically reduce the length of the noise time int.ervals
`(especially in continuous speech cases) and defect t.he validity
`
`5
`
`1d:n
`
`nqF
`
`IJ'F
`
`20
`
`25
`
`MARCA\2800.APP (IMA: car)
`
`*13
`
`RTL345-2_1020-0021
`
`
`
`PATENT
`67 0025-2804
`
`r
`
`rJ-
`
`of the noise estimation.
`In the present invention, a separat.e adapt.ive threshold'
`is implemented for each frequency bin 302. This alfows the
`location of noise elemenLs for each bin separately without the
`examination of the overall signal energy. The logic behind this
`method is that, for each syl1ab1e, the energy may appear at I
`different. frequency bands. At Lhe same time, oLher frequency
`bands may contain noise elements. It is therefore possible to
`apply a non-sensitive threshold for the noise and yet locate many
`10:[ non-speech data points for each bin, even within a continuous
`i*q
`uf
`speecn case. The advantage of this method is that it allows the
`ii:
`;;{
`tiI
`il+ collect.ion of many noise segments for a good and sLable
`tr,* esLimation of the noise, even within continuous speech segments'
`In t.he threshold determination process, for each
`ir
`frequency bin, two minimum values are calculated. A future
`lgii
`!-*i:-hf minimum value is initiated every 5 seconds at 304 with the value
`,ig of the current magnitude (Y (n) ) and replaced with a smaller
`UF
`.--.
`..
`1 --rLL
`-
`minimal value over the next 5 seconds through the following
`process. The future minimum value of each bin is compared with
`If the currenL
`the current magnitude value of the signal.
`magnitude is smaller t.han the future minimum, the future minimum
`is replaced with 1he magnitude which becomes the new future
`minimum-
`
`5
`
`2A
`
`25
`
`At. the same time, a current minimum value is calculated
`at 306. The current. minimum is iniLiated every 5 seconds with
`MARCA\28oo.APP (rMA:car)
`. Un
`IL
`t .'r
`
`RTL345-2_1020-0022
`
`
`
`PATENT
`670025 -2800
`the value of the future minimum that was determined over the
`previous 5 seconds and follows the minimum value of t.he signal
`for the next 5 seconds by comparing its value with the current
`magnitude value. The current minimum value is used by the
`subtraction process, while the future minimum is used for t.he
`initiatj-on and refreshing of the current minimum.
`The noise estimation mechanism of the present invention
`ensures a t.ight and quick estimation of the noise va1ue, with
`limited memory of the process (s seconds), while preventing a too
`high an estimation of the noise.
`Each bin's magnitude (Y(n) ) is compared with four times
`the current minimum value of that. bin by comparator 308 - which
`If the magnitude
`serves as the adapt.ive t.hreshold for that bin.
`is within the range (hence bel-ow the threshol-d) , it is al-lowed as
`noise and used by an exponential averaging unit 310 that
`det.ermines the level of the noise 312 of that frequency. If the
`magnitude is above the t.hreshold it is rejected for t.he noise
`estimation. The t.ime constant for the exponential averaging is
`typically 0.95 which may be interpreted as taking the average of
`the last 20 frames. The threshold of *minimum val-ue may be
`changed for some applications.
`Figure 4 is a detailed description of the subt.raction
`In a straight forward approach, Lhe value of
`processor 400 (21-0) .
`the estimated bin noise magnitude is subtracted from the current
`bin magnitude. The phase of the current. bin is calculated and
`15
`
`MARcA\2Boo.APP (rMA: car)
`
`I'v
`
`rl.#
`
`";-
`
`ni
`
`20
`
`25
`
`RTL345-2_1020-0023
`
`
`
`PATENT
`670025-2800
`used in conjunction with the result of t.he subtraction Lo obtain
`the Real and Imaginary parts of the resuft. This approach is
`very expensive in terms of processing and memory beeause it
`requires the calculation of the Sine and Cosine arguments of the
`complex vecLor with consideration of the 4'quarters where the
`complex vector may be posit.ioned. An alternative approach used
`in this present invention is Lo use a Filter approach. The
`subtraction is interpreted as a filter multiplicaLion performed
`by filter 402 where H (the filter coefficient) is:
`
`H(n) =
`
`llY(n)l - lrv,n)ll
`, lvtn)l
`Where y(n) is the magnitude of the current bin and N(n)
`is the noise estimation of t.hat bin. The val-ue H of the fi-lter
`coefficient (of each bin separately) is multiplied by the Real
`and Imaginary parts of the current bin at 4042
`
`E (Real) =Y (Real) *H
`
`;
`
`E (Inag) =Y (Imag) *H
`
`Where E is the noise free complex value. In the
`straight. forward approach t.he subtraction may result in a
`negative value of magnitude. This value can be either replaced
`with zero (hal-f-wave rectification) or replaced with a positive
`value equal to t.he negative one (full-wave rectification).
`The
`filt.er approach, as expressed here, results in the full-wave
`L6
`
`MARCA\28OO.APP ( IMA: CAr)
`
`#1
`
`i-O:t'
`
`-;!T/D l;;i
`lnll iliI lsr
`r)
`.-
`
`i.cl1f4
`tr-i
`
`-.i:
`15;
`gn
`
`t--
`irF
`r t:*
`ii3
`
`20
`
`25
`
`RTL345-2_1020-0024
`
`
`
`PATENT
`570A25-2804
`rectification directly. The full wave rect.ification provides a
`less noise reduction but introd-uces much less artifacts to
`little
`It will be appreciated Lhat this filter can be
`the signal.
`modified to effect a half-wave rectification by taking the non-
`absolute value of the numerator and replacing negative values
`wit.h zeros.
`Note also that the values of Y in the figures are the
`smoothed values of Y after averaging over neighboring spectral
`bins and over time frames (2D smoothing). Another approach is to
`use the smoothed Y only for the noise estimation (N), and Lo use
`the unsmoothed Y for the cal-culation of H.
`t.he residual noise reduction
`Figure 5 illustrales
`processor 50 O Q!6) . The residual noise is defined as the
`remaining noise during non-speech intervals. The noise in these
`reduced by the subtraction process which does
`int.ervals is first
`not differentiate between speech and non-speech time intervals.
`The remaining resid.ual noise can be reduced further by using a
`voice switch 502 and either multiplying the residual noise by a
`decaying factor or replacing it. wit.h zeros. Another alternative
`to the zeroing is replacing the residual noise with a minimum
`value of noise at 504.
`Yet another approach, which avoids t.he voice switch, is
`illust.rated in Figure 5A. The residual noise reduction processor
`506 appties a similar threshold used by the noise estimator at
`508 on the noise free output bin and replaces or decays the
`MARcA\2goo.APP (rMA:car)
`17
`
`ai/\IJ
`
`1_ 0"r
`
`'";
`
`1Ei-i
`
`- li.q
`
`20
`
`25
`
`RTL345-2_1020-0025
`
`
`
`1q:;
`
`trc
`
`!Ei{F
`
`15i#
`
`PATENT
`67002s -280.0
`
`of the
`intervals.
`noise when
`and Lhe
`
`result when it is lower than t.he threshold at 510.
`'
`The result of the residual noise processing
`present invention is a quieter sound in Lhe non-speech
`However, the appearance of art.ifacts such as a pumping
`the noise level is switched between the speech interval
`non-speech interval may occur in some applications.
`The spectral subtracLion technique of Lhe present
`invention can be utilized in conjunction with the array
`technigues, close talk microphone technique or as a stand alone
`system. The spectral subtraction of the present invention can be
`implemented.on an embedded hardware (DSP) as a stand alone
`system, as part of other embedded algorithms such as adaptive
`beamforming, or as a software application running on a PC using
`data obtained from a sound port.
`As illust.rat.ed in Figures 6-9, for example, the present
`In step
`invention may be implemented as a software application.
`600, the input samples are read. At step 602, the read