`Lindemann et a].
`
`[54]
`
`NOISE REDUCTION SYSTEM FOR
`BINAURAL HEARING AID
`
`[75]
`
`Inventors :
`
`Eric Lindemann; John Laurence
`Melanson. both of Boulder. Colo.
`
`[73] Assignee: AudioLogic, Inc.. Boulder. C010.
`
`[21]
`[22]
`[51]
`[52]
`[58]
`
`[56]
`
`Appl. No.: 123,503
`Filed:
`Sep. 17, 1993
`
`Int. (:1.6 ................................................... .. H04R 25/00
`US. Cl. ........... ..
`381/682; 381/684; 395/235
`Field of Search ......................... .. 381/682. 68. 68.4.
`381/60. 26. 74. 94. 46. 47; 395/235. 2.12.
`2.37. 2.42
`
`References Cited
`
`U.S. PATENT DOCUlVlENTS
`
`4,628,529
`4,630,305
`4,868,880
`4,887,299
`5,029,217
`5,341,452
`
`12/1986 Borth et al. .
`12/1986 Borth et a1. .
`9/1989 Bennett, Jr. .
`12/1989 Cummins et a1. .
`7/1991 Chabn'es et a1. .
`8/1994 Hall, H et al. ....................... .. 395/235
`
`OTHER PUBLICATIONS
`“Multimicrophone Signal-Processing Technique to Remove
`Reverberation from Speech Signals” by J. Allen et al. vol.
`62. No. 4. Oct. 1977. pp. 912-915.
`“An Alternative Approach to Linearly Constrained Adaprive
`Beamforming” By L. J. Gri?lths et al. IEEE Transactions.
`vol. AP-30. No. 1. Jan. 1982. pp. 27-34.
`“Speech Enhancement Using A Minimum Mean-Square
`Error Short-Time Spectral Amplitude Estimator” By Y.
`Ephraim et al. IEE Transactions. Dec. 1984. No. 6.
`Article Entitled “Extension of a Binaural Cross-Correlation
`Model by Contralateral Inhibition” By W. Lindemann. J.
`Acoust. Soc. Am. 80(6). Dec. 1986. pp. 1608-1622.
`‘Multimicrophone Adaptive Beamforming for Interference
`Reduction In Hearing Aids” by P. Peterson et al. Journal of
`Rehabilitation Research and Development. vol. 24. No. 4.
`pp. 103-110.
`
`US005651071A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,651,071
`Jul. 22, 1997
`
`“Evaluation of Two Voice-Separation Algorithms Using
`Normal-Hearing and Hearing-Impaired Listeners” By R.
`Stubbs et al. J. Acoust. 800.. Oct. 1988.
`“Improvement of Speech Intelligibility In Noise Develop
`ment and Evaluation of a New Directional Hearing Instru
`ment Based On Array Technology” By W. Soede, Delft
`Univ. of Technology.
`Article Entitled “Evaluation of An Adaptive Beamforming
`Method for Hearing Aids” By J. Greenberg et al. J. Acoust.
`Soc. Am. 91 (3). Mar. 1992. pp. 1662-1676.
`“Digital Signal Processing for Binaural Hearing Aids” By
`Kollmeier et al. Proceedings International Congress on
`Acoustics. 1992. Beijing. China.
`Article Entitled “Cocktail-Party-Processing: Concept and
`Results” By M. Bodden. Bodden Proceedings. 1992.
`Beijing. China.
`
`(List continued on next page.)
`
`Primary Examiner—Curtis Kuntz
`Assistant Examiner—Hl1yen D. Le
`Attorney, Agent, or F irm—H0me1' L. Knearl; Holland & Hart
`[57]
`ABSTRACT
`
`In this invention noise in a binaural hearing aid is reduced
`by analyzing the left and right digital audio signals to
`produce left and right signal frequency domain vectors and
`thereafter using digital signal encoding techniques to pro
`duce a noise reduction gain vector. The gain vector can then
`be multiplied against the left and right signal vectors to
`produce a noise reduced left and right signal vector. The cues
`used in the digital encoding techniques include
`directionality. short term amplitude deviation from long
`term average. and pitch. In addition. a multidimensional
`gain function based on directionality estimate and amplitude
`deviation estimate is used that is more effective in noise
`reduction than simply summing the noise reduction results
`of directionality alone and amplitude deviations alone. As
`further features of the invention. the noise reduction is
`scaled based on pitch-estimates and based on voice detec
`tion.
`‘
`
`14 Claims, 5 Drawing Sheets
`
`242
`
`DE-
`EMPHASIS
`
`OUTPUT
`LEFT
`
`230
`
`..
`
`FFT
`
`WINDOW
`
`it
`
`202
`
`234
`
`238
`
`152
`INNER
`PRODUCT
`
`$68
`
`‘I56
`
`+
`
`*
`
`PITCH
`GAIN
`
`_ BEAM
`
`BAND
`SMOOTH
`
`25589 ‘5%
`GAIN
`
`154
`
`158
`
`GAIN
`ADJUST
`
`200
`
`I
`VOICE
`
`DETECT
`5cm:
`
`149
`
`151
`
`FFT
`
`+ 206
`
`*
`
`|
`
`204
`232
`
`236
`
`240
`
`:8
`
`rn
`
`wmnow
`
`DE-
`EMPHASIS
`244
`
`OUTPUT
`RIGHY
`
`RTL345-2_1026-0001
`
`
`
`5,651,071
`Page 2
`
`OTHER PUBLICATIONS
`
`“Microphone Array Speech Enhancement In Overdeter
`mined Signal Scenarios” By R. Slyh et aL. Proceedings
`IEEE International Conference on on Acoustics. Speech and
`Signal Processing. II-347—I1—350.
`
`“Separation of Speech from Interfering Speech By Means of
`Harmonic Selection” by T. Parsons, J. Acoust. Soc. Am.. vol.
`60. No. 4. Oct. 1976, pp. 911-918.
`“Suppression of Acoustic Noise In Speech Using Spectral
`Subtraction” By S. Boll. IEEE Transactions on Acoustics,
`Speech and Signal Processing. vol. ASSP-27. No. 2. Apr.
`1979. pp. 113-120.
`
`RTL345-2_1026-0002
`
`
`
`U.S. Patent
`
`Jul. 22, 1997
`
`Sheet 1 of 5
`
`5,651,071
`
`mmm
`
`Em
`
`zoozzsEom_m<:n_zm
`
`$5
`
`N
`#N
`
`on
`N
`
`382.2.E°ovmomm
`
`is
`
`m.m<EE
`
`3&.
`
`m_m<zn:m
`
`mm.
`
`[E
`
`3
`—
`
`RTL345-2_1026-0003
`
`mi:1
`
`m_m<_i_fiEz_Eon._E382.;
`
`RTL345-2_1026-0003
`
`
`
`
`US. Patent
`
`Jul. 22, 1997
`
`Sheet 2 0f 5
`
`5,651,071
`
`INNER
`PRODUCT
`
`NOTE: THIS CIRCUIT IS
`REPEATED F
`EVERY
`FREQUENCY F
`THE EFT
`
`19D
`
`PITCH
`CONFIDENCE
`
`INTEGRATE
`TOTAL
`POWER
`
`MAXIMUM
`DOT
`PRODUCT
`
`FFT sum >~I '2
`
`DOT
`' PRODUCT
`
`SELECT
`MAXIMUM
`
`HARMONIC
`GR
`TA
`
`186 J
`
`‘ SELECT
`GRID
`
`FIG.6
`
`RTL345-2_1026-0004
`
`
`
`US. Patent
`
`Jul. 22, 1997
`
`Sheet 3 of 5
`
`5,651,071
`
`NO SMOOTHING
`
`(
`6-17 F 4 POINT COSINE KERNALT 8-15
`SMOOTHING FILTER
`
`l,
`
`13-20 6 POINT COSINE KERNAL\16—23
`SMOOTHING FILTER
`
`J
`20-35 a POINT cOsINE KERNAL ‘24-31
`SMOOTHING FILTER
`J
`12 POINT COSINE KERNAL 32-47
`SMOOTHING FILTER
`~
`
`—
`
`INNER
`PRODUCT
`gégTgoRmT
`
`(I) N
`-
`[N 0)
`
`2O POINT COSINE KERNAL‘ 48—72
`SMOOTHING FILTER
`
`)
`
`57-127 32 POINT cOsINE KERNAC73-127
`SMOOTHING FILTER
`
`L J
`FIG.3A
`
`157
`
`‘0:’.
`
`NO SMOOTHING
`
`0-7
`
`INNER
`PRODUCT
`AVERAGE
`128 POINT
`VECTOR
`
`“A6551:
`AVERAGE
`128 POINT
`VECTOR
`
`5-17
`
`J
`4 POINT cOsINE KERNAL‘ 8-15
`SMOOTHING FILTER
`
`VECTOR Li L A)
`
`‘
`
`MAG SQ
`SUM
`128 POINT
`
`J
`13-20 6 POINT COSINE KERNAL 15-23
`SMOOTHING FILTER
`<
`20-35 8 POINT COSINE KERNAL 24~31
`)
`SMOOTHING FILTER
`26—53 12 POINT COSINE KERNAL\32—47
`SMOOTHING FILTER
`
`38-82 20 POINT COSINE KERNAL\ 48-72
`SMOOTHING FILTER
`
`L
`1
`57—127 32 POINT COSINE KERNAL 73-127
`SMOOTHING FILTER
`'
`K
`'57
`FIOBB
`
`RTL345-2_1026-0005
`
`
`
`US. Patent
`
`Jul. 22, 1997
`
`Sheet 4 of 5
`
`5,651,071
`
`INNER
`PRODUCT
`
`1-POLE
`LOWPASS
`
`THEN d8
`IF F<1000HZ
`THEN d4
`IF F<25OOHZ
`THEN d2
`ELSE d
`
`D ME
`MR A MU
`UQ
`SS
`
`2D
`GAIN
`FUNCTION
`TABLE
`
`‘><GAIN
`
`NOTE: THIS CIRCUIT IS
`REPEATED FOR EVERY
`FREQUENCY F OF THE FFT
`
`ES EG 08
`L5 TA PA
`PP E Ew _ w mm W0
`M PPTII
`RE LS
`
`TOTAL
`
`L+R FREQ
`VECTOR
`
`TOTAL
`MAG SQ
`| I2
`
`MAG SQ 210%,
`
`218
`
`ATTACK /
`RELEASE
`
`2164
`
`10
`
`222
`
`Es
`
`LIMIT
`TO
`1
`
`220/J
`
`ADJUSTED
`NOISE
`REDUCTION
`GAIN
`
`226
`FIG]
`
`RTL345-2_1026-0006
`
`
`
`US. Patent
`
`Jul. 22, 1997
`
`Sheet 5 of 5
`
`5,651,071
`
`SPAD
`
`SPAD
`
`2D GAIN FOR
`SERIAL
`CONNECTION
`
`FIG.5A
`
`2D
`GENERALIZED
`GAIN
`
`FIG.5B
`
`RTL345-2_1026-0007
`
`
`
`5,651,071
`
`1
`NOISE REDUCTION SYSTEM FOR
`BINAURAL HEARING AID
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`
`The present invention relates to patent application entitled
`“Binaural Hearing Aid” Ser. No. 08/ 123.499. ?led Sep. 17.
`1993. which describes the system architecture of a hearing
`aid that uses the noise reduction system of the present
`invention.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention:
`This invention relates to binaural hearing aids. and more
`particularly. to a noise reduction system for use in a binaural
`hearing aid.
`2. Description of Prior Art:
`Noise reduction. as applied to hearing aids. means the
`attenuation of undesired signals and the ampli?cation of
`desired signals. Desired signals are usually speech that the
`hearing aid user is trying to understand. Undesired signals
`can be any sounds in the environment which interfere with
`the principal speaker. These undesired sounds can be other
`speakers. restaurant clatter. music. tra?ic noise. etc. There
`have been three main areas of research in noise reduction as
`applied to hearing aids: directional beamforrning. spectral
`subtraction. pitch-based speech enhancement.
`The purpose of beamforming in a hearing aid is to create
`an illusion of “tunnel hearing” in which the listener hears
`what he is looking at but does not hear sounds which are
`coming from other directions. If he looks in the direction of
`a desired sound—e.g.. someone he is speaking to-—-then
`other distracting sounds—e.g.. other speakers-will be
`attenuated A bearnformer then separates the desired “on
`axis” (line of sight) target signal from the undesired “off
`axis” jammer signals so that the target can be ampli?ed
`while the jammer is attenuated.
`Researchers have attempted to use bearnforming to
`improve signal-to-noise ratio for hearing aids for a number
`of years {References 1. 2. 3. 7. 8. 9}. Three main approaches
`have been proposed. The simplest approach is to use purely
`analog delay and sum techniques {2}. Amore sophisticated
`approach uses adaptive FIR ?lter techniques using
`algorithms. such as the Grifliths-Jim beamformer {1. 3}.
`These adaptive ?lter techniques require digital signal pro
`cessing and were originally developed in the context of
`antenna array beamforrning for radar applications {5}. Still
`another approach is motivated from a model of the human
`binaural hearing system {14. 15}. While the ?rst two
`approaches are time domain approaches. this last approach
`is a frequency domain approach.
`There have been a number of problems associated with all
`of these approaches to beamforming. The delay-and-sum
`and adaptive ?lter approaches have tended to break down in
`non-anechoic. reverberant listening situations: any real room
`will have so many acoustic re?ections coming off walls and
`ceilings that the adaptive ?lters will be largely unable to
`distinguish between desired sounds coming from the front
`and undesired sounds coming from other directions. The
`delay-and-sum and adaptive ?lter techniques have also
`required a large (>=8) number of microphone sensors to be
`effective. This has made it di?icult to incorporate these
`systems into practical hearing aid packages. One package
`that has been proposed consists of a microphone array across
`the top of eyeglasses {2}.
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`The frequency domain approaches which have been pro
`posed {7. 8. 9} have performed better than delay-and-sum or
`adaptive ?lter approaches in reverberant listening environ
`ments and function with only two microphones. The prob
`lems related to the previously-published frequency domain
`approaches have included unacceptably long input-to-output
`time delay. distortion of the desired signal. spatial aliasing at
`high frequencies. and some di?iculty in reverberant envi
`ronments (although less than for the adaptive ?lter case).
`While beamforming uses directionality to separate
`desired signal from undesired signal. spectral subtraction
`makes assumptions about the differences in statistics of the
`undesired signal and the desired signal. and uses these
`differences to separate and attenuate the undesired signal.
`The undesired signal is assumed to be lower in amplitude
`then the desired signal and/or has a less time varying
`spectrum. If the spectrum is static compared to the desired
`signal (speech). then a long-term estimation of the spectrum
`will approximate the spectrum of the undesired signal. This
`spectrum can be attenuated. If the desired speech spectrum
`is most often greater in amplitude and/or uncorrelated with
`the undesired spectrum. then it will pass through the system
`relatively undistorted despite attenuation of the undesired
`spectrum. Examples of work in spectral subtraction include
`references {11. 12, 13}.
`Pitch-based speech enhancement algorithms use the
`pitched nature of voiced speech to attempt to extract a voice
`which is embedded in noise. Apitch analysis is made on the
`noisy signal. If a strong pitch is detected. indicating strong
`voiced speech superimposed on the noise. then the pitch can
`be used to extract harmonics of the voiced speech. removing
`most of the uncorrelated noise components. Examples of
`work in pitch-based enhancement are references {17. 18}.
`
`SU'MNIARY OF THE INVENTION
`
`In accordance with this invention. the above problems are
`solved by analyzing the left and right digital audio signals to
`produce left and right signal frequency domain vectors and,
`thereafter. using digital signal encoding techniques to pro
`duce a noise reduction gain vector. The gain vector can then
`be multiplied against the left and right signal vectors to
`produce a noise reduced left and right signal vector. The cues
`used in the digital encoding techniques include
`directionality. short-term amplitude deviation from long
`term average. and pitch. In addition. a multidimensional
`gain function. based on directionality estimate and ampli
`tude deviation estimate. is used that is more effective in
`noise reduction than simply summing the noise reduction
`results of directionality alone and amplitude deviations
`alone. As further features of the invention, the noise reduc
`tion is scaled based on pitch-estimates and based on voice
`detection.
`Other advantages and features of the invention will be
`understood by those of ordinary skill in the art after referring
`to the complete written description of the preferred embodi
`ments in conjunction with the following drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 illustrates the preferred embodiment of the noise
`reduction system for a binaural hearing aid.
`FIG. 2 shows the details of the inner product operation
`and the sum of magnitudes squared operation referred to in
`FIG. 1.
`FIGS. 3A and 3B show the band smoothing ?lters 157 of
`band smoothing operation 156 in FIG. 1.
`
`RTL345-2_1026-0008
`
`
`
`5.651.071
`
`3
`FIG. 4 shows the details of the beam spectral subtract gain
`operation 158 in FIG. 1.
`FIG. 5A is a graph of noise reduction gains as a serial
`function of directionality and spectral subtraction.
`FIG. 5B is a graph of the noise reduction gain as a
`function of directionality estimate and spectral subtraction
`excursion estimate in accordance with the process in FIG. 4.
`FIG. 6 shows the details of the pitch-estimate gain opera—
`tion 180 in FIG. 1.
`FIG. 7 shows the details of the voice detect gain scaling
`operation 208 in FIG. 1.
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`
`Theory of Operation:
`In the noise-reduction system described in this invention.
`all three noise reduction techniques. beamforming. spectral
`subtraction and pitch enhancement. are used. Innovations
`will be described relevant to the individual techniques.
`especially beamforming. In addition. it will be demonstrated
`that a synergy exists between these techniques such that the
`whole is greater than the sum of the parts.
`
`Multidimensional Noise Reduction:
`
`We call a multidimensional noise reduction system any
`system which uses two or more distinct cues generated from
`signal analysis to attempt to separate desired from undesired
`signal. In our case. we use three cues: directionality (D).
`short term amplitude deviation from long term average
`(STAD). and pitch (f0). Each of these cues has been used
`separately to design noise reduction systems. but the coop
`erative use of the cues taken together in a single system has
`not been done.
`To see the interactions between the cues assume a system
`which uses D and STAD separately. i.e.. the use of D alone
`as a beamformer and STAD alone as a spectral subtractor. In
`the case. of the bearnformer we estimate D and then specify
`a gain function of D which is unity for high D and tends to
`zero for low D. Similarly. for the spectral subtractor we
`estimate STAD and provide a gain function of STAD which
`is unity for high STAD and tends to zero for low STAD.
`The two noise reduction systems can be connected back
`to back in serial fashion (e.g.. beamformer followed by
`spectral subtractor). In this case. we can think in terms of a
`two-dimensional gain function of (D.STAD) with the func
`tion having a shape similar to that shown in FIG. 5A. With
`the serial connection. the gain function in FIG. 5A is
`rectangular. Values of (D.STAD) inside the rectangle gen
`erate a gain near unity which tends toward Zero near the
`boundaries of the rectangle.
`If we abandon the notion of a serial connection
`(beamformer followed by spectral subtractor) and instead
`think in terms of a general two-dimensional function of
`(D.STAD). then we can de?ne non-rectangular gain
`contours. such as that shown in FIG. 5B Generalized Gain.
`Here we see that there is more interaction between the D and
`STAD values. Aregion which may have been included in the
`rectangular gain contour is now excluded because we are
`better able to take into consideration both D and STAD.
`A common problem in spectral subtraction noise reduc
`tion systems is musical noise . This is isolated bits of
`spectrum which manage to rise above the STAD threshold in
`discrete bursts. This can turn a steady state noise. such as a
`fan noise. into a ?uttering random musical note generator.
`
`4
`By using the combination of (D.STAD) we are able to make
`a better decision about a spectral component by insisting that
`not only must it rise above the STAD threshold. but it must
`also be reasonably on-line. There is a continuous give and
`take between these two parameters.
`Including f0. pitch. as a third cue gives rise to a three
`dimensional noise reduction system. We found it advanta
`geous to estimate D and STAD in parallel and then use the
`two parameters in a single two-dimensional function for
`gain. We do not want to estimate f0 in parallel with D and
`STAD. though. because we can do a better estimate of f0 if
`we ?rst noise reduce the signal somewhat using D and
`STAD. Therefore. based on the partially noise-reduced
`signal. we estimate f0 and then calculate the ?nal gain using
`D. STAD and f0 in a general three-dimensional function. or
`we can use f0 to adjust the gain produced from D.STAD
`estimates. When f0 is included. we see that not only is the
`system more e?icient because we can use arbitrary gain
`functions of three parameters. but also the presence of a ?rst
`stage of noise reduction makes the subsequent f0 estimation
`more robust than it would be in an f0 only based system.
`The D estimate is based on values of phase angle and
`magnitude for the current input segment. The STAD esti
`mate is based on the sum of magnitudes over many past
`segments. A more general approach would make a single
`uni?ed estimate based on current and past values of both
`phase angle and magnitude. More information would be
`used. the function would be more general. and so a better
`result would be had.
`
`Frequency Domain Beamforming:
`A frequency domain beamformer is a kind of analysis/
`synthesis system. The incoming signals are analyzed by
`transforming to the frequency (or frequency-like) domain.
`Operations are carried out on the signals in the frequency
`domain. and then the signals are resynthesized by transform
`ing them back to the time domain. In the case of two
`microphone beamformers. the two signals are the left and
`right ear signals. Once transformed to the frequency domain.
`a directionality estimate can be made at each frequency
`point by comparing left and right values at each frequency.
`The directionality estimate is then used to generate a gain
`which is applied to the corresponding left and right fre
`quency points and then the signals are resynthesized.
`There are several key issues involved in the design of the
`basic analysis/synthesis system. In general. the analysis/
`synthesis system will treat the incoming signals as consecu
`tive (possibly time overlapped) time segments of N sample
`points. Each N sample point segment will be transformed to
`produce a ?xed length block of frequency domain coe?i
`cients. An optimum transform concentrates the most signal
`power in the smallest percentage of frequency domain
`coefficients. Optimum and near optimum transforms have
`been widely studied in signal coding applications {reference
`19} where the desire is to transmit a signal using the fewest
`coe?icients to achieve the lowest data rate. If most of the
`signal power is concentrated in a few coe?icients. then only
`those coe?icients need to be coded with high accuracy. and
`the others can be crudely coded or not coded at all.
`The optimum transform is also extremely important for
`the beamformer. Assume that a signal consists of desired
`signal plus undesired noise signal. When the signal is
`transformed. some of the frequency domain coe?icients will
`correspond largely to desired signal. some to undesired
`signal. and some to both. For the frequency coef?cients with
`substantial contributions from both desired signal and noise.
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`RTL345-2_1026-0009
`
`
`
`15
`
`25
`
`5
`it is di?icult to determine an appropriate gain. For frequency
`coefficients corresponding largely to desired signals the gain
`is near unity. For frequency coe?'icients corresponding
`largely to noise. the gain is near Zero. For dynamic signals,
`such as speech. the distribution of energy across frequency
`coe?icients from input segment to input segment can be
`regarded as random except for possibly a long-term global
`spectral envelope. Two signals. desired signal and noise.
`generate two random distributions across frequency coe?i
`cients. The value of a particular frequency coe?‘icient is the
`sum of the contribution from both signals. Since the total
`number of frequency coe?icients is ?xed. the probability of
`two signals making substantial contributions to the same
`frequency coe?icient increases as the number of frequency
`coe?icients with substantial energy used to code each signal
`increases. Therefore. an optimum transform. which concen
`trates energy in the smallest percentage of the total
`coe?icients. will result in the smallest probability of overlap
`between coefficients of the desired signal and noise signal.
`This. in turn. results in the highest probability of correct
`answers in the beamformer gain estimation.
`A different view of the analysis/synthesis system is as a
`multiband ?lter bank {20}. In this case. each frequency
`coe?icient. as it varies in time from input segment to input
`segment. is seen as the output of a bandpass ?lter. There are
`as many bandpass ?lters. adjacent in frequency. as there are
`frequency coef?cients. To achieve high energy concentration
`in frequency coe?icients we want sharp transition bands
`between bandpass ?lters. For speech signals. optimum trans
`forms correspond to ?lter banks with relatively sharp tran
`sition bands to minimize overlap between bands.
`In general. to achieve good discrimination between
`desired signal and noise. we want many frequency coe?i
`cients (or many bands of ?ltering) with energy concentrated
`in as few coe?icients as possible (sharp transition bands
`between bandpass ?lters). Unfortunately. this kind of high
`frequency resolution implies large input sample segments
`which. in turn. implies long input to output delays in the
`system. In a hearing aid application. time delay through the
`system is an important parameter to optimize. If the time
`delay from input to output becomes too large (e.g.>about 40
`ms). the lips of speakers are no longer synchronized with
`sound. It also becomes di?icult to speak since the sound of
`one’s one voice is not synchronized with muscle move
`ments. The impression is unnatural and fatiguing. A com
`promise must be made between input-output delay and
`frequency resolution. A good choice of analysis/synthesis
`architecture can ease the constraints on this compromise.
`Another important consideration in the design of analysis/
`synthesis systems is edge effects. These are discontinuities
`that occur between adjacent output segments. These edge
`effects can be due to the circular convolution nature of
`fourier transform and inverse transforms. or they can be due
`to abrupt changes in frequency domain ?ltering (noise
`reduction gain. for example) from one segment to the next.
`Edge effects can sound like ?uttering at the input segment
`rate. A well-designed analysis/synthesis system will elimi
`nate these edge effects or reduce them to the point where
`they are inaudible.
`The theoretical optimum transform for a signal of known
`statistics is the Karhoenen-Loeve Transform or KLT {19}.
`The KLT does not generally lend itself to practical
`implementation. but serves as a basis for measuring the
`effectiveness of other transforms. It has been shown that. for
`speech signals. various transforms approach the KLT in
`effectiveness. These include the DCT {19}. and BLT {21}.
`A large body of literature also exists for designing e?icient
`
`5,651,071
`
`6
`?lter banks {22. 23}. This literature also proposes tech
`niques for eliminating or reducing edge effects.
`One common design for analysis/synthesis systems is
`based on a technique called overlap-add {16}. In the
`overlap-add scheme. the incoming time domain signals are
`segmented into N point non-overlapping. adjacent time
`segments. Each N point segment is “padded” with an
`additional L zero values. Then each N+L point “augmented”
`segment is transformed using the FFl". A frequency domain
`gain. which can be viewed as the FFI‘ of another N+L point
`sequence consisting an M point time domain ?nite impulse
`response padded with N+L-M zeros. is multiplied with the
`transformed “augmented” input segment. and the product is
`inverse transformed to generate an N+L point time domain
`sequence. As long as M<L. then the resulting N+L point time
`domain sequence will have no circular convolution compo
`nents. Since an N+L point segment is generated for each
`incoming N point segment. the resulting segments will
`overlap in time. If the overlapping regions of consecutive
`segments are summed. then the result is equivalent to a
`linear convolution of the input signal with the gain impulse
`response.
`There are a number of problems associated with the
`overlap-add scheme. Viewed from the point of view of ?lter
`bank analysis. an overlap/add scheme uses bandpass ?lters
`whose frequency response is the transform of a rectangmlar
`window. This results in a poor quality bandpass response
`with considerable leakage between bands so the coe?icient
`energy concentration is poor. While an overlap-add scheme
`will guarantee smooth reconstruction in the case of convo
`lution with a stationary ?nite impulse response of con
`strained length. when the impulse response is changing
`every block time. as is the case when we generate adaptive
`gains for a beamformer, then discontinuities will be gener
`ated in the output. It is as if we were to abruptly change all
`the coei?cients in an FIR ?lter every block time. In an
`overlap-add system. the input to output minimum delay is:
`
`Where:
`N=input segment length.
`Z=number of zeros added to each block for zero padding.
`Aminimum value for Z is N, but this can easily be greater
`if the gain function is not su?iciently smooth over frequency.
`The frequency resolution of this system is N/2 frequency
`bins given conjugate symmetry of the transforms of the real
`input signal. and the fact that zero padding results in an
`interpolation of the frequency points with no new informa
`tion added.
`In the system design described in the preferred embodi
`ments section of this patent. We use a Windowed analysis/
`synthesis architecture. In a windowed FFI‘ analysis/
`synthesis system. the input and output time domain sample
`segments are multiplied by a window function which in the
`preferred embodiment is a sine window for both the input
`and output segments. The frequency response of the band
`pass ?lters (the transform of the sine window) is more
`sharply bandpass than in the case of the rectangular Win
`dows of the overlap-add scheme so there is better coe?icient
`energy concentration. The presence of the synthesis window
`results in an effective interpolation of the adaptive gain
`coefficients from one segment to the next and so reduces
`edge effects. The input to output delay for a windowed
`system is:
`
`45
`
`50
`
`55
`
`60
`
`65
`
`DWWW=1 * N+(compute time for N FFI)
`
`RTL345-2_1026-0010
`
`
`
`Where:
`N=input segment length.
`It is clear that the sine windowed system is preferable to
`the overlap-add system from the point of view of coe?icient
`energy concentration. output smoothness. and input-output
`delay. Other analysis/synthesis architectures. such as ELT.
`Paraunitary Filter Banks. QMF Filter Banks. Wavelets. DCI‘
`should provide similar performance in terms of input-output
`delay but can be superior to the sine window architecture in
`terms of energy concentration. and reduction of edge effects.
`
`Preferred Embodiment:
`
`5,651,071
`
`8
`signal is processed in 256 point consecutive sample blocks.
`After each block is processed. the block origin is advanced
`by 128 points. So. if the ?rst block spans samples 0..255 of
`both the left and right channels. then the second block spans
`samples 128.383. the third spans samples 256.511. etc. The
`processing of each consecutive block is identical.
`The noise reduction processing begins by multiplying the
`left and right 256 point sample blocks by a sine window in
`operations 148. 149. A fast Fourier transform (FFI‘) opera
`tion 150. 151 is then performed on the left and right blocks.
`Since the signals are real. this yields a 128 point complex
`frequency vector for both the left and right audio channels.
`The elements of the complex frequency vectors will be
`referred to as bin values. So there are 128 frequency bins
`from F=O (DC) to F><Fsampl2 Khz.
`The inner product of. and the sum of magnitude squares
`of each frequency bin for the left and right channel complex
`frequency vector. is calculated by operations 152 and 154.
`respectively. The expression for the inner product is:
`
`10
`
`15
`
`25
`
`and is implemented. as shown in FIG. 2. The operation ?ow
`in FIG. 2 is repeated for each frequency bin. On the same
`FIG. 2. the sum of magnitude squares is calculated as:
`
`In FIG. 1. the noise reduction stage. which is implemented
`as a DSP software program. is shown as an operations ?ow
`diagram. The left and right ear microphone signals have
`been digitized at the system sample rate which is generally
`adjustable in a range from FSAMP=8—48 kHz. but has a
`nominal value of Fsamp 11.025 Khz sampling rate. The left
`and right audio signals have little. or no. phase or magnitude
`distortion. A hearing aid system for providing such low
`distortion left and right audio signals is described in the
`above-identi?ed cross-referenced patent application entitled
`“Binaural Hearing Aid.” The time domain digital input
`signal from each ear is passed to one-zero pre-emphasis
`?lters 139. 141. Pre-emphasis of the left and right ear signals
`using a simple one-zero high-pass differentiator pre-whitens
`the signals before they are transformed to the frequency
`domain. This results in reduced variance between frequency
`coe?icients so that there are fewer problems with numerical
`error in the Fourier transfonnation process. The effects of
`the preemphasis ?lters 139. 141 are removed after inverse
`Fourier transformation by using one-pole integrator deem
`phasis ?lters 242 and 244 on the left and right signals at the
`end of noise reduction processing. Of course. if binaural
`compression follows the noise reduction stage of processing.
`the inverse transformation and deemphasis would be at the
`end of binaural compression.
`In FIG. 1. after preemphasis. if used. the left and right
`time domain audio signals are passed through allpass ?lters
`144. 145 to gain multipliers 146. 147. The allpass ?lter
`serves as a variable delay. The combination of variable delay
`and gain allows the direction of the beam in beam forming
`to be steered to any angle if desired. Thus. the on~axis
`direction of beam forming may be steered from something
`other than straight in front of the user. or may be tuned to
`compensate for microphone or other mechanical mis
`matches.
`At times. it may be desirable to provide maximum gain
`for signals appearing to be off-axis. as determined from
`analysis of left and right ear signals. This may be necessary
`to calibrate a system which has imbalances in the left and
`right audio chain. such as imbalances between the two
`microphones. It may also be desirable to focus a beam in
`another direction then straight ahead. This may be true when
`a listener is riding in a car and wants to liste