throbber
United States Patent (19)
`Lindemann et al.
`
`54 NOISE REDUCTION SYSTEM FOR
`BNAURAL, HEARNGAD
`
`75 Inventors: Eric Lindemann; John Laurence
`Melanson, both of Boulder, Colo.
`73 Assignee: AudioLogic, Inc., Boulder, Colo.
`
`(21) Appl. No.: 123,503
`22 Filed:
`Sep. 17, 1993
`(51
`Int. Cl. ... H04R 25/00
`52 U.S. Cl. .............
`381/68.2; 381/68.4; 395/2.35
`58 Field of Search ....................... 381/68.2, 68, 68.4,
`381/60, 26, 74,94, 46, 47: 395/2.35, 2.12,
`2.37, 2.42
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`4,628,529 12/1986 Borth et al..
`4,630,305 12/1986 Borth et al..
`4,868,880 9/1989 Bennett, Jr. .
`4,887,299 12/1989 Cummins et al..
`5,029,217
`7/1991 Chabries et al. .
`5,341,452 8/1994 Hall, II et al. ......................... 395/2.35
`OTHER PUBLICATIONS
`“Multimicrophone Signal-Processing Technique to Remove
`Reverberation from Speech Signals” by J. Allen et al., vol.
`62. No. 4, Oct. 1977, pp. 912-915.
`“An Alternative Approach to Linearly Constrained Adaprive
`Beamforming” By L. J. Griffiths et al. IEEE Transactions,
`vol. AP-30, No. 1, Jan. 1982, pp. 27–34.
`"Speech Enhancement Using A Minimum Mean-Square
`Error Short-Time Spectral Amplitude Estimator” By Y.
`Ephraim et al. IEE Transactions, Dec. 1984, No. 6.
`Article Entitled "Extension of a Binaural Cross-Correlation
`Model by Contralateral Inhibition” By W. Lindemann, J.
`Acoust. Soc. Am. 80(6), Dec. 1986, pp. 1608-1622.
`“Multimicrophone Adaptive Beamforming for Interference
`Reduction. In Hearing Aids” by P. Peterson et al., Journal of
`Rehabilitation Research and Development, vol. 24, No. 4,
`pp. 103-110.
`
`US005651071A
`Patent Number:
`11
`45 Date of Patent:
`
`5,651,071
`Jul. 22, 1997
`
`"Evaluation of Two Voice-Separation Algorithms Using
`Normal-Hearing and Hearing-Impaired Listeners” By R.
`Stubbs et al., J. Acoust. Soc., Oct. 1988.
`“Improvement of Speech Intelligibility In Noise Develop
`ment and Evaluation of a New Directional Hearing Instru
`ment Based On Array Technology” By W. Soede, Delft
`Univ. of Technology.
`Article Entitled “Evaluation of An Adaptive Beamforming
`Method for Hearing Aids' By J. Greenberg et al., J. Acoust.
`Soc. Am. 91 (3), Mar. 1992, pp. 1662-1676.
`“Digital Signal Processing for Binaural Hearing Aids”. By
`Kollmeier et al, Proceedings International Congress on
`Acoustics, 1992, Beijing, China.
`Article Entitled “Cocktail-Party-Processing: Concept and
`Results.” By M. Bodden, Bodden Proceedings, 1992,
`Beijing, China.
`(List continued on next page.)
`
`Primary Examiner-Curtis Kuntz
`Assistant Examiner-Huyen D. Le
`Attorney, Agent, or Firm-Homer L. Knearl; Holland & Hart
`57
`ABSTRACT
`In this invention noise in a binaural hearing aid is reduced
`by analyzing the left and right digital audio signals to
`produce left and right signal frequency domain vectors and
`thereafter using digital signal encoding techniques to pro
`duce a noise reduction gain vector. The gain vector can then
`be multiplied against the left and right signal vectors to
`produce a noise reduced left and right signal vector. The cues
`used in the digital encoding techniques include
`directionality, short term amplitude deviation from long
`term average, and pitch. In addition, a multidimensional
`gain function based on directionality estimate and amplitude
`deviation estimate is used that is more effective in noise
`reduction than simply Summing the noise reduction results
`of directionality alone and amplitude deviations alone. As
`further features of the invention, the noise reduction is
`scaled based on pitch-estimates and based on voice detec
`
`tion.
`
`W
`
`14 Claims, 5 Drawing Sheets
`
`39
`
`LEF IN
`
`PRE
`EPASS
`
`WINDOW
`
`
`
`48
`
`SEERING
`GAN
`
`STEERNG
`APASS
`
`44
`
`19
`
`151
`
`WINDOW
`
`17818O
`
`GE) () PTCH
`Y
`GAN
`
`58
`
`GAN AJS
`
`
`
`RIGHT IN
`
`RE-
`EPHASS
`s
`-
`14
`5
`
`4
`
`SEERING
`ALPASS
`
`SERG
`GAN
`
`G.) 4
`
`242
`
`Eas
`
`250r.
`G 2O2
`
`Illi,
`
`GAN
`
`23
`4.
`
`238
`
`256
`
`240
`
`GE)
`WOW
`E (54
`
`244
`
`- 1 -
`
`Amazon v. Jawbone
`U.S. Patent 11,122,357
`Amazon Ex. 1007
`
`

`

`5,651,071
`Page 2
`
`OTHER PUBLICATIONS
`
`"Microphone Array Speech Enhancement In Overdeter
`mined Signal Scenarios” By R. Slyh et al., Proceedings
`IEEE International Conference on on Acoustics, Speech and
`Signal Processing. II-347-II-350.
`
`"Separation of Speech from Interfering Speech. By Means of
`Harmonic Selection” by T. Parsons, J. Acoust.Soc. Am... vol.
`60, No. 4, Oct. 1976, pp. 911–918.
`"Suppression of Acoustic Noise In Speech Using Spectral
`Subtraction” By S. Boll, IEEE Transactions on Acoustics,
`Speech and Signal Processing, vol. ASSP-27, No. 2, Apr.
`1979, pp. 113-120.
`
`- 2 -
`
`

`

`U.S. Patent
`
`Jul. 22, 1997
`
`Sheet 1 of 5
`
`5,651,071
`
`99CZ
`
`
`
`M00 NAMA?aeG)
`
`–30
`
`Z
`
`
`
`
`
`
`
`SISWHdW3
`
`69 ||
`
`8 #7
`
`|
`
`- 3 -
`
`

`

`U.S. Patent
`
`Jul. 22, 1997
`
`Sheet 2 of 5
`
`5,651,071
`
`FG.2
`
`NOTE: THIS CIRCUIT IS
`R
`REPEATED FOR EVERY
`FRECUENCY F OF THE FFT
`
`
`
`
`
`
`
`NTEGRATE
`TOTAL
`POWER
`
`19 O
`
`PTCH
`CONFIDENCE
`
`MAXIMUM
`DOT
`PRODUCT
`
`
`
`
`
`
`
`HARMONIC
`GRD
`TABLE
`
`
`
`SELECT
`GRD
`
`186
`
`F.G. 6
`
`1 92
`
`
`
`
`
`- 4 -
`
`

`

`U.S. Patent
`
`Jul. 22, 1997
`
`Sheet 3 of 5
`
`5,651,071
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`INNER
`PRODUCT
`128 POINT
`VECTOR
`
`MAG SQ
`SUM
`28 POINT
`VECTOR
`
`NO SMOOTHING
`
`6-17
`
`4 POINT COSINE KERNAL 8-15
`SMOOTHING FILTER
`
`13-2O 6 POINT COSINE KERNAL 16-23
`SMOOTHING FILTER
`
`20-35 8 POINT COSINE KERNA 24-31
`SMOOTHING FILTER
`
`
`
`26-53 12 POINT COSINE KERNAL 32-47
`SMOOTHING FILTER
`
`38-82 20 POINT COSINE KERNAL48-72
`SMOOTHING FILTER
`57-127 32 POINT COSINE KERNAL 73-127
`SMOOTHING FILTER
`
`157
`
`F. G. 3A
`
`
`
`
`
`NO SMOOTHING
`
`7
`O
`
`6- 17
`
`5
`4. POINT COSINE KERNAL 8-1
`SMOOTHING FILTER
`s-or POINT COSINE KERNAL 16-23
`SMOOTHING FILTER
`20-35 8 POINT COSINE KERNAL 24-31
`SMOOTHING FILTER
`26-53 12 POINT COSINE KERNA 32-47
`SMOOTHING FILTER
`
`
`
`38-82 20 POINT COSINE KERNAL48-72
`SMOOTHING FILTER
`57-127. 32. POINT COSINE KERNAL 73-127
`SMOOTHING FILTER
`
`
`
`57
`
`F. G. 3B
`
`INNER
`PRODUCT
`AVERAGE
`128 POINT
`VECTOR
`
`MAG SQ
`Aver
`128 POINT
`VECTOR
`
`- 5 -
`
`

`

`U.S. Patent
`
`Jul. 22, 1997
`
`Sheet 4 of 5
`
`5,651,071
`
`POLE
`1 -
`LOWPASS
`
`THEN d8
`F E999Hz
`THEN d
`IF F(25 OOHZ
`THEN d2
`ELSE d
`
`
`
`2D
`GAN
`FUNCTION
`TABLE
`
`
`
`
`
`
`
`LONG TERM
`AVERAGE
`ONE-POL
`OWPASS
`
`NOTE: THIS CIRCUIT IS
`REPEATED FOR EVERY
`FREQUENCY F OF THE FFT
`FG. 4
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ATTACK /
`RELEASE
`TWO
`POLE
`21 6
`
`
`
`
`
`ADJUSTED
`NOISE
`REDUCTION
`GAN
`
`
`
`- 6 -
`
`

`

`U.S. Patent
`
`Jul. 22, 1997
`
`Sheet 5 of 5
`
`5,651,071
`
`2D GAIN FOR
`SERIAL
`CONNECTION
`
`
`
`F.G. 5A
`
`
`
`2D
`GENERALZED
`GAN
`
`F.G. 5B
`
`- 7 -
`
`

`

`1.
`NOISE REDUCTION SYSTEM FOR
`BNAURAL, HEARNGAD
`
`5,651,071
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`The presentinvention relates to patent application entitled
`"Binaural Hearing Aid” Ser. No. 08/123.499, filed Sep. 17,
`1993, which describes the system architecture of a hearing
`aid that uses the noise reduction system of the present
`invention.
`
`10
`
`BACKGROUND OF THE INVENTION
`1. Field of the Invention:
`This invention relates to binaural hearing aids, and more
`particularly, to a noise reduction system for use in a binaural
`hearing aid.
`2. Description of Prior Art:
`Noise reduction, as applied to hearing aids, means the
`attenuation of undesired signals and the amplification of
`desired signals. Desired signals are usually speech that the
`hearing aid user is trying to understand. Undesired signals
`can be any sounds in the environment which interfere with
`the principal speaker. These undesired sounds can be other
`speakers, restaurant clatter, music, traffic noise, etc. There
`have been three main areas of research in noise reduction as
`applied to hearing aids: directional beamforming, spectral
`subtraction, pitch-based speech enhancement.
`The purpose of beamforming in a hearing aid is to create
`an illusion of "tunnel hearing” in which the listener hears
`what he is looking at but does not hear sounds which are
`coming from other directions. If he looks in the direction of
`a desired sound-e.g., someone he is speaking to-then
`other distracting sounds-e.g., other speakers-will be
`attenuated. A beamformer then separates the desired "on
`axis" (line of sight) target signal from the undesired "off
`axis' jammer signals so that the target can be amplified
`while the jammer is attenuated.
`Researchers have attempted to use beamforming to
`improve signal-to-noise ratio for hearing aids for a number
`of years {References 1,2,3,7,8,9}. Three main approaches
`have been proposed. The simplest approach is to use purely
`analog delay and sum techniques {2}. A more sophisticated
`approach uses adaptive FIR filter techniques using
`algorithms, such as the Griffiths-Jim beamformer {1, 3}.
`These adaptive filter techniques require digital signal pro
`cessing and were originally developed in the context of
`antenna array beamforming for radar applications {5}. Still
`another approach is motivated from a model of the human
`binaural hearing system {14, 15. While the first two
`approaches are time domain approaches, this last approach
`is a frequency domain approach.
`There have been a number of problems associated with all
`of these approaches to beamforming. The delay-and-sum
`and adaptive filter approaches have tended to break down in
`non-anechoic, reverberant listening situations: any real room
`will have so many acoustic reflections coming off walls and
`ceilings that the adaptive filters will be largely unable to
`distinguish between desired sounds coming from the front
`and undesired sounds coming from other directions. The
`delay-and-sum and adaptive filter techniques have also
`required a large (>=8) number of microphone sensors to be
`effective. This has made it difficult to incorporate these
`systems into practical hearing aid packages. One package
`that has been proposed consists of a microphone array across
`the top of eyeglasses {2}.
`
`15
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`The frequency domain approaches which have been pro
`posed {7,8,9} have performed better than delay-and-sum or
`adaptive filter approaches in reverberant listening environ
`ments and function with only two microphones. The prob
`lems related to the previously-published frequency domain
`approaches have included unacceptably long input-to-output
`time delay, distortion of the desired signal, spatial aliasing at
`high frequencies, and some difficulty in reverberant envi
`ronments (although less than for the adaptive filter case).
`While beamforming uses directionality to separate
`desired signal from undesired signal, spectral subtraction
`makes assumptions about the differences in statistics of the
`undesired signal and the desired signal, and uses these
`differences to separate and attenuate the undesired signal.
`The undesired signal is assumed to be lower in amplitude
`then the desired signal and/or has a less time varying
`spectrum. If the spectrum is static compared to the desired
`signal (speech), then a long-term estimation of the spectrum
`will approximate the spectrum of the undesired signal. This
`spectrum can be attenuated. If the desired speech spectrum
`is most often greater in amplitude and/or uncorrelated with
`the undesired spectrum, then it will pass through the system
`relatively undistorted despite attenuation of the undesired
`spectrum. Examples of workin spectral subtraction include
`references {11, 12, 13.
`Pitch-based speech enhancement algorithms use the
`pitched nature of voiced speech to attempt to extract a voice
`which is embedded in noise. A pitch analysis is made on the
`noisy signal. If a strong pitch is detected, indicating strong
`voiced speech superimposed on the noise, then the pitch can
`be used to extract harmonics of the voiced speech, removing
`most of the uncorrelated noise components. Examples of
`work in pitch-based enhancement are references {17, 18}.
`SUMMARY OF THE INVENTION
`In accordance with this invention, the above problems are
`solved by analyzing the left and right digital audio signals to
`produce left and right signal frequency domain vectors and,
`thereafter, using digital signal encoding techniques to pro
`duce a noise reduction gain vector. The gain vector can then
`be multiplied against the left and right signal vectors to
`produce a noise reduced left and right signal vector. The cues
`used in the digital encoding techniques include
`directionality, short-term amplitude deviation from long
`term average, and pitch. In addition, a multidimensional
`gain function, based on directionality estimate and ampli
`tude deviation estimate, is used that is more effective in
`noise reduction than simply summing the noise reduction
`results of directionality alone and amplitude deviations
`alone. As further features of the invention, the noise reduc
`tion is scaled based on pitch-estimates and based on voice
`detection.
`Other advantages and features of the invention will be
`understood by those of ordinary skill in the art after referring
`to the complete written description of the preferred embodi
`ments in conjunction with the following drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 illustrates the preferred embodiment of the noise
`reduction system for a binaural hearing aid.
`FIG. 2 shows the details of the inner product operation
`and the sum of magnitudes squared operation referred to in
`FIG. 1.
`FIGS. 3A and 3B show the band smoothing filters 157 of
`band smoothing operation 156 in FIG. 1.
`
`- 8 -
`
`

`

`3
`FIG. 4 shows the details of the beam spectral subtract gain
`operation 158 in FIG. 1.
`FIG. 5A is a graph of noise reduction gains as a serial
`function of directionality and spectral Subtraction.
`FIG. 5B is a graph of the noise reduction gain as a
`function of directionality estimate and spectral subtraction
`excursion estimate in accordance with the process in FIG. 4.
`FIG. 6 shows the details of the pitch-estimate gain opera
`tion 180 in FIG. 1.
`FIG. 7 shows the details of the voice detect gain scaling
`operation 208 in FIG. 1.
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`Theory of Operation:
`In the noise-reduction system described in this invention,
`all three noise reduction techniques, beamforming, spectral
`subtraction and pitch enhancement, are used. Innovations
`will be described relevant to the individual techniques,
`especially beamforming. In addition, it will be demonstrated
`that a synergy exists between these techniques such that the
`whole is greater than the sum of the parts.
`Multidimensional Noise Reduction:
`We call a multidimensional noise reduction system any
`system which uses two or more distinct cues generated from
`signal analysis to attempt to separate desired from undesired
`signal. In our case, we use three cues: directionality (D),
`short term amplitude deviation from long term average
`(STAD), and pitch (fo). Each of these cues has been used
`Separately to design noise reduction Systems, but the coop
`erative use of the cues taken together in a single system has
`not been done.
`To see the interactions between the cues assume a system
`which uses D and STAD separately, i.e., the use of D alone
`as a beamformer and STAD alone as a spectral subtractor. In
`the case, of the beamformer we estimate D and then specify
`40
`again function of D which is unity for high D and tends to
`zero for low D. Similarly, for the spectral subtractor we
`estimate STAD and provide again function of STAD which
`is unity for high STAD and tends to zero for low STAD.
`The two noise reduction systems can be connected back
`to back in serial fashion (e.g., beamformer followed by
`spectral subtractor). In this case, we can thinkin terms of a
`two-dimensional gain function of (D.STAD) with the func
`tion having a shape similar to that shown in FIG. 5A. With
`the serial connection, the gain function in FIG. 5A is
`rectangular. Values of (DSTAD) inside the rectangle gen
`erate a gain near unity which tends toward Zero near the
`boundaries of the rectangle.
`If we abandon the notion of a serial connection
`(beamformer followed by spectral subtractor) and instead
`think in terms of a general two-dimensional function of
`(D.STAD), then we can define non-rectangular gain
`contours, such as that shown in FIG. 5B Generalized Gain.
`Here we see that there is more interaction between the D and
`STAD values. A region which may have been included in the
`rectangular gain contour is now excluded because we are
`better able to take into consideration both D and STAD.
`A common problem in spectral subtraction noise reduc
`tion systems is musical noise . This is isolated bits of
`spectrum which manage to rise above the STAD threshold in
`discrete bursts. This can turn a steady state noise, such as a
`fan noise, into a fluttering random musical note generator.
`
`35
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,651,071
`
`10
`
`15
`
`25
`
`30
`
`4
`By using the combination of (D.STAD) we are able to make
`a better decision about a spectral component by insisting that
`not only must it rise above the STAD threshold, but it must
`also be reasonably on-line. There is a continuous give and
`take between these two parameters.
`Including fo, pitch, as a third cue gives rise to a three
`dimensional noise reduction system. We found it advanta
`geous to estimate D and STAD in parallel and then use the
`two parameters in a single two-dimensional function for
`gain. We do not want to estimate fo in parallel with D and
`STAD, though, because we can do a better estimate off0 if
`we first noise reduce the signal somewhat using D and
`STAD. Therefore, based on the partially noise-reduced
`signal, we estimate fo and then calculate the final gain using
`D, STAD and fo in a general three-dimensional function, or
`we can use fo to adjust the gain produced from DSTAD
`estimates. When fo is included, we see that not only is the
`system more efficient because we can use arbitrary gain
`functions of three parameters, but also the presence of a first
`stage of noise reduction makes the subsequent fo estimation
`more robust than it would be in an fo only based system.
`The D estimate is based on values of phase angle and
`magnitude for the current input segment. The STAD esti
`mate is based on the Sum of magnitudes over many past
`Segments. A more general approach would make a single
`unified estimate based on current and past values of both
`phase angle and magnitude. More information would be
`used, the function would be more general, and so a better
`result would be had.
`Frequency Domain Beamforming:
`A frequency domain beamformer is a kind of analysis/
`synthesis system. The incoming signals are analyzed by
`transforming to the frequency (or frequency-like) domain.
`Operations are carried out on the signals in the frequency
`domain, and then the signals are resynthesized by transform
`ing them back to the time domain. In the case of two
`microphone beamformers, the two signals are the left and
`right ear signals. Once transformed to the frequency domain,
`a directionality estimate can be made at each frequency
`point by comparing left and right values at each frequency.
`The directionality estimate is then used to generate a gain
`which is applied to the corresponding left and right fre
`quency points and then the signals are resynthesized.
`There are several key issues involved in the design of the
`basic analysis/synthesis system. In general, the analysis/
`Synthesis system will treat the incoming signals as consecu
`tive (possibly time overlapped) time segments of N sample
`points. Each Nsample point segment will be transformed to
`produce a fixed length block of frequency domain coeffi
`cients. An optimum transform concentrates the most signal
`power in the Smallest percentage of frequency domain
`coefficients. Optimum and near optimum transforms have
`been widely studied in signal coding applications reference
`19 where the desire is to transmit a signal using the fewest
`coefficients to achieve the lowest data rate. If most of the
`signal power is concentrated in a few coefficients, then only
`those coefficients need to be coded with high accuracy, and
`the others can be crudely coded or not coded at all.
`The optimum transform is also extremely important for
`the beamformer. Assume that a signal consists of desired
`Signal plus undesired noise signal. When the signal is
`transformed, some of the frequency domain coefficients will
`correspond largely to desired signal, some to undesired
`signal, and some to both. For the frequency coefficients with
`substantial contributions from both desired signal and noise,
`
`- 9 -
`
`

`

`10
`
`15
`
`30
`
`35
`
`40
`
`25
`
`5
`it is difficult to determine an appropriate gain. For frequency
`coefficients corresponding largely to desired signals the gain
`is near unity. For frequency coefficients corresponding
`largely to noise, the gain is near Zero. For dynamic signals,
`such as speech, the distribution of energy across frequency
`coefficients from input segment to input segment can be
`regarded as random except for possibly a long-term global
`spectral envelope. Two signals, desired signal and noise,
`generate two random distributions across frequency coeffi
`cients. The value of a particular frequency coefficient is the
`sum of the contribution from both signals. Since the total
`number of frequency coefficients is fixed, the probability of
`two signals making substantial contributions to the same
`frequency coefficient increases as the number of frequency
`coefficients with substantial energy used to code each signal
`increases. Therefore, an optimum transform, which concen
`trates energy in the smallest percentage of the total
`coefficients, will result in the smallest probability of overlap
`between coefficients of the desired signal and noise signal.
`This, in turn, results in the highest probability of correct
`answers in the beamformer gain estimation.
`A different view of the analysis/synthesis system is as a
`multiband filter bank {20. In this case, each frequency
`coefficient, as it varies in time from input segment to input
`segment, is seen as the output of a bandpass filter. There are
`as many bandpass filters, adjacent in frequency, as there are
`frequency coefficients. To achieve high energy concentration
`in frequency coefficients we want sharp transition bands
`between bandpass filters. For speech signals, optimum trans
`forms correspond to filter banks with relatively sharp tran
`sition bands to minimize overlap between bands.
`In general, to achieve good discrimination between
`desired signal and noise, we want many frequency coeffi
`cients (or many bands of filtering) with energy concentrated
`in as few coefficients as possible (sharp transition bands
`between bandpass filters). Unfortunately, this kind of high
`frequency resolution implies large input sample segments
`which, in turn, implies long input to output delays in the
`system. In a hearing aid application, time delay through the
`system is an important parameter to optimize. If the time
`delay from input to output becomes too large (e.g.>about 40
`ms), the lips of speakers are no longer synchronized with
`sound. It also becomes difficult to speak since the sound of
`one's one voice is not synchronized with muscle move
`ments. The impression is unnatural and fatiguing. A com
`45
`promise must be made between input-output delay and
`frequency resolution. A good choice of analysis/synthesis
`architecture can ease the constraints on this compromise.
`Another important consideration in the design of analysis/
`synthesis systems is edge effects. These are discontinuities
`that occur between adjacent output segments. These edge
`effects can be due to the circular convolution nature of
`fourier transform and inverse transforms, or they can be due
`to abrupt changes in frequency domain filtering (noise
`reduction gain, for example) from one segment to the next.
`Edge effects can sound like fluttering at the input segment
`rate. A well-designed analysis/synthesis system will elimi
`nate these edge effects or reduce them to the point where
`they are inaudible.
`The theoretical optimum transform for a signal of known
`statistics is the Karhoenen-Loeve Transform or KLT 19.
`The KLT does not generally lend itself to practical
`implementation, but serves as a basis for measuring the
`effectiveness of other transforms. It has been shown that, for
`speech signals, various transforms approach the KLT in
`effectiveness. These include the DCT 19, and ELT 21.
`A large body of literature also exists for designing efficient
`
`5,651,071
`
`6
`filter banks {22, 23. This literature also proposes tech
`niques for eliminating or reducing edge effects.
`One common design for analysis/synthesis systems is
`based on a technique called overlap-add {16}. In the
`overlap-add scheme, the incoming time domain signals are
`segmented into N point non-overlapping, adjacent time
`segments. Each N point segment is "padded” with an
`additional L zero values. Then each NHL point “augmented”
`segment is transformed using the FFT. A frequency domain
`gain, which can be viewed as the FFT of another NHL point
`sequence consisting an M point time domain finite impulse
`response padded with NHL-M Zeros, is multiplied with the
`transformed “augmented” input segment, and the product is
`inverse transformed to generate an NHL point time domain
`sequence. As long as MKL, then the resultingN+L point time
`domain sequence will have no circular convolution compo
`nents. Since an NHL point segment is generated for each
`incoming N point segment, the resulting segments will
`overlap in time. If the overlapping regions of consecutive
`segments are summed, then the result is equivalent to a
`linear convolution of the input signal with the gain impulse
`response.
`There are a number of problems associated with the
`overlap-addscheme. Viewed from the point of view offilter
`bank analysis, an overlap/add scheme uses bandpass filters
`whose frequency response is the transform of a rectangular
`window. This results in a poor quality bandpass response
`with considerable leakage between bands so the coefficient
`energy concentration is poor. While an overlap-add scheme
`will guarantee smooth reconstruction in the case of convo
`lution with a stationary finite impulse response of con
`strained length, when the impulse response is changing
`every block time, as is the case when we generate adaptive
`gains for a beamformer, then discontinuities will be gener
`ated in the output. It is as if we were to abruptly change all
`the coefficients in an FIR filter every block time. In an
`overlap-add system, the input to output minimum delay is:
`
`=(1+Z/2) * N+(compute time for 2*N FFT)
`
`D
`Where:
`N=input segment length,
`Z=number of zeros added to each blockfor zero padding.
`Aminimum value for Z is N, but this can easily be greater
`if the gain function is not sufficiently smooth over frequency.
`The frequency resolution of this system is N/2 frequency
`bins given conjugate symmetry of the transforms of the real
`input signal, and the fact that zero padding results in an
`interpolation of the frequency points with no new informa
`tion added.
`In the system design described in the preferred embodi
`ments section of this patent, we use a windowed analysis/
`synthesis architecture. In a windowed FFT analysis/
`synthesis system, the input and output time domain sample
`segments are multiplied by a window function which in the
`preferred embodiment is a sine window for both the input
`and output segments. The frequency response of the band
`pass filters (the transform of the sine window) is more
`sharply bandpass than in the case of the rectangular win
`dows of the overlap-add scheme so there is better coefficient
`energy concentration. The presence of the synthesis window
`results in an effective interpolation of the adaptive gain
`coefficients from one segment to the next and so reduces
`edge effects. The input to output delay for a windowed
`system is:
`D=1 * N+(compute time for N FFT)
`
`50
`
`55
`
`60
`
`65
`
`- 10 -
`
`

`

`Where:
`N=input segment length.
`It is clear that the sine windowed system is preferable to
`the overlap-add system from the point of view of coefficient
`energy concentration, output Smoothness, and input-output
`delay. Other analysis/synthesis architectures, such as EIT,
`Paraunitary Filter Banks, QMF Filter Banks, Wavelets, DCT
`should provide similar performance in terms of input-output
`delay but can be superior to the sine window architecture in
`terms of energy concentration, and reduction of edge effects.
`Preferred Embodiment:
`In FIG. 1, the noise reduction stage, which is implemented
`as a DSP software program, is shown as an operations flow
`diagram. The left and right ear microphone signals have
`been digitized at the system sample rate which is generally
`adjustable in a range from Fsa=8-4.8 kHz, but has a
`nominal value of Fsamp 11.025 Khz sampling rate. The left
`and right audio signals have little, or no, phase or magnitude
`distortion. A hearing aid system for providing such low
`distortion left and right audio signals is described in the
`above-identified cross-referenced patent application entitled
`"Binaural Hearing Aid.” The time domain digital input
`signal from each ear is passed to one-Zero pre-emphasis
`filters 139,141. Pre-emphasis of the left and right ear signals
`using a simple one-zero high-pass differentiator pre-whitens
`the signals before they are transformed to the frequency
`domain. This results in reduced variance between frequency
`coefficients so that there are fewer problems with numerical
`error in the Fourier transformation process. The effects of
`the preemphasis filters 139, 141 are removed after inverse
`Fourier transformation by using one-pole integrator deem
`phasis filters 242 and 244 on the left and right signals at the
`end of noise reduction processing. Of course, if binaural
`compression follows the noise reduction stage of processing,
`the inverse transformation and deemphasis would be at the
`end of binaural compression.
`In FIG. 1, after preemphasis, if used, the left and right
`time domain audio signals are passed through allpass filters
`144, 145 to gain multipliers 146, 147. The allpass filter
`serves as a variable delay. The combination of variable delay
`and gain allows the direction of the beam in beam forming
`to be steered to any angle if desired. Thus, the on-axis
`direction of beam forming may be steered from something
`other than straight in front of the user, or may be tuned to
`compensate for microphone or other mechanical mis
`matches.
`At times, it may be desirable to provide maximum gain
`for signals appearing to be off-axis, as determined from
`analysis of left and right ear signals. This may be necessary
`to calibrate a system which has imbalances in the left and
`right audio chain, such as imbalances between the two
`microphones. It may also be desirable to focus a beam in
`another direction then straight ahead. This may be true when
`a listener is riding in a car and wants to listen to someone
`sitting next to him without turning in that direction. It may
`also be desirable for non-hearing aid applications, such as
`speaker phones or hands-free car phones. To accomplish this
`beam steering, a delay and gain are inserted in one of the
`time domain input signal paths. This tunes the beam for a
`particular direction.
`The noise reduction operation in FIG. 1 is performed on
`N point blocks. The choice of N is a trade-off between
`frequency resolution and delay in the system. It is also a
`function of the selected sample rate. For the nominal 11.025
`sample rate, a value of N=256 has been used. Therefore, the
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,651,071
`
`10
`
`15
`
`20
`
`8
`signal is processed in 256 point consecutive sample blocks.
`After each block is processed, the block origin is advanced
`by 128 points. So, if the first block spans samples 0.255 of
`both the left and right channels, then the second block spans
`samples 128.383, the third spans samples 256.511, etc. The
`processing of each consecutive block is identical.
`The noise reduction processing begins by multiplying the
`left and right 256 point sample blocks by a sine window in
`operations 148, 149. A fast Fourier transform (FFT) opera
`tion 150, 151 is then performed on the left and right blocks.
`Since the signals are real, this yields a 128 point complex
`frequency vector for both the left and right audio channels.
`The elements of the complex frequency vectors will be
`referred to as bin values. So there are 128 frequency bins
`from F=0 (DC) to FXFsamp/2 Khz.
`The

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket