throbber
ACOUSTIC ECHO AND NOISE CONTROL 
`
`A LONG LASTING CHALLENGE
`
`Pia Dreiseitel
`
`Eberhard Hansler
`Signaltheorie
`Darmstadt University of Technology D Darmstadt Germany
`fdreiseithaenslerhpuderg(nesi(cid:4)tudarmstadt(cid:4)de
`
`Henning Puder
`
`ABSTRACT
`Handsfree operation of telephones incorporating echo
`cancellation and noise reduction has been discussed for
`over a decade(cid:4) This paper presents an overview of the
`wide range of algorithms which are applicable to echo
`cancellers and noise reduction(cid:4) Practical problems asso
`ciated with implementation and overall system control
`are also discussed(cid:4)
`
` INTRODUCTION
`
`When telecommunications started about a century ago
`users had their two hands busy  (cid:4) They had to hold
`a microphone close to their mouth and a loudspeaker
`close to one ear(cid:4) It did not take long to get one hand
`free microphone and loudspeaker were assembled in a
`handset(cid:4) However the aim of handsfree operation has
`not yet been attained(cid:4)
`In early years of telecommunication the lack of ef
`cient electro acoustic devices and ampliers justied
`the inconvenience to the customer(cid:4) At the same time
`two problems were solved
`
` acoustic echos transmitted back to the remote user
`were reduced by providing sucient attenuation
`
` operation in a noisy environment was possible by
`an improved signal to noise ratio(cid:4)
`
`For nonexperts it is still dicult to understand that
`it takes all the signal processing capabilities available
`today to achieve at least some solution of these eas
`ily explained problems of handsfree operation(cid:4) A large
`number of papers on the topic under consideration have
`been published within the last few years including bib
`liographies     and reports on the state of the
`art  (cid:4) Adaptive algorithms for acoustic echo com
`pensation and noise control gained special attention in
` (cid:4)
`
` BASICS
`
`At the most general level there are two sources that
`make the solution of the handsfree problem dicult
`
`rst the physical properties of loudspeakerenclosure
`microphone systems LEMSs and speech signals and
`secondly the fulllment of the regulations of the Interna
`tional Telecommunications Union ITU(cid:4) Although the
`latter may seem arbitrary it is essential for the equip
`ment to be licensed by telecommunication authorities(cid:4)
`
`(cid:4) Physics
`
`Audio communication systems include at least one loud
`speaker and one microphone housed within the same en
`closure(cid:4) Consequently the microphone picks up not only
`locally generated signals like speech and environmental
`noise but also the signal radiated by the loudspeaker as
`well as its echos caused by reections at the boundaries
`of the enclosure(cid:4) Assuming linearity the audio charac
`teristics of the LEMS may be modeled by an impulse
`response(cid:4) The duration of the response depends on the
`reverberation time of the enclosure(cid:4)
`In case of an of
`ce room this time is in the order of several hundred
`milliseconds in case of a passenger car it is in the or
`der of fty to one hundred milliseconds(cid:4) Furthermore
`the response of the LEMS is extremely sensitive to any
`movements within the enclosure(cid:4) Finally the system is
`driven by audio signals typically a mixture of speech
`and noise where speech itself is comprised by periodic
`and aperiodic components with highly uctuating mag
`nitudes and pauses(cid:4) Briey from a signal processing
`point of view the system and the signals involved are
`extremely unpleasant(cid:4)
`
`(cid:4) Regulations
`
`The ITUT Recommendations   put very stringent
`conditions on handsfree telephone systems(cid:4) For ordi
`nary telephones the echo attenuation has to be at least
` dB in the case of single talk(cid:4) In double talk situations
`this value can be reduced by  dB(cid:4) Beyond that only a
`negligible delay may be introduced into the signal path
`by the handsfree facility(cid:4)
`
`WAVES607_1008-0001
`
`Petitioner Waves Audio Ltd. 607 - Ex. 1008
`
`

`
`
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`
`gk
`
`g
`
`yk
`
`xk
`
`
`r
`
`AK
`A
`
`ck
`
`A
`A
`
`A
`
`farend
`speaker
`
`
`
`
`
`f r
`
`ek
`
`to farend
`speaker
`
`Figure  Adaptive System
`
`ck   ck 
`
`ek xk
`kxkk
`
`
`
`where ek denotes the adaptation error dk the de
`sired signal ck the coecient vector of the adaptive
`lter xk the excitation vector and nally a variable
`step size factor(cid:2)
`The computational requirements of the NLMS algo
`rithms are low(cid:2) This is important since the application
`considered here requires a large number of coecients(cid:2)
`The disadvantage is its slow speed of convergence espe
`cially in case of correlated inputs(cid:2)
`
` (cid:2) NLMS algorithm with prewhitening lters
`
`A simple approach to overcome this problem is pre
`whitening of both input and reference signal(cid:2) This can
`be achieved by a linear prediction error lter(cid:2) Restrict
`ing oneself to a time invariant lter a lter of order one
`to four proved to be sucient(cid:2) Prewhitening lters of
`higher order have to be adaptive(cid:2) They can be designed
`using the LevinsonDurbinalgorithm(cid:2)
`
` (cid:2) Step size Control
`
`As mentioned before the NLMS algorithm uses an adap
`tation factor called step size which is responsible for
`both stability and speed of convergence(cid:2) Controlling the
`step size becomes especially important in case of noisy
`microphone signals like those in car environments or in
`double talk situations  (cid:2)
`It can be shown   that an optimal step size exists for
`adaptation in a noisy environment which is also suitable
`for non stationary excitation signals
`
`opt
`
`Efkg
`Efekg
`
`
`
` 
`
`where Ef g denotes expectation ek the current error
`value and k the residual echo signal(cid:2) The application
`of   however requires a reliable double talk detector
` (cid:2)
`
`(cid:2) Systems for Acoustic Echo and Noise Con
`trol
`
`A number of systems have been proposed for acoustic
`echo and noise control(cid:2) They all use three units or a
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`
`f
`
`Loudspeaker
`Enclosure
`Microphone
`System
`
`r
`
`
`r
`
`
`
`
`
`
`
`
`r
`
`
`
`
`
`d
`Adaptive
`Filter
`
`
`
`
`
`r
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`d
`
`
`
`
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`d
`
`r
`
` 
`
`Loss Control
`
`Postlter
`
`Figure Basic structure for acoustic echo and noise
`control
`
`subset of them Fig(cid:2)  The rst unit is a loss control
`that attenuates the incoming andor the outgoing sig
`nal(cid:2) Early handsfree communication systems used this
`unit only reducing conversations to half duplex(cid:2) Be
`cause of the ITU regulations loss control still remains
`the most important function because it has to guaran
`tee the required attenuation(cid:2) The second unit consists of
`an adaptive lter functioning as a replica of the LEMS(cid:2)
`If perfect adaptation could be achieved loudspeaker and
`microphone would be decoupled entirely without any
`impact on locally generated signals(cid:2) The third  most
`modern  ingredient of an echo and noise control sys
`tem consists of a Wiener lter within the outgoing sig
`nal path(cid:2) In contrast to the loss control unit this lter
`provides a frequency dependent attenuation of the out
`going signal(cid:2)
`Its aim is twofold to suppress residual
`echos not covered by the adaptive lter and to enhance
`speech quality by suppressing noise components(cid:2)
`Design considerations and results achieved by these
`three units will be given within the following sections(cid:2)
`
` ADAPTIVE ALGORITHMS
`
`Several adaptation algorithms have been applied to
`acoustic echo cancellers(cid:2) Each of these algorithms min
`imises the mean square error of signal ek s(cid:2) Fig(cid:2) (cid:2)
`The algorithms discussed in this section are dealt with
`in order of increasing complexity(cid:2)
`
` (cid:2) NLMS
`
`The least mean square NLMS algorithm is the most
`easily and frequently implemented algorithm(cid:2) It is de
`scribed by the following relations
`
`ek dk cT k xk
`
` 
`
`WAVES607_1008-0002
`
`Petitioner Waves Audio Ltd. 607 - Ex. 1008
`
`

`
` (cid:2) Ane Projection Algorithm
`
`Looking closely at the ane projection algorithm
`APA
`it can be considered as an extension of the
`NLMS algorithm taking into account the P last exci
`tation vectors(cid:7)
`
`ek dk cT kxk
`
`
`
`ek ek ek  ek P  T 
`
`Xk xk xk  xk P  T 
`
`ck   ck X k X T k Xk eT k 
`
`Usually P is small compared to the total number of l
`ter coecients(cid:7) In contrast to the NLMS algorithm the
`matrix X T k Xk has to be inverted(cid:7) This can be car
`ried out recursively(cid:7) A fast version of the APA  called
`FAP fast ane projection    has been developed for
`an ecient implementation(cid:7) This algorithm is therefore
`suitable for acoustic echo cancellers(cid:7) However numeri
`cal instabilities occur because of recursively calculated
`correlation matrices(cid:7) One can overcome these problems
`by regularising the correlation matrix by adding a con
`stant value to the values of the main diagonal(cid:7) Fur
`thermore the algorithm has to be reinitialised whenever
`divergence is detected(cid:7)
`If an ane projection of second order is applied the
`inverse of the matrix can be calculated directly requiring
`only small computational load(cid:7) Compared to the NLMS
`algorithm even a second order APA increases the speed
`of convergence remarkably(cid:7)
`
` (cid:2) RLS and FTF
`
`The recursive least squares algorithm RLS is known as
`a very fast converging recursive algorithm(cid:7) A straight
`forward notation of this algorithm is given here
`
`wk  R
`xx k xk
`
`R
`xx k    R
`xx k
`
`wk wT k
` w T k xk
`
`
`
`ek dk cT k xk
`
`ck   ck  ek wk
`
`
`
` 
`
` 
`
` 
`
`where Rxxk denotes an estimate of the autocorrela
`tion matrix of the excitation signal  an exponential
`forgetting factor    and wk the gain vec
`tor(cid:7) The convergence of the RLS algorithm is superior
`to the NLMS algorithm(cid:7) However there is the problem
`of locking when  is chosen close to one(cid:7) The tracking
`performance of the RLS algorithm is therefore not as
`satisfying as the initial convergence(cid:7)
`
`If the algorithm is implemented with niteprecision
`it can become unstable for the numerical roundo er
`ror increases(cid:7) A QRDecomposition based inversion of
`the autocorrelation matrix does not show this behaviour
` (cid:7)
`If one has to deal with a large number of coecients
`the direct implementation of the RLS algorithm is not
`feasible since its computational complexity of order M (cid:7)
`Several approaches for a fast version of the RLS algo
`rithm are known principally based on prewindowing
`techniques which reduce the computational load to or
`der M (cid:7) A fast implementation of the RLS algorithm
` called Fast Transversal Filter algorithm FTF  is
`organised in four steps
`
` Recursive forward linear prediction(cid:7)
`
` Recursive backward linear prediction(cid:7)
`
` Recursive computation of the gain vector(cid:7)
`
` Recursive estimation of the desired response(cid:7)
`
`Unfortunately the FTF algorithm is numerically unsta
`ble and tends to diverge(cid:7) In fact stabilising the RLS
`algorithms is a topic in its own right   (cid:7) One ba
`sic idea is to extend the algorithm by accumulating the
`roundo errors and to perform corrections when the
`numerical error becomes signicant(cid:7)
`
` (cid:2) Fast Newton
`
`Whereas the APA may be considered as an extended
`version of the NLMS algorithm the Fast Newton algo
`rithm can be seen as a simplied version of the fast RLS
`algorithm  (cid:7) In fast implementations of the RLS al
`gorithm linear predictions of the order M are required
`where M is the size of the coecient vector(cid:7) When
`the order of correlation of the excitation signal is small
`there is actually no need to calculate the full predic
`tion vector of order M (cid:7) Reducing the size of the pre
`diction vector to a size P appropriate to the excitation
`signal leads to the Fast Newton algorithm(cid:7) The conver
`gence performance is comparable to the RLS algorithm
`whereas the numerical complexity is of order M P (cid:7)
`
` (cid:2) Fullband  Subband  Blockprocessing
`
`Until now our discussion of adaptive lters has dealt
`only with fullband signals since this is the most straight
`forward method of implementation(cid:7) However straight
`forward does not necessarily mean most ecient(cid:7) Both
`subband and block processing enable implementations
`resulting in less computational cost(cid:7)
`If a signal is split up into subbands one can subsam
`ple the resulting signals leading to shorter adaptive echo
`cancellers(cid:7) All of the adaptive algorithms mentioned
`above are suitable for subband implementation(cid:7) The
`
`WAVES607_1008-0003
`
`Petitioner Waves Audio Ltd. 607 - Ex. 1008
`
`

`
`tion to one of the excitation signals  (cid:2) In a second
`approach the correlation matrix is regularised by intro
`ducing leakage into the update of the coecient vector
`(cid:2)
`
` NOISE REDUCTION
`
`With the increasing number of mobile telephones more
`and more people use them in cars(cid:2) This generates a
`demand for handsfree telephone sets for cars that not
`only increase the comfort to the user but also allow the
`driver to keep his hands on the steering wheel(cid:2)
`To enhance the speech signal outgoing to the farend
`user noise reduction methods are desirable(cid:2)
`We describe one channel methods for two reasons
`rst the cost for installing a second channel may be pro
`hibitive and secondly single channel procedures can also
`be extended to multichannel methods(cid:2)
`
`(cid:3) Basic architecture
`
`Most noise reduction procedures are based on the
`Wiener solution 
`
`XP SD kB n p
`GoptkB n 
` NP SDkB n
`
`f
`
` Gopt f
` otherwise
`
` 
`
`where NP SDkB n and XP SDkB n denote the PSD
`of the noise and the distorted input signal respectively
`and B is equal to the block size(cid:2) The frequency index is
`given by n(cid:2) Compared to the wellknown Wiener lter
`an overestimation factor  a variable power p and a
`spectral oor f are introduced(cid:2)
`Unfortunately there is a conict between the ratio
`of the noise reduction and the quality of the resulting
`speech signal(cid:2) The parameters suggested above have to
`be chosen such that a subjective obtimum is achieved(cid:2)
`To preserve natural sounding speech the spectral
`oor is introduced which in turn limits the SNR
`improvement to  logf dB(cid:2) The imprecision asso
`ciated with estimation of the timevarying PSDs causes
`an unpleasant tonal noise(cid:2) The socalled musicaltones
`can be attenuated by tailoring the transfer function ad
`equately with the additional parameters(cid:2)
`Modications of the lter   are given by the
`MMSESTSA estimator Minimum Mean Square Error
`ShortTime Spectral Amplitude and its derivation the
`MMSELSA estimator Minimum Mean Square Error
`Logarithmic Spectral Amplitude  (cid:2) To derive the
`algorithms the timevarying property of the distorted
`input signal has been taken into account(cid:2) For these al
`gorithms an <a priori< and an <a posteriori< signal to noise
`ratio SNR are estimated
`jXkB nj
`jNP SDkB nj
`
`SN RpostkB n
`
` 
`
`SN RpriokB n   maxSN RpostkB n 
`
`processing power saved may be used for more complex
`adaptation(cid:2) However subband realisations do have one
`substantial disadvantage that may prohibit their appli
`cation they introduce delay into the system  (cid:2) This
`delay is caused by the lterbanks for analysis decom
`position and synthesis of the excitation and error sig
`nals(cid:2) These lterbanks have to be designed with re
`spect to the special demands of an adaptive echo can
`celler(cid:2) The aliasing terms for example have to be min
`imised  (cid:2) There is a substantial body of literature
`concerned with the design of polyphase lter banks used
`for echo cancellation e(cid:2)g(cid:2)   (cid:2)
`In block processing the impulse response of the adap
`tive lter is split up into blocks(cid:2) Using fast convolution
`techniques the calculation of the output signal can be
`carried out very eciently (cid:2) Again there is a trade
`o between eciency of processing and delay(cid:2) However
`blockprocessing oers the advantage of optimising de
`lay versus processing power(cid:2) Small block sizes keep the
`delay low but increase the processing power required(cid:2)
`
` STEREOPHONIC ECHO
`
`CANCELLATION
`
`Recently stereophonic acoustic echo cancellation be
`came more and more important for applications such
`as teleconferencing or video games  (cid:2)
`
`yk
`
`
`
`e
`r
`
`
`e r
`
`e k
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`
`hhhh
`HH
`
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`kq (cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`
`
`
`h
`
`
`ek
`
`So
`S
`gk
`S
`JJ
`J
`g k
`
`J
`
`
`
`xk
`r
`
`x k
`
`r
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)R (cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`e
`
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`
`(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0) (cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`kq(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)(cid:0)
`
`hh
`
`Figure Stereophonic echo cancellation
`
`As the excitation signals of the two channels are corre
`lated Fig(cid:2)  there is no unique solution for identifying
`the two impulse responses(cid:2) Furthermore an extended
`correlation matrix of the two input signals has to be
`inverted(cid:2) In case of high correlation this causes numer
`ical instabilities due to illconditioning which in turn
`leads to divergence(cid:2) However there are a number of
`approaches to overcome the correlation of the two exci
`tation signals(cid:2) One technique applies a nonlinear func
`
`WAVES607_1008-0004
`
`Petitioner Waves Audio Ltd. 607 - Ex. 1008
`
`

`
`Filter
`
`Filter
`
`Filter
`
`+
`
`Filter
`
`Filter
`
`Noise reduction processing
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`+
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Figure  Filterbank
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`+
`
`+
`
`+
`
`Noise reduction processing
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
`Filter
`
` jGoptk B n XkB nj
`jNP SDkB nj
`XkB n describes the STFT of the input signal xk
`at block andkB (cid:6) The weighting rules for the algorithms
`are given by
`
` 
`
`
`
`  MMSESTSA
`
`GoptkB n
`
` 
`
`p
` s
`
`
` SN R postkB n
`
`SN RpriokB n
` SN R priokB n
`
`SN RpriokB n
`
` SN R priokB n
` M   SN RpostkB n
`
`   uI uwith M u exp u
` 
`
`  I  u
`
` MMSE  LSA
`
`Filter
`
`Filter
`
`Filter
`
`Figure  Cascaded lterbank
`
`s(cid:6) (cid:6)(cid:6) Alternatively nonuniformly distributed reso
`lution can be obtained by cascaded lter banks Fig(cid:6) 
`including the special case of the discrete wavelet trans
`form Fig(cid:6)    (cid:6)
`With these cascaded structures also dierent time
`resolutions are obtained as subsampling is performed
`after each lter stage(cid:6) Fast varying high frequency com
`ponents can be treated with a higher resolution in time
`whereas low frequency components show a more detailed
`frequency resolution(cid:6)
`
`(cid:7) Estimation of the Power Spectral Densities
`
`The timefrequency analysed input signal can be used
`to estimate XP SDkB n and NP SDkB n(cid:6)
`To determine XP SDkB n a recursively smoothed
`periodogramm is sucient(cid:6)
`However only slight
`smoothing is tolerable to avoid echoreverberation ef
`fects in the enhanced signal(cid:6)
`The estimation of NP SDkB n has to be based on
`XkB n also(cid:6) To distinguish between noise compo
`
`2
`
`2
`
`HP
`
`TP
`
`+
`
`2
`
`2
`
`HP
`
`TP
`
`+
`
`2
`
`2
`
`HP
`
`TP
`
`+
`
`processing
`Noise reduction
`
`2
`
`2
`
`HP
`
`TP
`
`2
`
`2
`
`HP
`
`TP
`
`HP
`
`TP
`
`2
`
`2
`
`Figure  Wavelet lterbank
`
`GoptkB n
`
`SN RpriokB n
` SN R priokB n
`
` 
`
`SN RpriokB n
`
`et
`
` SN R priokB n
`t dto and I I the
`
` M  SN RpostkB n
`with Mu expn
`R
`
`u
`modied Bessel functions of rst and
`second order(cid:6)
`
`(cid:7) Frequency Decomposition
`As shown above the noise reduction lter is dened in
`the frequency domain(cid:6) Therefore a frequency analysis of
`the nonstationary input signal is required(cid:6) One method
`achieving this is to use the STFT Short Time Fourier
`Transform which needs the multiplication of the input
`signal by a timewindow m
`
`XkB n
`
`N
`
`Xm
`
`xmm kBej 
`
`N nm
`
` 
`
`Subband decomposition provides a second class of meth
`ods(cid:6) The sample values of the subband signals can pro
`duce a set of spectral coecients for the noise reduction
`algorithm Fig(cid:6) (cid:6)
`After noise reduction the subband signals are upsam
`pled passed through antialiasing lters and synthe
`sised to obtain the enhanced output signal(cid:6) The lter
`banks shown in Fig(cid:6)  split the input signal into uni
`formly spaced frequency bands  comparable to the
`STFT(cid:6) Modications include nonuniformly spaced fre
`quency resolutions  oering the possibility of mod
`eling the human perception system ear and brain
`
`WAVES607_1008-0005
`
`Petitioner Waves Audio Ltd. 607 - Ex. 1008
`
`

`
`nents and speech components either voice activity de
`tection or the socalled MinimumStatistics    are
`necessary(cid:10) In the latter the minima of the smoothed in
`put spectrum are observed for each frequency bin over a
`time window(cid:10) The length of this time window is chosen
`according to the duration of speech components(cid:10)
`
`(cid:2) Psycho Acoustics
`
`Recent studies use psycho acoustics to improve noise
`reduction algorithms(cid:10) Two approaches are followed the
`rst is to adapt the signal analysis to the human ear the
`second is to use the socalled masking thresholds(cid:10)
`It is known that the human ear performs a non
`uniform frequency analysis on a logarithmic scale Bark
`scale  (cid:10) Methods presented in section (cid:10) allow a
`frequency analysis adapted to the human perception(cid:10)
`Masking means that weak tones are covered by
`stronger neighbouring tones in time or frequency(cid:10) Noise
`reduction lter design takes advantage of these proper
`ties   (cid:10)
`
`(cid:2) Combinations
`
`When acoustic echo control is applied in a noisy environ
`ment such as in cars a combination of noise reduction
`and echo control is desirable(cid:10) As far as the succession
`is concerned echo control should precede noise reduc
`tion   so that parts of the echo not compensated by
`the adaptive lter may be considered as additional noise
`  (cid:10)
`The stationarity assumption for the background noise
`does not hold for the residual echo(cid:10) Therefore dierent
`estimation methods have to be applied(cid:10) The power spec
`tral density PSD of the residual echo EP SDkB n
`is estimated by a power transfer factor kB  
`EP SDkB n kB FP SDkB n or a trans
`fer function kB n   EP SDkB n kB n
`FP SDkB n specifying the ratio between the farend
`signal and the residual echo where FP SDkB n de
`notes the PSD of the farend signal(cid:10)
`The distortion signal is given by the sum of the esti
`mated noise and the residual echo
`
`N
`P SDkB n NP SDkB n  EP SDkB n
`
` 
`
`Consequently N
`P SDkB n replaces NP SDkB n in
`the noise reduction methods presented above(cid:10)
`Separating the two problems of cancelling the residual
`echo and noise suppression by applying two lters in
`series oers additional degrees of freedom  (cid:10)
`
`(cid:2) MultiMicrophoneSolutions
`
`Microphone arrays oer further improvements of noise
`reduction(cid:10)
`A simple approach consists of delay and sum beam
`forming where a control system adapts the direction of
`maximum sensitivity towards the nearend speaker(cid:10)
`
`Assuming that the dierent microphone signals are
`comprised by correlated speech and uncorrelated noise
`one yields an improved estimate of the noise power spec
`tral density  (cid:10) The following formulas illustrate the
`procedure
`
`X kB n SkB n N kB n
`
`XkB n  SkB n N kB n
`
` 
`
`
`
`Cn
`
`
`
`E  fX kB n XkB ng
`kB n E X
`kB n
`E X
`s n 
`  
`
`
`s n  n n 
`s n  
` 
` 
`nn
`
`
`
` 
`
`with SkB n being the STFT of the speech signal sk
`N kB n NkB n the STFTs of the uncorrelated
`
`
`
`noise signals s n n n and nn the correspond
`ing power spectra(cid:10)
`The assumption that the noise signals are uncorre
`lated is more valid in higher frequency bands and for
`microphones located further apart(cid:10) The correlation co
`ecients Cn may also inuence the transfer function
`Goptn of the noise reduction lter(cid:10)
`
` LOSS CONTROL
`
`Loss control is required to guarantee a prescribed echo
`suppression level e(cid:10)g(cid:10) by the ITUT(cid:10) The total at
`tenuation introduced by the loss control is distributed
`between the loudspeaker and microphone paths respec
`tively in such that the communication is disturbed as
`less as possible(cid:10)
`In combination with the acoustic echo cancellation
`loss control has to insert only the dierence between the
`attenuation reached by the acoustic echo canceller and
`the level required by ITUT(cid:10) This requires that means
`have to be provided to estimate the performance of the
`echo canceller(cid:10)
`
` SYSTEM CONTROL
`
`So far we have not discussed the importance of con
`trolling the handsfree telephone systems(cid:10) In a realistic
`scenario the adaptive lter does not achieve more than
` dB ERLE echo return loss enhancement and may
`achieve less if the processing power does not allow a
`large number of coecients(cid:10) A loss control is therefore
`required(cid:10) Since the LEMSystem is timevariant the
`adaptation has to be performed whenever possible to
`track system changes(cid:10) Situations however may occur
`e(cid:10)g(cid:10) double talk low SNR where only small step sizes
`for the adaptation are permissible(cid:10) An exact observa
`tion of the total system is therefore important for the
`overall performance  (cid:10)
`
`WAVES607_1008-0006
`
`Petitioner Waves Audio Ltd. 607 - Ex. 1008
`
`

`
` IMPLEMENTATIONS
`
`At this time lowcost realtime processing means that
`the algorithms have to be implemented on xedpoint
`signal processors with only  bit wordlength(cid:9) The
`step from mathematical notation to implementation is
`therefore not trivial(cid:9) Restrictions of computational cost
`are still limiting the performance of the system(cid:9) How
`ever this problem may be resolved in the future(cid:9)
`
` QUALITY TESTING
`
`Although international test standards are desirable no
`denition has yet been made(cid:9) Besides transmission qual
`ity in case of echo cancellation the system should be
`judged with regard to conversation ability(cid:9) For noise re
`duction the naturalism of the outgoing speech signal is
`most important(cid:9)
`As an example of objective testing of echo cancellers
`the composite source signal should be mentioned (cid:9) It
`consists of three sections with dierent signal character
`istics tonal random noise silence which enables both
`signal detection and adaptation(cid:9) For testing the dou
`ble talk ability composite source signals with increas
`ing respectively decreasing envelops are superimposed
` (cid:9) However it is not sucient to judge echo cancelling
`and noise reduction merely on objective measures since
`they cannot be brought into line with human perception(cid:9)
`Subjective judgments by system users are therefore still
`necessary(cid:9) These tests are based on the mean opinion
`score MOS which marks the signal on a scale of to
` very bad to very good(cid:9)
`
` OUTLOOK
`
`Voice communication systems with handsfree facilities
`are on the market(cid:9) Double talk capability and noise
`reduction are provided at least to a certain extent(cid:9) Fur
`ther improvements however are necessary(cid:9) These may
`result from a better understanding of the problem and
`more powerful yet aordable hardware(cid:9) As far as the
`problem understanding is concerned an improved joint
`control of the various algorithms comprising an echo and
`noise procedure are most promising(cid:9) Therefore even
`after more than one decade of intensive research and
`development the challenge still remains(cid:9)
`
`References
`
`  The new Bell telephone(cid:9) Scientic American 
`  p(cid:9) (cid:9)
`
`  E(cid:9) Hansler The handsfree telephone problem 
`An annotated bibliography update(cid:9) Annales des
`T el ecommunications     pp(cid:9)  (cid:9)
`
` E(cid:9) Hansler The handsfree telephone problem 
`A second annotated bibliography update(cid:9) Proc(cid:14) of
`the th Internat(cid:14) Workshop on Acoustic Echo and
`Noise Control   R ros Norway pp(cid:9)  (cid:9)
`
` A(cid:9) Gilloire P(cid:9) Scalart et al(cid:9) Innovative speech pro
`cessing for mobile terminals An annotated bibliog
`raphy(cid:9) COST  workshop Toulouse France July
` (cid:9)
`
` A(cid:9) Gilloire State of the art in acoustic echo can
`cellation(cid:9) A(cid:9)R(cid:9) Figueiras and D(cid:9) Docampo Ed(cid:9)
`Adaptive Algorithms Applications and Non Classi
`cal Schemes(cid:14) Universidad de Vigo (cid:9) pp(cid:9)  (cid:9)
`
` A(cid:9) Gilloire E(cid:9) Moulines D(cid:9) Slock and P(cid:9) Duhamel
`State of the art in acoustic echo cancellation(cid:9)
`A(cid:9)R(cid:9) FigueirasVidalEd(cid:9) Digital Signal Pro
`cessing in Telecommunications(cid:14) Springer London
`U(cid:9)K(cid:9)  p(cid:9)  (cid:9)
`
` T(cid:9) P etillon A(cid:9) Gilloire and S(cid:9) Th eodoridis A com
`parative study of ecient transversal algorithms
`for acoustic echo cancellation(cid:9) Proc(cid:14) EUSIPCO 
`  Brussels Belgium pp(cid:9)  (cid:9)
`
`  R(cid:9) Martin Algorithms for handsfree voice com
`munication in noisy environments(cid:9) Proc(cid:14) (cid:14) Aach
`ener Kolloquium Signaltheorie   Institut
`fur Technische Elektronik der RWTH Aachen pp(cid:9)
`  (cid:9)
`
`  International Telecommunication Union General
`characteristics of international telephone connec
`tions and international telephone circuits  acoustic
`echo controllers(cid:9) ITUT Recommendation G(cid:14) 
` (cid:9)
`
`  S(cid:9) Gay and S(cid:9) Tavathia The fast ane projection
`algorithm(cid:9) Proc(cid:14) ICASSP  (cid:9)
`
`  S(cid:9) Haykin Adaptive Filter Theory(cid:14) nd ed(cid:9)
`PrenticeHall International New Jersey (cid:9)
`
`  J(cid:9) Cio and T(cid:9) Kailath Fast RLS Transversal Fil
`ter for adaptive ltering(cid:9) IEEE Trans(cid:14) on ASSP
`Vol(cid:9)   pp(cid:9)  (cid:9)
`
`  G(cid:9)V(cid:9) Moustakides and S(cid:9) Theodoridis Fast New
`ton Transversal Filters  A New Class of Adap
`tive Estimation Algorithms(cid:9) IEEE Trans(cid:14) on Signal
`Processing Vol  No(cid:9) 

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket