`
`CHAPTER 11: ADAPTIVE EQUALIZATION
`
`643
`
`equation in (11-1-20).
`
`Closed-loop control system representation of recursive
`
`FIGURE 11-1-3
`
`where I is the identity matrix, [ is the autocorrelation matrix of the received
`signal, C, is the (2K + 1)-dimensional vector of equalizer tap gains, and €
`is the
`vector of cross-correlations given by (10-2-45). The recursive relation in
`(11-1-20) can be represented as a closed-loop control system as shown in Fig.
`11-1-3. Unfortunately, the set of 2K +1 first-order difference equations in
`{11-1-20) are coupled through the autocorrelation matrix T. In order to solve
`these equationsand, thus, establish the convergence properties of the recursive
`algorithm,
`it
`is mathematically convenient
`to decouple the equations by
`performing a
`linear
`transformation. The appropriate transformation is
`obtained by noting that
`the matrix [ is Hermitian and, hence, can be
`represented as
`
`(11-1-21)
`.
`r= UAU"*
`where U is the normalized modal matrix of I and A is a diagonal matrix with
`diagonal elements equal to the eigenvalues of I.
`When (11-1-21) is substituted into (11-1-20) and if we define the trans-
`formed (orthogonalized) vectors C? = U'*C, and &’ = U’*E, we obtain
`Cy... = (T— AA)C? + AE?
`(11-1-22)
`This set of first order difference equations is now decoupled. Their conver-
`gence is determined from the homogeneous equation
`(11-1-23)
`C24, =(- AANC?
`Wesee that the recursive relation will converge provided that all the poleslie
`inside the unit circle, ie.,
`
`(11-1-24)
`k=~-K,...,-10,1,...,K
`Jl-AvJ<1,
`where {A,} is the set of 2K + 1 (possibly nondistinct) eigenvalues of I’. Since T°
`iS an autocorrelation matrix, it is positive-definite and, hence, A, >0 for all &.
`Consequently convergence of the recursive relation in (11-1-22) is ensured if A
`satisfies the inequality
`
`o<Ac—
`
`max
`
`(11-1-25)
`
`where Anas, is the largest eigenvalue of F.
`Since the largest eigenvalue of a positive-definite matrix is less than the sum
`
`652
`
`652
`
`
`
`
`
`644
`
`DIGITAL COMMUNICATIONS
`
`of all the eigenvalues of the matrix and, furthermore, since the sum of the
`eigenvalues of a matrix is equal to its trace, we have the following simple upper
`bound on Ajax!
`
`K
`Amax< Dd Ap =trP=(2K +1
`k=—-K
`
`= (2K + 1)(xo + No)
`
`(11-1-26)
`
`From (11-1-23) and (11-1-24) we observe that rapid convergence occurs
`when {1 — AA,| 1s small, ic., when the pole positions are far from the unit
`circle. But we cannot achieve this desirable condition and still satisfy (11-1-25)
`if there is a large difference between the largest and smallest eigenvalues of I.
`In other words, even if we select 4 to be near the upper bound given in
`(11-1-25), the convergence rate of the recursive MSE algorithm is determined
`by the smallest eigenvalue A,,j,. Consequently, the ratio Amax/Amin Ultimately
`determines the convergence rate. If Amax/Amin is small, A can be selected so as
`to achieve rapid convergence. However,if the ratio Amex/Amin iS large, as is the
`case when the channel
`frequency response has deep spectral nulls,
`the
`convergence rate of the aigorithm will be slow.
`
`11-1-4 Excess MSE Due to Noisy Gradient Estimates
`The recursive algorithm in (11-1-11) for adjusting the coefficients of the linear
`equalizer employs unbiased noisy estimates of the gradient vector. The noisein
`these estimates causes random fluctuations in the coefficients about
`their
`optimal values and, thus, leads to an increase in the MSE at the outputof the
`equalizer. That is, the final MSEis J,,,, + Js, where J, is the variance of the
`measurement noise. The term J, due to the estimation noise has been termed
`excess means-Square error by Widrow (1966).
`The total MSE at the output of the equalizer for any set of coefficients C
`can be expressed as
`
`(11-1-27)
`J =Inin + (C — Copr)!*P(C ~ Cops)
`where C,,, represents the optimum coefficients, which satisfy (11-1-6). This
`expression for the MSE can be simplified by performing the linear orthogonal
`transformation used above to establish convergence. The result of
`this
`transformation applied to (11-1-27) is
`K
`J=Jmin t+ DAE |e? — C8 opel?
`k=-—K
`
`(11-1-28)
`
`where the {ci} are the set of transformed equalizer coefficients. The excess
`MSEis the expected value of the second term in (11-1-28), ie.,
`K
`Jn= 3 AGE le2- C8 opal?
`k=-K
`
`(11-1-29)
`
`653
`
`653
`
`
`
`
`
`A
`«
`Js = AFnin > ———+|11-1-30
`etx -(1- aa)
`
`It has been shown by Widrow (1970, 1975) that the excess MSE is
`
`CHAPTER ll: ADAPTIVE EQUALIZATION
`
`645
`
`The expression in (11-1-30) can be simplified when A is selected such that
`AA, <<1 for all &. Then
`
`K
`
`Jy* LATnin 2. A;
`~t1AJa,0T
`
`k=-
`
`~ $A(2K + 1Wmin(Xo + No}
`
`(11-1-31)
`
`Note that xy + Np represents the received signal plus noise power.
`It is desirable to have J, < Jin. That is, A should be selected such that
`
`
`
`72 = NOK +1)00+M)<1
`
`mir
`
`or, equivalently,
`
`2
`A<—_————_
`
`(2K +1)(xo + No)
`
`°
`
`For example, if A is selected as
`
`0.2
`~ (2K + 1)(%o + No)
`
`(
`
`11-1-3
`
`2)
`
`(11-1-33)
`
`the degradation in the output SNR of the equalizer due to the excess MSEis
`less than 1 dB.
`The analysis given above on the excess mean square error is based on the
`assumption that the mean value of the equalizer coefficients has converged to
`the optimum value C,,,. Under this condition,the step size A should satisfy the
`boundin (11-1-32). On the other hand, we have determined that convergence
`of the mean coefficient vector requires that A<2/Ajmax- While a choice of A
`near
`the upper bound 2/A,,,, may lead to initial convergence of
`the
`deterministic (known) steepest-descent gradient algorithm, such a large value
`of A will usually result in instability of the LMS stochastic gradientalgorithm.
`The initial convergence or transient behavior of the LMS algorithm has
`been investigated by several researchers. Their results clearly indicate that the
`step size must be reduced in direct proportion to the length of the equalizer as
`specified by (11-1-32). Hence,
`the upper bound given by (11-1-32) is also
`necessary to ensure the initial convergence of the LMSalgorithm. The papers
`by Gitlin and Weinstein (1979) and Ungerboeck (1972) contain analyses of the
`transient behavior and the convergence properties of the LMSalgorithm.
`
`654
`
`654
`
`
`
`
`
`646
`
`DIGITAL COMMUNICATIONS
`
`FIGURE11-1-4
`
`Initial convergence characteristics of the LMS
`algorithm with different step sizes. [From Digital
`Signal Processing, by J. G. Proakis and D, G. Manoiakis,
`1988. Macmillan Publishing Company. Reprinted with
`permission of the publisher.|
`
`0
`
`
`
`400
`300
`200
`100
`Numberofiterations
`
`500
`
`The following example serves to reinforce the important points made above
`regarding the initial convergence of the LMSalgorithm.
`
`Example 11-1-1
`
`The LMS algorithm was used to adaptively equalize a communication
`channel for which the autocorrelation matrix FT has an eigenvalue spread of
`Amax/Amin = 11. The numberof taps selected for the equalizer was 2K + 1 =
`11. The input signal plus noise power x)+M, was normalized to unity.
`Hence,
`the upper bound on A given by (11-1-32) is 0.18. Figure 11-1-4
`illustrates the initial convergence characteristics of the LMS algorithm for
`A= 0.045, 0.09, and 0.115, by averaging the (estimated) MSE in 200
`simulations. We observe that by selecting A = 0.09 (one-half of the upper
`bound) we obtain relatively fast initial convergence. If we divide A by a
`‘factor of 2 to A= 0.045, the convergence rate is reduced but the excess
`mean square error is also reduced, so that the LMS algorithm performs
`better in steady state (in a time-invariant signal environment). Finally, we
`note that a choice of A = 0.115, which is still far below the upper bound,
`causes large undesirable fluctuations in the output MSEofthe algorithm.
`
`the choice of the
`implementation of the LMS algorithm,
`In a digital
`step-size parameter becomes even more critical. In an attempt to reduce the
`excess mean square error,it is possible to reduce the step-size parameter to the
`point where the total mean square error actually increases. This condition
`occurs when the estimated gradient components of the vector «WV? after
`muttiplication by the small step-size parameter A are smaller than one-half of
`the least significant bit
`in the fixed-point representation of the equalizer
`coefficients. In such a case, adaptation ceases. Consequently, it is important for
`the step size to be large enough to bring the equalizer coefficients in the
`vicinity of C,,.. If it is desired to decrease the step size significantly,
`it
`is
`necessary to increase the precision in the equalizer coefficients. Typically, 16
`
`655
`
`655
`
`
`
`
`
`CHAPTER I: ADAPTIVE EQUALIZATION
`
`647
`
`bits of precision may be used for the coefficients, with about 10-12 of the most
`significant bits used for arithmetic operations in the equalization of the data.
`The remaining least significant bits are required to provide the necessary
`precision for the adaptation process. Thus, the scaled, estimated gradient
`components AceV7z usually affect only the feast-significant bits in any one
`iteration. In effect, the added precision also allows for the noise to be averaged
`out, since many incremental changes in the least-significant bits are required
`before any change occurs in the upper moresignificant bits used in arithmetic
`operations for equalizing the data. For an analysis of roundoff errors in a
`digital
`implementation of the LMS algorithm, the reader is referred to the
`papers by Gitlin and Weinstein (1979), Gitlin et al (1982), and Caraiscos and
`Liu (1984).
`Asa final point, we should indicate that the LMS algorithm is appropriate
`for tracking slowly time-invariant signalstatistics. In such a case, the minimum
`MSEand the optimum coefficient vector will be time-variant. In other words,
`Jmin(t) is a function of time and the (2K + 1)-dimensional error surface is
`moving with the time index mn. The LMSalgorithm attempts to follow the
`moving minimum Jpin(t) in the (2K + 1)-dimensional space, but it
`is always
`lagging behind due to its use of (estimated) gradient vectors. As a conse-
`quence, the LMSalgorithm incurs another form of error,called the /ag error,
`whose mean square value decreases with an increase in the step size A. The
`total MSE error can now be expressed as
`
`Jrotat = Jmint) + Jy + J
`
`where J, denotes the mean square error dueto the lag.
`if we plot the
`In any given nonstationary adaptive equalization problem,
`errors J, and J, as a function of A, we expect these errors to behave as
`illustrated in Fig. 11-1-5. We observe that J, increases with an increase in A
`while J, decreases with an increase in A. The total error will exhibit a
`minimum, which will determine the optimum choice of the step-size parameter.
`Whenthestatistical time variations of the signal occur rapidly, the lag error
`
`Mean square error
`
`
`
`
`J, error due ta
`or” BOlsy gradients
`
`FIGURE 1-1-5
`
`Excess mean square error J, and lag
`error J, as a function of the step size.
`[From Digital Signal Processing, by J. G.
`Proakis and D. G. Manolakis, 1988.
`Macmillan Publishing Company.
`Reprinted with permission of the
`publisher|
`
`656
`
`656
`
`
`
`
`
`648
`
`DIGITAL COMMUNICATIONS
`
`cos @, F
`
`
`
`Complex
`ualizer
`A
`Decision
`
`
`
`
`sin @®, f
`
`FIGURE11-1-6
`
`=QAM< signal demodulation.
`
`will dominate the performance of the adaptive equalizer. In such a case,
`J,>> Imin + Jy, evens when the targest possible value of A is used. When this
`condition occurs, the LMS algorithm is inappropriate for the application and
`one must
`rely on the more complex recursive least-squares algorithms
`described in Section 11-4 to obtain faster convergence and tracking.
`
`11-1-5 Baseband and Passband Linear Equalizers
`Our treatment of adaptive linear equalizers has been in terms of equivalent
`lowpass signals. However, in a practical implementation, the linear adaptive
`equalizer shown in Fig. 11-1-2 can be realized either ai baseband or at
`bandpass. For example Fig. 11-1-6 illustrates the demodulation of QAM (or
`multiphase PSK) byfirst translating the signal to baseband and equalizing the
`basebandsignal with an equalizer having complex-valued coefficients. In effect,
`the complex equalizer with complex-valued (in-phase and quadrature com-
`ponents) input is equivalent to four parallel equalizers with real-valued tap
`coefficients as shownin Fig. 11-1-7.
`As an alternative, we may equalize the signal at passband. This is
`
`QAM signals.
`
`Complex-valued baseband equalizer for
`
`FIGURE 11-1-7
`
`657
`
`657
`
`
`
`
`
`CHAPTER I): ADAPTIVE EQUALIZATION 64
`
` (Hilbert
`
`transformer}
`
`ent
`
`FIGURE 11-1-8
`
`QAM or PSK signal equalization at passband.
`
`accomplished as shownin Fig. 11-1-8 for a two-dimensional signal constellation
`such as QAM and PSK. The received signal is filtered and, in parallel, it is
`passed through a Hilbert transformer, called a phase-splitting filter. Thus, we
`have the equivalent of in-phase and quadrature components at passband,
`which are fed to a passband complex equalizer. Following the equalization, the
`signal
`is down-converted to a baseband and detected. The error signal
`generated for the purpose of adjusting the equalizer coefficients is formed at
`baseband and frequency-translated to passbandasillustrated in Fig. 11-1-8.
`
`11-2 ADAPTIVE DECISION-FEEDBACK EQUALIZER
`the coefficients of the
`As in the case of the linear adaptive equalizer,
`feedforwardfilter and the feedback filter in a decision-feedback equalizer may
`be adjusted recursively, instead of inverting a matrix as implied by (10-3-3).
`Based on the minimization of the MSE at
`the output of the DFE,
`the
`Sleepest-descent algorithm takes the form
`
`C.4,=C, + AEEVI)
`
`(1-2-1)
`
`interval,
`is the vector of equalizer coefficients in the kth signal
`where C,
`E(e,Vz) is the cross-correlation of the error signal ¢, = J, —1, with V, and
`Va = [Uern, ++ Ue &e-1 -.. 4x,]',
`representing the signal values in the
`feedforward and feedbackfilters at time t= kT. The MSEis minimized when
`the cross-correlation vector E(e,Vf) =0 as k— ~.
`Since the exact cross-correlation vector is unknown at any time instant, we
`use as an estimate the vector €,V?
`and average out the noise in the estimate
`through the recursive equation
`
`C.4,=€, + Ae, VE
`
`(1-2-2)
`
`This is the LMSalgorithm for the DFE.
`
`658
`
`658
`
`
`
`
`
`6480
`
`pict aL COMMUNICATIONS
`
`Input
`
`["} A
`
`FIGURE 11-2-1
`
`Decision-feedback equalizer.
`
`As in the case of a linear equalizer, we may use a training sequence to
`adjust the coefficients of the DFE initially. Upon convergence to the (near-)
`optimum coefficients (minimum MSE), we may switch to a decision-directed
`mode where the decisions at the output of the detector are used in forming the
`error signal €, and fed to the feedback filter. This is the adaptive mode of the
`DFE,which is illustrated in Fig 11-2-1. In this case, the recursive equation for
`adjusting the equalizer coefficientis
`
`Cy. = €, + Az, VE
`
`(11-2-3)
`
`where Ey =I, —i, and V, =[Upa, wee Uy Ty tae T,-x,}.
`The performance characteristics of the LMS algorithm for the DFE are
`basically the same as the development given in Sections 11-1-3 and 11-1-4 for
`the linear adaptive equalizer.
`
`11-2-1 Adaptive Equalization of Trellis-Coded Signals
`Bandwidth efficient trellis-coded modulation that was described in Section 8-3
`is frequently used in digital communications over telephone channels to reduce
`the required SNR per bit
`for achieving a specified error rate. Channel
`distortion of the trellis-coded signal forces us to use adaptive equalization in
`order to reduce the intersymbol interference. The output of the equalizer is
`then fed to the Viterbi decoder, which performs soft-decision decoding of the
`tretlis-coded signal.
`
`659
`
`659
`
`
`
`
`
`CHAPTER i ADAPTIVE EQUALIZATION
`
`651
`
`Error signal
`
`Recetved
`
`signal
`
`samples
`
`
`
`FIGURE 11-2-2
`
`Final
`:
`.
`.
`.
`devisi
`
`Adjustment of equalizer based on tentative decisions.
`stOns
`
`The question that arises regarding such a receiver is how do we adapt the
`equalizer in a data transmission mode? Onepossibility is to have the equalizer
`make its own decisions at its output solely for the purpose of generating an
`error signal for adjusting its tap coefficients, as shown in the block diagram in
`Fig. 11-2-2. The problem with this approach is that such decisions are generally
`unreliable, since the pre-decoding coded symbol SNRisrelatively low. A high
`error rate would cause a significant degradation in the operation of the
`equalizer, which would ultimately affect the reliability of the decisions at the
`output of the decoder. The more desirable alternative is to use the post-
`decoding decisions from the Viterbi decoder, which are much morereliable, to
`continuously adapt the equalizer. This approach is certainly preferable and
`viable when a linear equalizer is used prior to the Viterbi decoder. The
`decoding delay inherent in the Viterbi decoder can be overcome by introduc-
`ing an identical delay in the tap weight adjustmentof the equalizer coefficients
`as shown in Fig. 11-2-3. The major price that must be paid for the added delay
`is that the step-size parameter in the LMS algorithm must be reduced, as
`described by Long e¢ al.
`(1987, 1989), in order to achieve stability in the
`algorithm.
`In channels with one or more in-band spectral nuils, the linear equalizeris
`
`FIGURE {1-2-3
`
`Adjustment of equalizer based on decisions from the Viterbi decoder.
`Error signal
`
`Received
`signal
`
`
`
`samples Decisions
`
`
`
`
`660
`
`660
`
`
`
`
`
`652
`
`DIGITAL COMMUNICATIONS
`
`To channel
`
`{a} Transmitter
`
`Received
`signal
`
`samples
`
` filter
`
`
`(predictor)
`(4) Receiver
`,
`FIGURE1-2-4=Use of predictive DFE with interleaving and trellis-coded modulation.
`
`intersymbol interference.
`no longer adequate for compensating the channel
`Instead, we should like to use a DFE. But the DFE requiresreliable decisions
`in its feedback filter in order to cancel out the intersymbolinterference from
`previously detected symbols. Tentative decisions prior to decoding would be
`highly unreliable and, hence, inappropriate. Unfortunately, the conventional
`DFE cannot be cascaded with the Viterbi algorithm in which post-decoding
`decisions from the decoder are fed back to the DFE,
`Onealternative is to use the predictive DFE described in Section 10-3-3. In
`order to accommodate for the decoding delay as it affects the linear predictor,
`we introduce a periodic interleaver/deinterleaver pair that has the same delay
`as the Viterbi decoder and, thus, makesit possible to generate the appropriate
`error signal to the predictor asillustrated in the block diagram of Fig. 11-2-4.
`The novel way in which a predictive DFE can be combined with Viterbi
`decoding to equalize trellis-coded signals is described and analyzed by
`Eyuboglu (1988). This same idea has been carried over to the equalization of
`fading multipath channels by Zhou et ai. (1988, 1990), but the structure of the
`DFE was modified to use recursive least-squares lattice-type filters, which
`provide faster adaptation to the time variations encountered in the channel.
`
`11-3 AN ADAPTIVE CHANNEL ESTIMATOR
`FOR ML SEQUENCE DETECTION
`The ML sequence detection criterion implemented via the Viterbi algorithm as
`embodied in the metric computation given by (10-1-23) and the probabilistic
`symbol-by-symbol detection algorithm described in Section 5-1-5 require
`knowledge of the equivalent discrete-time channelcoefficients {f,}. To accom-
`modate a channel that is unknown or slowly time-varying, one may include a
`
`661
`
`661
`
`
`
`
`
`CHAPTER I ADAPTIVE EQUALIZATION
`
`653
`
`characteristics for the Viterbi algorithm.
`
`Block diagram of method for estimating the channel
`
`input
`
`FIGURE Li-3-1
`
`channei estimator connected in paralle! with the detection algorithm, as showa
`in Fig. 11-3-1. The channel estimator, which is shown in Fig. 11-3-2 is identical
`in structure to the linear transversal equalizer discussed previously in Section
`11-1. In fact, the channel estimator is a replica of the equivalent discrete-time
`channe] filter that models the intersymbol interference. The estimated tap
`coefficients, denoted by {f,}, are adjusted recursively to minimize the MSE
`between the actual received sequence and the output of the estimator. For
`example,
`the steepest-descent algorithm in a decision-directed mode of
`operation is
`
`(11-3-1)
`fei =f, + Ae,it
`wheref, is the vector of tap gain coefficients at the kth iteration, A is the step
`size, &, =v, — 0, is the error signal, and I, denotes the vector of detected
`information symbols in the channel estimator at the kth iteration.
`We now show that when the MSE between vu, and @,
`is minimized, the
`resulting values of the tap gain coefficients of the channel estimator are the
`values of the discrete-time channel model. For mathematical tractability, we
`assume that
`the detected information sequence {/,} is correct,
`i.e., {7,} is
`
`FIGUREI1-3-2
`
`Adaptive transversalfilter for estimating the channel dispersion.
`
`662
`
`662
`
`
`
`
`
`654
`
`DIGITAL COMMUNICATIONS
`
`to the transmitted sequence {/,}. This is a reasonable assumption
`identical
`when the system is operating at a lowprobability of error. Thus,
`the MSE
`beiween the received signal v, and the estimate d, is
`“1
`2
`
`y=0
`
`(11-3-2)
`
`
`
`J(f) = E{ u,- > fie; )
`The tap coefficients {f,} that minitnize J(f) in (11-3-2) satisfy the set of N linear
`equations
`
`wed
`> fds = dy,
`grou
`
`k=0,1,..., N-1
`
`(11-3-3)
`
`where
`
`b= EIS).
`
`Nw 1
`de = 3 fb;
`0
`
`(1-3-4)
`
`From (11-3-3) and (11-3-4), we conclude that, as long as the information
`sequence{/,} is uncorrelated, the optimum coefficients are exactly equal to the
`respective values of the equivalent discrete-time channel. It
`is also apparent
`that when the number of taps N in the channel estimator is greater than or
`equal
`to L+1,
`the optimum tap gain coefficients {f,} are equal
`to the
`respective values of the {f,}, even when the information sequenceis correlated.
`Subject to the above conditions, the minimum MSEis simply equal to the
`noise vartance Np.
`In the above discussion, the estimated information sequence at the output of
`the Viterbi algorithm or the probabilistic symbol-by-symbol algorithm was
`used in making adjustments of the channel estimator. For startup operation,
`one may send a short training sequence to perform theinitial adjustment of the
`lap coefficients, as
`is usually done in the case of the linear
`transversal
`equalizer. In an adaptive mode of operation, the receiver simply uses its own
`decisions to form an errorsignal.
`
`11-4 RECURSIVE LEAST-SQUARES ALGORITHMS
`FOR ADAPTIVE EQUALIZATION
`The LMSalgorithm that we described in Sections 11-1 and 11-2 for adaptively
`adjusting the tap coefficients of a linear equalizer or a DFE.
`is basically a
`(stochastic) steepest-descent algorithm in which the true gradient vector is
`approximated by an estimate obtained directly from the data.
`The major advantage of the steepest-descent algorithm lies in its computa-
`tional simplicity. However,
`the price paid for the simplicity is slow conver-
`gence, especially when the channel characteristics result in an autocorrelation
`mairix T whose eigenvalues have a large spread,i.€., Amax/Amin >> 1. Viewed in
`another way,the gradient algorithm has only a single adjustable parameter for
`
`«a
`
`663
`
`663
`
`
`
`
`
`CHAPTER Ii; ADAPTIVE EQUALIZATION
`
`655
`
`controlling the convergence rate, namely, the parameter 4. Consequently the
`slow convergence is due to this fundamentallimitation.
`In order to obtain faster convergence,it is necessary to devise more complex
`algorithms involving additional parameters. In particular, if the matrix TI’
`is
`NXWNand has eigenvalues A,,A2,...,Ay, we may use an algorithm that
`contains N parameters—one for each of the eigenvalues. The optimum
`selection of these parameters to achieve rapid convergence is a topic of this
`section.
`In deriving faster converging algorithms, we shall adopt a least-squares
`approach. Thus, we shall deal directly with the received data in minimizing the
`quadratic performance index, whereas previously we minimized the expected
`value of the squared error. Put simply, this means that the performance index
`is expressed in terms of a time average instead ofastatistical average.
`It
`is convenient to express the recursive least-squares algorithms in matrix
`form. Hence, we shall define a number of vectors and matrices that are needed
`in this development.
`In so doing, we shall change the notation slightly.
`Specifically, the estimate of the information symbol at time 1, where ¢ is an
`integer, from a linear equalizer is now expressed as
`x
`Kny= DY o(t- Dy;
`i=
`
`By changing the index j on ci—1) to run from j=0 to j=N-1 and
`simultaneously defining
`
`the estimate I(r) becomes
`
`YO) = UK
`
`IW = ¥, o(t- Dye)
`
`fs0
`
`= CUE - DYy(t)
`
`(11-4-1)
`
`the
`the column veciors of
`respectively,
`where C,(t-1) and Y¥,(t) are,
`equalizer coefficients c(t—1), 7 =0,1,...,N— 1, and the input signals y(r —
`/),7=0,1,2,...,N—1.
`Similarly, in the decision-feedback equalizer, we have tap coefficients c,(t),
`j=0,1,...,M-1, where the first K, +1 are the coefficients of the feedfor-
`ward filter and the remaining K,=N-—K,~—1 are the coefficients of the
`feedback filter. The data in the estimate I(r) is vs,).... Vrs Laas Lewy
`where /,_,, 1<j/ <K,, denote the decisions on previously detected symbols. In
`this development, we neglect the effect of decision errors in the algorithms.
`Hence, we assume that Lj,=1,.;, 1%) K,. For notational convenience, we
`also define
`
`;
`
`ye-jp={ k,-j
`+k,-;
`liak,-;
`
`i)
`(OS;
`(OS/=K
`(Ki<j<N-1)
`
`(1-4-2)
`
`664
`
`664
`
`
`
`
`
`656
`
`DIGITAL COMMUNICATIONS
`
`Thus,
`
`¥x)=[¥Q yt-1) ... ya N+)
`
`= {e246
`
`tee Mer
`
`Y&
`
`fap
`
`oe
`
`I. x,J
`
`(11-4-3)
`
`11-4-1 Recursive Least-Squares (Kalman) Algorithm
`The recursive least-squares (RLS) estimation of /(t} may be formulated as
`follows. Suppose we have observed the vectors Y,(r), n =0,1,...,¢, and we
`wish to determine the coefficient vector C,(t) of the equalizer (linear or
`decision-feedback) that minimizes the time-average weighted squared error
`
`ty = > wenn, t)P
`
`h=0
`
`(11-4-4)
`
`where the error is defined as
`
`ev(n, £) = I(n) — CMNYa(n)
`
`(11-4-5)
`
`and w represents a weighting factor 0<w <1. Thus we introduce exponential
`weighting into past data, which is appropriate when the channel characteristics
`are time-variant. Minimization of &x* with respect
`to the coefficient vector
`C,(¢) yields the set of linear equations
`
`where R,(f) is the signal correlation matrix defined as
`
`Ry()Cy(t) = Dye)
`
`R,(1) = > wo"Wen)¥5(n)
`
`and D,(r) is the cross-correlation vector
`
`Dv) = >w"HayWh(n)
`
`The solution of (11-4-6) is
`
`Cyt) = Ry'()Dv0)
`
`(11-46)
`
`(11-4-7)
`
`(1i-4-8)
`
`(11-4-9)
`
`The matrix Ry(¢) is akin to the statistical autocorretation matrix I", while
`the vector D,(t) is akin to the cross-correlation vector &,, defined previously.
`We emphasize, however, that Rxa(f) is not a Toeplitz matrix. We also should
`mention that, for small values of 4, Ry(r) may be ill conditioned: hence,it is
`customary to initially add the matrix 51, to Ry(), where 5 is a small positive
`
`665
`
`665
`
`
`
`
`
`
`CHAPTER 11; ADAPTIVE EQUALIZATION
`
`657
`
`constant and I, is the identity matrix. With exponential weighting into the
`past, the effect of adding 61, dissipates with time.
`Now suppose we have the solution (11-4-9) for time ¢-1, ie., Cy(t— 1),
`and we wish to compute C,(¢). It is inefficient and, hence, impractical to solve
`the set of N linear equations for each,new signal component that is received.
`To avoid this, we proceed as foliows. First, R,(f) may be computed recursively
`as
`
`Ry(t) = wRAt — 1) + VEC)YM)
`
`(11-4-10)
`
`Wecall (11-4-10) the tme-update equation for R,(?).
`Since the inverse of Ry(r) is needed in (11-4-9), we use ithe matrix-inverse
`identity
`
`Ry'(t — YAY MOR '( — 2)
`-
`1[,-
`(1) -—*
`Ry'(t -=|
`OE RD SYMORAE DYRO
`
`Thus Ry'(t) may be computed recursively according to (11-4-11).
`For convenience, we define P(t) = Rx '(s). It is also convenient to define an
`N-dimensional vector, called the Kaiman gain vector, as
`
`(11-4-11)
`
`K,(t) wt ptt) Pv(i— 1)¥M(1)
`where y,(/) is a scalar defined as
`
`_
`
`*
`
`-4-
`
`(11-4-12)
`
`=
`
`1
`
`_
`
`H(t) = ¥NO)Pa(t — LYM)
`
`(11-4-13)
`
`With these definitions, (11-4-11} becomes
`
`I
`Pv(t) = y [Pvt — 1) — KYMPv(t — 13]
`
`(11-4-14)
`
`Suppose we postmultiply both sides of (11-4-14) by
`¥%(r). Then
`PACOYR(D) = “(Pn — IVR) — KACOY¥MOPA(t — TVA]
`= {lw + py(OIKult) — Kyun}
`(11-4-15)
`= Ky(‘)
`Therefore, the Kalman gain vector mayalso be defined as Py(t)¥y(1).
`Now we use the matrix inversion identity to derive an equation for
`obtaining C,,(t) from C,(t — 1). Since
`
`and
`
`Crt) = PACDA(t)
`
`Dy(t) = wt — 1) + OX)
`
`(11-4-16)
`
`666
`
`666
`
`
`
`
`
`658
`
`piGirAl COMMUNICATIONS
`
`we have
`
`Cr) = © (Pat — 1) KACOVACOPA(CF DfDale — 1) + FGYKRCO]
`= Pe — De 1) + ~ AOR. - DYED)
`
`{11-4-17)
`
`(11-4-18)
`
`(11-4-19)
`
`—KAto¥y\coPat — 1D, Ce - 1)
`
`I5
`
`OI COVSCOP HY
`= Cyt 1) + Ky (AMO — YUES(t 1)]
`
`Note that ¥\(¢C.(2 — 1) is the output of the equalizer at time 7, ie..
`7) = YAMANE~ 1)
`
`and
`
`est 1) 40) - FO) = es)
`
`is the error between the desired symbol and the estimate. Hence, C, (1)
`updated recursively according to the relation
`
`is
`
`Cy) = x(t ~ 1) + Ky(dent)
`
`(11-4-20)
`
`The residual MSE resulting from this optimization is
`
`EeSmin = Dw RP — CODE)
`n=
`
`(11-4-21)
`
`To summarize. suppose we have Cy(t—1) and Py(¢~- 1) When a new
`signal componentis received, we have Y,(¢). Then the recursive computation
`for the time update of C,(/) and Pfr) proceeds as follows:
`
`* compute output:
`
`1) = Wines ~ 1)
`
`* compute error:
`
`ext) = f(t) ~ fn)
`* compute Kalman gain vector:
`
`K,(0) =
`
`Pye LYM)
`wt YMOPS(t ~ DVR)
`* update inverse of the correlation matrix:
`1
`Pat) =~ [Pat ~ 1) ~ Ke)¥x(t)Pa(? ~ 1)]
`* update coefficients:
`
`Cat) = Cyt - 1) + Ken (1)
`= Cy(t - 1) + PAY*(HNex(0)
`
`(11-4-22)
`
`667
`
`667
`
`
`
`
`
`CHAPTER 11; ADAPTIVE EQUALIZATION
`
`659
`
` Comparison of convergence rate for the
`
`FIGURE 11-41
`
`Kalman and gradient algorithms.
`
`“~
`
`100
`
`4200
`
`400
`400
`Number of iterations
`
`500
`
`600
`
`700
`
`The algorithm described by (11-4-22) is called the RLS direct form or Kalman
`algorithm. It is appropriate when the equalizer has a transversal (direct-form)
`structure.
`Note that the equalizer coefficients change with time by an amount equal to
`the error e,(1) multiplied by the Kalman gain vector K,(t). Since Ky(f) is
`N-dimensional, each tap coefficient
`in effect
`is controlled hy one of the
`elements of K,(t}). Consequently rapid convergence is obtained. In contrast,
`the steepest-descent algorithm, expressed in our present notation,is
`
`Cy(t) = Cn(t — 1) + AYRNen(1)
`
`(11-4-23)
`
`and the only variable parameteris the step size A.
`Figure 11-4-1 illustrates the initial convergence rate of these two algorithms
`for a channel with fixed parameters f = 0.26, f, = 0.93, f£, = 0.26, and a linear
`equalizer with 11 taps. The eigenvalue ratio for this channel is Ajux/Amm = 1.
`All
`the equalizer coefficients were initialized to zero. The steepest-descent
`algorithm was implemented with A= 0.020. The superiority of the Kalman
`algorithm is clearly evident. This is especially important
`in tracking a
`time-variant channel. For example, the time variations in the characteristics of
`an (ionospheric) high-frequency (HF) radio channel are too rapid to be
`equalized by the gradient algorithm, but
`the Kalman algorithm adapts
`sufficiently rapidly to track such variations.
`the Kalman algorithm
`In spite of
`its superior tracking performance,
`described above have two disadvantages. Oneis its complexity. The second is
`its sensitivity to roundoff noise that accumulates due to the recursive
`computations. The latter may cause instabilities in the algorithm.
`The number of computations or operations (multiplications, divisions, and
`subtractions) in computing the variables in (11-4-22) is proportional to N’.
`Mostof these operations are involved in the updating of P(t). This part of the
`computation is ‘also susceptible to roundoff noise. To remedy that problem,
`algorithms have been developedthat avoid the computation of P(t) according
`to (11-4-14). The basis of these algorithmslies in the decomposition of P,{t) in
`the form
`
`Prult) = Sw(QAn(NSiA0)
`
`(11-4-24)
`
`668
`
`668
`
`
`
`
`
`660=pIGITAL COMMUNICATIONS
`
`where §,(t) is a lower-triangular matrix whose diagonal elements are unity,
`and A,(t) is a diagonal matrix, Such a decomposition is called a square-root
`factorization (see Bierman, 1977). This factorization is described in Appendix
`D. In a square-root algorithm, P,{¢) is not updated as in (11-4-14) noris it
`computed. Instead, the time updating is performed on S,(t) and A,(r).
`Square-root algorithms are frequently used in control systems applications
`in which Kalman filtering is involved. In digital communications, the square-
`root Kalman aigorithm has been implemented in a decision-feedback-equalized
`PSK modem designed to transmit at high speed over HF radio channels with a
`nominal 3kHz bandwidth. This algorithm is described in the paper by Hsu
`(1982). It has a computational complexity of 1.5N? + 6.5N (complex-valued
`multiplications and divisions per output symbol), It is also numerically stable
`and exhi