`
`
`
`On Fast FIR Filters Implemented as Tail-Canceling
`IIR Filters
`
`Avery Wang, Julius O. Smith III
`
`Abstract
`
`We have developed an algorithm based on synthetic division for deriving the transfer function which cancels the tail of a
`given arbitrary rational (IIR) transfer function after a desired number of time steps. Our method applies to transfer functions
`with repeated poles, whereas previous methods of tail-subtraction cannot [ ]. We use a parallel state-variable technique with
`periodic refreshing to induce (cid:12)nite memory in order to prevent accumulation of quantization error in cases where the given transfer
`function has unstable modes. We present two methods for designing linear-phase Truncated IIR (TIIR) (cid:12)lters: one based on the
`addition and the other on the cascading of anti-phase (cid:12)lters. We explore (cid:12)nite-register e(cid:11)ects for unstable modes and provide
`bounds on the maximum TIIR (cid:12)lter length. In particular, we show that for unstable systems the available dynamic range of the
`registers must be three times that of the data. Considerable computational savings over conventional FIR (cid:12)lters are attainable for
`a given speci(cid:12)cation of linear-phase (cid:12)lter. We provide examples of (cid:12)lter design. We show how to generate (cid:12)nite-length polynomial
`impulse responses using TIIR (cid:12)lters. We list some applications of TIIR (cid:12)lters, including uses in digital audio and an algorithm
`for e(cid:14)ciently implementing Kay’s optimal high-resolution frequency estimator [].
`
`I. Introduction
`
`In(cid:12)nite impulse response (IIR) recursive linear digital (cid:12)lters are widely used because of their low computational cost
`and low storage overhead requirements. Finite impulse response (FIR) (cid:12)lters, on the other hand, allow the possibility
`of implementing linear-phase linear digital (cid:12)lters which have constant group delay across all frequencies. The tradeo(cid:11)
`is that to achieve similar magnitude transfer functions, FIR (cid:12)lters usually require much larger (cid:12)lter orders than their
`IIR counterparts. For example, a general N -th order FIR (cid:12)lter requires N + multiplies and N adds.
`In certain
`cases, however, FIR (cid:12)lters may be designed which have an operation count comparable to that of an IIR (cid:12)lter while
`maintaining the linear phase property.
`The set of (cid:12)nite impulse responses which may be e(cid:14)ciently implemented includes those which are truncated IIR
`(TIIR) sequences of low order. Although the following results were derived independently of Saram(cid:127)aki and Fam [ , ],
`the idea of using truncated IIR (cid:12)lters to generate linear-phase (cid:12)lters was originally introduced by Fam [ ]. Fam and
`Saram(cid:127)aki [ , ] deal with unstable hidden modes due to pole-zero cancellations outside of the unit circle by employing
`a switching and resetting algorithm to reduce the e(cid:11)ects of quantization error buildup. We introduce a slightly more
`e(cid:14)cient version of this idea, as well as an error analysis.
`In this paper, we describe an algorithm for the e(cid:14)cient implementation of certain classes of FIR (cid:12)lters. We introduce
`an extension of the TIIR algorithm which allows the truncation of arbitrary IIR (cid:12)lter tails, and provides a way to
`implement polynomial impulse responses. Additionally, we present an analysis of the e(cid:11)ects of limited numerical precision
`and provide design guidelines for designing systems with acceptable noise tolerance.
`
`A general causal N -th order FIR (cid:12)lter consists of a tapped delay line N elements long and a table of N + impulse
`response coe(cid:14)cients fh ; : : : ; hNg such that at each time step n the output
`
`II. Definitions
`
`( )
`
`()
`
`( )
`
`hkx[n (cid:0) k]
`
`NXk
`
`=
`
`y[n] =
`
`is formed. The transfer function has the simple form
`
`=
`
`h + h z(cid:0) + : : : + hN z(cid:0)N
`= z(cid:0)N C(z);
`
`HFIR(z)
`
`where C(z) is the N -th degree polynomial formed by the hk. On the other hand, a causal P -th order IIR (cid:12)lter has the
`relation
`
`()
`
`b‘x[n (cid:0) ‘]:
`
`PX‘
`
`=
`
`aky[n (cid:0) k] +
`
`PXk
`
`=
`
`y[n] =(cid:0)
`
`This work was supported in part by a National Science Foundation Graduate Fellowship
`
`Ex. 1033 / Page 1 of 17
`Apple v. Saint Lawrence
`
`
`
`
`
`IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. , NO. , PP. { , JUNE
`
`The corresponding transfer function is
`
`()
`
`()
`
`()
`
`()
`
`( )
`
`b + b z(cid:0) + : : : + bP z(cid:0)P
` +a z(cid:0) + : : : + aP z(cid:0)P
`B(z)
`A(z)
`
`h + h z(cid:0) + hz(cid:0) + : : : ;
`
`=
`
`=
`
`=
`
`HIIR(z)
`
`A(z) =z P + a zP (cid:0) + : : : + aP
`
`where both
`
`and
`
`B(z) =b zP + b zP (cid:0) + : : : + bP
`may be assumed to be relatively prime P -th degree polynomials in z, and A(z) is monic by construction.
`Group delay [] is de(cid:12)ned by
`
`d argfH(ej!)g
`(cid:0)
`d!
`The group delay at normalized frequency ! = (cid:25)f =fs, where fs is the sampling frequency, is the number of samples of
`delay experienced by the amplitude envelope of a narrow-band input signal centered at !.
`A linear-phase (cid:12)lter is one such that the phase response at a given frequency is a linear function of frequency, i.e.,
`
`:
`
`=
`
`(cid:14)(!)
`
`( )
`
`arg(cid:8)H(ej!)(cid:9) = K ! + K for some constants K and K. From this property we see immediately that the group delay
`
`is constant for all frequencies. Filters with linear phase response are often desirable because they have no frequency-
`dependent temporal distortion. A stable IIR (cid:12)lter with non-zero poles cannot have linear phase. However, an FIR (cid:12)lter
`with coe(cid:14)cients fh ; : : : ; hNg has linear phase if there exists a such that for all k f ; : : : ; Ng
`hN (cid:0)k = ej h(cid:3)
`k;
`
`( )
`
`i.e., if the reversed coe(cid:14)cients are the complex conjugates of the forward sequence plus a constant phase shift []. The
`group delay is then
`
`( )
`
`:
`
`N
`
`(cid:14)(f ) =
`
`III. Truncated IIR (TIIR) Filters
`Consider an FIR (cid:12)lter having a truncated geometric sequence fh ; h p; : : : ; h pNg as an impulse response. This (cid:12)lter
`has the same impulse response for the (cid:12)rst N + terms as the one-pole IIR (cid:12)lter with transfer function
`
`If we subtract o(cid:11) the tail of the impulse response we obtain
`HFIR(z) =h + h pz(cid:0) + : : : + h pN z(cid:0)N
` (cid:0) pN + z(cid:0)(N + )
` (cid:0) pz(cid:0)
`
`HIIR(z) =
`
`h
`
` (cid:0) pz(cid:0) :
`
`= h
`
`:
`
`The time-domain recursion for this (cid:12)lter is
`
`( )
`
`( )
`
`( )
`
`( )
`
`( )
`
`NXk
`
`=
`
`y[n] =
`
`h pkx[n (cid:0) k]
`= py[n (cid:0) ] + h (cid:0)x[n] (cid:0) pN + x[n (cid:0) (N + )](cid:1) :
`
`We see that the (cid:12)rst formulation, Eqn. ( ), requires N + multiplies and N adds to implement directly, whereas the
`second formulation, Eqn. ( ), requires only multiplies and adds, independent of N . Thus we see that if we can
`represent an FIR sequence as a truncated exponential sequence, a tremendous savings in computation can be achieved.
`Note that the x[n(cid:0) (N + )] term in Eqn. ( ) still requires a delay line to be maintained, and thus there are no savings
`in storage. With modern digital signal processing (DSP) chips, however, ring bu(cid:11)ers may be implemented with virtually
`no computational overhead and the full savings in computational cost may be achieved.
`Notice that there is a pole-zero cancellation in the representation given by Eqn. ( ). If jpj < there is no problem
`since the system is inherently stable. If jpj (cid:21) , however, then there is a potential problem due to the hidden mode. We
`will deal with this in Section IV where we will see how to run TIIR (cid:12)lters with unstable modes.
`The idea of this section was used by Fam [ ], where a partial fraction expansion of a transfer function is taken and
`each mode is truncated separately using Eqn. ( ). This method works only for cases in which the multiplicity of each
`pole is one, so that the each mode exhibits a simple exponential decay.
`
`Ex. 1033 / Page 2 of 17
`
`
`
`WANG AND SMITH: ON FAST FIR FILTERS IMPLEMENTED AS TAIL-CANCELING IIR FILTERS
`
`
`
`A. Extension to Higher-Order TIIR Sequences
`
`We may extend the idea illustrated in the previous section for the one-pole case to computing the TIIR sequence of
`any rational H(z). The general procedure is to (cid:12)nd the \tail" IIR transfer function
`
`( )
`
`H
`IIR(z) =h z(cid:0) + h z(cid:0) + : : :
`
`
`hN + z(cid:0) + hN +z(cid:0) + : : :
`
`=
`
`whose impulse response, except for a time shift of N steps, matches the tail of the transfer function HIIR(z) which we
`would like to truncate after time step N .
`We multiply Eqn. () by zN and obtain
`
`zN HIIR(z) = h zN + : : : + hN (cid:0) z + hN
`+ hN + z(cid:0) + hN +z(cid:0) + : : :
`= C(z) + H
`IIR(z)
`zN B(z)
`A(z)
`
`=
`
`= C(z) +
`
`B (z)
`A(z)
`
`;
`
`( )
`
`( )
`
`( )
`
`()
`
`where Deg fB (z)g < Deg fA(z)g = P . We may assume that Deg fB (z)g = P (cid:0) . B (z) is unique and may be obtained
`by performing synthetic division on zN B(z) by A(z) and (cid:12)nding the remainder. Thus, zN B(z) (cid:17) B (z)
`(mod A(z)).
`
`Once we have obtained B (z), we have H IIR(z) =B (z)=A(z) and we may write
`HFIR(z) = HIIR(z) (cid:0) z(cid:0)N H
`IIR(z)
`B(z) (cid:0) z(cid:0)N B (z)
`=
`A(z)
`
`:
`
`( )
`
`()
`
`The corresponding system is
`
`()
`
`b‘x[n (cid:0) ‘]
`
`PX‘
`
`=
`
`aky[n (cid:0) k] +
`
`PXk
`
`=
`
`y[n] = (cid:0)
`
`P (cid:0)
`
`Xm=
`
`(cid:0)
`
`b
`mx[n (cid:0) m (cid:0) (N + )];
`
`using the representation in Eqn. ().
`The fact that the denominators of the transfer functions HIIR(z) and H
`IIR(z) are the same allows additional savings
`in computational cost due to the fact that the original IIR and tail IIR dynamics are the same and do not need to
`be performed twice. The term z(cid:0)(N + )z(cid:0)(P (cid:0) )B (z) serves to zero out the dynamics at the end of the delay line and
`requires only an additional P multiplies and P (cid:0) adds. The computational cost of this general truncated P -th order
`IIR system is P + multiplies and P (cid:0) adds, independent of N . Thus, a net computational savings with this class
`of FIR (cid:12)lters is achieved if N > P .
`The storage costs for this (cid:12)lter are P output samples for the IIR feedback dynamics, N input samples of the FIR
`(cid:12)lter, and an additional P input samples for the tail-cancellation dynamics, yielding N + P input delay samples, of
`which only the (cid:12)rst and last P are used, and P output delay samples. Thus, the fast FIR algorithm requires P more
`storage samples than a direct FIR implementation.
`As in the previous section, we observe cautiously that the e(cid:11)ect of subtracting the tail IIR response z(cid:0)N H
`IIR(z) from
`HIIR(z) is to cancel all the poles in Eqn. (), which is to be expected since an FIR (cid:12)lter is an all-zero (cid:12)lter.
`
`B. Other Architectures
`
`The direct implementation speci(cid:12)ed by Eqn. () may notbe desirable for various reasons. For example, one may
`choose to use a factored structure such as the cascaded biquad or the parallel partial fraction form. The latter form is
`given as
`
`H(z) =
`
`Np
`
`Xk=
`
`Mk
`
`X‘=
`
`Ck;‘
`
`( (cid:0) pkz(cid:0) )‘ ;
`
`()
`
`Ex. 1033 / Page 3 of 17
`
`
`
`
`
`IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. , NO. , PP. { , JUNE
`
`where Np is the number of distinct poles, and Mk is the multiplicity of the k-th pole. The (k; ‘) term of the partial
`fraction expansion corresponds to a (cid:12)lter with impulse response
`
`hk;‘[n] =C k;‘(cid:18)n + ‘ (cid:0)
`‘ (cid:0) (cid:19)pn
`
`k :
`
`()
`
`To form the TIIR (cid:12)lter, a tail IIR (cid:12)lter is derived for each partial fraction using synthetic division as outlined in
`Section III-A. Each TIIR response is calculated separately, and the results are added together to form the complete
`response. The factorization need not be as complete as outlined in Eqn. (). One may choose an intermediate level of
`factorization, leaving some factors lumped together and others separated from each other. For example, one may want
`to group complex-conjugate pairs together to avoid complex arithmetic. Alternatively, one may wish to leave terms with
`the same poles together as in
`
`Bk(z)
`Xk=
`(z (cid:0) pk)Mk
`since calculating the tail IIR response for each n-th order multiplicity term yields a degree n (cid:0) polynomial numerator
`anyway. The impulse response of the k-th partial fraction in this case is
`
`Np
`
`H(z) =
`
`;
`
`()
`
`hk[n] =
`
`Mk
`
`X‘=
`
`bk;‘(cid:18)n (cid:0) ‘ + Mk (cid:0) (cid:19)pn(cid:0)‘
`
`
`Mk (cid:0)
`
`k
`
`( )
`
`where the bk;‘; ‘ = ; : : : ; Mk are the coe(cid:14)cients of Bk(z).
`Another example is to group together the stable factors, i.e., those with poles pk such thatjp kj < , and implement
`separately the unstable poles for which jpkj > . These strategies will become useful in Section V.
`IV. Unstable Hidden Modes
`
`n (cid:17)
`
`(mod N )
`
`n (cid:17)
`
`(mod N )
`
`Primary TIIR Filter
`
`w [k] w [k (cid:0) ];
`w [ ] x[n]
`q [k] q [k (cid:0) ];
`
`k = N + P; : : : ;
`
`k = P; : : : ;
`
`b‘w [‘]
`
`PX‘
`
`=
`
`PXk
`
`=
`P (cid:0)
`
`q [ ] (cid:0)
`
`ak q [k] +
`
`w [m + N + ]
`
` m
`
`b
`
`Xm=
`
`(cid:0)
`
`y[n] q [ ]
`
`k = ; : : : ; P (cid:0)
`w [N + k] ;
`w [k] w [k (cid:0) ];
`k = N; : : : ;
`w [ ] x[n]
`q [k] q[k (cid:0) ];
`
`k = P; : : : ;
`
`b‘w [‘]
`
`PX‘
`
`=
`
`ak q [k] +
`
`PXk
`
`q [ ] (cid:0)
`
`=
`y[n] q [ ]
`
`Auxiliary TIIR Filter
`
`w[k] w[k (cid:0) ];
`w[ ] x[n]
`q[k] q[k (cid:0) ];
`
`k = P; : : : ;
`
`k = P; : : : ;
`
`b‘w[‘]
`
`PX‘
`
`=
`
`ak q[k] +
`
`PXk
`
`=
`
`q[ ] (cid:0)
`
`k = ; : : : ; P (cid:0)
`
`w[k] ;
`w[ ] x[n]
`k = ; : : : ; P (cid:0)
`q[k] ;
`q[ ] b w[ ]
`
`Fast FIR algorithm for an unstable Truncated IIR (TIIR) filter.
`
`TABLE I
`
`We now address the issue of pole-zero cancellation and the resulting hidden modes in the fast FIR algorithm. Although
`the na(cid:127)(cid:16)ve Fast FIR algorithm of Eqn. () works in theory, there is the practical matter of quantization error due to
`(cid:12)nite register lengths when dealing with truncated unstable IIR systems, i.e., those with poles pk such that jpkj > .
`Consider the system in equation ( ). We may model quantization error as an independent, identically distributed (IID)
`
`Ex. 1033 / Page 4 of 17
`
`
`
`WANG AND SMITH: ON FAST FIR FILTERS IMPLEMENTED AS TAIL-CANCELING IIR FILTERS
`
`
`
`n (cid:17)
`
`(mod N )
`
`n (cid:17)
`
`(mod N )
`
`Primary TIIR Filter
`
`k = N + P; : : : ;
`
`k = P; : : : ;
`
`q [P ]
`
`(cid:3) P
`
`a
`
`(cid:0)
`
`q [P (cid:0) k]
`
`(cid:3) P
`
`a
`
`w [P (cid:0) (cid:0) m]
`
` m(cid:3)
`
`b
`
`(cid:3) P
`
`a
`
`w [N + P (cid:0) ‘]
`
`(cid:3) ‘
`
`b
`
`(cid:3) P
`
`a
`
`Xk=
`Xm=
`
`P (cid:0)
`
`+
`
`PX‘
`
`=
`
`w [k] w [k (cid:0) ];
`w [ ] x[n]
`q [k] q [k (cid:0) ];
`P (cid:0)
`
`(cid:3) k
`
`q [ ] (cid:0)
`
`a
`
`(cid:0)
`
`y[n] q [ ]
`
`Auxiliary TIIR Filter
`
`k = ; : : : ; P (cid:0)
`w [N + k] ;
`w [k] w [k (cid:0) ];
`k = N; : : : ;
`w [ ] x[n]
`q [k] q[k (cid:0) ];
`P (cid:0)
`
`k = P; : : : ;
`
`q [P (cid:0) k]
`
`(cid:3) k
`
`q [P ]
`
`(cid:3) P
`
`a
`
`(cid:0)
`
`(cid:3) P
`
`a
`
`q [ ] (cid:0)
`
`a
`
`w [P (cid:0) (cid:0) m]
`
` m(cid:3)
`
`b
`
`(cid:3) P
`
`a
`
`P (cid:0)
`
`+
`
`Xk=
`Xm=
`
`w[k] ;
`w[ ] x[n]
`q[k] ;
`b
`
` P
`
`(cid:3)
`
`k = ; : : : ; P (cid:0)
`
`k = ; : : : ; P (cid:0)
`w[ ]
`
`(cid:0)
`
`q[ ]
`
`w[P (cid:0) (cid:0) m]
`
` m(cid:3)
`
`b
`
`(cid:3) P
`
`a
`
`P (cid:0)
`
`+
`
`Fast FIR algorithm for a reversed unstable Truncated IIR (TIIR) filter derived from a stable TIIR filter.
`
`TABLE II
`
`
`noise signal (cid:23)[n] with zero mean and variance (cid:27)(cid:23) input to the system of Eqn. ( ). The output equation is then
`y[n] =py [n (cid:0) ] + h (cid:0)x[n] (cid:0) pN + x[n (cid:0) (N + )](cid:1) + (cid:23)[n]
`Xk=
`pkx[n (cid:0) k] +
`pk(cid:23)[n (cid:0) k]:
`
`N (cid:0)
`
`= h
`
`nXk
`
`=
`
`Thus, the accumulated noise has zero mean, but its variance grows as
`
`(cid:23)
`
`nXk
`
`=
`
`(cid:27)
`acc[n] =
`
`jpjk(cid:27)
` (cid:0) jpj(n+ )
` (cid:0) jpj
`
`= (cid:27)
`(cid:23)
`
`( )
`
`( )
`
`( )
`
`( )
`
`( )
`
`so that
`
`lim
`n!
`
`(cid:27)
`
`acc[n] = <
`:
`
`+ ;
`(cid:27)
`(cid:23)
` (cid:0) jpj ;
`
`jpj (cid:21)
`jpj < .
`For a higher-order system, as described in Section III-A, this analysis gives an estimate of the noise accumulation behavior
`if p is the largest-magnitude pole, since its dynamics dominates the behavior of the system. A direct implementation of
`an FIR system, such as in equation ( ) does not have noise accumulation problems because of its (cid:12)nite memory. We
`trade computational cost for noise sensitivity in TIIR systems.
`
` Finite-register e(cid:11)ects may be modeled as the sum of q uniformly distributed IID [(cid:0)(cid:15); +(cid:15)], where (cid:15) > is the smallest quantity such
`that x + (cid:15) = x in machine arithmetic, and q is the number of additive terms. If q is relatively large, the total noise may be modeled as a
`Gaussian distribution. For (cid:12)xed-point arithmetic, (cid:15) is constant x. Floating point arithmetic presents di(cid:14)culties since the e(cid:11)ective (cid:15) varies
`proportionally to the magnitude of x. Assuming independence and (cid:12)xed-point arithmetic, we may model quantization noise as a Gaussian
`
`random variable with variance (cid:27)
`= q(cid:15)
`= . A thorough analysis of quantization noise statistics is beyond the scope of this paper.
`
` (cid:23)
`
`k = P; : : : ;
`
`k = P; : : : ;
`
`q[P ]
`
`(cid:3) P
`
`a
`
`(cid:0)
`
`q[P (cid:0) k]
`
`(cid:3) P
`
`a
`
`(cid:3) k
`
`a
`
`Xk=
`Xm=
`
`w[k] w[k (cid:0) ];
`w[ ] x[n]
`q[k] q[k (cid:0) ];
`P (cid:0)
`
`q[ ] (cid:0)
`
`(cid:3) P
`
`a
`
`Ex. 1033 / Page 5 of 17
`
`
`
`
`
`IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. , NO. , PP. { , JUNE
`
`Recalling the discussion in Section III-B, we may factor the unstable modes of a system apart from the transfer
`function so that
`
`H(z) = H #(z) +H "(z)
`B"(z)
`B#(z)
`A#(z)
`A"(z)
`where the \#" and \"" superscripts denote \stable hidden modes" and \unstable hidden modes," respectively. Thus, we
`may implement the two (cid:12)lters of orders P # and P ", where P # + P " = P , and add together the outputs. H #(z) may be
`implemented as described in Section III-A, but H "(z) must be handled carefully.
`Alternatively, we may factor the transfer function into stable and unstable parts
`
`=
`
`+
`
`;
`
`( )
`
`( )
`
`H(z) =H #(z)H "(z)
`B"(z)
`B#(z)
`A#(z)
`A"(z)
`
`=
`
`;
`
`( )
`
`( )
`
`in which case the stable and unstable (cid:12)lters are cascaded. Note that in this factorization B#(z) and B"(z) are not
`uniquely determined.
`A possible algorithm for stabilizing an unstable TIIR (cid:12)lter using two parallel sets of state variables is given in Table
`I. The algorithm cycles with a period of N time steps. For cycle number k, at time stepn = kN , we clear the auxiliary
`(cid:12)lter’s input delay line w[(cid:1)] and vector of state variables q[(cid:1)], each of length P . Otherwise the primary and auxiliary
`systems evolve with the same dynamics. We have taken into account the fact that during the (cid:12)rst N time steps the tail
`canceling terms due to z(cid:0)N B (z) in the auxiliary (cid:12)lter are identically zero, hence those terms are not included. Since
`the FIR impulse response has length N + we see that at time n = kN + N the two systems have identical output
`values, except for the amount of accumulated noise. The state variables q [(cid:1)] and q[(cid:1)] are not identical because the
`input histories are slightly di(cid:11)erent. However, since we are only concerned with the output, this does not matter. We
`then copy the auxiliary system’s state variables to the primary system and repeat the cycle. This algorithm may be
`optimized by realizing that the values for w[m] are zero for m > N during each cycle. Thus the last summation in the
`dynamics for the auxiliary system shown in Table I is omitted for the cases where n (cid:17) ; : : : ; N (mod N ).
`Under this scheme the amount of time that quantization errors can accumulate varies from a minimum of N time
`steps to a maximum of N , depending on when each error occurs. The minimum is due to the amount of time necessary
`to allow the auxiliary system to evolve to having the same output value as the primary system. For noise accumulation
`analysis in this system, we may assume that the unstable system dynamics are dominated by the mode with the largest
`eigenvalue pmax such thatjp maxj > . We assume for the present analysis that the multiplicity of pmax is one. Thus we
`must reckon for tolerable performance with
`
`
`
`(cid:23)
`
` (cid:0) jpmaxjN
`(cid:27)
`max (cid:21) (cid:27)
` (cid:0) jpmaxj :
`
`
`max < (cid:27)If the maximum tolerable noise variance is (cid:27)tol then we must have (cid:27) tol so that, approximately,
`(jpmaxj (cid:0) )(cid:27)
`log(cid:18) +
`(cid:19)
`(cid:27)
`(cid:23)
` log(jpmaxj)
`Thus, given pmax, (cid:27)tol, and quantization noise variance (cid:27)
`
`(cid:23) , we have a fundamental limitation on the length of the
`e(cid:11)ective impulse response. A re(cid:12)nement of this bound is given below in Section V-C.
`The computational cost for implementing an unstable TIIR (cid:12)lter is at most twice the cost of a stable TIIR (cid:12)lter. The
`extra cost is due to the cost of implementing the auxiliary TIIR (cid:12)lter, which costs about P operations per sample.
`Thus the computational cost of the unstable TIIR (cid:12)lter is approximately P multiplies per input sample. The number
`of adds is about the same. If we take into account the fact that the feed-forward terms due to B(z) are the same (except
`for cases where w[k] has been set to zero) in both the primary and auxiliary systems, we may optimize by avoiding
`duplicate calculation of these terms and thus reduce the total number of multiplies and adds to about P per input
`sample. The shifts of the state variables and input delay lines may be simulated by using pointer arithmetic, and need
`not actually be performed.
`
`( )
`
`( )
`
`N <
`
`tol
`
`:
`
`V. Fast Linear-Phase FIR Filters
`
`The implementation of (cid:12)lters as TIIR systems is rather pointless unless there is an advantage that is unavailable to
`untruncated IIR (cid:12)lters. That advantage is that it is possible to implement exactly linear-phase (cid:12)lters. We present two
`basic strategies for designing (cid:12)lters based on TIIR (cid:12)lters. The (cid:12)rst uses the factorization given in Eqn. ( ), and the
`second uses the factorization used in Eqn. ( ).
`
`Ex. 1033 / Page 6 of 17
`
`
`
`WANG AND SMITH: ON FAST FIR FILTERS IMPLEMENTED AS TAIL-CANCELING IIR FILTERS
`
`
`
`A. Additive Factorization Design Method
`
`We saw in Section II that an FIR (cid:12)lter has linear phase if Eqn. ( ) holds. It is possible to attain such a relation
`using TIIR (cid:12)lters. Let H +
`FIR(z) be a TIIR transfer function such that
`
`( )
`
`()
`
`( )
`
`
`
`h+k z(cid:0)k
`
`NXk
`
`=
`
`H +
`FIR(z) =
`
`=
`
`B+(z) (cid:0) z(cid:0)N B +(z)
`A+(z)
`
`:
`
`We form the time-reversed truncated transfer function
`
`h(cid:0)
`k z(cid:0)k
`
`NXk
`
`=
`
`H (cid:0)
`FIR(z) =
`
`;
`
`NXk
`
`=
`
`=
`
`=
`
`h+(cid:3)
`k zk(cid:0)N
`= z(cid:0)N (cid:8)H +FIR ( =z(cid:3))(cid:9)(cid:3)
`
`z(cid:0)N fB+ ( =z(cid:3))g(cid:3) (cid:0) fB + ( =z(cid:3))g(cid:3)
`fA+ ( =z(cid:3))g(cid:3)
`= (cid:0)zB (cid:0)(z) + z(cid:0)N B(cid:0)(z)
`A(cid:0)(z)
`where the \+" and \(cid:0)" superscripts denote \forward" and \reverse-conjugate" (cid:12)lter respectively, and the \*" superscript
`denotes complex conjugation, as usual. Thus, comparing with Eqns. () and ( ), we have
`
`()
`
`()
`
`()
`
`()
`
`and
`
`A(cid:0)(z) = + a(cid:3)
` z + : : : + a(cid:3)
`P zP ;
`B(cid:0)(z) =b (cid:3)
` + b(cid:3)
` z + : : : + b(cid:3)
`P zP ;
`
`()
`
`( )
`
`B (cid:0)(z) =b (cid:3)
` + b (cid:3)
` z + : : : + b (cid:3)
`P (cid:0) zP (cid:0) :
`We have assumed that B+(z) andB (cid:0)(z) have degreesP and P (cid:0) , respectively. If we assume that H +
`FIR(z) is a stable
`TIIR (cid:12)lter then H (cid:0)
`
`FIR(z) is an unstable TIIR (cid:12)lter whose hidden modes are conjugate-reciprocals of those of H +FIR(z).
`An example implementation for H (cid:0)
`FIR(z) is given in Table II. There we have normalized the numerator and denominator
`of the (cid:12)lter by dividing through by a(cid:3)
`P .
`Using the arbitrary time shift M (cid:21) and phase shift , we de(cid:12)ne the (cid:12)lter
`
`FIR(z) +e j z(cid:0)M H (cid:0)FIR(z)
`H +
`HLPFIR(z)
`
`( )
`
`( )
`
`=
`
`()
`
`h+(cid:3)
`k zk(cid:0)N (cid:0)M :
`
`NXk
`
`=
`
`
`
`h+k z(cid:0)k + ej
`
`NXk
`
`=
`
`=
`
`We note that this new FIR (cid:12)lter of length M + N + is invariant with respect to reversing the order, conjugating the
`coe(cid:14)cients, and multiplying by the phase factor exp(j ). Eqn. ( ) holds, and thus HLPFIR(z) is a linear-phase FIR
`(cid:12)lter. We may see the linear-phase property more directly by (cid:12)rst noticing that, on the unit circle,
`
`so that
`
`H (cid:0)
`
`;
`
`FIR(cid:0)ej!(cid:1)(cid:9)(cid:3)
`FIR(cid:0)ej!(cid:1) = e(cid:0)jN ! (cid:8)H +
`FIR(cid:0)ej!(cid:1) + ej e(cid:0)jM !H (cid:0)
`HLPFIR(cid:0)ej!(cid:1) = H +
`FIR(cid:0)ej!(cid:1)
`
`
`Eqn. ( ) gives us the result
`
`= e(cid:0)j((N +M )!+ )=Renej((N +M )!+ )=H +FIR(cid:0)ej!(cid:1)o :
`
`( )
`
`()
`
`()
`
`M + N
`
`It is rather di(cid:14)cult to design fast FIR (cid:12)lters to a given set of speci(cid:12)cations using this additive factorization design
`
`(cid:14) =
`
`:
`
`()
`
`method because it is not intuitively obvious how to control the magnitude (cid:12)(cid:12)HLPFIR(cid:0)ej!(cid:1)(cid:12)(cid:12) due to Eqn. (). Nonetheless,
`
`for certain impulse response waveforms which are well characterized, this technique provides a useful tool for designing
`corresponding fast FIR (cid:12)lters to realize them. Some examples are given in Section VI-B.
`
`Ex. 1033 / Page 7 of 17
`
`
`
`
`
`IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. , NO. , PP. { , JUNE
`
`B. Magnitude-Squared Design Method
`
`We begin by choosing a desired non-negative real-valued transfer function H (z) > for which there exists a stable
`transfer function H(z) such that
`
`()
`
`()
`
`( )
`
`( )
`
`( )
`
`()
`
`( )
`
`and polynomials A+(z) and B+(z) such that
`
`
`
`H (cid:0)ej!(cid:1)
`= (cid:12)(cid:12)H (cid:0)ej!(cid:1)(cid:12)(cid:12)
`= H (cid:0)ej!(cid:1)(cid:3)
`H (cid:0)ej!(cid:1) ;
`
` Xk
`
`=
`B+(z)
`A+(z)
`
`:
`
`hkz(cid:0)k
`
`H(z) =
`
`=
`
`If the coe(cid:14)cients of A+(z) and B+(z) are real then Eqn. () yields
`
`H (cid:0)ej!(cid:1) = H (cid:0)ej!(cid:1)H (cid:0)e(cid:0)j!(cid:1) :
`
`hkz(cid:0)k
`
`NXk
`
`=
`
`H +
`FIR(z) =
`
`=
`
`B+(z) (cid:0) z(cid:0)N B +(z)
`A+(z)
`
`;
`
`We form the TIIR transfer function
`
`as in Eqn. (). Similarly, we form H (cid:0)
`FIR(z), as in Eqn. (). We see immediately, in light of Eqn. ( ), that the (cid:12)lter
`
`
`H +
`
`FIR(z)H (cid:0)FIR(z)
`
`=
`
`H
`FIR(z)
`
`has the property that
`
`FIR(cid:0)ej!(cid:1) = e(cid:0)jN ! (cid:12)(cid:12)H +FIR(cid:0)ej!(cid:1)(cid:12)(cid:12)
`
`
`This (cid:12)lter may be simply implemented as a cascade of H +
`and Table II, respectively.
`
`
`FIR(z) andH (cid:0)FIR(z), which may be implemented by Eqn. ()
`
`The relationship between H FIR(z) and H (z) is seen by considering the cyclic convolution []
`(cid:25) Z (cid:25)
`
`H +
`
`FIR(cid:0)ej!(cid:1) =
`
`
`
`
`
`H (cid:0)ej(cid:18)(cid:1) WN (cid:16)ej(!(cid:0)(cid:18))(cid:17) d(cid:18)
`sinf(N + =)!g
`sin(!=)
`
`:
`
`()
`
`()
`
`where
`
`WN (cid:0)ej!(cid:1) =
`
`FIR(z) and H (cid:0)FIR(z) must be chosen long enough so that the blurring induced by the periodic
`
`The (cid:12)lter length N for H +
`convolution of H(ej!) by WN (ej!) does not induce too much distortion in the frequency response. Indeed, disregarding
`quantization noise for the moment, as N grows to in(cid:12)nity, WN (cid:0)ej!(cid:1) approaches an impulse centered at ! = , and
`
`H +
`FIR(z) converges to H(z). There are constraints due to Eqn. ( ) on the (cid:12)lter length which are considered in the next
`section.
`We notice that the phase of the (cid:12)lter H(z) utilized in the design of H
`FIR(z) is irrelevant, and thus IIR (cid:12)lters, which are
`often considered to have excessive phase distortion near sharp cuto(cid:11)s, may be used. Thus, any stable discrete-time IIR
`(cid:12)lter design such as elliptic, Chebyshev, and Butterworth (cid:12)lters may be chosen as H(z) in Eqn. () and transformed
`into a magnitude-squared (cid:12)lter with linear phase. Additionally, the fact that the magnitude response for H (z) is twice
`the distance from dB compared to the magnitude response of H(z) implies that constraints in the stop band of H(z)
`need only be half the desired dB design speci(cid:12)cation, whereas, on the other hand, the ripples in the passband of H(z)
`must be better than half the desired dB design speci(cid:12)cation.
`
`H
`
`and this is obviously a linear-phase (cid:12)lter with group delay
`
`(cid:14) = N:
`
`
`
`
`
`;
`
`()
`
`()
`
`()
`
`Ex. 1033 / Page 8 of 17
`
`
`
`WANG AND SMITH: ON FAST FIR FILTERS IMPLEMENTED AS TAIL-CANCELING IIR FILTERS
`
`
`
`C. A Re(cid:12)ned Truncation Algorithm
`
`In both the additive and multiplicative designs of the previous two sections, the length N is constrained by the growth
`of quantization error in the unstable, reverse-conjugate (cid:12)lter H (cid:0)
`FIR(z). This bound is given in Eqn. ( ). We note that
`
`FIR(z) which gives rise to the most unstable hidden mode of H (cid:0)FIR(z) due to the
`it is the most stable hidden mode of H +
`conjugate-reciprocal correspondence between their modes. These problematic conjugate-reciprocal modes may restrict
`the direct implementation in Table II to undesirably short lengths. One way around this problem is to set a signi(cid:12)cance
`(cid:13)oor (cid:21)S above the (cid:13)oor for numerical precision (cid:21)P so that errors with magnitude less than (cid:21)S are insigni(cid:12)cant, and
`quantization errors are bounded by (cid:21)P . Let H(z) be the untruncated IIR impulse response associated with H +
`FIR(z)
`as in Eqn. ( ). Since H(z) is stable, we have jpkj < , where the pk’s are the poles of H(z), and consequently the
`
`FIR(z). The hidden modes of H (cid:0)FIR(z) are thus =p(cid:3)
`hidden modes of H +
`k and are unstable. We perform a partial fraction
`expansion on H(z) as in Eqns. () and ( ). We then observe the impulse response for each partial fraction. We set
`the cuto(cid:11) point Nk for the k-th partial fraction response to be the smallest time after which the maximum impulse
`response becomes insigni(cid:12)cant, i.e., n > Nk; j(cid:22)hk;nj (cid:20) (cid:21)S , where (cid:22) is the largest magnitude input. We may solve
`jhk;Nkj = (cid:21)S=(cid:22)
`numerically for Nk by using Eqn. ( ) or by using the approximation
`
`( )
`
`hk;n (cid:25)
`
`Mk
`
`X‘=
`
`k
`
`Mk
`
`k
`
`bk;‘(n (cid:0) ‘)Mk (cid:0) pn(cid:0)‘
`(Mk (cid:0) )!
`bk;‘p(cid:0)‘
`X‘=
`(cid:25) nMk(cid:0) pn
`k
`(Mk (cid:0) )!
`= BknMk(cid:0) pn
`k
`
`for large n, where Bk is implicitly de(cid:12)ned by Eqns. ( ) and (). For Mk = , we have exactly
`
`( )
`
`( )
`
`()
`
`( )
`
`Nk =
`
`log(cid:16)(cid:12)(cid:12)(cid:12)
`
`(cid:21)S
`
`(cid:22)Bk(cid:12)(cid:12)(cid:12)(cid:17)
`
`:
`
`log(jpkj)
`Thus, if Nk < N we may truncate the k-th partial fraction response at Nk instead of N without losing signi(cid:12)cance.
`However, since H(z) is stable and responses due to the k-th partial fraction beyond the Nk-th time step are below the
`signi(cid:12)cance (cid:13)oor, this does not make any di(cid:11)erence. The re(cid:12)nement comes from implementing H (cid:0)
`FIR(z) as a sum of
`reversed partial fractions based on Eqn. () and Eqn. (), but with the k-th partial fraction truncated after n = Nk
`samples instead of n = N if Nk < N . Thus, the unstable mode of H (cid:0)
`FIR(z) due to the pole at =p(cid:3)
`k only needs to grow
`from having an initial magnitude above the signi(cid:12)cance (cid:13)oor (cid:21)S and has less time to accumulate exponentially growing
`quantization noise. We have
`
`where
`
`H (cid:0)
`FIR(z) =
`
`Np
`
`Xk=
`
`k (z) +z (cid:0)N
`k B(cid:0)
`(cid:0)zB (cid:0)
`k (z; N
`k)
`kz)Mk
`( (cid:0) p(cid:3)
`
`;
`
`N
`
`k = (cid:26) Nk; Nk (cid:20) N
`
`N; Nk > N .
`
`()
`
`()
`
`Eqn. ( ) indicates how much quantization noise will accumulate due to the truncated response of length N = Nk of
`the k-th partial fraction of H (cid:0)
`FIR(z) . Assuming that quantization error occurs on the order of the precision (cid:13)oor (cid:21)P ,
`
`
`tol = (cid:21)(cid:23) (cid:25) (cid:21)we have (cid:27) P . Also, the signi(cid:12)cance (cid:13)oor sets a convenient level of noise tolerance, so that may set (cid:27)
`
`S. We
`may recast Eqn. ( ) as
`
`
`
`P
`
` (cid:0) jpkj(cid:0)N
`S (cid:21) (cid:21)(cid:21)
` (cid:0) jpkj(cid:0)
`P jpkj(cid:0)N
`(c