Tài liệu Digital Signal Processing Handbook P23 pdf

23 Adaptive IIR FiltersFilter-23.2 The Equation Error Approach The LMS and LS Equation Error Algorithms •InstrumentalVariable Algorithms•Equation Error Algorithms with Unit Norm Constrai

Trang 1

Williamson, G.A “Adaptive IIR Filters”

Digital Signal Processing Handbook

Ed Vijay K Madisetti and Douglas B Williams Boca Raton: CRC Press LLC, 1999

Trang 2

23 Adaptive IIR Filters

Filter-23.2 The Equation Error Approach

The LMS and LS Equation Error Algorithms •InstrumentalVariable Algorithms•Equation Error Algorithms with Unit Norm Constraints

23.3 The Output Error Approach

Gradient-Descent Algorithms • Output Error AlgorithmsBased on Stability Theory

23.4 Equation-Error/Output-Error Hybrids

The Steiglitz-McBride Family of Algorithms

23.5 Alternate Parametrizations23.6 Conclusions

of the IIR structure, compared to the “all-zero” form of the FIR structure

However, adapting an IIR filter brings with it a number of challenges in obtaining stable andoptimal behavior of the algorithms used to adjust the filter parameters Since the 1970s, there hasbeen much active research focused on adaptive IIR filters, but many of these challenges to date havenot been completely resolved As a consequence, adaptive IIR filters are not found in commercialpractice in anywhere near the frequency that adaptive FIR filters are Nonetheless, recent advances

in adaptive IIR filter research have provided new results and insights into the behavior of severalmethods for adapting the filter parameters, and new algorithms have been proposed that addresssome of the problems and open issues in these systems Hence, this class of adaptive filter continues

to maintain promise as a potentially effective and efficient adaptive filtering option

In this section, we provide an up-to-date overview of the different approaches to the adaptive IIRfiltering problem Due to the extensive literature on the subject, many readers may wish to peruseseveral earlier general treatments of the topic Johnson’s 1984 [11] and Shynk’s 1989 paper [23] arestill current in the sense that a number of open issues cited therein remain open today More recently,Regalia’s 1995 book [19] provides a comprehensive view of the subject

Trang 3

23.1.1 The System Identification Framework for Adaptive IIR Filtering

The spread of issues associated with adaptive IIR filters is most easily understood if one adopts a systemidentification perspective to the filtering problem To this end, consider the diagram presented inFig.23.1 Available to the adaptive filter are two external signals: the input signalx(n) and the desired

output signald(n) The adaptive filtering problem is to adjust the parameters of the filter acting on x(n) so that its output y(n) approximates d(n) From the system identification perspective, the task at

hand is to adjust the parameters of the filter generatingy(n) from x(n) in Fig.23.1so that the filtering operation itself matches in some sense the system generating d(n) from x(n) These two viewpoints

are closely related because if the systems are the same, then their outputs will be close However, byadopting the convention that there is a system generatingd(n) from x(n), clearer insights into the

behavior and design of adaptive algorithms are obtained This insight is useful even if the “system”generatingd(n) from x(n) has only a statistical and not a physical basis in reality.

FIGURE 23.1: System identification configuration of the adaptive IIR filter

The standard adaptive IIR filter is described by

y(n)+a1(n)y(n−1)+· · ·+a N (n)y(n−N) = b0(n)x(n)+b1(n)x(n−1)+· · ·+b M (n)x(n−M) ,

Trang 4

whereB(q−1, n) and A(q−1, n) are the time-dependent polynomials in the delay operator q−1appearing in (23.2) The parameters that are updated by the adaptive algorithm are the coefficients

of these polynomials Note that the polynomialA(q−1, n) is constrained to be monic, such that

a0(n) = 1.

We adopt a rather more general description for the unknown system, assuming thatd(n) is

gen-erated from the input signalx(n) via some linear time-invariant system H (q−1), with the addition

of a noise signalv(n) to reflect components in d(n) that are independent of x(n) We further break

downH(q−1) into a transfer function H m (q−1) that is explicitly modeled by the adaptive filter, and

a transfer functionH u (q−1) that is unmodeled In this way, we view d(n) as a sum of three

compo-nents: the signaly m (n) that is modeled by the adaptive filter, the signal y u (n) that is unmodeled but

that depends on the input signal, and the signalv(n) that is independent of the input Hence,

withBopt(q−1) =PM i=0 b i,optq −i andAopt(q−1) = 1 +PN i=i a i,optq −i Note that (23.6) has the

same form as (23.3) The parameters{a i,opt} and {b i,opt} are considered to be the optimal values forthe adaptive filter parameters, in a manner that we describe shortly

Figure23.1 shows two error signals: e e (n) termed the equation error , and e o (n), termed the output error The parameters of the adaptive filter are usually adjusted so as to minimize some

positive function of one or the other of these error signals However, the figure of merit for judgingadaptive filter performance that we will apply throughout this section is the mean-square output error

E{e2

o (n)} In most adaptive filtering applications, the desired signal, d(n), is available only during a

“training phase” in which the filter parameters are adapted At the conclusion of the training phase,the filter will be operated to produce the output signaly(n) as shown in the figure, with the difference

between the filter outputy(n) and the (now unmeasurable) system output d(n) the error Thus,

we adopt the convention that{a i,opt} and {b i,opt} are defined such that when a i (n) ≡ a i,opt and

b i (n) ≡ b i,opt, E{e2

o (n)} is minimized, with Aopt(q−1) constrained to be stable.

At this point it is convenient to set down some notation and terminology Define the regressorvectors

Ue (n) = [x(n) · · · x(n − M) − d(n − 1) · · · − d(n − N)] T , (23.7)

Uo (n) = [x(n) · · · x(n − M) − y(n − 1) · · · − y(n − N)] T , (23.8)

Um (n) = [x(n) · · · x(n − M) − y m (n − 1) · · · − y m (n − N)] T (23.9)These vectors are the equation error regressor, output error regressor, and modeled system regressorvectors, respectively Define a noise regressor vector

withM + 1 leading zeros corresponding to the x(n − i) values in the preceding regressors

Further-more, define the parameter vectors

Trang 5

We will have occasion to use W to refer to the adaptive filter parameter vector when the parameters

are considered to be held at fixed values With this notation, we may for instance writey m (n) =

UT

m (n)Woptandy(n) = U T

o (n)W(n).

The situation in whichy u (n) ≡ 0 is referred to as the sufficient order case The situation in which

y u (n) 6≡ 0 is termed the undermodeled case.

23.1.2 Algorithms and Performance Issues

A number of different algorithms for the adaptation of the parameter vectorW(n) in (23.11) havebeen suggested These may be characterized with respect to the form of the error criterion employed

by the algorithm Each algorithm attempts to drive to zero either the equation error, the output error,

or some combination or hybrid of these two error criteria Major algorithm classes that we considerfor the equation error approach include the standard least-squares (LS) and least mean-square (LMS)algorithms, which parallel the algorithms used in adaptive FIR filtering For equation error meth-ods, we also examine the instrumental variables (IV) algorithm, as well as algorithms that constrainthe parameters in the denominator of the adaptive filter’s transfer function to improve estimationproperties In the output error class, we examine gradient algorithms and hyperstability-based algo-rithms Within the equation and output error hybrid algorithm class, we focus predominantly on theSteiglitz-McBride (SM) algorithm, though there are several algorithms that are more straightforwardcombinations of equation and output error approaches

In general, we desire that the adaptive filtering algorithm adjusts the parameter vector Wnso that

it converges to Wopt, the parameters that minimize the mean-square output error The major issuesfor adaptive IIR filtering on which we will focus herein are

1 conditions for the stability and convergence of the algorithm used to adapt W(n), and

2 the asymptotic value of the adapted parameter vector W∞, and its relationship to Wopt.This latter issue relates to the minimum mean-square error achievable by the algorithm, as notedabove Other issues of importance include the convergence speed of the algorithm, its ability totrack time variations of the “true” parameter values, and numerical properties, but these will receiveless attention here Of these, convergence speed is of particular concern to practitioners, especially

as adaptive IIR filters tend to converge at a far slower rate than their FIR counterparts However,

we emphasize the stability and nature of convergence over the speed because if the algorithm fails

to converge or converges to an undesirable solution, the rate at which it does so is of less concern.Furthermore, convergence speed is difficult to characterize for adaptive IIR filters due to a number offactors, including complicated dependencies on algorithm initializations, input signal characteristics,and the relationship betweenx(n) and d(n).

23.1.3 Some Preliminaries

Unless otherwise indicated, we assume in our discussion that all signals in Fig.23.1are stationary,zero mean, random signals with finite variance In particular, the properties we ascribe to the variousalgorithms are stated with this assumption and are presumed to be valid Results that are based on adeterministic framework are similar to those developed here; see [1] for an example

We shall also make use of the following definitions

DEFINITION 23.1 A (scalar) signal is persistently exciting (PE) of orderL if, with

Trang 6

there existα and β satisfying 0 < α < β < ∞ such that αI < E{X(n)X T (n)} < βI The (vector)

signal X(n) is then also said to be PE.

Ifx(n) contains at least L/2 distinct sinusoidal components, then x(n) is PE of order L Any

random signalx(n) whose power spectrum is nonzero over a interval of nonzero width will be PE

for any value ofL in (23.15) Such is the case, for example, ifx(n) is uncorrelated or if x(n) is

modeled as an AR, MA, or ARMA process driven by uncorrelated noise PE conditions are required

of all adaptive algorithms to ensure good behavior because if there is inadequate excitation to provideinformation to the algorithm, convergence of the adapted parameters estimates will not necessaryfollow [22]

DEFINITION 23.2 A transfer functionH (q−1) is said to be strictly positive real (SPR) if H (q−1)

is stable and the real part of its frequency response is positive at all frequencies

An SPR condition will be required to ensure convergence for a few of the algorithms that we discuss.Note that such a condition cannot be guaranteed in practice whenH(q−1) is an unknown transfer

function, or whenH(q−1) depends on an unknown transfer function.

23.2 The Equation Error Approach

To motivate the equation error approach, consider again Fig.23.1 Suppose thaty(n) in the figure

were actually equal tod(n) Then the system relationship A(q−1, n)y(n) = B(q−1, n)x(n) would

imply thatA(q−1, n)d(n) = B(q−1, n)x(n) But of course this last equation does not hold exactly,

and we term its error the “equation error”e e (n) Hence, we define

Using the notation developed in (23.7) through (23.14), we find that

e e (n) = d(n) − U T

Equation error methods for adaptive IIR filtering typically adjust W(n) so as to minimize the

mean-squared error (MSE)JMSE(n) = E{e2

e (n)}, where E{·} denotes statistical expectation, or the

expo-nentially weighted least-squares (LS) errorJLS(n) =Pn k=0 λ n−k e2

e (k).

23.2.1 The LMS and LS Equation Error Algorithms

The equation errore e (n) of (23.17) is the difference betweend(n) and a prediction of d(n) given by

UT

e (n)W(n) Noting that U T

e (n) does not depend on W(n), we see that equation error adaptive IIR

filtering is a type of linear prediction, and in particular the form of the prediction is identical to thatarising in adaptive FIR filtering One would suspect that many adaptive FIR filter algorithms wouldthen apply directly to adaptive IIR filters with an equation error criterion, and this is in fact the case.Two adaptive algorithms applicable to equation error adaptive IIR filtering are the LMS algorithmgiven by

Trang 7

where the above expression forP (n) is a recursive implementation of

d ) are chosen With the normalized step

size, we require 0< ¯µ < 2 and > 0 for stability, with typical choices of ¯µ = 0.1 and = 0.001.

In23.20, we require thatλ satisfy 0 < λ ≤ 1, with λ typically close to or equal to one, and we

initializeP (0) = γ I with γ a large, positive number These results are analogous to the FIR filter

cases considered in the earlier sections of this chapter

These algorithms possess nice convergence properties, as we now discuss

Property 1: Given that x is P E of order N + M + 1, under ( 23.18 ) and under ( 23.19 ) and ( 23.20 ), with algorithm parameters chosen to satisfy the conditions noted above, then E{W(n)} converges to a

value W∞minimizing J MSE (n) and J LS (n), respectively, as n → ∞.

This property is desirable in that global convergence to parameter values optimal for the equationerror cost function is guaranteed, just as with adaptive FIR filters The convergence result holdswhether the filter is operating in the sufficient order case or the undermodeled case This is animportant advantage of the equation error approach over other approaches The reader is referred

to Chapters 19, 20, and 21 for further details on the convergence behaviors of these algorithms andtheir variations As in the FIR case, the eigenvalues of the matrixR = E{U e (n)U T

e (n)} determine

the rates of convergence for the LMS algorithm A large eigenvalue disparity inR engenders slow

convergence in the LMS algorithm and ill-conditioning, with the attendant numerical instabilities,

in the RLS algorithm For adaptive IIR filters, compared to the FIR case, the presence ofd(n) in

Ue (n) tends to increase the eigenvalue disparity, so that slower convergence is typically observed for

these algorithms

Of importance is the value of the convergence points for the LMS and RLS algorithms with respect

to the modeling assumptions of the system identification configuration of Fig.23.1 For simplicity,let us first assume that the adaptive filter is capable of modeling the unknown system exactly; that is,

H u (q−1) = 0 One may readily show that the parameter vector W that minimizes the mean-square

equation error (or equivalently the asymptotic least square equation error, given ergodic stationarysignals) is

= E{U m (n)U m T (n)} + EnV(n)V T (n)o−1

Clearly, ifv(n) ≡ 0, the W so obtained must equal Wopt, so that we have

Wopt= EnUm (n)U T m (n)o−1E{U m (n)y m (n)} (23.24)

By comparing (23.23) and (23.24), we can easily see that whenv(n) 6≡ 0, W 6= Wopt That is, theparameter estimates provided by (23.18) through (23.20) are, in general, biased from the desired

values, even when the noise termv(n) is uncorrelated.

What effect on adaptive filter performance does this bias impose? Since the parameters that

minimize the mean-square equation error are not the same as Wopt, the values that minimize the

Trang 8

mean-square output error, the adaptive filter performance will not be optimal Situations can arise

in which this bias is severe, with correspondingly significant degradation of performance

Furthermore, a critical issue with regard to the parameter bias is the input-output stability of theresulting IIR filter Because the equation error is formed asA(q−1)d(n) − B(q−1)x(n), a difference

of two FIR filtered signals, there are no built in constraints to keep the roots ofA(q−1) within the

unit circle in the complex plane Clearly, if an unstable polynomial results from the adaptation, thenthe filter outputy(n) can grow unboundedly in operational mode, so that the adaptive filter fails An

example of such a situation is given in [25] An important feature of this example is that the adaptivefilter is capable of precisely modeling the unknown system, and that interactions of the noise processwithin the algorithm are all that is needed to destabilize the resulting model

Nonetheless, under certain operating conditions, this kind of instability can be shown not to occur,

as described in the following

Property 2: [ 18 ] Consider the adaptive filter depicted in Fig 23.1 , where y(n) is given by ( 23.2 ) If x(n) is an autoregressive process of order no more than N, and v(n) is independent of x(n) and of finite variance, then the adaptive filter parameters minimizing the mean-square equation error E{e2

e (n)} are such that A(q−1) is stable.

For instance, ifx(n) is an uncorrelated signal, then the convergence point of the equation error

algorithms corresponds to a stable filter

To summarize, for LMS and RLS adaptation in an equation error setting, we have guaranteedglobal convergence, but bias in the presence of additive noise even in the exact modeling case, and

an estimated model guaranteed to be stable only under a limited set of conditions

23.2.2 Instrumental Variable Algorithms

A number of different approaches to adaptive IIR filtering have been proposed with the intention ofmitigating the undesirable biased properties of the LMS- and RLS-based equation error adaptive IIRfilters One such approach, still within the equation error context, is the instrumental variables (IV)method Observe that the bias problem illustrated above stems from the presence ofv(n) in both

Ue (n) and in e e (n) in the update terms in (23.18) and (23.19), so that second order terms inv(n) then

appear in (23.23) This simultaneous presence creates, in expectation, a nonzero, noise-dependent

driving term to the adaptation The IV algorithm approach addresses this by replacing Ue (n) in these

algorithms with a vector Uiv (n) of instrumental variables that are independent of v(n) If U iv (n)

remains correlated with Um (n), the noiseless regressor, convergence to unbiased filter parameters is

withλ(n) = 1 − µ(n) Common choices for λ(n) are to set λ(n) ≡ λ0, a fixed constant in the range

0 < λ < 1 and usually chosen in the range between 0.9 and 0.99, or to choose µ(n) = 1/n and λ(n) = 1 − µ(n) As with RLS methods, P (0) = γ I with γ a large, positive number The vector

Uiv (n) is typically chosen as

Uiv (n) = [x(n) · · · x(n − M) − z(n − 1) · · · − z(n − N)] T (23.27)with either

z(n) = −x(n − M) or z(n) = ¯B(q−1)

¯

Trang 9

In the first case, Uiv (n) is then simply an extended regressor in the input x(n), while the second

choice may be viewed as a regressor parallel to Um (n), with z(n) playing the role of y m (n) For this

choice, one may think of ¯A(q−1) and ¯B(q−1) as fixed filters chosen to approximate Aopt(q−1) and

Bopt(q−1), but the exact choice of ¯ A(q−1) and ¯B(q−1) is not critical to the qualitative behavior of

the algorithm In both cases, note that Uiv (n) is independent of v(n), since d(n) is not employed in

its construction

The convergence of this algorithm is described by the following property, derived in [15]

Property 3: In the sufficient order case with x(n) PE of order at least N + M + 1, the IV algorithm

in ( 23.25 ) and ( 23.26 ) with U iv (n) chosen according to ( 23.27 ) or ( 23.28 ) causes E{W(n)} to converge

to W∞= Wopt

There are a few additional technical conditions anAopt(q−1), Bopt(q−1), ¯ A(q−1), and ¯B(q−1) that

are required for the property to hold These conditions will be satisfied in almost all circumstances; fordetails, the reader is referred to [15] This convergence property demonstrates that the IV algorithmdoes in fact achieve unbiased parameter estimates in the sufficient order case

In the undermodeled case, little has been said regarding the behavior and performance of the

IV algorithm A convergence point W∞must satisfyE{U iv (n) − (d(n)U T

e (n)W∞)} = 0, but no

characterization of such points exists ifN and M are not of sufficient order Furthermore, it is

possible for the IV algorithm to converge to a point such that 1/A(q−1) is unstable [9]

Notice that (23.25) and (23.26) are similar in form to the RLS algorithm One may postulate an

“LMS-style” IV algorithm as

which is computationally much simpler than the “RLS-style” IV algorithm of (23.25) and (23.26)

However, the guarantee of convergence of the algorithm to Woptin the sufficient order case for the

RLS-style algorithm is now complicated by an additional requirement on Uiv (n) for convergence of

the algorithm in (23.29) In particular, all eigenvalues of

must lie strictly in the right half of the complex plane Since the properties of Ue (n) depend on

the unknown relationship betweenx(n) and d(n), one is generally unable to guarantee a priori

satisfaction of such conditions This situation has parallels with the stability-theory approach tooutput error algorithms, as discussed later in this section

Summarizing the IV algorithm properties, we have that in the sufficient order case, the RLS-style

IV algorithm is guaranteed to converge to unbiased parameter values However, an understandingand characterization of its behavior in the undermodeled case is yet incomplete, and the IV algorithmmay produce unstable filters

23.2.3 Equation Error Algorithms with Unit Norm Constraints

A different approach to mitigating the parameter bias in equation error methods arises as follows.Consider modifying the equation error of (23.17) to

Trang 10

and allowing for adaptation of the new parametera0(n) One can view the equation error algorithms

that we have already discussed as adapting the coefficients of this version ofA(q−1, n), but with

a monic constraint that imposes a0(n) = 1 Recently, several algorithms have been proposed that consider instead equation error methods with a unit norm constraint In these schemes, one adapts

W(n) and a0(n) subject to the constraint

Property 4: [ 18 ] Consider the adaptive filter in Fig 23.1 with A(q−1, n) given by ( 23.32 ), with v(n)

an uncorrelated signal and with H u (q−1) = 0 (the sufficient order case) Then the parameter values W

and a0that minimize E{e2

e (n)} subject to the unit norm constraint ( 23.33 ) satisfy W /a0= Wopt

That is, the parameter estimates are unbiased in the sufficient order case with uncorrelated

out-put noise Note that normalizing the coefficients in W bya0recovers the monic character of the

denominator for Wopt:

B(q−1) A(q−1) =

b0+ b1q−1+ · · · + b Mq−M

= (b0/a0) + (b1/a0) q−1+ · · · + (b M /a0) q −M

1+ (a1/a0) q−1+ · · · + (a N /a0) q −N (23.35)

In the undermodeled case, we have the following

Property 5: [ 18 ] Consider the adaptive filter in Fig 23.1 with A(q−1, n) given by ( 23.32 ) If x(n) is

an autoregressive process of order no more than N, and v(n) is independent of x(n) and of finite variance,

then the parameter values W and a0that minimize E{e2

e (n)} subject to the unit norm constraint ( 23.33 ) are such that A(q−1) is stable Furthermore, at those minimizing parameter values, if x(n) is an uncorrelated input, then

E{e2

e (n)} ≤ σ2

where σ N+1 is the (N + 1) st Hankel singular value of H (z).

Notice that Property 5 is similar to Property 2, except that we have the added bonus of a bound

on the mean-square equation error in terms of the Hankel singular values ofH (q−1) Note that

the(N + 1)st Hankel singular value ofH (q−1) is related to the achievable modeling error in an Nth order, reduced order approximation to H (q−1) (see [19, Ch.4] for details) This bound thusindicates that the optimal unit norm constrained equation error filter will in fact do about as well

as can be expected with anNth order filter However, this adaptive filter will suffer, just as with the

equation error approaches with the monic constraint on the denominator, from a possibly unstabledenominator if the inputx(n) is not an autoregressive process.

An adaptive algorithm for minimizing the mean-square equation error subject to the unit normconstraint can be found in [4] The algorithm of [4] is formulated as a recursive total least squaresalgorithm using a two-channel, fast transversal filter implementation The connection between totalleast squares and the unit norm constrained equation error adaptive filter implies that the correlationmatrices that are embedded within the adaptive algorithm will be more poorly conditioned than thecorrelation matrices arising in the RLS algorithm Consequently, convergence will be slower for theunit norm constrained approach than in the standard, monic constraint approach

Tiêu đề	Adaptive IIR Filters
Tác giả	Geoffrey A. Williamson
Người hướng dẫn	Vijay K. Madisetti, Editor, Douglas B. Williams, Editor
Trường học	Illinois Institute of Technology
Chuyên ngành	Digital Signal Processing
Thể loại	chapter
Năm xuất bản	1999
Thành phố	Boca Raton

Định dạng
Số trang	21
Dung lượng	284,2 KB