Real-Time Digital Signal Processing - Chapter 8: Adaptive Filtering

8.2.1 Introduction to Adaptive Filtering An adaptive filter consists of two distinct parts ± a digital filter to perform the desiredsignal processing, and an adaptive algorithm to adjust

Trang 1

Adaptive Filtering

As discussed in previous chapters, filtering refers to the linear process designed to alterthe spectral content of an input signal in a specified manner In Chapters 5 and 6, weintroduced techniques for designing and implementing FIR and IIR filters for givenspecifications Conventional FIR and IIR filters are time-invariant Theyperform linearoperations on an input signal to generate an output signal based on the fixed coeffi-cients Adaptive filters are time varying, filter characteristics such as bandwidth andfrequencyresponse change with time Thus the filter coefficients cannot be determinedwhen the filter is implemented The coefficients of the adaptive filter are adjustedautomaticallybyan adaptive algorithm based on incoming signals This has the import-ant effect of enabling adaptive filters to be applied in areas where the exact filteringoperation required is unknown or is non-stationary

In Section 8.1, we will review the concepts of random processes that are useful in thedevelopment and analysis of various adaptive algorithms The most popular least-mean-square (LMS) algorithm will be introduced in Section 8.2 Its important properties will beanalyzed in Section 8.3 Two widely used modified adaptive algorithms, the normalizedand leakyLMS algorithms, will be introduced in Section 8.4 In this chapter, we introduceand analyze the LMS algorithm following the derivation and analysis given in [8] InSection 8.5, we will brieflyintroduce some important applications of adaptive filtering.The implementation considerations will be discussed in Section 8.6, and the DSP imple-mentations using the TMS320C55x will be presented in Section 8.7

8.1 Introduction to Random Processes

A signal is called a deterministic signal if it can be described preciselyand be reproducedexactlyand repeatedly However, the signals encountered in practice are not necessarily

of this type A signal that is generated in a random fashion and cannot be described bymathematical expressions or rules is called a random (or stochastic) signal The signals

in the real world are often random in nature Some common examples of randomsignals are speech, music, and noises These signals cannot be reproduced and need to

be modeled and analyzed using statistical techniques We have briefly introducedprobabilityand random variables in Section 3.3 In this section, we will review theimportant properties of the random processes and introduce fundamental techniquesfor processing and analyzing them

Real-Time Digital Signal Processing Sen M Kuo, Bob H Lee

Copyright # 2001 John Wiley& Sons Ltd ISBNs: 0-470-84137-0 (Hardback); 0-470-84534-1 (Electronic)

Trang 2

A random process maybe defined as a set of random variables We associate a timefunction xn xn, A with everypossible outcome A of an experiment Each timefunction is called a realization of the random process or a random signal The ensemble

of all these time functions (called sample functions) constitutes the random process xn

If we sample this process at some particular time n0, we obtain a random variable Thus

a random process is a familyof random variables

We mayconsider the statistics of a random process in two ways If we fix the time n at

n0 and consider the random variable xn0, we obtain statistics over the ensemble Forexample, Exn0 is the ensemble average, where E is the expectation operationintroduced in Chapter 3 If we fix A and consider a particular sample function, wehave a time function and the statistics we obtain are temporal For example, Exn, Ai

is the time average If the time average is equal to the ensemble average, we saythat theprocess is ergodic The propertyof ergodicityis important because in practice we oftenhave access to onlyone sample function Since we generallywork onlywith temporalstatistics, it is important to be sure that the temporal statistics we obtain are the truerepresentation of the process as a whole

8.1.1 Correlation Functions

For manyapplications, one signal is often used to compare with another in order todetermine the similaritybetween the pair, and to determine additional informationbased on the similarity Autocorrelation is used to quantify the similarity between twosegments of the same signal The autocorrelation function of the random process x(n) isdefined as

This function specifies the statistical relation of two samples at different time index nand k, and gives the degree of dependence between two random variables of n kunits apart For example, consider a digital white noise x(n) as uncorrelated randomvariables with zero-mean and variance s2

x The autocorrelation function is

to measure the degree in which the two signals are similar The crosscorrelationand crosscovariance functions between two random processes x(n) and y(n) are definedas

Trang 3

gxyn, k Efxn mxnyk mykg rxyn, k mxnmyk: 8:1:5Correlation is a veryuseful DSP tool for detecting signals that are corruptedbyadditive random noise, measuring the time delaybetween two signals, determiningthe impulse response of a system (such as obtain the room impulse response used inSection 4.5.2), and manyothers Signal correlation is often used in radar, sonar, digitalcommunications, and other engineering areas For example, in CDMA digital commu-nications, data symbols are represented with a set of unique key sequences If one ofthese sequences is transmitted, the receiver compares the received signal with everypossible sequence from the set to determine which sequence has been received In radarand sonar applications, the received signal reflected from the target is the delayedversion of the transmitted signal Bymeasuring the round-trip delay, one can determinethe location of the target

Both correlation functions and covariance functions are extensivelyused in analyzingrandom processes In general, the statistical properties of a random signal such as themean, variance, and autocorrelation and autocovariance functions are time-varyingfunctions A random process is said to be stationaryif its statistics do not changewith time The most useful and relaxed form of stationaryis the wide-sense stationary(WSS) process A random process is called WSS if the following two conditions aresatisfied:

1 The mean of the process is independent of time That is,

Trang 4

2 where rxx0 Ex2n is equal to the mean-squared value, or the power in therandom process.

In addition, if x(n) is a zero-mean random process, we have

rxx0 Ex2n s2

Thus the autocorrelation function of a signal has its maximum value at zero lag

If x(n) has a periodic component, then rxxk will contain the same periodic ponent

com-Example 8.1: Given the sequence

xn anun, 0 < a < 1,the autocorrelation function can be computed as

rxxk 1 aak 2:Example 8.2: Consider the sinusoidal signal expressed as

xn cos!n,find the mean and the autocorrelation function of x(n)

(a) mx Ecos!n 0

(b) rxxk Exn kxn Ecos!n !k cos!n

12Ecos2!n !k 12cos!k 12cos!k:

The crosscorrelation function of two WSS processes x(n) and y(n) is defined as

Trang 5

In practice, we onlyhave one sample sequence fxng available for analysis Asdiscussed earlier, a stationaryrandom process x(n) is ergodic if all its statistics can bedetermined from a single realization of the process, provided that the realization is longenough Therefore time averages are equal to ensemble averages when the record length

is infinite Since we do not have data of infinite length, the averages we compute differfrom the true values In dealing with finite-duration sequence, the sample mean of x(n)

where N is the length of the sequence x(n) Note that for a given sequence of length

N, Equation (8.1.15) generates values for up to N different lags In practice, we canonlyexpect good results for lags of no more than 5±10 percent of the length of thesignals

The autocorrelation and crosscorrelation functions introduced in this section can becomputed using the MATLAB function xcorr in the Signal Processing Toolbox Thecrosscorrelation function rxyk of the two sequences x(n) and y(n) can be computedusing the statement

c = xcorr(x, y);

where x and y are length N vectors and the crosscorrelation vector c has length 2N 1.The autocorrelation function rxxk of the sequence x(n) can be computed using thestatement

See Signal Processing Toolbox User's Guide for details

Trang 6

The correlation functions represent the time-domain description of the statistics of arandom process The frequency-domain statistics are represented by the power densityspectrum (PDS) or the autopower spectrum The PDS is the DTFT (or the z-transform)

of the autocorrelation function rxxk of a WSS signal x(n) defined as

of (7.3.16) and (7.3.17) if the DFT is used in computing the PDS of random signals.Equation (8.1.16) implies that the autocorrelation function is the inverse DTFT of thePDS, which is expressed as

or

Trang 7

The DTFT of the crosscorrelation function Pxy! of two WSS signals x(n) and y(n) isgiven by

This function is called the cross-power spectrum

Example 8.3: The autocorrelation function of a WSS white random process can bedefined as

which is of constant value for all frequencies !

Consider a linear and time-invariant digital filter defined bythe impulse responseh(n), or the transfer function H(z) The input of the filter is a WSS random signal x(n)with the PDS Pxx! As illustrated in Figure 8.1, the PDS of the filter output y(n) can

be expressed as

Pyy! H! 2Pxx! 8:1:28or

Pyyz Hz 2Pxxz, 8:1:29

Trang 8

h(n) H(w) x(n)

P xx (w) P yy (w)

y(n)

Figure 8.1 Linear filtering of random processes

where H! is the frequencyresponse of the filter Therefore the value of the outputPDS at frequency ! depends on the squared magnitude response of the filter and theinput PDS at the same frequency

Another important relationships between x(n) and y(n) are

hlExn l mx

X1 l 1

hl, 8:1:30and

hlrxxk l hk rxxk: 8:1:31Taking the z-transform of both sides, we obtain

Trang 9

Example 8.4: Let the system shown in Figure 8.1 be a second-order FIR filter Theinput x(n) is a zero-mean white noise given byExample 8.3, and the I/O equation

x if k 12s2

Manypractical applications involve the reduction of noise and distortion for extraction

of information from the received signal The signal degradation in some physicalsystems is time varying, unknown, or possibly both Adaptive filters provide a usefulapproach for these applications Adaptive filters modifytheir characteristics to achievecertain objectives and usuallyaccomplish the modification (adaptation) automatically.For example, consider a high-speed modem for transmitting and receiving data overtelephone channels It employs a filter called a channel equalizer to compensate forthe channel distortion Since the dial-up communication channels have different char-acteristics on each connection and are time varying, the channel equalizers must beadaptive

Adaptive filters have received considerable attention from manyresearchers over thepast 30 years Many adaptive filter structures and adaptation algorithms have beendeveloped for different applications This chapter presents the most widelyused adap-tive filter based on the FIR filter with the LMS algorithm Adaptive filters in this classare relativelysimple to design and implement Theyare well understood with regard toconvergence speed, steady-state performance, and finite-precision effects

8.2.1 Introduction to Adaptive Filtering

An adaptive filter consists of two distinct parts ± a digital filter to perform the desiredsignal processing, and an adaptive algorithm to adjust the coefficients (or weights) ofthat filter A general form of adaptive filter is illustrated in Figure 8.2, where d(n) is adesired signal (or primaryinput signal), y(n) is the output of a digital filter driven byareference input signal x(n), and an error signal e(n) is the difference between d(n) andy(n) The function of the adaptive algorithm is to adjust the digital filter coefficients to

Trang 10

x(n) y(n)

d(n) e(n)

+

−

Adaptive algorithm

Digital filter

Figure 8.2 Block diagram of adaptive filter

Figure 8.3 Block diagram of FIR filter for adaptive filtering

minimize the mean-square value of e(n) Therefore the filter weights are updated so thatthe error is progressivelyminimized on a sample-by-sample basis

In general, there are two types of digital filters that can be used for adaptive filtering:FIR and IIR filters The choice of an FIR or an IIR filter is determined bypracticalconsiderations The FIR filter is always stable and can provide a linear phase response

On the other hand, the IIR filter involves both zeros and poles Unless theyare properlycontrolled, the poles in the filter maymove outside the unit circle and make the filterunstable Because the filter is required to be adaptive, the stabilityproblems are muchdifficult to handle Thus the FIR adaptive filter is widelyused for real-time applications.The discussions in the following sections will be restricted to the class of adaptive FIRfilters

The most widelyused adaptive FIR filter is depicted in Figure 8.3 Given a set

of L coefficients, wln, l 0, 1, , L 1, and a data sequence, fxn xn 1 xn L 1g, the filter output signal is computed as

yn XL 1l0

Trang 11

and the weight vector at time n as

wn w0n w1n wL 1nT: 8:2:3Then the output signal y(n) in (8.2.1) can be expressed using the vector operation

yn wTnxn xTnwn: 8:2:4The filter output y(n) is compared with the desired response d(n), which results in theerror signal

en dn yn dn wTnxn: 8:2:5

In the following sections, we assume that d(n) and x(n) are stationary, and our objective is

to determine the weight vector so that the performance (or cost) function is minimized

8.2.2 Performance Function

The general block diagram of the adaptive filter shown in Figure 8.2 updates thecoefficients of the digital filter to optimize some predetermined performance criterion.The most commonlyused performance measurement is based on the mean-square error(MSE) defined as

For an adaptive FIR filter, xn will depend on the L filter weights w0n, w1n, , wL 1n The MSE function can be determined bysubstituting (8.2.5) into (8.2.6),expressed as

xn Ed2n 2pTwn wTnRwn, 8:2:7where p is the crosscorrelation vector defined as

p Ednxn rdx0 rdx1 rdxL 1T, 8:2:8and

37

Trang 12

is the autocorrelation function of x(n)

Example 8.5: Given an optimum filter illustrated in the following figure:

The optimum filter wominimizes the MSE cost function xn Vector differentiation

of (8.2.7) gives woas the solution to

This system equation defines the optimum filter coefficients in terms of two correlationfunctions ± the autocorrelation function of the filter input and the crosscorrelationfunction between the filter input and the desired response Equation (8.2.12) provides asolution to the adaptive filtering problem in principle However, in manyapplications,the signal maybe non-stationary This linear algebraic solution, wo R 1p, requirescontinuous estimation of R and p, a considerable amount of computations In addition,when the dimension of the autocorrelation matrix is large, the calculation of R 1maypresent a significant computational burden Therefore a more useful algorithm isobtained bydeveloping a recursive method for computing wo, which will be discussed

in the next section

To obtain the minimum MSE, we substitute the optimum weight vector wo R 1pfor w(n) in (8.2.7), resulting in

Trang 13

Since R is positive semidefinite, the quadratic form on the right-hand side of (8.2.7)indicates that anydeparture of the weight vector w(n) from the optimum wo wouldincrease the error above its minimum value In other words, the error surface is concaveand possesses a unique minimum This feature is veryuseful when we utilize searchtechniques in seeking the optimum weight vector In such cases, our objective is todevelop an algorithm that can automaticallysearch the error surface to find theoptimum weights that minimize xn using the input signal x(n) and the error signal e(n).Example 8.6: Consider a second-order FIR filter with two coefficients w0 and

w1, the desired signal dn p2sinn!0, n 0, and the reference signal

xn dn 1 Find woand xmin

Similar to Example 8.2, we can obtain rxx0 Ex2n Ed2n 1,

rxx1 cos!0, rxx2 cos2!0, rdx0 rxx1, and rdx1 rxx2 From(8.2.12), we have

wo R 1p 1 cos!0

cos!0 1

cos!ocos2!0

xmin 1 cos!0 cos2!0 2 cos!1 0

0:

Equation (8.2.7) is the general expression for the performance function of an adaptiveFIR filter with given weights That is, the MSE is a function of the filter coefficientvector w(n) It is important to note that the MSE is a quadratic function because theweights appear onlyto the first and second degrees in (8.2.7) For each coefficient vectorw(n), there is a corresponding (scalar) value of MSE Therefore the MSE valuesassociated with w(n) form an L 1-dimensional space, which is commonlycalled theMSE surface, or the performance surface

For L 2, this corresponds to an error surface in a three-dimensional space Theheight of xn corresponds to the power of the error signal e(n) that results from filteringthe signal x(n) with the coefficients w(n) If the filter coefficients change, the power in theerror signal will also change This is indicated bythe changing height on the surfaceabove w0 w1 the plane as the component values of w(n) are varied Since the errorsurface is quadratic, a unique filter setting wn wowill produce the minimum MSE,

xmin In this two-weight case, the error surface is an elliptic paraboloid If we cut theparaboloid with planes parallel to the w0 w1 plane, we obtain concentric ellipses ofconstant mean-square error These ellipses are called the error contours of the errorsurface

Example 8.7: Consider a second-order FIR filter with two coefficients w0and w1.The reference signal x(n) is a zero-mean white noise with unit variance Thedesired signal is given as

dn b0xn b1xn 1:

Trang 14

Plot the error surface and error contours.

From Equation (8.2.10), we obtain R rxx0 rxx1

dx1

bb01

From (8.2.7), we get

x Ed2n 2pTw wTRw b2

0 b2

1 2b0w0 2b1w1 w2

0 w2 1Let b0 0:3 and b1 0:5, we have

x 0:34 0:6w0 w1 w2

0 w2

1:The MATLAB script (exam8_7a.m in the software package) is used to plot the errorsurface shown in Figure 8.4(a) and the script exam8_7b.m is used to plot the errorcontours shown in Figure 8.4(b)

1200 1000 800 600

400 200

20 15 10 5 0

10 20 30

0 40 20 0

−20

−40

40 20 0 w0

−40

Figure 8.4 Performance surface and error contours, L 2

Trang 15

One of the most important properties of the MSE surface is that it has onlyone globalminimum point At that minimum point, the tangents to the surface must be 0 Minim-izing the MSE is the objective of manycurrent adaptive methods such as the LMSalgorithm.

8.2.3 Method of Steepest Descent

As shown in Figure 8.4, the MSE of (8.2.7) is a quadratic function of the weights thatcan be pictured as a positive-concave hyperparabolic surface Adjusting the weights tominimize the error involves descending along this surface until reaching the `bottom ofthe bowl.' Various gradient-based algorithms are available These algorithms are based

on making local estimates of the gradient and moving downward toward the bottom ofthe bowl The selection of an algorithm is usuallydecided bythe speed of convergence,steady-state performance, and the computational complexity

The steepest-descent method reaches the minimum byfollowing the direction inwhich the performance surface has the greatest rate of decrease Specifically, an algo-rithm whose path follows the negative gradient of the performance surface Thesteepest-descent method is an iterative (recursive) technique that starts from some initial(arbitrary) weight vector It improves with the increased number of iterations Geomet-rically, it is easy to see that with successive corrections of the weight vector in thedirection of the steepest descent on the concave performance surface, we should arrive

at its minimum, xmin, at which point the weight vector components take on theiroptimum values Let x0 represent the value of the MSE at time n 0 with an arbitrarychoice of the weight vector w(0) The steepest-descent technique enables us to descend

to the bottom of the bowl, wo, in a systematic way The idea is to move on the errorsurface in the direction of the tangent at that point The weights of the filter are updated

at each iteration in the direction of the negative gradient of the error surface

The mathematical development of the method of steepest descent is easilyseen fromthe viewpoint of a geometric approach using the MSE surface Each selection of a filterweight vector w(n) corresponds to onlyone point on the MSE surface, [wn, xn].Suppose that an initial filter setting w(0) on the MSE surface, [w0, x0] is arbitrarilychosen A specific orientation to the surface is then described using the directionalderivatives of the surface at that point These directional derivatives quantifythe rate ofchange of the MSE surface with respect to the w(n) coordinate axes The gradient of theerror surface rxn is defined as the vector of these directional derivatives

The concept of steepest descent can be implemented in the following algorithm:

wn 1 wn m

where m is a convergence factor (or step size) that controls stabilityand the rate ofdescent to the bottom of the bowl The larger the value of m, the faster the speed ofdescent The vector rxn denotes the gradient of the error function with respect tow(n), and the negative sign increments the adaptive weight vector in the negativegradient direction The successive corrections to the weight vector in the direction of

Trang 16

the steepest descent of the performance surface should eventuallylead to the minimummean-square error xmin, at which point the weight vector reaches its optimum value wo.When w(n) has converged to wo, that is, when it reaches the minimum point of theperformance surface, the gradient rxn 0 At this time, the adaptation in (8.2.14) isstopped and the weight vector stays at its optimum solution The convergence can beviewed as a ball placed on the `bowl-shaped' MSE surface at the point [w0, x0] If theball was released, it would roll toward the minimum of the surface, and would initiallyroll in a direction opposite to the direction of the gradient, which can be interpreted asrolling towards the bottom of the bowl.

^

Therefore the gradient estimate used bythe LMS algorithm is

r^xn 2renen: 8:2:16Since en dn wTnxx, ren xn, the gradient estimate becomes

Substituting this gradient estimate into the steepest-descent algorithm of (8.2.14), we have

This is the well-known LMS algorithm, or stochastic gradient algorithm This algorithm

is simple and does not require squaring, averaging, or differentiating The LMS rithm provides an alternative method for determining the optimum filter coefficientswithout explicitlycomputing the matrix inversion suggested in (8.2.12)

algo-Widrow's LMS algorithm is illustrated in Figure 8.5 and is summarized as follows:

1 Determine L, m, and w(0), where L is the order of the filter, m is the step size, andw(0) is the initial weight vector at time n 0

2 Compute the adaptive filter output

yn XL 1l0

Trang 17

x(n) y(n)

d(n) e(n)

+

−

w(n)

LMS

Figure 8.5 Block diagram of an adaptive filter with the LMS algorithm

3 Compute the error signal

4 Update the adaptive weight vector from w(n) to w(n + 1) byusing the LMSalgorithm

wln 1 wln mxn len, l 0, 1, , L 1: 8:2:218.3 Performance Analysis

A detailed discussion of the performance of the LMS algorithm is available in manytextbooks In this section, we present some important properties of the LMS algorithmsuch as stability, convergence rate, and the excess mean-square error due to gradientestimation error

8.3.1 Stability Constraint

As shown in Figure 8.5, the LMS algorithm involves the presence of feedback Thusthe algorithm is subject to the possibilityof becoming unstable From (8.2.18), weobserve that the parameter m controls the size of the incremental correction applied

to the weight vector as we adapt from one iteration to the next The mean weightconvergence of the LMS algorithm from initial condition w(0) to the optimum filter womust satisfy

0 < m <l2

where lmax is the largest eigenvalue of the autocorrelation matrix R defined in (8.2.10).Applying the stability constraint on m given in (8.3.1) is difficult because of the compu-tation of lmax when L is large

In practical applications, it is desirable to estimate lmaxusing a simple method From(8.2.10), we have

Trang 18

Px rxx0 Ex2n 8:3:4denotes the power of x(n) Therefore setting

0 < m <LP2

assures that (8.3.1) is satisfied

Equation (8.3.5) provides some important information on how to select m, and theyare summarized as follows:

1 Since the upper bound on m is inverselyproportional to L, a small m is used for order filters

large-2 Since m is made inverselyproportional to the input signal power, weaker signals use

a larger m and stronger signals use a smaller m One useful approach is to normalize

with respect to the input signal power Px The resulting algorithm is called thenormalized LMS algorithm, which will be discussed in Section 8.4

8.3.2 Convergence Speed

In the previous section, we saw that w(n) converges to wo if the selection of m satisfies(8.3.1) Convergence of the weight vector w(n) from w(0) to wo corresponds to theconvergence of the MSE from x0 to xmin Therefore convergence of the MSE towardits minimum value is a commonlyused performance measurement in adaptive systemsbecause of its simplicity During adaptation, the squared error e2n is non-stationaryasthe weight vector w(n) adapts toward wo The corresponding MSE can thus be definedonlybased on ensemble averages A plot of the MSE versus time n is referred to as thelearning curve for a given adaptive algorithm Since the MSE is the performancecriterion of LMS algorithms, the learning curve is a natural wayto describe the transientbehavior

Each adaptive mode has its own time constant, which is determined bythe overalladaptation constant m and the eigenvalue ll associated with that mode Overall con-vergence is clearlylimited bythe slowest mode Thus the overall MSE time constant can

be approximated as

Trang 19

Because the upper bound of tmse is inverselyproportional to lmin, a small lmin canresult in a large time constant (i.e., a slow convergence rate) Unfortunately, if lmax isalso verylarge, the selection of m will be limited by(8.3.1) such that onlya small m cansatisfythe stabilityconstraint Therefore if lmaxis verylarge and lminis verysmall, from(8.3.6), the time constant can be verylarge, resulting in veryslow convergence Aspreviouslynoted, the fastest convergence of the dominant mode occurs for m 1=lmax.Substituting this smallest step size into (8.3.6) results in

tmselmax

For stationaryinput and sufficientlysmall m, the speed of convergence of the algorithm

is dependent on the eigenvalue spread (the ratio of the maximum to minimum values) of the matrix R

eigen-As mentioned in the previous section, the eigenvalues lmaxand lminare verydifficult

to compute However, there is an efficient wayto estimate the eigenvalue spread fromthe spectral dynamic range That is,

8.3.3 Excess Mean-Square Error

The steepest-descent algorithm in (8.2.14) requires knowledge of the gradient rxn,which must be estimated at each iteration The estimated gradient r^xn produces thegradient estimation noise After the algorithm converges, i.e., w(n) is close to wo, thetrue gradient rxn 0 However, the gradient estimator r^xn 6 0 As indicated bythe update of Equation (8.2.14), perturbing the gradient will cause the weight vector

wn 1 to move awayfrom the optimum solution wo Thus the gradient estimation

Trang 20

noise prevents wn 1 from staying at woin steadystate The result is that w(n) variesrandomlyabout wo Because wocorresponds to the minimum MSE, when w(n) movesawayfrom wo, it causes xn to be larger than its minimum value, xmin, thus producingexcess noise at the filter output.

The excess MSE, which is caused byrandom noise in the weight vector after vergence, is defined as the average increase of the MSE For the LMS algorithm, it can

con-be approximated as

This approximation shows that the excess MSE is directlyproportional to m The largerthe value of m, the worse the steady-state performance after convergence However,Equation (8.3.6) shows that a larger m results in faster convergence There is a designtrade-off between the excess MSE and the speed of convergence

The optimal step size m is difficult to determine Improper selection of m might makethe convergence speed unnecessarilyslow or introduce excess MSE If the signal is non-stationaryand real-time tracking capabilityis crucial for a given application, then use alarger m If the signal is stationaryand convergence speed is not important, use a smaller

m to achieve better performance in a steadystate In some practical applications, we canuse a larger m at the beginning of the operation for faster convergence, then use a smaller

m to achieve better steady-state performance

The excess MSE, xexcess, in (8.3.9) is also proportional to the filter order L, whichmeans that a larger L results in larger algorithm noise From (8.3.5), a larger L implies asmaller m, resulting in slower convergence On the other hand, a large L also impliesbetter filter characteristics such as sharp cutoff There exists an optimum order L foranygiven application The selection of L and m also will affect the finite-precision error,which will be discussed in Section 8.6

In a stationaryenvironment, the signal statistics are unknown but fixed The LMSalgorithm graduallylearns the required input statistics After convergence to a steadystate, the filter weights jitter around the desired fixed values The algorithm perform-ance is determined byboth the speed of convergence and the weight fluctuations insteadystate In the non-stationarycase, the algorithm must continuouslytrack the time-varying statistics of the input Performance is more difficult to assess

8.4 Modified LMS Algorithms

The LMS algorithm described in the previous section is the most widelyused adaptivealgorithm for practical applications In this section, we present two modified algorithmsthat are the direct variants of the basic LMS algorithm

8.4.1 Normalized LMS Algorithm

The stability, convergence speed, and fluctuation of the LMS algorithm are governed bythe step size m and the reference signal power As shown in (8.3.5), the maximum stable

Trang 21

step-size m is inverselyproportional to the filter order L and the power of the referencesignal x(n) One important technique to optimize the speed of convergence whilemaintaining the desired steady-state performance, independent of the reference signalpower, is known as the normalized LMS algorithm (NLMS) The NLMS algorithm isexpressed as

1 ChooseP^x0 as the best a priori estimate of the reference signal power

2 Since it is not desirable that the power estimate P^xn be zero or verysmall, asoftware constraint is required to ensure that mn is bounded even if P^xn is verysmall when the signal is absent for a long time This can be achieved bymodifying(8.4.2) as

where v is the leakage factor with 0 < v 1 It can be shown that leakage is thedeterministic equivalent of adding low-level white noise Therefore this approach results

Trang 22

in some degradation in adaptive filter performance The value of the leakage factor isdetermined bythe designer on an experimental basis as a compromise between robust-ness and loss of performance of the adaptive filter The leakage factor introduces a bias

on the long-term coefficient estimation The excess error power due to the leakage isproportional to 1 v=m2 Therefore (1 v) should be kept smaller than m in order tomaintain an acceptable level of performance For fixed-point hardware realization,multiplication of each coefficient by v, as shown in (8.4.5), can lead to the introduction

of roundoff noise, which adds to the excess MSE Therefore the leakage effects must beincorporated into the design procedure for determining the required coefficient andinternal data wordlength The leakyLMS algorithm not onlyprevents unconstrainedweight overflow, but also limits the output power in order to avoid nonlinear distortion.8.5 Applications

The desirable features of an adaptive filter are the abilityto operate in an unknownenvironment and to track time variations of the input signals, making it a powerfulalgorithm for DSP applications The essential difference between various applications

of adaptive filtering is where the signals x(n), d(n), y(n), and e(n) are connected Thereare four basic classes of adaptive filtering applications: identification, inverse modeling,prediction, and interference canceling

8.5.1 Adaptive System Identification

System identification is an experimental approach to the modeling of a process or a plant.The basic idea is to measure the signals produced bythe system and to use them toconstruct a model The paradigm of system identification is illustrated in Figure 8.6,where P(z) is an unknown system to be identified and W(z) is a digital filter used to modelP(z) Byexciting both the unknown system P(z) and the digital model W(z) with the sameexcitation signal x(n) and measuring the output signals y(n) and d(n), we can determinethe characteristics of P(z) byadjusting the digital model W(z) to minimize the differencebetween these two outputs The digital model W(z) can be an FIR filter or an IIR filter

filter, W(z)

LMS algorithm

Unknown

system, P(z)

Signal generator

Figure 8.6 Block diagram of adaptive system identification using the LMS algorithm

Trang 23

Adaptive system identification is a technique that uses an adaptive filter for the modelW(z) This section presents the application of adaptive estimation techniques for directsystem modeling This technique has been widely applied in echo cancellation, whichwill be introduced in Sections 9.4 and 9.5 A further application for system modeling is

to estimate various transfer functions in active noise control systems [8]

Adaptive system identification is a very important procedure that is used frequently

in the fields of control systems, communications, and signal processing The modeling

of a single-input/single-output dynamic system (or plant) is shown in Figure 8.6, wherex(n), which is usuallywhite noise, is applied simultaneouslyto the adaptive filter and theunknown system The output of the unknown system then becomes the desired signal,d(n), for the adaptive filter If the input signal x(n) provides sufficient spectral excita-tion, the adaptive filter output y(n) will approximate d(n) in an optimum sense afterconvergence

Identification could mean that a set of data is collected from the system, and that aseparate procedure is used to construct a model Such a procedure is usuallycalled off-line (or batch) identification In manypractical applications, however, the model issometimes needed on-line during the operation of the system That is, it is necessary toidentifythe model at the same time that the data set is collected The model is updated ateach time instant that a new data set becomes available The updating is performed with

a recursive adaptive algorithm such as the LMS algorithm

As shown in Figure 8.6, it is desired to learn the structure of the unknown systemfrom knowledge of its input x(n) and output d(n) If the unknown time-invariant systemP(z) can be modeled using an FIR filter of order L, the estimation error is given as

en dn yn XL 1

l0

pl wlnxn l, 8:5:1

where p(l) is the impulse response of the unknown plant

Bychoosing each wln close to each p(l), the error will be made small For noise input, the converse also holds: minimizing e(n) will force the wln to approachp(l), thus identifying the system

white-wln pl, l 0, 1, , L 1: 8:5:2The basic concept is that the adaptive filter adjusts itself, intending to cause its output tomatch that of the unknown system When the difference between the physical systemresponse d(n) and adaptive model response y(n) has been minimized, the adaptive modelapproximates P(z) In actual applications, there will be additive noise present at theadaptive filter input and so the filter structure will not exactlymatch that of the unknownsystem When the plant is time varying, the adaptive algorithm has the task of keeping themodeling error small bycontinuallytracking time variations of the plant dynamics

8.5.2 Adaptive Linear Prediction

Linear prediction is a classic signal processing technique that provides an estimate of thevalue of an input process at a future time where no measured data is yet available The

Trang 24

techniques have been successfullyapplied to a wide range of applications such as speechcoding and separating signals from noise As illustrated in Figure 8.7, the time-domainpredictor consists of a linear prediction filter in which the coefficients wln are updatedwith the LMS algorithm The predictor output y(n) is expressed as

yn XL 1l0

refer-Now consider the adaptive predictor for enhancing an input of M sinusoidsembedded in white noise, which is of the form

xn sn vn M 1X

m0

Amsin!mn fm vn, 8:5:5where v(n) is white noise with uniform noise power s2

v In this application, the structureshown in Figure 8.7 is called the adaptive line enhancer, which provides an efficientmeans for the adaptive tracking of the sinusoidal components of a received signal x(n)and separates these narrowband signals s(n) from broadband noise v(n) This techniquehas been shown effective in practical applications when there is insufficient a prioriknowledge of the signal and noise parameters

As shown in Figure 8.7, we want the highlycorrelated components of x(n) to appear

in y(n) This is accomplished byadjusting the weights to minimize the expected square value of the error signal e(n) This causes an adaptive filter W(z) to form

mean-x(n)

y(n) + e(n)

− Digital

Narrowband output

Figure 8.7 Block diagram of an adaptive predictor

8.2.1 Introduction to Adaptive Filtering< /h3>

An adaptive filter consists of two distinct parts ± a digital filter to perform the desiredsignal processing, and an adaptive algorithm to... general form of adaptive filter is illustrated in Figure 8.2, where d(n) is adesired signal (or primaryinput signal) , y(n) is the output of a digital filter driven byareference input signal x(n),... error is progressivelyminimized on a sample-by-sample basis

In general, there are two types of digital filters that can be used for adaptive filtering: FIR and IIR filters The choice of

Tiêu đề	Adaptive Filtering
Tác giả	Sen M Kuo, Bob H Lee
Trường học	John Wiley & Sons Ltd
Chuyên ngành	Digital Signal Processing
Thể loại	Textbook
Năm xuất bản	2001

Định dạng
Số trang	48
Dung lượng	424,19 KB

Tài liệu tham khảo	Loại	Chi tiết
[1] S. T. Alexander, Adaptive Signal Processing, New York: Springer-Verlag, 1986	Khác
[2] M. Bellanger, Adaptive Digital Filters and Signal Analysis, New York: Marcel Dekker, 1987	Khác
[3] P. M. Clarkson, Optimal and Adaptive Signal Processing, Boca Raton, FL: CRC Press, 1993	Khác
[4] C. F. N. Cowan and P. M. Grant, Adaptive Filters, Englewood Cliffs, NJ: Prentice-Hall, 1985	Khác
[5] J. R. Glover, Jr., `Adaptive noise canceling applied to sinusoidal interferences,' IEEE Trans.Acoust., Speech, Signal Processing, ASSP-25, Dec. 1997, pp. 484±491	Khác
[6] S. Haykin, Adaptive Filter Theory, 2nd Ed., Englewood Cliffs, NJ: Prentice-Hall, 1991	Khác
[8] S. M. Kuo and D. R. Morgan, Active Noise Control Systems ± Algorithms and DSP Implementa- tions, New York: Wiley, 1996	Khác
[9] L. Ljung, System Identification: Theory for the User, Englewood Cliffs, NJ: Prentice-Hall, 1987	Khác
[10] J. Makhoul, `Linear prediction: A tutorial review,' Proc. IEEE, vol. 63, Apr. 1975, pp. 561±580	Khác
[11] J. R. Treichler, C. R. Johnson, Jr., and M. G. Larimore, Theory and Design of Adaptive Filters, New York: Wiley, 1987	Khác
[12] B. Widrow, J. R. Glover, J. M. McCool, J. Kaunitz, C. S. Williams, R. H. Hern, J. R. Zeidler, E. Dong, and R. C. Goodlin, `Adaptive noise canceling: principles and applications,' Proc. IEEE, vol. 63, Dec. 1975, pp. 1692±1716	Khác
[13] B. Widrow and S. D. Stearns, Adaptive Signal Processing, Englewood Cliffs, NJ: Prentice-Hall, 1985	Khác
[14] M. L. Honig and D. G. Messerschmitt, Adaptive Filters: Structures, Algorithms, and Applications, Boston, MA: Kluwer Academic Publishers, 1986.Exercises Part A	Khác