Analysis of the Adaptive-Q Algorithm: Ideal Case- 123docz.net

In order to obtain some insight into the behavior of the adaptive algorithm, we analyze the closed-loop system described by (2.4), (3.2) and (3.3), not considering the projection or the leakage modification.

Also, in order that the “criterion minimization” task attempted by the adaptive algorithm be well posed we assume that the disturbance signalwk is stationary.

More precisely, we assume that:

Assumption 4.1. Stationary signals:wkis a bounded signal such that there exist constant matrices Ew, Cw(`), `=0,1, . . . such that for all integers N ≥1

k0+N−1

k=k0

(wk−Ew)

≤C2Nγ,

k0+N−1

k=k0

(wkw0k−`−Cw(`))

≤C3Nγ, `=0,1, . . . ,

(4.1)

for some positive constantsC2,C3independent of k0andγ ∈(0,1).

From the condition (4.1) it follows that the criterion (3.1) is well defined for any fixed2. In particular as N goes to infinity the Cesáro mean in (3.1) converges to J(2)at the same rate as N−1+γ converges to 0.

We also assume that the disturbance is sufficiently rich. In particular we assume that:

Assumption 4.2. Exciting signals:wkis such that for any stable, rational, matrix transfer functionH∈ R H∞(m+p)the signalwf =Hwsatisfies:

k0+N−1

k=k0

(wf,kw0f,k−Cwf)

≤C4Nγ, (4.2)

for someC4>0, and some symmetric positive definite Cwf, andγ ∈(0,1). Under Assumption 4.2, there exists a unique2∗such that J(2∗)≤ J(2)for all2, indeed J(2)is quadratic in2with positive definite Hessian.

Remark. Assumption 4.2 not only guarantees the existence of a unique2∗minimizer of J(2)but unfortunately, also excludes the possibility of exact tracking.

Indeed, Assumption 4.2 implies inter alia that the spectral content ofwis not finite.

Under Assumption 4.1 and providedàis sufficiently small we can use averaging results to analyze the behavior of the adaptive system. For an overview of the required results from averaging theory, see Appendix C. The following result is established.

Theorem 4.3. Consider the adaptive system described by (2.4), (3.2) and (3.3).

Let A+B F,A+H C,Aqbe stable matrices in that (2.5) and (2.6) are satisfied.

Let the disturbance signal wk = w1k

w2k

satisfy Assumptions 4.1 and 4.2, and kwkk ≤ W . Then there exists a positive à∗ such that for allà ∈ (0, à∗)the adaptive system has the properties:

1. The system state is bounded; for all initial conditions x0,x˜0, z0,20: k2kk ≤θ for someθ >0,

(| ˜xk|,|zk|)≤C0λk0(| ˜x0|,|z0|)+C0(1−λk0)W,

|xk| ≤C1λk1(|x0| + | ˜x0| + |z0|θ)+C1(1+θ)(1−λk1)W, for all k=0,1, . . . for some constants C1, C0independent ofà, W ,θ. See also (2.7) and (2.8).

2. Near optimality:

lim sup

k→∞

2k−2∗

≤C5à(1−γ )/2, (4.3) where2∗is the unique minimizer of the criterion (3.1). The constant C5is independent ofà.

3. Convergence is exponential:

2k−2∗

≤C6(1−Là)k

20−2∗

for all k=0,1, . . . , (4.4) for some C6>1 and L >0 independent ofà.

Proof∗. Define Xk(2)as the state solution of the system (2.4), denoted Xk(2k), with2k ≡ 2. In a similar manner define ek(2)as its output. Such correspond to the so-called frozen system state and output. These are zero adaptation approx- imations for the case when2k ≈ 2. Because 2k is slowly time varying and because of the stability of the matrices A+B F,Aq and A+H C it follows that for sufficiently smallà, provided|2k| ≤θ,

|Xk−Xk(2k)| ≤C7à; some C7>0, for all k :|2k| ≤θ,

|ek−ek(2k)| ≤C7à; some C7>0, for all k :|2k| ≤θ.

∗This proof may be omitted on first reading.

Now, (3.2) may be written as:

2i j,k+1=2i j,k−àγi j0,kRek(2k)+O(à2),

which is at least valid on a time interval k∈(0,M/à), for some M >0.

The averaged equation becomes:

2ai jv,k+1=2ai jv,k−à ∂J(2)

∂2i j

2=2avk

Here we observe that for all finite2: lim

N→∞

1 N

k0+N−1

k=k0

γi j0,kRek(2)= ∂J(2)

∂2i j

This follows from Assumption 4.1. Providedàis sufficiently small, it follows that 2akvis bounded and converges to2∗. Indeed, the averaged equation is a steepest descent algorithm for the cost function J(2). In view of Assumption 4.2, J(2) has a unique minimum, which is therefore a stable and attractive equilibrium, see Hirsch and Smale (1974).

The result then follows from the standard averaging theorems presented in Ap- pendix C, in particular, Theorem C.4.2.

Remarks.

1. The same result can be derived under the weaker signal assumption:

Assumption 4.4. The external signalwis such that there exists a unique minimizer2∗for the criterion (3.1).

Assumption 4.4 allows for situations where exact output tracking can be achieved.

2. Generally, the adaptive algorithm achieves near optimal performance in an exponential manner. Of course, the convergence is according to a large time constant, asàis small.

3. The real advantage of the adaptive algorithm is its ability to track near optimal performance in the casewkis not stationary. Indeed, we can infer from Theorem 4.3 that provided the signalwk is well approximated by a stationary signal over a time horizon of the order of 1/à, the adaptive algorithm will maintain near optimal performance regardless of the time-varying char- acteristics of the signal.

If we consider the adaptive algorithm with leakage, we may establish a result which no longer requires sufficiently rich external signals. In this situation, there is not necessarily a unique minimizer for the criterion (3.1). The following result holds:

Theorem 4.5. Consider the adaptive system described by (2.4), (3.3) and (3.5).

Let A+B F , A+H C and Aqbe stable matrices satisfying the conditions (2.5) and (2.6). Let the external signal be stationary, satisfying Assumption 4.1, and kwkk ≤ W . Then there exists a positive à∗ such that for allà ∈ (0, à∗)the adaptive system has the properties

1. The system state is bounded for all possible initial conditions x0, x˜0, z0, 20:

k2kk ≤θ, for someθ >0,

(| ˜xk|,|zk|)≤C0λk0(| ˜x0|,|z0|)+C0(1−λk0)W,

|xk| ≤C1λk1(|x0| + | ˜x0| + |z0|θ)+C1(1+θ)(1−λk1)W, for all k=0,1, . . . and constants C0, C1>0 independent ofà, W andθ. 2. Writing∂J/∂2=0 in the form0vec2−E=0†; then there exists a2∗,

vec2∗=(0+λI)−1E such that for some C2independent ofà lim sup

k→∞

2k−2∗

≤C2à(1−γ )/2. 3. Convergence is exponential (for some C3>0, L >0):

2k−2∗

≤C3(1−Là)k

20−2∗

; k=0,1, . . . . Proof. The proof follows along the same lines as the proof of Theorem 4.3. It suf- fices to observe that the averaged version of equation (3.4) governing the update for the estimate2k becomes:

2ai jv,k+1=2ai jv,k−à ∂J(2)

∂2i j

2=2av

+λ2ai jv,k

. (4.5)

The existence of the averages is guaranteed by Assumption 4.1. It follows that there exists a unique equilibrium for (4.5) given by2∗. Because0=00≥0, and 0+λI > 0 for allλ >0, then2∗ is a locally exponentially stable solution of (4.5), that is for sufficiently smallà, such that|eig(I−à(0+λI))| <1. This establishes the result upon invoking Theorem C.4.2.

Remarks.

1. The result of Theorem 4.5 establishes the rationale for having the exponential forgetting factor(1−àλ)in (3.5) satisfying 0 < 1−àλ < 1. The exponential forgetting of past observations should not dominate the adaptive mechanism, otherwise the only benefit which can be derived from the adaptation will be due to the most recent measurements.

†vec2denotes the column vector obtained by collecting all columns of the matrix2from left to right and stacking them under one another.

2. The minimizers of the criterion (3.1) are of course the solutions of0vec2−

E =0. In the case of0being only positive semi-definite and not positive definite, there is a stable linear subspace of2parameter space, achieving optimal performance. In this case the exponential forgetting is essential to maintain bounded2. Without it, the adaptation mechanism will force2k

towards this linear subspace of minimizers, and subsequently2kwill wan- der aimlessly in this subspace. Under such circumstances, there are com- binations of w signals that will causek2kk to become unbounded. The forgetting factorλprohibits this.

3. The forgetting factor guarantees boundedness, but with the cost of not achieving optimality. Indeed, 2∗ (as established in Theorem 4.5 Item 2) solves (0 + λI)vec2 = E, not 0vec2 = E. The penalty for this is however a small one. If 0 were invertible, then for sufficiently smallλ,

vec2∗=0−1E−λ0−2E+λ20−3E+ ã ã ã

Hence there is but an orderλerror between the obtained2∗and the optimal one,0−1E. More generally,0and E may be expressed as0 =Pk

i 0i0i0 and E = Pk

i 0iai, with 0i00i = 1, i = 1, . . . ,k and 0i00j = 0 for i 6= j , i,j = 1, . . . ,k. The 0i may be interpreted as those direc- tions in parameter space in which information is obtained from the external signal w. If k < dim(vec2), that is 0 is singular, then we have that

3jvec2∗=0; j =k+1, . . . ,dim(vec2), 0i0vec2∗= ai

1+λ; i =1, . . . ,k,

where the 3j, j = k +1, . . . ,dim(vec2) complement the 0i, i = 1, . . . ,k to form an orthonormal basis for the parameter space. Again we see that near optimal performance is obtained.

6.5 Q-augmented Controller Structure: Plant-model Mismatch

In Sections 6.2, 6.3 and 6.4, the ideal situation where the plant is precisely mod- eled has been discussed. Normally, we expect the model to be an approximation.

Let the controller be as defined in (2.2) and (2.3). The nominal controller corre- sponds to2=0. As usual, we assume that the nominal controller stabilizes the actual plant. Using the representations developed in Chapters 2 and 3 the plant is

here represented by:





 xk+1

vk+1













A −H Cs B B 0

−BsF As Bs Bs 0

C Cs D 0 I











 xk vk

w1k

w2k







, (5.1)

where

G :





A B

C D



 is the realization for the model (see also (2.1)), and

S :





As Bs Cs 0



 (5.2)

represents a stable system characterizing the plant model mismatch. The vector vk ∈ Rs is the state associated with the unmodeled dynamics. The matrices H and F are respectively the gain matrices used to stabilize the observer and the nominal system model. The representation (5.1) is derived from the earlier result (3.5.1), with the assumption Ds =0 and using a nominal stabilizing controller in observer form, that is, with the Z of (3.5.4) having the form

Z :=

M U

N V

# :=







A+B F B −H

F I 0

C+D F D I





. (5.3)

From the results of Chapter 3, we recall that the controller (2.2) with sk ≡0, that is, the nominal controller stabilizes any system of the form (5.1) as long as As is a stable matrix. It is also clear that any system of the form (5.1) with As stable and Cs =0 is stabilized by the controller (2.2) with the stable Q-filter (2.3). More importantly it is established in Theorem 3.4.2 that the controlled system described by equation (5.1), (2.2) and (2.3) with2k ≡2is stable if and only if the matrix

As Bs2 BqCs Aq

(5.4) is a stable matrix. It has all its eigenvalues in the open unit disk, since this condition is equivalent to requiring that the pair(Q,S)is stabilizing. It is clear that some prior knowledge about the ‘unmodeled dynamics’ S is required in order to

be able to guarantee that(Q,S)is a stabilizing pair. (Also recall that this result does not require Asto be stable!)

The closed loop, apart from the adaptation algorithm, may be represented in state space format as follows:





 xk+1 vk+1

zk+1

˜ xk+1

yk uk













A+B F −H Cs B2k −B F B 0

0 As Bs2k −BsF Bs 0

0 −BqCs Aq −BqC 0 −Bq

0 0 0 A+H C B H

C+D F Cs D2k −D F 0 I

F 0 2k F 0 0











 xk vk

˜ xk

w1,k

w2,k





 .

(5.5) The state variable is now xk =(xk0 vk0 z0kx˜k0)0, where againx˜kis the state estimation errorx˜k =xk− ˆxk.

Obviously the stability of the above closed loop (5.5) system hinges on the stability of the matrix

As Bs2k

−BqCs Aq

# .

Due to the presence of the unmodeled dynamics, it is also obvious that the interconnection between the disturbance signalwk and our performance signal ek = yk

is no longer affine in2k. Cs 6=0 is the offending matrix.

In order to present the equivalent of Lemma 2.1 for the system (5.5) we make the following assumption concerning the effect of the unmodeled dynamics:

Assumption 5.1. There exists a positive constant2ssuch that for allk2k ≤2s

eig

As Bs2

−BqCs Aq

< λs < λ0<1 (5.6) for someλs >0. (λ0is defined in (2.6)).

In essence we are requiring that the unmodeled dynamics are well outside the bandwidth of the nominal controller, to the extent that for any2modification of the controller with “gain” bounded by2s the dominant dynamics are determined by the nominal model and nominal controller system. The bound2s can be con- servatively estimated from a small gain argument. In order to obtain a reasonable margin for adaptive control, with2snot too small, it is important that the nominal controller has small gain in the frequency band where the unmodeled dynamics are significant.

The small gain argument is obtained as follows. Denote S(z) = CS(z I − AS)−1BS and H2(z) = (z I − Aq)−1Bq. Then, the interconnection of (Q,S)

will be stable provided that

k2H2(z)S(z)k<1 on |z| =1, or

k2k< 1

kH2(z)S(z)k on |z| =1. (5.7) From these inequalities we deduce that in order to have an effective plug in controller Q, the controller transfer function H2(z)should be negligible in the frequency band where significant model-plant mismatch is expected. As explained before, due to the frequency weighting, S(z)is only significant outside the pass- band of the nominal controller.

With the above assumption (5.6) we are now in a position to state the equivalent of Lemma 2.1 for the system (5.5).

Lemma 5.2. Consider the system (5.5). Let Assumption 5.1 hold. Letkwkk ≤W and let condition (2.5) and (2.6) from Lemma 2.1 hold. There exists a1 >0 such that for all sequences2ksuch thatk2kk ≤2sandk2k+1−2kk ≤1, the state of the system (5.5) is bounded. In particular, there exists positive constants C0,C1 such that:

(|zk|,|vk|,| ˜xk|)≤C0λk0(|z0|,|v0|,| ˜x0|)+C0(1−λk0)W,

|xk| ≤C1λk1(|x0| + | ˜x0| + |v0| +2s|z0|) +C1(1+2s)(1−λk1)W.

Lemma 5.2 is considerably weaker than Lemma 2.1. Due to the presence of the unmodeled dynamics we not only need to restrict the amount of adaptation of 2s, but also need to restrict the adaptation speed; at least if we want to guarantee that adaptation is not going to alter the stability properties of the closed loop. We argue that the requirement that adaptation improves the performance of a nominal control design is essential in practical applications. From this perspective, Lemma 5.2 established the minimal (be it conservative) requirements that need to be imposed on any adaptation algorithm we wish to implement. It provides yet another pointer for why slow adaptation is essential.

Analysis of the Adaptive-Q Algorithm: Ideal Case

Analysis of the Adaptive-Q Algorithm

Adaptive Algorithm Analysis: Ideal case