Tài liệu Adaptive lọc và phát hiện thay đổi P10 doc

378 Chanae detection based on multide models Here St is a discrete parameter representing the mode of the system linearized mode, faulty mode etc., and it takes on one of S different va

Trang 1

10

multiple models

10.1 Basics 377

10.2 Examples of applications 378

10.3 On-line algorithms 385

10.3.1 General ideas 385

10.3.2 Pruning algorithms 386

10.3.3 Merging strategies 387

10.3.4 A literature survey 389

10.4 Off-line algorithms 391

10.4.1 The EM algorithm 391

10.4.2 MCMC algorithms 392

10.5 Local pruning in blind equalization 395

10.5.1 Algorithm 395

10.A.Posterior distribution 397

1O.A.l.Posterior distribution of the continuous state 398

lO.A.2 Unknown noise level 400

10.1 Basics

This chapter addresses the most general problem formulation of detection in linear systems Basically, all problem formulations that have been discussed

so far are included in the framework considered The main purpose is to survey multiple model algorithms, and a secondary purpose is to overview and compare the state of the art in different application areas for reducing complexity, where similar algorithms have been developed independently The goal is to detect abrupt changes in the state space model

Yt = C t ( & h + + et

ut E N ( m u , t ( & ) , Q t ( & ) )

et E N ( m e , t ( & ) , &(Q)

Adaptive Filtering and Change Detection

Trang 2

378 Chanae detection based on multide models

Here St is a discrete parameter representing the mode of the system (linearized

mode, faulty mode etc.), and it takes on one of S different values (mostly

we have the case S = 2) This model incorporates all previously discussed problems in this book, and is therefore the most general formulation of the estimation and detection problem Section 10.2 gives a number of applications, including change detection and segmentation, but also model structure selection, blind and standard equalization, missing data and outliers The common theme in these examples is that there is an unknown discrete parameter, mode,

in a linear system

One natural strategy for choosing a S is the following:

0 For each possible 6, filter the data through a Kalman filter for the (conditional) known state space model (10.1)

0 Choose the particular value of 6, whose Kalman filter gives the smallest prediction errors

In fact, this is basically how the MAP estimator

g M A P = arg rn?P(6lYN) (10.2) works, as will be proven in Theorem 10.1 The structure is illustrated in Figure 10.1

The key tool in this chapter is a repeated application of Bayes’ law t o compute a posteriori probabilities:

(10.3)

W Y t - c t ( s t ) s t , t - l ( 6 t ) , R t ( 6 t ) + Ct(S,)P,,t-,(6,)C,T(s,))

A proof is given in Section 10.A The latter equation is recursive and suitable for implementation This recursion immediately leads to a multiple model algorithm summarized in Table 10.1 This table also serves as a summary of the chapter

A classical signal processing problem is to find a sinusoid in noise, where the phase, amplitude and frequency may change in time Multiple model approaches are found in Caciotta and Carbone (1996) and Spanjaard and White

Trang 3

10.2 ExamDles of amlications 379

Table 10.1 A generic multiple model algorithm

1 Kalman filtering: conditioned on a particular sequence bt, the state estimation problem in (10.1) is solved by a Kalman filter This will be

called the conditional Kalman filter, and its outputs are

2 Mode evaluation: for each sequence, we can compute, up to an un-

known scaling factor, the posterior probability given the measurements,

using (10.3)

3 Distribution: at time t , there are St different sequences St, which will

be labeled @(i), i = 1 , 2 , , S t It follows from the theorem of total probability that the exact posterior density of the state vector is

This distribution is a Gaussian mixture with St modes

4 Pruning and merging (on-line): for on-line applications, there are

two approaches to approximate the Gaussian mixture, both aiming at removing modes so only a fixed number of modes in the Gaussian mixture are kept The exponential growth can be interpreted as a tree, and the approximation strategies are merging and pruning Pruning is simply to

cut off modes in the mixture with low probability In merging, two or more modes are replaced by one new Gaussian distribution

5 Numerical search (off-line): for off-line analysis, there are numerical

approaches based on the EM algorithm or MCMC methods We will detail some suggestions for how to generate sequences of bt which will theoretically belong to the true posterior distribution

Trang 4

Figure 10.1 The multiple model approach

(1995) In Daumera and Falka (1998), multiple models are used to find the change points in biomedical time series In Caputi (1995), the multiple model

is used to model the input to a linear system as a switching Gaussian process Actuator and sensor faults are modeled by multiple models in Maybeck and Hanlon (1995) Wheaton and Maybeck (1995) used the multiple model approach for acceleration modeling in target tracking, and Yeddanapudi et al (1997) applied the framework to target tracking in ATC These are just a few examples, more references can be found in Section 10.3.4 Below, important special cases of the general model are listed as examples It should be stressed that the general algorithm and its approximations can be applied t o all of them

Example 70.1 Detection in changing mean model

Consider the case of an unknown constant in white noise Suppose that

we want to test the hypothesis that the 'constant' has been changed at some unknown time instant We can then model the signal by

yt = 81 + a(t - S + l)& + et,

where a ( t ) is the step function If all possible change instants are t o be considered, the variable S takes its value from the set {l, 2, , t - 1, t } , where S = t

Trang 5

10.2 ExamDles of amlications 381

should be interpreted as no change (yet) This example can be interpreted as

a special case of ( l O l ) , where

1

S = {1,2, , t } , xt = (81,82)~, At(S) = ( 0 a(t - S + 1)

Ct(S) = (1, l), Q t ( S ) = 0 &(S) = X

The detection problem is to estimate S

Example 70.2 Segmentation in changing mean model

Suppose in Example 10.1 that there can be arbitrarily many changes in the mean The model used can be extended by including more step functions, but such a description would be rather inconvenient A better alternative to model the signal is

&+l =& + Stvt

Yt =& + et

6, E{O, 11

Here the changes are modeled as the noise ut, and the discrete parameter S,

is 1 if a change occurs at time t and 0 otherwise Obviously, this is a special case of (10.1) where the discrete variable is SN = ( S l , b ~ , , S and

S N = (0, xt = B t , At(S) = 1, Ct = 1, Q t ( S ) = StQt, Rt = X Here (0, denotes all possible sequences of zeros and ones of length N The

problem of estimating the sequence S N is called segmentation

Example 70.3 Model structure selection

Suppose that there are two possible model structures for describing a measured signal, namely two auto-regressions with one or two parameters,

6 = 1 : yt = -alyt-l+ et

6 = 2 : yt = -alyt-l - a2yt-2 + e t

Here, et is white Gaussian noise with variance X We want to determine from

a given data set which model is the most suitable One solution is to refer to the general problem with discrete parameters in (10.1) Here we can take

At(6) = I , Q t ( S ) = 0, &(S) = X

Trang 6

382 Change detection based on multiple models

and

The problem of estimating S is called model structure selection

Example 10.4 Equalization

A typical digital communication problem is to estimate a binary signal, ut,

transmitted through a channel with a known characteristic and measured at the output A simple example is

We refer to the problem of estimating the input sequence with a known channel

as equalization

Example 10.5 Blind equalization

Consider again the communication problem in Example 10.4, but assume now that both the channel model and the binary signal are unknown a priori

We can try to estimate the channel parameters as well by using the model

The problem of estimating the input sequence with an unknown channel is called blind equalization

Trang 7

where some of the measurements are known t o be bad One possible approach

to this problem is to model the measurement noise as a Gaussian mixture,

M i=l

where C cq = 1 With this notation we mean that the density function for et

( N ( p 1 , Q 1 ) with probability a 1

1 N ( p 2 , Q 2 ) with probability a 2

et E

[ ; ( P M , Q M ) with probability Q M

Hence, the noise distribution can be written

where St E { 1 , 2 , , M } and the prior is chosen as p ( & = i ) = ai

The simplest way to describe possible outliers is t o take p1 = p 2 = 0, Q 1

equal to the nominal noise variance, Q 2 as much larger than Q 1 and a 2 = 1 - a 1

equal to a small number This models the fraction a 2 of all measurements as outliers with a very large variance The Kalman filter will then ignore these measurements, and the a posteriori probabilities are almost unchanged

Trang 8

Example 10.7 Missing data

In some applications it frequently happens that measurements are missing, typically due to sensor failure A suitable model for this situation is

yt =( 1 - &)Ctxt + et (10.5) This model is used in Lainiotis (1971) The model (10.5) corresponds to the choices

in the general formulation (10.1) For a thorough treatment of missing data, see Tanaka and Katayama (1990) and Parzen (1984)

Example 10.8 Markov models

Consider again the case of missing data, modeled by (10.5) In applications, one can expect that a very low fraction, say p l l , of the data is missing On the other hand, if one measurement is missing, there is a fairly high probability, say p22, that the next one is missing as well This is nothing but a prior assumption on 6, corresponding to a Markov chain Such a state space model

is commonly referred to as a jump linear model A Markov chain is completely specified by its transition probabilities

and the initial probabilities p ( & = i ) = p i Here we must have p12 = 1 - p 2 2

and p21 = 1 -p11 In our framework, this is only a recursive description of the prior probability of each sequence,

For outliers, and especially missing data, the assumption of an underlying Markov chain is particularly logical It is used, for instance, in MacGarty (1975)

Trang 9

10.3 On-line alaorithms 385

10.3 On-line algorithms

Interpret the exponentially increasing number of discrete sequences St as a

growing tree, as illustrated in Figure 10.2 It is inevitable that we either

prune or merge this tree

In this section, we examine how one can discard elements in S by cutting off branches in the tree, and lump sequences into subsets of S by merging branches

Thus, the basic possibilities for pruning the tree are to cut 08 branches and to merge two or more branches into one That is, two state sequences are

merged and in the following treated as just one There is also a timing question:

at what instant in the time recursion should the pruning be performed? To understand this, the main steps in updating the a posteriori probabilities can

be divided into a time update and a measurement update as follows:

Figure 10.2 A growing tree of discrete state sequences In GPB(2) the sequences (1,5),

(2,6), (3,7) and (4,8), respectively, are merged In GPB(1) the sequences (1,3,5,7) and (2,4,6,8), respectively, are merged

Trang 10

386 Change detection based on multiple models

0 Time update:

(10.6) (10.7)

First, a quite general pruning algorithm is given

1 Compute recursively the conditional Kalman filter for a bank of M sequences @(i) = ( S l ( i ) , & ( i ) , , 6 t ( i ) ) T , i = 1 , 2 , , M

2 After the measurement update at time t , prune all but the M / S most probable branches St(i)

3 At time t + 1: let the M / S considered branches split into S M / S = M

branches, S t s l ( j ) = (st(i),&+l) for all @(i) and &+l Update their a posteriori probabilities according to Theorem 10.1

For change detection purposes, where 6, = 0 is the normal outcome and

St # 0 corresponds to different fault modes, we can save a lot of filters in the filter bank by using a local search scheme similar to that in Algorithm 7.1

Algorithm 70.2 Local pruning for multiple models

1 Compute recursively the conditional Kalman filter for a bank of M hy- potheses of @(i) = ( S l ( i ) , 6 2 ( i ) , , 6 t ( i ) ) T , i = 1 , 2 , , M

2 After the measurement update at time t , prune the S - 1 least probable branches St

Trang 11

10.3 On-line alaorithms 387

3 At time t + 1: let only the most probable branch split into S branches,

S t + l ( j ) = (Jt(i), S,+l)

4 Update their posterior probabilities according to Theorem 10.1

Some restrictions on the rules above can sometimes be useful:

0 Assume a minimum segment length: let the most probable sequence split

only if it is not too young

0 Assure that sequences are not cut off immediately after they are born: cut off the least probable sequences among those that are older than a certain minimum life-length, until only M ones are left

The exact posterior density of the state vector is a mixture of St Gaussian distributions The key point in merging is to replace, or approximate, a number

of Gaussian distributions by one single Gaussian distribution in such a way that the first and second moments are matched That is, a sum of L Gaussian distributions

Trang 12

The GP6 algorithm

The idea of the Generalized Pseudo-Bayesian ( G P B ) approach is to merge the

mixture after the measurement update

The mode parameter 6 is an independent sequence with S outcomes used to switch modes in a linear state space model Decide on the sliding window memory L Represent the posterior distribution of the state at time t with a Gaussian mixture of M = SL-' distributions,

i = l

Repeat the following recursion:

1 Let these split into S L sequences by considering all S new branches at time t + 1

2 For each i, apply the conditional Kalman filter measurement and time update giving ? t + l l t ( 4 , i t + l l t + l ( i ) , Pt+llt(i), Pt+llt+l(i), Et+&) and St+lW

3 Time update the weight factors a(i) according to

4 Measurement update the weight factors a(i) according to

5 Merge S sequences corresponding to the same history up to time t - L

This requires SL-l separate merging steps using the formula

Tiêu đề	Change Detection Based on Multiple Models
Tác giả	Fredrik Gustafsson
Trường học	John Wiley & Sons, Ltd
Chuyên ngành	Adaptive Filtering and Change Detection
Thể loại	book
Năm xuất bản	2000
Thành phố	Hoboken

Định dạng
Số trang	25
Dung lượng	0,94 MB