Adaptive lọc và phát hiện thay đổi P4

Basics This chapter surveys off-line formulations of single and multiple change point data batch.wise, many important algorithms have natural on-line implemen- In segmentation... We wil

Trang 1

Off-line approaches

4.1 Basics 89

4.2 Segmentation criteria 91

4.2.1 ML change time sequence estimation 91

4.2.2 Information based segmentation 92

4.3 On-line local search for optimum 94

4.3.1 Local tree search 94

4.3.2 A simulation example 95

4.4 Off-line global search for optimum 98

4.4.1 Local minima 98

4.4.2 An MCMC approach 101

4.5 Change point estimation 102

4.5.1 The Bayesian approach 103

4.5.2 The maximum likelihood approach 104

4.5.3 A non-parametric approach 104

4.6 Applications 106

4.6.1 Photon emissions 106

4.6.2 Altitude sensor quality 107

4.6.3 Rat EEG 108

4.1 Basics

This chapter surveys off-line formulations of single and multiple change point

data batch.wise, many important algorithms have natural on-line implemen-

In segmentation the goal is to find a sequence kn = ( k ~ k2 k n ) of time

Adaptive Filtering and Change Detection

Trang 2

90 Off-line amroaches

the signal can be accurately described as piecewise constant, i.e.,

are other possibilities Equation (4.1) will be the signal model used throughout this chapter, but it should be noted that an important extension to the case where the parameter is slowly varying within each segment is possible with minor modifications However, equation (4.1) illustrates the basic ideas One way to guarantee that the best possible solution is found is t o consider

choose the particular kn that minimizes an optimality criteria,

h

kn = arg min V ( k n )

n>l,O<kl< <k,=N

The procedure is illustrated below:

optimality criteria have been proposed:

estimate of kn is studied

V ( i ) (the sum of squared residuals), and the total information is the sum

of these Since the total information is minimized for the degenerated

Similar problems have been studied in the context of model structure selection, and from this literature Akaike’s AIC and BIC criteria have been proposed for segmentation

change at each time instant) Here, several strategies have been proposed:

Monte Carlo (MCMC) techniques

Trang 3

4.2 Seamentation criteria 91

vectors

4.2 Segmentation criteria

This section describes the available statistical and information based optimiza- tion criteria

kl, k2, , k, is estimated from the data sequence yt Later, on-line algorithms

will be derived from this approach We will use the likelihood for data, given

that the vector of change points is p(ytlkn)

Trang 4

independence That is,

The a posteriori probability for k is defined by p ( k I y t ) Bayes' rule P(AIB) =

U P ( B 1 A ) thus gives

P ( B )

probability density function equal t o one The last term is recognized as the

posteriori ( M A P ) estimate, which is not influenced by the scaling factor p ( y t ) ,

MAP and ML estimators coincide

n

i = O

(4.10)

n

(4.11)

Trang 5

4.2 Seamentation criteria 93

more change points, the smaller loss function The easiest way t o see this is

to consider the extreme case where the number of change points equals the

because there is no error In fact, the loss function is monotonously decreasing

principle, which says that the best data description is a compromise between performance (small loss function) and complexity (few parameters) This is

principle is the choice of model structures in system identification Penalty terms occuring in model order selection problems can also be used in this application, for instance:

The asymptotically equivalent criteria: Akaike’s BIC (Akaike, 1977),

Section 5.3.2

AIC is proposed in Kitagawa and Akaike (1978) for auto-regressive models

models, it would read

(4.12)

constant noise variance, leading to

h

known, but it is not commented upon in Kitagawa and Akaike (1978)

The MDL theory provides a nice interpretation of the segmentation problem: Choose the segments such that the fewest possible data bits are used t o

Trang 6

94 Off-line approaches

vectors and the prediction errors are stored with finite accuracy

where marginalized ML works fine

4.3 On-line local search for optimum

used in the following section

other alternative using global search strategies, examined in the next section, decides which branches to examine on an off-line basis

in common with the famous Viterbi algorithm in equalization; see Algorithm

Algorithm 4.1 Recursive signal segmentation

probabilities or information-based criteria

Compute recursively the optimality criterion using a bank of least squares

segmentation

Use the following rules for maintaining the hypotheses and keeping the

c) Assume a minimum segment length: let the most probable sequence

Trang 7

4.3 On-line local search for oDtimum 95

Figure 4.1 The tree of jump sequences A path labeled 0 corresponds t o no jump, while 1 corresponds t o a jump

d ) Assure that sequences are not cut off immediately after they are born:

a certain minimum lafe-length, until only M are left

The last two restrictions are optional, but might be useful in some cases

than exponential complexity in the data size, which would be the consequence

Trang 8

12

Measurements and real parameters

Figure 4.2 A change in the mean signal with three abrupt changes of increasing magnitude

respectively The plot mimics Figure 4.1 but is ‘upside down’ Each line rep- resents one hypothesis and shows how the number of change points evolves for

there is one filter that performs best and the other filters are used t o evaluate change points at each time instant After having lived for three samples with-

time 22 one filter reacts and at time 23 the correct change time is found After the last change, it takes three samples until the correct hypothesis becomes the most likely

result from Appendix 7.A) Figure 4.4 shows how the hypotheses examine all branches that need to be considered

Trang 9

4.3 On-line local search for oDtimum 97

Local search with M=5

I l I

0 5 10 15 20 25 30 35 40

Time [sample Local search withll\rl=8

Figure 4.4 Evolution of change hypotheses for a local search with M = 40 A small offset

is added to the number of change points for each hypothesis

Trang 10

4.4 Off-line global search for optimum

Example 4.7 Local minimum for one change point

This example shows that we must do an exhaustive search for the change point However, this might not improve the likelihood if there are two change points as the following example demonstrates

Example 4.2 Local minimum for two change points

The signal in Figure 4.6 is constant and zero, except for a small segment

in the middle of the data record The global minimum of the likelihood is

Such an example might motivate an approach where a complete search of one and two change points is performed This will work in most cases, but

Trang 11

4.4 Off-line alobal search for oDtimum 99

Figure 4.5 Upper plot: signal Lower plot: negative log likelihood p ( k ) with global mini-

mum at k = 100 but local minimum at k = 0

= (100,110) but local minimum at k = 0 No improvement for k" = m, m =

Trang 12

Example 4.3 Global search using one and two change points

true change times generally

Example 4.4 Counterexample of convergence of global search

Assume the following exponential distribution on the noise:

Then the negative log likelihood for no change point is

Trang 13

4.4 Off-line alobal search for oDtimum 101

Figure 4.8 The signal in the counter example with M = 10

residuals in two of the segments are identically zero, so their likelihood vanish

That is, its negative log likelihood is larger than the likelihood given no change

at all Thus, we will never find the global optimum by trying all combinations

search for three change points

Markov Chain Monte Carlo (MCMC) approaches are surveyed in Chapter

MCMC algorithm proposed in Fitzgerald et al (1994) for signal estimation is

a combination of Gibbs sampling and the Metropolis algorithm The algorithm

below is based solely on the knowledge of the likelihood function for data given

applied, which defines the Metropolis algorithm: the candidate will be rejected

with large probability if its value is unlikely

Trang 14

Algorithm 4.2 MCMC signal segmentation

from

- p(kj1k;" except 5 )

be taken as flat, or Gaussian centered around the previous estimate

3 The candidate j is accepted with probability

likelihood

computed by Monte Carlo techniques

and that one has to decide what the burn-in time is

Example 4.5 MCMC search for two change points

4.5 Change point estimation

Trang 15

4.5 Chanae Doint estimation 103

Figure 4.9 Result of the MCMC Algorithm 4.2 Left plot (a) shows the examined jump

sequence in each iteration The best encountered sequence is marked with dashed lines The

right plot (b) shows an histogram over all considered jump sequences The burn-in time is

not excluded in the histogram

(1975), where different procedures to test H0 t o H1 are described The meth-

ods are off-line and only one change point may exist in the data The Bayesian

and likelihood methods are closely related to the already described algorithms

However, the non-parametric approaches below are interesting and unique for

same, gives:

N

t=2

N

Trang 16

~ N - l N

L k=l t=k+l

one possible estimate of the jump time (change point) is given by

for P3 and similarly for P4

Using the ML method the test statistics are as follows:

tribution for the noise Non-parametric tests for the first problem, assuming only whiteness, are based on the decision rule:

Trang 17

4.5 Chanae Doint estimation 105

where the distance measure s i is one of the following ones:

Here, med denotes the median and sign is the sign function The first method

estimate is given by the maximizing argument

Example 4.6 Change point estimation

To get a feeling of the different test statistics, we compare the statistics

from zero to one Both signals have white Gaussian measurement noise with

for the abruptly changing data (though the Bayesian statistic seems to have

Trang 18

some probability of over-estimating the change time), and that there should

be no problem in designing a good threshold

The explicit formulas given in this section, using the maximum likelihood

special cases of the more general formulas in Section 4.2, which can be verified

by the reader

4.6 Applications

likelihood based algorithms with respect to any distribution of the noise The exponential distribution has the nice property of offering explicit and compact

(MGL) The standard algorithm assumes Gaussian noise, but can still be

Trang 19

The result is shown in Figure 4.11 Clearly, there are non-stationarities in the

might be used in real-time automatic surveillance

Section 2.2.1 One problem is to detect the critical regions of variance increases

Figure 4.13 shows the same low-pass filtered variance estimate (in logarith- mic scale) and the result from the ML variance segmentation algorithm The

Trang 20

10-3; 500 l000 ,500 2000 2500 3000 3500 4600

Figure 4.1 3 Low-pass filtered and segmented squared residuals for altitude data

number That is, we know precisely where the measurements are useful

based on a band pass filter and level thresholding on the output power This

EEG signal on a rat

Trang 21

4.6 Amlications 109

gives

[l096 1543 1887 2265 2980 3455 3832 39341

applied This gives

C754 1058 1358 1891 2192 2492 2796 3098 3398 36991

It can be noted that the changes are hardly abrupt for this signal

Tiêu đề	Adaptive filtering and change detection
Tác giả	Fredrik Gustafsson
Thể loại	Book chapter
Năm xuất bản	2000

Định dạng
Số trang	21
Dung lượng	762,55 KB