Tài liệu Digital Signal Processing Handbook P16 pptx

Tugnait Auburn University 16.1 Introduction 16.2 Gaussianity, Linearity, and Stationarity Tests Gaussianity Tests•Linearity Tests•Stationarity Tests 16.3 Order Selection, Model Validatio

Trang 1

Tugnait, J.K “Validation, Testing, and Noise Modeling”

Digital Signal Processing Handbook

Ed Vijay K Madisetti and Douglas B Williams Boca Raton: CRC Press LLC, 1999

Trang 2

16 Validation, Testing, and Noise

Modeling

Jitendra K Tugnait

Auburn University

16.1 Introduction 16.2 Gaussianity, Linearity, and Stationarity Tests

Gaussianity Tests•Linearity Tests•Stationarity Tests

16.3 Order Selection, Model Validation, and Confidence Intervals

Order Selection •Model Validation•Confidence Intervals

16.4 Noise Modeling

Generalized Gaussian Noise•Middleton Class A Noise•Stable Noise Distribution

16.5 Concluding Remarks References

16.1 Introduction

Linear parametric models of stationary random processes, whether signal or noise, have been found

to be useful in a wide variety of signal processing tasks such as signal detection, estimation, filtering, and classification, and in a wide variety of applications such as digital communications, automatic control, radar and sonar, and other engineering disciplines and sciences A general representation of

a linear discrete-time stationary signalx(t) is given by

x(t) =X∞

i=0

where {(t)} is a zero-mean, i.i.d (independent and identically distributed) random sequence

with finite variance, and {h(i), i ≥ 0} is the impulse response of the linear system such that

P∞

i=−∞ h2(i) < ∞ Much effort has been expended on developing approaches to linear model

fitting given a single measurement record of the signal (or noisy signal) Parsimonious parametric models such as AR (autoregressive), MA (moving average), ARMA or state-space, as opposed to impulse response modeling, have been popular together with the assumption of Gaussianity of the data

Define

H (q) =X∞

i=0

whereq−1is the backward shift operator (i.e.,q−1x(t) = x(t − 1), etc.) If q is replaced with the

complex variablez, then H (z) is the Z-transform of {h(i)}, i.e., it is the system transfer function.

Trang 3

Using (16.2), (16.1) may be rewritten as

Fitting linear models to the measurement record requires estimation ofH (q), or equivalently of {h(i)}

(without observing{(t)} ) Typically H (q) is parameterized by a finite number of parameters, say

by the parameter vectorθ (M)of dimensionM For instance, an AR model representation of order

M means that

H AR (q; θ (M) ) = 1

1+PM i=1 a i q −i , θ (M) = (a1, a2, · · · , a M ) T (16.4) This reduces the number of estimated parameters from a “large” number toM.

In this section several aspects of fitting models such as (16.1) to (16.3) to the given measurement record are considered These aspects are (see also Fig.16.1):

• Is the model of the type (16.1) appropriate to the given record? This requires testing for linearity and stationarity of the data

• Linear Gaussian models have long been dominant both for signals as well as for noise pro-cesses Assumption of Gaussianity allows implementation of statistically efficient param-eter estimators such as maximum likelihood estimators A Gaussian process is completely characterized by its second-order statistics (autocorrelation function or, equivalently, its power spectral density) Since the power spectrum of{x(t)} of (16.1) is given by

S xx (ω) = σ2

 |H (e jω )|2, σ2

 = E{2(t)}, (16.5) one cannot determine the phase of H (e jω ) independent of |H (e jω )| Determination

of the true phase characteristic is crucial in several applications such as blind equaliza-tion of digital communicaequaliza-tions channels Use of higher-order statistics allows one to uniquely identify nonminimum-phase parametric models Higher-order cumulants of Gaussian processes vanish, hence, if the data are stationary Gaussian, a minimum-phase (or maximum-phase) model is the “best” that one can estimate Therefore, another aspect considered in this section is testing for non-Gaussianity of the given record

• If the data are Gaussian, one may fit models based solely upon the second-order statistics

of the data — else use of higher-order statistics in addition to or in lieu of the second-order statistics is indicated, particularly if the phase of the linear system is crucial In either case, one typically fits a modelH(q; θ (M) ) by estimating the M unknown parameters through

optimization of some cost function In practice, (the model order)M is unknown and its

choice has a significant impact on the quality of the fitted model In this section another aspect of the model-fitting problem considered is that of order selection

• Having fitted a model H (q; θ (M) ), one would also like to know how good are the estimated

parameters? Typically this is expressed in terms of error bounds or confidence intervals

on the fitted parameters and on the corresponding model transfer function

• Having fitted a model, a final step is that of model falsification Is the fitted model an appropriate representation of the underlying system? This is referred to variously as model validation, model verification, or model diagnostics

• Finally, various models of univariate noise pdf (probability density function) are discussed

to complete the discussion of model fitting

Trang 4

FIGURE 16.1: Section outline (SOS — second-order statistics; HOS — higher-order statistics).

16.2 Gaussianity, Linearity, and Stationarity Tests

Given a zero-mean, stationary random sequence{x(t)}, its third-order cumulant function C xxx (i, k)

is given by [12]

C xxx (i, k) := E{x(t + i)x(t + k)x(t)}. (16.6) Its bispectrumB xxx (ω1, ω2) is defined as [12]

B xxx (ω1, ω2) =

∞

X

i=−∞

∞

X

k=−∞

C xxx (i, k)e −j (ω1i+ω2k) (16.7) Similarly, its fourth-order cumulant functionC xxxx (i, k, l) is given by [12]

C xxxx (i, k, l) := E{x(t)x(t + i)x(t + k)x(t + l)}

− E{x(t)x(t + i)}E{x(t + k)x(t + l)}

− E{x(t)x(t + k)}E{x(t + l)x(t + i)}

− E{x(t)x(t + l)}E{x(t + k)x(t + i)}. (16.8) Its trispectrum is defined as [12]

T xxxx (ω1, ω2, ω3) :=

∞

X

i=−∞

∞

X

k=−∞

∞

X

l=−∞

C xxxx (i, k, l)e −j (ω1i+ω2k+ω3l) (16.9)

Trang 5

If{x(t)} obeys (16.1), then [12]

B xxx (ω1, ω2) = γ3 H (e jω1)H(e jω2)H∗(e j (ω1+ω2) ) (16.10) and

T xxxx (ω1, ω2, ω3) = γ4 H (e jω1)H(e jω2)H(e jω3)H∗(e j (ω1+ω2+ω3) ) (16.11) where

γ3 = C (0, 0, 0) and γ4 = C (0, 0, 0, 0). (16.12) For Gaussian processes,B xxx (ω1, ω2) ≡ 0 and T xxxx (ω1, ω2, ω3) ≡ 0; equivalently, C xxx (i, k) ≡

0andC xxxx (i, k, l) ≡ 0 This forms a basis for testing Gaussianity of a given measurement record.

When{x(t)} is linear (i.e., it obeys (16.1)), then using (16.5) and (16.10),

|B xxx (ω1, ω2)|2

S xx (ω1)S xx (ω1)S xx (ω1+ ω2) =

γ3

σ6

 = constant ∀ ω1, ω2, (16.13) and using (16.5) and (16.11),

|T xxxx (ω1, ω2, ω3)|2

S xx (ω1)S xx (ω1)S xx (ω3)S xx (ω1+ ω2+ ω3) =

γ4

σ8

 = constant ∀ ω1, ω2, ω3. (16.14) The above two relations form a basis for testing linearity of a given measurement record How the tests are implemented depends upon the statistics of the estimators of the higher-order cumulant spectra as well as that of the power spectra of the given record

16.2.1 Gaussianity Tests

Suppose that the given zero-mean measurement record is of length N denoted by {x(t), t =

1, 2, · · · , N} Suppose that the given sample sequence of length N is divided into K

nonover-lapping segments each of sizeN Bsamples so thatN = KN B LetX (i) (ω) denote the discrete Fourier

transform (DFT) of the ith block{x(t + (i − 1)N B ), 1 ≤ t ≤ N B } (i = 1, 2, · · · , K) given by

X (i) (ω m ) =

NXB−1

l=0

x(l + 1 + (i − 1)N B )exp(−jω m l) (16.15)

where

ω m = N2π

B m, m = 0, 1, · · · , N B − 1. (16.16) Denote the estimate of the bispectrumB xxx (ω m , ω n ) at bifrequency (ω m = 2π

N B m, ω n = 2π

N B n) as

b

B xxx (m, n), given by averaging over K blocks

b

B xxx (m, n) = 1

K

X

i=1

1

N B X (i) (ω m )X (i) (ω n )hX (i) (ω m + ω n )i∗

, (16.17)

whereX∗denotes the complex conjugate ofX A principal domain of b B xxx (m, n) is the triangular

grid

D =

(m, n) | 0 ≤ m ≤ N B

2 , 0 ≤ n ≤ m, 2m + n ≤ N B

. (16.18) Values of bB xxx (m, n) outside D can be inferred from that in D.

Trang 6

FIGURE 16.2: Coarse and fine grids in the principal domain.

Select a coarse frequency grid(m, n) in the principal domain D as follows Let d denote the

distance between two adjacent coarse frequency pairs such thatd = 2r + 1 with r a positive integer.

Setn0 = 2 + r and n = n0, n0+ d, · · · , n0+ (L n − 1)d where L n=bbNB3 c−1

d c For a given

n, set m0,n =bN B −n

2 c − r, m = m n =m0,n , m0,n − d, · · · , m0,n − (L m,n − 1)d where L m,n=

bm0,n −(n+r+1)

d c + 1 Let P denote the number of points on the coarse frequency grid as defined

above so thatP =PL n

n=1 L m,n Suppose that(m, n) is a coarse point, then select a fine grid (m, n nk )

and(m mi , n nk ) consisting of

m mi = m + i, |i| ≤ r, n nk = n + k, |k| ≤ r, (16.19) for some integerr > 0 such that (2r +1)2> P ; see also Fig.16.2 Order theL (= (2r +1)2) estimates b

B xxx (m mi , n nk ) on the fine grid around the bifrequency pair (m, n) into an L-vector, which after

relabeling, may be denoted asν ml , l = 1, 2, · · · , L, m = 1, 2, · · · , P, where m indexes the coarse

grid andl indexes the fine grid Define P -vectors

9 i = (ν1i , ν2i , · · · , ν P i ) T (i = 1, 2, · · · , L). (16.20) Consider the estimates

M = L1

L

X

i=1

9 i and 6 = L1

L

X

i=1

9 i − M 9 i − MH (16.21)

Define

F G = 2(L − P )

H 6−1M. (16.22)

If{x(t)} is Gaussian, then F Gis distributed as a centralF (Fisher) with (2P, 2(L − P )) degrees of

freedom A statistical test for testing Gaussianity of{x(t)}istodeclareittobeanon-Gaussiansequence

ifF G > T α whereT αis selected to achieve a fixed probability of false alarmα (= P r{F G > T α} withF Gdistributed as a centralF with (2P, 2(L − P )) degrees of freedom) If F G ≤ T α, then either{x(t)} is Gaussian or it has zero bispectrum.

The above test is patterned after [3] It treats the bispectral estimates on the “fine” bifrequency grid as a “data set” from a multivariable Gaussian distribution with unknown covariance matrix Hinich [4] has simplified the test of [3] by using the known asymptotic expression for the covariance matrix involved, and his test is based uponχ2 distributions Notice thatF G ≤ T α does not

Trang 7

necessarily imply that{x(t)} is Gaussian; it may result from that fact that {x(t)} is non-Gaussian

with zero bispectrum Therefore, a next logical step would be to test for vanishing trispectrum of the record This has been done in [14] using the approach of [4]; extensions of [3] are too complicated Computationally simpler tests using “integrated polyspectrum” of the data have been proposed in [6] The integrated polyspectrum (bispectrum or trispectrum) is computed as cross-power spectrum and

it is zero for Gaussian processes Alternatively, one may test ifC xxx (i, k) ≡ 0 and C xxxx (i, k, l) ≡ 0.

This has been done in [8]

Other tests that do not rely on higher-order cumulant spectra of the record may be found in [13]

16.2.2 Linearity Tests

Denote the estimate of the power spectral densityS xx (ω m ) of {x(t)} at frequency ω m = 2π

N B m as

bS xx (m) given by

bS xx (m) = 1

K

X

i=1

1

N B X (i) (ω m )

h

X (i) (ω m )i∗

Consider

b

γ x (m, n) = b |bB xxx (m, n)|2

S xx (m)b S xx (n)b S xx (m + n) . (16.24)

It turns out thatbγ x (m, n) is a consistent estimator of the left side of (16.13), and it is asymptotically distributed as a Gaussian random variable, independent at distinct bifrequencies in the interior of

D These properties have been used by Subba Rao and Gabr [3] to design a test of linearity Construct a coarse grid and a fine grid of bifrequencies inD as before Order the L estimates

b

γ x (m mi , n nk ) on the fine grid around the bifrequency pair (m, n) into an L-vector, which after

relabeling, may be denoted asβ ml , l = 1, 2, · · · , L, m = 1, 2, · · · , P, where m indexes the coarse

grid andl indexes the fine grid Define P -vectors

9 i = (β1i , β2i , · · · , β P i ) T , (i = 1, 2, · · · , L). (16.25) Consider the estimates

M = 1 L

L

X

i=1

9 i and 6 = 1

L

X

i=1

(9 i − M)(9 i − M) T (16.26)

Define a(P −1)×P matrix B whose ijth element B ijis given byB ij= 1 ifi = j; = −1 if j = i +1;

= 0 otherwise Define

F L = L − P + 1 P − 1 (BM) T B6B T−1BM. (16.27)

If{x(t)} is linear, then F Lis distributed as a centralF with (P −1, L−P +1) degrees of freedom A

statistical test for testing linearity of{x(t)} is to declare it to be a nonlinear sequence if F L > T αwhere

T αis selected to achieve a fixed probability of false alarmα (= P r{F L > T α } with F Ldistributed as

a centralF with (P − 1, L − P + 1) degrees of freedom) If F L ≤ T α, then either{x(t)} is linear

or it has zero bispectrum

The above test is patterned after [3] Hinich [4] has “simplified” the test of [3] Notice that

F L ≤ T αdoes not necessarily imply that{x(t)} is nonlinear; it may result from that fact that {x(t)}

is non-Gaussian with zero bispectrum Therefore, a next logical step would be to test if (16.14) holds true This has been done in [14] using the approach of [4]; extensions of [3] are too complicated The approaches of [3] and [4] will fail if the data are noisy A modification to [3] is presented in [7] when additive Gaussian noise is present Finally, other tests that do not rely on higher-order cumulant spectra of the record may be found in [13]

Trang 8

16.2.3 Stationarity Tests

Various methods exist for testing whether a given measurement record may be regarded as a sample sequence of a stationary random sequence A crude yet effective way to test for stationarity is to divide the record into several (at least two) nonoverlapping segments and then test for equivalency (or compatibility) of certain statistical properties (mean, mean-square value, power spectrum, etc.)

computed from these segments More sophisticated tests that do not require a priori segmentation

of the record are also available

Consider a record of lengthN divided into two nonoverlapping segments each of length N/2 Let

KN B = N/2 and use the estimators such as (16.23) to obtain the estimator bS xx (l) (m) of the power

spectrumS xx (l) (ω m ) of the l−th segment (l = 1, 2), where ω mis given by (16.16) Consider the test statistic

N B− 2

r

K

2

NB

2 −1

X

m=1

h

ln b S (1)

xx (m) − ln b S (2)

xx (m)i. (16.28)

Then, asymptoticallyY is distributed as zero-mean, unit variance Gaussian if {x(t)} is stationary.

Therefore, if|Y | > T α, then{x(t)} is declared to be nonstationary where the threshold T αis chosen

to achieve a false-alarm probability ofα (= P r{|Y | > T α } with Y distributed as zero-mean, unit

variance Gaussian) If|Y | ≤ T α, then{x(t)} is declared to be stationary Notice that similar tests

based upon higher-order cumulant spectra can also be devised

The above test is patterned after [10] More sophisticated tests involving two model comparisons

as above but without prior segmentation of the record are available in [11] and references therein A test utilizing evolutionary power spectrum may be found in [9]

16.3 Order Selection, Model Validation, and Confidence

Intervals

As noted earlier, one typically fits a modelH (q; θ (M) ) to the given data by estimating the M unknown

parameters through optimization of some cost function A fundamental difficulty here is the choice

ofM There are two basic philosophical approaches to this problem: one consists of an iterative

process of model fitting and diagnostic checking (model validation), and the other utilizes a more

“objective” approach of optimizing a cost w.r.t.M (in addition to θ (M)).

16.3.1 Order Selection

Letf θ (M) (X) denote the probability density function of X = [x(1), x(2), · · · , x(N)] T parameterized

by the parameter vectorθ (M)of dimensionM A popular approach to model order selection in the

context of linear Gaussian models is to compute the Akaike information criterion (AIC)

AIC(M) = −2 ln fbθ (M) (X) + 2M (16.29) where bθ (M)maximizesf θ (M) (X) given the measurement record X Let M denote an upper bound

on the true model order Then the minimum AIC estimate (MAICE), the selected model order, is given by the minimizer ofAIC(M) over M = 1, 2, · · · , M Clearly one needs to solve the problem

of maximization ofln f θ (M) (X) w.r.t θ (M)for each value ofM = 1, 2, · · · , M The second term on

the right side of (16.29) penalizes overparametrization

Rissanen’s minimum description length (MDL) criterion is given by

MDL(M) = −2 ln fbθ (M) (X) + M ln N. (16.30)

Trang 9

It is known that if{x(t)} is a Gaussian AR model, then AIC is an inconsistent estimator of the model

order whereas MDL is consistent, i.e., MDL picks the correct model order with probability one as the data length tends to infinity, whereas there is a nonzero probability that AIC will not Several other variations of these criteria exist [15]

Although the derivation of these order selection criteria is based upon Gaussian distribution, they have frequently been used for non-Gaussian processes with success provided attention is confined

to the use of second-order statistics of the data They may fail if one fits models using higher-order statistics

16.3.2 Model Validation

Model validation involves testing to see if the fitted model is an appropriate representation of the underlying (true) system It involves devising appropriate statistical tools to test the validity of the assumptions made in obtaining the fitted model It is also known as model falsification, model verification, or diagnostic checking It can also be used as a tool for model order selection It is an essential part of any model fitting methodology

Suppose that{x(t)} obeys (16.1) Suppose that the fitted model corresponding to the estimated parameter bθ (M)isH (q;b θ (M) ) Assuming that the true model H (q) is invertible, in the ideal case one

should get(t) = H−1(q)x(t) where {(t)} is zero-mean, i.i.d (or at least white when using

second-order statistics) Hence, if the fitted modelH (q;b θ (M) ) is a valid description of the underlying true

system, one expects0(t) = H−1(q;b θ (M) )x(t) to be zero-mean, i.i.d One of the diagnostic checks

then is to test for whiteness or independence of the inverse filtered data (or the residuals or linear innovations, in case second-order statistics are used) If the fitted model is unable to “adequately” capture the underlying true system, one expects{0(t)} to deviate from i.i.d distribution This is

one of the most widely used and useful diagnostic checks for model validation

A test for second-order whiteness of{0(t)} is as follows [15] Construct the estimates of the covariance function as

br (τ) = N−1N−τX

t=1

0(t + τ)0(t) (τ ≥ 0). (16.31) Consider the test statistic

R = br2N

 (0)

m

X

i=1

br2

wherem is some a priori choice of the maximum lag for whiteness testing If {0(t)} is zero-mean

white, thenR is distributed as χ2(m) (χ2withm degrees of freedom) A statistical test for testing

whiteness of{0(t)} is to declare it to be a nonwhite sequence (hence invalidate the model) if R > T α

whereT αis selected to achieve a fixed probability of false alarmα (= P r{R > T α } with R distributed

asχ2(m)) If R ≤ T α, then{0(t)} is second-order white, hence the model is validated.

The above procedure only tests for second-order whiteness In order to test for higher-order whiteness, one needs to examine either the higher-order cumulant functions or the higher-order cumulant spectra (or the integrated polyspectra) of the inverse-filtered data A statistical test using bispectrum is available in [5] It is particularly useful if the model fitting is carried out using higher-order statistics If{0(t)} is third-order white, then its bispectrum is a constant for all bifrequencies.

Let bB 000(m, n) denote the estimate of the bispectrum B 000(ω m , ω n ) mimicking (16.17) Construct

a coarse grid and a fine grid of bifrequencies inD as before Order the L estimates b B 000(m mi , n nk )

on the fine grid around the bifrequency pair(m, n) into an L-vector, which after relabeling may be

denoted asµ ml , l = 1, 2, · · · , L, m = 1, 2, · · · , P, where m indexes the coarse grid and l indexes

Trang 10

the fine grid DefineP -vectors

ei = (µ1i , µ2i , · · · , µ P i ) T , (i = 1, 2, · · · , L). (16.33) Consider the estimates

f

M = L1

L

X

i=1

ei and e6 = L1

L

X

i=1

ei− fM e9 i − fMH (16.34)

Define a(P −1)×P matrix B whose ijth element B ijis given byB ij= 1 ifi = j; = −1 if j = i +1;

= 0 otherwise Define

F W = 2(L − P + 1)

2P − 2 B f M

H

Be 6B T−1B f M. (16.35)

If{0(t)} is third-order white, then F W is distributed as a centralF with (2P − 2, 2(L − P + 1))

degrees of freedom A statistical test for testing third-order whiteness of{0(t)} is to declare it to be

a nonwhite sequence ifF W > T αwhereT α is selected to achieve a fixed probability of false alarm

α (= P r{F W > T α } with F W distributed as a centralF with (2P − 2, 2(L − P + 1)) degrees of

freedom) IfF W ≤ T α, then either{0(t)} is third-order white or it has zero bispectrum.

The above model validation test can be used for model order selection Fix an upper bound on the model orders For every admissible model order, fit a linear model and test its validity From among the validated models, select the “smallest” order as the correct order It is easy to see that this procedure will work only so long as the various candidate orders are nested Further details may be found in [5] and [15]

16.3.3 Confidence Intervals

Having settled upon a model order estimateM, let b θ N (M)be the parameter estimator obtained by minimizing a cost functionV N (θ (M) ), given a record of length N, such that V∞(θ) :=limN→∞ V N (θ)

exists For instance, using the notation of the section on order selection, one may takeV N (θ (M) ) =

−N−1ln f θ (M) (X) How reliable are these estimates? An assessment of this is provided by confidence

intervals

Under some general technical conditions, it usually follows that asymptotically (i.e., for largeN),

√

Nbθ N (M) − θ0

is distributed as a Gaussian random vector with zero-mean and covariance matrix

P where θ0denotes the true value ofθ (M) A general expression forP is given by [15]

P = V∞00(θ0)−1P∞V∞00(θ0)−1 (16.36) where

P∞ = limN→∞ EnNV N 0 T (θ0)V N0(θ0)o (16.37) andV0(a row vector) andV00(a square matrix) denote the gradient and the Hessian, respectively, of

V

The above result can be used to evaluate the reliability of the parameter estimator It follows from the above results that

η N = Nbθ N (M) − θ0

T

P−1

bθ (M)

N − θ0

(16.38)

is asymptoticallyχ2(M) Define χ2

α (M) via P r{y > χ2

α (M)} = α where y is distributed as χ2(M).

For instance,χ2

0.05 = 9.49 so that P r{η N > 9.49} = 0.05 The ellipsoid η N ≤ χ2

α (M) then defines

Tiêu đề	Validation, testing, and noise modeling
Tác giả	Jitendra K. Tugnait
Người hướng dẫn	Vijay K. Madisetti, Editor, Douglas B. Williams, Editor
Trường học	Auburn University
Chuyên ngành	Digital Signal Processing
Thể loại	Book chapter
Năm xuất bản	1999
Thành phố	Boca Raton

Định dạng
Số trang	14
Dung lượng	222,91 KB