14 Spectrum Estimation and Modeling

Random Processes •Spectra of Deterministic Signals•Spectraof Random Processes 14.3 The Problem of Power Spectrum Estimation14.4 Nonparametric Spectrum Estimation Periodogram •The Bartlet

Trang 1

Djuric, P.M & Kay S.M “Spectrum Estimation and Modeling”

Digital Signal Processing Handbook

Ed Vijay K Madisetti and Douglas B Williams

Boca Raton: CRC Press LLC, 1999

Trang 2

Random Processes •Spectra of Deterministic Signals•Spectra

of Random Processes

14.3 The Problem of Power Spectrum Estimation14.4 Nonparametric Spectrum Estimation

Periodogram •The Bartlett Method•The Welch Method•

Blackman-Tukey Method •Minimum Variance Spectrum

Es-timator •Multiwindow Spectrum Estimator

14.5 Parametric Spectrum Estimation

Spectrum Estimation Based on Autoregressive Models• trum Estimation Based on Moving Average Models•Spectrum Estimation Based on Autoregressive Moving Average Models •

Spec-Pisarenko Harmonic Decomposition Method •Multiple

Sig-nal Classification (MUSIC)

14.6 Recent DevelopmentsReferences

14.1 Introduction

The main objective of spectrum estimation is the determination of the power spectrum density (PSD)

of a random process The PSD is a function that plays a fundamental role in the analysis of stationaryrandom processes in that it quantifies the distribution of total power as a function of frequency Theestimation of the PSD is based on a set of observed data samples from the process A necessaryassumption is that the random process is at least wide sense stationary, that is, its first and secondorder statistics do not change with time The estimated PSD provides information about the structure

of the random process which can then be used for refined modeling, prediction, or filtering of theobserved process

Spectrum estimation has a long history with beginnings in ancient times [17] The first significantdiscoveries that laid the grounds for later developments, however, were made in the early years of theeighteenth century They include one of the most important advances in the history of mathematics,Fourier’s theory According to this theory, an arbitrary function can be represented by an infinitesummation of sine and cosine functions Later came the Sturm-Liouville spectral theory of differentialequations, which was followed by the spectral representations in quantum and classical physicsdeveloped by John von Neuman and Norbert Wiener, respectively The statistical theory of spectrumestimation started practically in 1949 when Tukey introduced a numerical method for computation

of spectra from empirical data A very important milestone for further development of the fieldwas the reinvention of the fast Fourier transform (FFT) in 1965, which is an efficient algorithm forcomputation of the discrete Fourier transform Shortly thereafter came the work of John Burg, who

Trang 3

proposed a fundamentally new approach to spectrum estimation based on the principle of maximumentropy In the past three decades his work was followed up by many researchers who have developednumerous new spectrum estimation procedures and applied them to various physical processes fromdiverse scientific fields Today, spectrum estimation is a vital scientific discipline which plays a majorrole in many applied sciences such as radar, speech processing, underwater acoustics, biomedicalsignal processing, sonar, seismology, vibration analysis, control theory, and econometrics.

14.2 Important Notions and Definitions

14.2.1 Random Processes

The objects of interest of spectrum estimation are random processes They represent time tions of a certain quantity which cannot be fully described by deterministic functions The voltagewaveform of a speech signal, the bit stream of zeros and ones of a communication message, or thedaily variations of the stock market index are examples of random processes Formally, a randomprocess is defined as a collection of random variables indexed by time (The family of random vari-ables may also be indexed by a different variable, for example space, but here we will consider only

fluctua-random time processes.) The index set is infinite and may be continuous or discrete If the index

set is continuous, the random process is known as a continuous-time random process, and if the set

is discrete, it is known as a discrete-time random process The speech waveform is an example of

a continuous random process and the sequence of zeros and ones of a communication message, adiscrete one We shall focus only on discrete-time processes where the index set is the set of integers

A random process can be viewed as a collection of a possibly infinite number of functions, alsocalled realizations We shall denote the collection of realizations by{ ˜x[n]} and an observed realization

of it by{x[n]} For fixed n, { ˜x[n]} represents a random variable, also denoted as ˜x[n], and x[n] is the

n-th sample of the realization {x[n]} If the samples x[n] are real, the random process is real, and if

they are complex, the random process is complex In the discussion to follow, we assume that{ ˜x[n]}

is a complex random process.

The random process{ ˜x[n]} is fully described if for any set of time indices n1,n2, ,n m, the jointprobability density function of˜x[n1], ˜x[n2], , and ˜x[n m] is given If the statistical properties of theprocess do not change with time, the random process is called stationary This is always the case if forany choice of random variables ˜x[n1], ˜x[n2], , and ˜x[n m], their joint probability density function

is identical to the joint probability density function of the random variables ˜x[n1+ k], ˜x[n2+ k],

, and˜x[n m + k] for any k Then we call the random process strictly stationary For example, if the

samples of the random process are independent and identically distributed random variables, it isstraightforward to show that the process is strictly stationary Strict stationarity, however, is a verysevere requirement and is relaxed by introducing the concept of wide-sense stationarity A randomprocess is wide-sense stationary if the following two conditions are met:

and

r[n, n + k] = E ˜x∗[n] ˜x[n + k]

whereE(·) is the expectation operator, ˜x∗[n] is the complex conjugate of ˜x[n], and {r[k]} is the

autocorrelation function of the process Thus, if the process is wide-sense stationary, its mean value

µ is constant over time, and the autocorrelation function depends only on the lag k between the

random variables For example, if we consider the random process

Trang 4

where the amplitudea and the frequency f0are constants, and the phase ˜θ is a random variable that

is uniformly distributed over the interval(−π, π), one can show that

Thus, Eq (14.3) represents a wide-sense stationary random process

14.2.2 Spectra of Deterministic Signals

Before we define the concept of spectrum of a random process, it will be useful to review the analogousconcept for deterministic signals, which are signals whose future values can be exactly determinedwithout any uncertainty Besides their description in the time domain, the deterministic signals have

a very useful representation in terms of superposition of sinusoids with various frequencies, which

is given by the discrete-time Fourier transform (DTFT) If the observed signal is{g[n]} and it is not

periodic, its DTFT is the complex valued functionG(f ) defined by

which means that the signal{g[n]} can be represented in terms of complex exponentials whose

frequencies span the continuous interval [0,1)

The complex functionG(f ) can be alternatively expressed as

where|G(f )| is called the amplitude spectrum of {g[n]}, and φ(f ) the phase spectrum of {g[n]}.

For example, if the signal{g[n]} is given by

Trang 5

and the amplitude and phase spectra are

From Eq (14.15), we deduce that|G(f )|2df is the contribution to the total energy of the signal

from the frequency band(f, f + df ) Therefore, we say that |G(f )|2represents the energy densityspectrum of the signal{g[n]}.

When{g[n]} is periodic with period N, that is

for alln, and where N is the period of {g[n]}, we use the discrete Fourier transform (DFT) to express

{g[n]} in the frequency domain, that is,

Trang 6

its PSDP (f k ) is

P (f k ) = N12, f k= N k , k ∈ {0, 1, · · · , N − 1} (14.23)Again, note that the PSD is defined for a discrete set of frequencies

In summary, the spectra of deterministic aperiodic signals are energy densities defined on thecontinuous set of frequenciesC f = [0, 1) On the other hand, the spectra of periodic signals are

power densities defined on the discrete set of frequenciesD f = {0, 1/N, 2/N, · · · , (N − 1)/N},

whereN is the period of the signal.

14.2.3 Spectra of Random Processes

Suppose that we observe one realization of the random process{ ˜x[n]}, or {x[n]} From the definition

of the DTFT and the assumption of wide-sense stationarity of{ ˜x[n]}, it is obvious that we cannot

use the DTFT to obtainX(f ) from {x[n]} because Eq (14.8) does not hold when we replaceg[n] by x[n] And indeed, if {x[n]} is a realization of a wide-sense stationary process, its energy is infinite.

Its power, however, is finite as was the case with the periodic signals So if we observe{x[n]} from

−N to N, {x[n]} N

−N, and assume that outside this interval the samplesx[n] are equal to zero, we can

find its DTFT,X N (f ) from

Then according to Eq (14.15),|X N (f )|2df represents the energy of the truncated realization that

is contributed by the components whose frequencies are betweenf and f + df The power due to

these components is given by

where ˜X N (f ) is the DTFT of { ˜x[n]} N −N Clearly,P (f )df is interpreted as the average contribution

to the total power from the components of{ ˜x[n]} whose frequencies are between f and f + df

There is a very important relationship between the PSD of a wide-sense stationary random processand its autocorrelation function By Wold’s theorem, which is the analogue of Wiener-Khintchinetheorem for continuous-time random processes, the PSD in Eq (14.27) is the DTFT of the autocor-relation function of the process [15], that is,

P (f ) = X∞

k=−∞

wherer[k] is defined by Eq (14.2)

For all practical purposes, there are three different types ofP (f ) [15] IfP (f ) is an absolutely

continuous function off , the random process has a purely continuous spectrum If P (f ) is

iden-tically equal to zero for allf except for frequencies f = f k,k = 1, 2, , where it is infinite, the

Trang 7

random process has a line spectrum In this case, a useful representation of the spectrum is given bythe Diracδ-functions,

P (f ) =X

k

whereP kis the power associated with thek line component Finally, the spectrum of a random process

may be mixed if it is a combination of a continuous and line spectra ThenP (f ) is a superposition

of a continuous function off and δ-functions.

14.3 The Problem of Power Spectrum Estimation

The problem of power spectrum estimation can be stated as follows: Given a set ofN samples {x[0], x[1], , x[N −1]} of a realization of the random process { ˜x[n]}, denoted also by {x[n]} N−10 , estimatethe PSD of the random process,P (f ) Obviously this task amounts to estimation of a function and

is distinct from the typical problem in elementary statistics where the goal is to estimate a finite set

of parameters

Spectrum estimation methods can be classified into two categories: nonparametric and parametric.The nonparametric approaches do not assume any specific parametric model for the PSD They arebased solely on the estimate of the autocorrelation sequence of the random process from the observeddata For the parametric approaches on the other hand, we first postulate a model for the process

of interest, where the model is described by a small number of parameters Based on the model, thePSD of the process can be expressed in terms of the model parameters Then the PSD estimate isobtained by substituting the estimated parameters of the model in the expression for the PSD Forexample, if a random process{ ˜x[n]} can be modeled by

wherea is an unknown parameter and { ˜w[n]} is a zero mean wide-sense stationary random process

whose random variables are uncorrelated and with the same varianceσ2, it can be shown that thePSD of{ ˜x[n]} is

P (f ) = σ2

Thus, to findP (f ) it is sufficient to estimate a and σ2

The performance of a PSD estimator is evaluated by several measures of goodness One is the bias

of the estimator defined by

where ˆP (f ) and P (f ) are the estimated and true PSD, respectively If the bias b(f ) is identically

equal to zero for allf , the estimator is said to be unbiased, which means that on average it yields the

true PSD Among the unbiased estimators, we search for the one that has minimal variability Thevariability is measured by the variance of the estimator

Trang 8

The variability of a PSD estimator is also measured by the normalized variance [8]

ψ(f ) = v(f )

Finally, another important metric for comparison is the resolution of the PSD estimators Itcorresponds to the ability of the estimator to provide the fine details of the PSD of the randomprocess For example if the PSD of the random process has two peaks at frequenciesf1andf2, thenthe resolution of the estimator would be measured by the minimum separation off1andf2forwhich the estimator still reproduces two peaks atf1andf2

14.4 Nonparametric Spectrum Estimation

When the method for PSD estimation is not based on any assumptions about the generation of theobserved samples other than wide-sense stationarity, then it is termed a nonparametric estimator.According to Eq (14.28),P (f ) can be obtained by first estimating the autocorrelation sequence from

the observed samplesx[0], x[1], · · ·, x[N − 1], and then applying the DTFT to these estimates One

estimator of the autocorrelation is given by

and those for|k| ≥ N are set equal to zero This estimator, although biased, has been preferred over

others An important reason for favoring it is that it always yields nonnegative estimates of the PSD,which is not the case with the unbiased estimator

Many nonparametric estimators rely on using Eq (14.36) and then transform the obtainedautocorrelation sequence to estimate the PSD Other nonparametric methods, however, operatedirectly on the observed data

14.4.1 Periodogram

The periodogram was introduced by Schuster in 1898 when he was searching for hidden periodicitieswhile studying sunspot data [19] To find the periodogram of the data{x[n]} N−10 , first we determinethe autocorrelation sequencer[k] for −(N − 1) ≤ k ≤ N − 1 and then take the DTFT, i.e.,

N−1X

n=0

x[n]e −j2πf n

2

Thus, the periodogram is proportional to the squared magnitude of the DTFT of the observed data

In practice, the periodogram is calculated by applying the FFT, which computes it at a discrete set of

Trang 9

frequenciesD f = {f k : f k = k/N, k = 0, 1, 2, · · · , (N − 1)} The periodogram is then expressed

N−1X

n=0

x[n]e −j2πkn/N

N−1X

n=0

x[n]e −j2πkn/N0

2

, f k ∈ D0f (14.42)

A general property of good estimators is that they yield better estimates when the number ofobserved data samples increases Theoretically, if the number of data samples tends to infinity, theestimates should converge to the true values of the estimated parameters So, in the case of a PSDestimator, as we get more and more data samples, it is desirable that the estimated PSD tends to thetrue value of the PSD In other words, if for finite number of data samples the estimator is biased, thebias should tend to zero asN → ∞ as should the variance of the estimate If this is indeed the case, the

estimator is called consistent Although the periodogram is asymptotically unbiased, it can be shownthat it is not a consistent estimator For example, if{ ˜x[n]} is real zero-mean white Gaussian noise,

which is a process whose random variables are independent, Gaussian, and identically distributedwith varianceσ2, the variance of ˆPPER(f ) is equal to σ4regardless of the lengthN of the observed

data sequence [12] The performance of the periodogram does not improve asN gets larger because

asN increases, so does the number of parameters that are estimated, P (f0), P (f1), , P (f N−1 ) In

general, for the variance of the periodogram, we can write [12]

whereP (f ) is the true PSD.

Interesting insight can be gained if one writes the periodogram as follows

ˆ

PPER(f ) = 1

N

N−1X

n=0

x[n]e −j2πf n

2

= N1

EnPˆPER(f )o= 1

N

Z 1

Trang 10

whereW R (f ) is the DTFT of the rectangular window Hence, the mean value of the periodogram

is a smeared version of the true PSD Since the implementation of the periodogram as defined in

Eq (14.44) implies the use of a rectangular window, a question arises as to whether we could use awindow of different shape to reduce the variance of the periodogram The answer is yes, and indeedmany windows have been proposed which weight the data samples in the middle of the observeddata more than those towards the ends of the observed data Some frequently used alternatives

to the rectangular window are the windows of Bartlett, Hanning, Hamming, and Blackman Themagnitude of the DTFT of a window provides two important characteristics about it One is thewidth of the window’s mainlobe and the other is the strength of its sidelobes A narrow mainlobeallows for a better resolution, and low sidelobes improve the smoothing of the estimated spectrum.Unfortunately, the narrower its mainlobe, the higher the sidelobes, which is a typical trade-off inspectrum estimation It turns out that the rectangular window allows for the best resolution but hasthe largest sidelobes

14.4.2 The Bartlett Method

One approach to reduce the variance of the periodogram is to subdivide the observed data record into

K nonoverlapping segments, find the periodogram of each segment, and finally evaluate the average

of the so-obtained periodograms This spectrum estimator, also known as the Bartlett’s estimator,has variance that is smaller than the variance of the periodogram

Suppose that the number of data samplesN is equal to KL, where K is the number of segments

andL is their length If the i-th segment is denoted by {x i [n]} L−10 ,i = 1, 2 · · · , K, where

x i [n] = x[n + (i − 1)L], n ∈ {0, 1, · · · , L − 1} (14.47)and its periodogram by

ˆ

PPER(i) (f ) = 1

L

L−1X

n=0

x i [n]e −j2πf n

2

(14.48)then the Bartlett spectrum estimator is

estimator has a resolutionK times less than that of the periodogram Thus, this estimator allows for

a straightforward trading of resolution for variance

14.4.3 The Welch Method

The Welch method is another estimator that exploits the periodogram It is based on the same idea asBartlett’s approach of splitting the data into segments and finding the average of their periodograms.The difference is that the segments are overlapped, where the overlaps are usually 50% or 75% large,and the data within a segment are windowed Let the length of the segments beL, the i-th segment

be denoted again by{x i [n]} L−10 , and the offset of successive sequences byD samples Then

whereN is the total number of observed samples and K the total number of sequences Note that if

there is no overlap,K = N/L, and if there is 50% overlap, K = 2N/L − 1 The i-th sequence is

Trang 11

defined by

x i [n] = x[n + (i − 1)D], n ∈ {0, 1, · · · , L − 1} (14.51)wherei = 1, 2, · · · , K, and its periodogram by

ˆ

P M (f ) = (i) L1

L−1X

n=0

w[n]x i [n]e −j2πf n

2

Here ˆP M (f ) (i) is the modified periodogram of the data because the samplesx[n] are weighted by a

nonrectangular windoww[n] The Welch spectrum estimate is then given by

in resolution in many more ways than with the Bartlett method It can be shown that if the overlap

is 50%, the variance of the Welch estimator is approximately 9/16 of the variance of the Bartlettestimator [8]

example,ˆr[N −1] has only the term x∗[0]x[n−1] compared to the N terms used in the computation

ofˆr[0] Therefore, the large variance of the periodogram can be ascribed to the large weight given to

the poor autocorrelation estimates used in its evaluation

Blackman and Tukey proposed to weight the autocorrelation sequence so that the autocorrelationswith higher lags are weighted less [3] Their estimator is given by

Eq (14. 44) implies the use of a rectangular window, a question arises as to whether we could use awindow of different shape to reduce the variance of the periodogram The answer is yes, and indeedmany... Bartlett, Hanning, Hamming, and Blackman Themagnitude of the DTFT of a window provides two important characteristics about it One is thewidth of the window’s mainlobe and the other is the strength

Tiêu đề	Spectrum estimation and modeling
Tác giả	Petar M. Djurić, Steven M. Kay
Trường học	State University of New York at Stony Brook; University of Rhode Island
Chuyên ngành	Electrical Engineering
Thể loại	Book chapter
Năm xuất bản	1999
Thành phố	Boca Raton

Định dạng
Số trang	22
Dung lượng	288,94 KB