1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

64 A Unified Instrumental Variable

19 222 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề A unified instrumental variable approach to direction finding in colored noise fields
Tác giả P. Stoica, M. Viberg, M. Wong, Q. Wu
Người hướng dẫn Vijay K. Madisetti, Editor, Douglas B. Williams, Editor
Trường học Uppsala University; Chalmers University of Technology; McMaster University; CELWAVE
Chuyên ngành Electrical Engineering
Thể loại Book chapter
Năm xuất bản 1999
Thành phố Boca Raton
Định dạng
Số trang 19
Dung lượng 308,72 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Wu CELWAVE 64.1 Introduction 64.2 Problem Formulation 64.3 The IV-SSF Approach 64.4 The Optimal IV-SSF Method 64.5 Algorithm Summary 64.6 Numerical Examples 64.7 Concluding Remarks Refer

Trang 1

Stoica, P.; Viberg, M.; Wong, M & Wu, Q

Digital Signal Processing Handbook

Ed Vijay K Madisetti and Douglas B Williams

Boca Raton: CRC Press LLC, 1999

“A Unified Instrumental Variable Approach to Direction Finding in Colored Noise Fields”Ó

Trang 2

A Unified Instrumental Variable Approach to Direction Finding in

P Stoica

Uppsala University

M Viberg

Chalmers University of Technology

M Wong

McMaster University

Q Wu

CELWAVE

64.1 Introduction 64.2 Problem Formulation 64.3 The IV-SSF Approach 64.4 The Optimal IV-SSF Method 64.5 Algorithm Summary 64.6 Numerical Examples 64.7 Concluding Remarks References

Appendix A: Introduction to IV Methods

The main goal herein is to describe and analyze, in a unifying manner, the spatial and

temporal IV-SSF approaches recently proposed for array signal processing in colored

noise fields (The acronym IV-SSF stands for “Instrumental Variable - Signal Subspace Fitting”) Despite the generality of the approach taken herein, our analysis technique is simpler than those used in previous more specialized publications We derive a general, optimally-weighted (optimal, for short), IV-SSF direction estimator and show that this estimator encompasses the UNCLE estimator of Wong and Wu, which is a spatial IV-SSF method, and the temporal IV-SSF estimator of Viberg, Stoica and Ottersten The latter two estimators have seemingly different forms (among others, the first of them makes use of four weights, whereas the second one uses three weights “only”), and hence their asymptotic equivalence shown in this paper comes as a surprising unifying result We hope that the present paper, along with the original works aforementioned, will stimulate the interest in the IV-SSF approach to array signal processing, which is sufficiently flexible

to handle colored noise fields, coherent signals and indeed also situations were only some

of the sensors in the array are calibrated

1 This work was supported in part by the Swedish Research Council for Engineering Sciences (TFR).

Trang 3

64.1 Introduction

Most parametric methods for Direction-Of-Arrival (DOA) estimation require knowledge of the spatial (sensor-to-sensor) color of the background noise If this information is unavailable, a serious degradation of the quality of the estimates can result, particularly at low Signal-to-Noise Ratio (SNR) [1,2,3] A number of methods have been proposed over the recent years to alleviate the sensitivity to the noise color If a parametric model of the covariance matrix of the noise is available, the parameters of the noise model can be estimated along with those of the interesting signals [4,5,6,7] Such an approach is expected to perform well in situations where the noise can be accurately modeled with relatively few parameters An alternative approach, which does not require a precise model of the noise, is based on the principle of Instrumental Variables (IV) See [8,9] for thorough treatments

of IV methods (IVM) in the context of identification of linear time-invariant dynamical systems A brief introduction is given in the appendix of this chapter Computationally simple IVMs for array signal processing appeared in [10,11] These methods perform poorly in difficult scenarios involving closely spaced DOAs and correlated signals

More recently, the combined Instrumental Variable Signal Subspace Fitting (IV-SSF) technique has been proposed as a promising alternative to array signal processing in spatially colored noise fields [12,13,14,15] The IV-SSF approach has a number of appealing advantages over other DOA estimation methods These advantages include:

• IV-SSF can handle noises with arbitrary spatial correlation, under minor restrictions on the signals or the array In addition, estimation of a noise model is avoided, which leads

to statistical robustness and computational simplicity

• The IV-SSF approach is applicable to both non-coherent and coherent signal scenarios

• The spatial IV-SSF technique can make use of the information contained in the output of

a completely uncalibrated subarray under certain weak conditions, which other methods cannot

Depending on the type of “instrumental variables” used, two classes of IV methods have appeared

in the literature:

1 Spatial IVM, for which the instrumental variables are derived from the output of a

(pos-sibly uncalibrated) subarray the noise of which is uncorrelated with the noise in the main calibrated subarray under consideration (see [12,13])

2 Temporal IVM, which obtains instrumental variables from the delayed versions of the

array output, under the assumption that the temporal-correlation length of the noise field is shorter than that of the signals (see [11,14])

The previous literature on IV-SSF has treated and analyzed the above two classes of spatial and temporal methods separately, ignoring their common basis In this contribution, we reveal the common roots of these two classes of DOA estimation methods and study them under the same umbrella Additionally, we establish the statistical properties of a general (either spatial or temporal) weighted IV-SSF method and present the optimal weights that minimize the variance of the DOA estimation errors In particular, we point out that the optimal four-weight spatial IV-SSF of [12,13] (called UNCLE there, and arrived at by using canonical correlation decomposition ideas) and the optimal three-weight temporal IV-SSF of [14] are asymptotically equivalent when used under the same conditions This asymptotic equivalence property, which is a main result of the present section,

is believed to be important as it shows the close ties that exist between two seemingly different DOA estimators

This section is organized as follows In Section64.2the data model and technical assumptions are introduced Next, in Section64.3the IV-SSF method is presented in a fairly general setting In

Trang 4

Section64.4, the statistical performance of the method is presented along with the optimal choices of certain user-specified quantities The data requirements and the optimal IV-SSF (UNCLE) algorithm are summarized in Section64.5 The anxious reader may wish to jump directly to this point to investigate the usefulness of the algorithm in a specific application In Section64.6, some numerical examples and computer simulations are presented to illustrate the performance The conclusions are given in Section64.7 In the appendix we give a brief introduction to IV methods The reader who is not familiar with IV might be helped by reading the appendix before the rest of the paper Background material on the subspace-based approach to DOA estimation can be found in Chapter 62

of this Handbook

64.2 Problem Formulation

Consider a scenario in whichn narrowband plane waves, generated by point sources, impinge on an

array comprisingm calibrated sensors Assume, for simplicity, that the n sources and the array are

situated in the same plane Leta(θ) denote the complex array response to a unit-amplitude signal

with DOA parameter equal toθ Under these assumptions, the output of the array, y(t) ∈ C m×1,

can be described by the following well-known equation [16,17]:

wherex(t) ∈ C n×1denotes the signal vector,e(t) ∈ C m×1is a noise term, and

Hereafter,θ kdenotes thekth DOA parameter.

The following assumptions on the quantities in the array equation, (64.1), are considered to hold throughout this section:

A1 The signal vector x(t) is a normally distributed random variable with zero mean and a possibly

singular covariance The signals may be temporally correlated; in fact the temporal IV-SSF approach relies on the assumption that the signals exhibit some form of temporal correlation (see below for details)

A2 The noise e(t) is a random vector that is temporally white, uncorrelated with the signals and

circularly symmetric normally distributed with zero mean and unknown covariance matrix2Q > O,

E[e(t)e(s)] = Q δ t,s ; E [e(t)e T (s)] = O (64.3)

A3 The manifold vectors {a(θ)}, corresponding to any set of m different values of θ, are linearly

independent

Note that assumption A1 above allows for coherent signals, and that in A2 the noise field is allowed

to be arbitrarily spatially correlated with an unknown covariance matrix Assumption A3 is a

well-known condition that, under a weak restriction onm, guarantees DOA parameter identifiability in

the caseQ is known (to within a multiplicative constant) [18] WhenQ is completely unknown,

DOA identifiability can only be achieved if further assumptions are made on the scenario under consideration The following assumption is typical of the IV-SSF approach:

2 Henceforth, the superscript “ ∗” denotes the conjugate transpose; whereas the transpose is designated by a superscript

T ” The notation A ≥ B, for two Hermitian matrices A and B, is used to mean that (A − B) is a nonnegative definite

matrix Also,O denotes a zero matrix of suitable dimension.

Trang 5

A4 There exists a vector z(t) ∈ C ¯m×1, which is normally distributed and satisfies

E[z(t)e(s)] = O for t ≤ s (64.4)

E[z(t)e T (s)] = O for all t, s (64.5) Furthermore, denote

¯n = rank (0) ≤ ¯m (64.7)

It is assumed that no row of0 is identically zero and that the inequality

¯n > 2n − m (64.8) holds (note that a rank-one0 matrix can satisfy the condition (64.8) ifm is large enough, and hence

the condition in question is rather weak) Owing to its (partial) uncorrelatedness with{e(t)}, the

vector{z(t)} can be used to eliminate the noise from the array output equation (64.1), and for this reason{z(t)} is called an IV vector Below, we briefly describe three possible ways to derive an IV

vector from the available data measured with an array of sensors (for more details on this aspect, the reader should consult [12,13,14])

EXAMPLE 64.1: Spatial IV

Assume that then signals, which impinge on the main (sub)array under consideration, are also

received by another (sub)array that is sufficiently distanced from the main one so that the noise vectors

in the two subarrays are uncorrelated with one another Thenz(t) can be made from the outputs of

the sensors in the second subarray (note that those sensors need not be calibrated) [12,13,15]

EXAMPLE 64.2: Temporal IV

When a second subarray, as described above, is not available but the signals are temporally corre-lated, one can obtain an IV vector by delaying the output vector:z(t) = [y T (t −1) y T (t −2) · · · ] T.

Clearly, such a vectorz(t) satisfies (64.4) and (64.5), and it also satisfies (64.8) under weak conditions

on the signal temporal correlation This construction of an IV vector can be readily extended to cases wheree(t) is temporally correlated, provided that the signal temporal correlation length is longer

than that corresponding to the noise [11,14]

In a sense, the above examples are both special cases of the following more general situation:

EXAMPLE 64.3: Reference Signal

In many systems a reference or pilot signal [19,20]z(t) (scalar or vector) is available If the

reference signal is sufficiently correlated with all signals of interest (in the sense of (64.8)) and uncorrelated with the noise, it can be used as an IV Note that all signals that are not correlated with the reference will be treated as noise Reference signals are commonly available in communication applications, for example a PN-code in spread spectrum communication [20] or a training signal used for synchronization and/or equalizer training [21] A closely related possibility is utilization of cyclo-stationarity (or self-coherence), a property that is exhibited by many man-made signals The reference signal(s) can then consist, for example, of sinusoids of different frequencies [22,23] In these techniques, the data is usually pre-processed by computing the auto-covariance function (or a higher-order statistic) before correlating with the reference signal

Trang 6

The problem considered in this section concerns the estimation of the DOA vector

givenN snapshots of the array output and of the IV vector, {y(t), z(t)} N

t=1 The number of signals,

n, and the rank of the covariance matrix 0, ¯n, are assumed to be given (for the estimation of these

integer-valued parameters by means of IV/SSF-based methods, we refer to [24,25])

64.3 The IV-SSF Approach

Let

ˆR = ˆW L

"

1

N

N

X

t=1

z(t)y(t)

# ˆ

W R ( ¯m × m) (64.10)

where ˆW L and ˆW R are two nonsingular Hermitian weighting matrices which are possibly data-dependent (as indicated by the fact that they are roofed) Under the assumptions made, asN → ∞,

ˆR converges to the matrix:

whereW LandW Rare the limiting weighting matrices (assumed to be bounded and nonsingular)

Owing to assumptions A2 and A3,

rank(R) = ¯n (64.12) Hence, the Singular Value Decomposition (SVD) [26] ofR can be written as

R = [U ?]



3 O

O O

 

S

?



= U3S∗ (64.13)

whereUU = SS = I, 3 ∈ R ¯nׯnis diagonal and nonsingular, and where the question marks

stand for blocks that are of no importance for the present discussion

The following key equality is obtained by comparing the two expressions forR in Eqs (64.11) and (64.13) above:

whereC = 04 ∗W L U3−1∈ C nׯnhas full column rank For a givenS, the true DOA vector can be

obtained as the unique solution to Eq (64.14) under the parameter identifiability condition (64.8) (see, e.g., [18]) In the more realistic case whenS is unknown, one can make use of Eq (64.14) to estimate the DOA vector in the following steps

The IV step — Compute the pre- and post-weighted sample covariance matrix ˆR in

Eq (64.10), along with its SVD:

ˆR =  ˆU ?   ˆ3 O

O ?

 

ˆS

?



(64.15)

where ˆ3 contains the ¯n largest singular values Note that ˆU, ˆ3, and ˆS are consistent estimates of

U, 3, and S in the SVD of R.

Trang 7

The SSF step — Compute the DOA estimate as the minimizing argument of the following signal subspace fitting criterion:

min

θ {minC [vec ( ˆS − ˆ W R AC)]

ˆV [vec (ˆS − ˆW R AC)]} (64.16)

where ˆV is a positive definite weighting matrix, and “vec” is the vectorization operator3 Alternatively, one can estimate the DOA instead by minimizing the following criterion:

min

θ {[vec (B

Wˆ−1R ˆS)]W[vec (Bˆ ∗Wˆ−1R ˆS)]} (64.17)

where ˆW is a positive definite weight, and B ∈ C m×(m−n)is a matrix whose columns form a basis

of the null-space ofA∗(hence,BA = 0 and rank (B) = m − n) The alternative fitting criterion

above is obtained from the simple observation that Eq (64.14) along with the definition ofB imply

that

It can be shown [27] that the classes of DOA estimates derived from Eqs ( 64.16 ) and ( 64.17 ), respectively, are asymptotically equivalent More exactly, for any ˆ V in Eq (64.16) one can choose ˆ

W in Eq (64.17) so that the DOA estimates obtained by minimizing Eq (64.16) and, respectively,

Eq (64.17) have the same asymptotic distribution and vice-versa

In view of the previous result, in an asymptotical analysis it suffices to consider only one of the two criteria above In the following, we focus on Eq (64.17) Compared with Eq (64.16), the criterion (64.17) has the advantage that it depends on the DOA only On the other hand, for a general array there is no known closed-form parameterization ofB in terms of θ However, as shown

in the following, this is no drawback because the optimally weighted criterion (which is the one to

be used in applications) is an explicit function ofθ.

64.4 The Optimal IV-SSF Method

In what follows, we deal with the essential problem of choosing the weights ˆW, ˆ W R, and ˆW Lin the IV-SSF criterion (64.17) so as to maximize the DOA estimation accuracy First, we optimize the accuracy with respect to ˆW, and then with respect to ˆ W Rand ˆW L

Optimal Selection of ˆ W

Define

and observe that the criterion function in Eq (64.17) can be written as,

g(θ) ˆ Wg(θ) (64.20)

In [27] it is shown thatg(θ) (evaluated at the true DOA vector) has, asymptotically in N, a circularly

symmetric normal distribution with zero mean and the following covariance:

G(θ) = 1

N [(W L U3−1)R z (W L U3−1)] T ⊗ [BR y B] (64.21)

3 Ifxkis thekth column of a matrix X, then vec (X) = [x T1 x T2 · · · ]T

Trang 8

where⊗ denotes the Kronecker matrix product [28]; and where, for a stationary signals(t), we use

the notation

R s = E [s(t)s(t)] (64.22) Then, it follows from the ABC (Asymptotically Best Consistent) theory of parameter estimation4that the minimum variance estimate, in the class of estimates under discussion, is given by the minimizing argument of the criterion in Eq (64.20) with ˆW = ˆG−1(θ), that is

where

ˆG(θ) = N1[( ˆ W L ˆU ˆ3−1)ˆR z ( ˆ W L ˆU ˆ3−1)] T ⊗ [BˆR y B] (64.24) and where ˆR zand ˆR yare the usual sample estimates ofR zandR y Furthermore, it is easily shown that the minimum variance estimate, obtained by minimizing Eq (64.23), is asymptotically normally distributed with mean equal to the true parameter vector and the following covariance matrix:

2{Re [JG−1(θ)J ]}−1 (64.25) where

J = lim

N→∞

∂g(θ)

The following more explicit formula forH is derived in [27]:

 Re



DR −1/2 y 5R −1/2

y A R −1/2 y D



 T

−1

(64.27) where denotes the Hadamard-Schur matrix product (elementwise multiplication) and

Furthermore, the notationY −1/2is used for a Hermitian (for notational convenience) square root of

the inverse of a positive definite matrixY , the matrix D is made from the direction vector derivatives,

D = [d1 · · · d n ]; d k =∂a(θ k )

∂θ k

and, for a full column-rank matrixX, 5Xdefines the orthogonal projection onto the nullspace of

X∗as

5X = I − 5 X ; 5 X = X(XX)−1X. (64.29)

To summarize, for fixed ˆW R and ˆW L, the statistically optimal selection of ˆW leads to DOA

estimates with an asymptotic normal distribution with mean equal to the true DOA vector and covariance matrix given by Eq (64.27)

4 For details on the ABC theory, which is an extension of the classical BLUE (Best Linear Unbiased Estimation) / Markov theory of linear regression to a class of nonlinear regressions with asymptotically vanishing residuals, the reader is referred

to [ 9 , 29 ].

Trang 9

Optimal Selection of ˆ W R and ˆ W L

The optimal weights ˆW Rand ˆW Lare, by definition, those that minimize the limiting covariance matrixH of the DOA estimation errors In the expression (64.27) ofH, only  depends on W Rand

W L(the dependence onW Ris implicit, viaU) Since the matrix 0 has rank ¯n, it can be factorized

as follows:

0 = 010

where both01 ∈ C ¯mׯnand02 ∈ C nׯn have full column rank Insertion of Eq (64.30) into the

equalityW L 0AW R = U3S∗yields the following equation, after a simple manipulation,

whereT = 0

2AW R S3−1∈ C ¯nׯnis a nonsingular transformation matrix By using Eq (64.31)

in Eq (64.28), we obtain:

 = 02 (0∗1W 2

L 01 )(0∗1W 2

L R z W2

L 01 )−1(0∗1W 2

L 01 )0∗2 (64.32) Observe that does not actually depend on W R Hence, WˆR can be arbitrarily selected, as any

nonsingular Hermitian matrix, without affecting the asymptotics of the DOA parameter estimates!

Concerning the choice of ˆW L, it is easily verified that

 ≤  |W L= R −1/2

z = 02(0∗1R−1z 01 )0∗2= 0R−1z 0 (64.33) Indeed,

0R−1

z 0 −  = 02 [0

1R−1

z 01 − (0

1W2

L 01 )(0

1W2

L R z W2

L 01 )−1×

×(0∗1W 2

L 01 )]0∗2= 0R −1/2 z 5R1/2

z W2

L 01R −1/2 z 0 (64.34) which is obviously a nonnegative definite matrix Hence,W L = R −1/2 z maximizes Then, it

follows from the expression of the matrixH and the properties of the Hadamard-Schur product that

this same choice ofW LminimizesH The conclusion is that the optimal weight ˆ W L , which yields the best limiting accuracy, is

ˆ

W L = ˆR −1/2 z (64.35) The (minimum) covariance matrixH , corresponding to the above choice, is given by

2N {Re [(DR −1/2 y 5R y −1/2 A R −1/2 y D) (0R−1z 0) T]}−1 (64.36)

Remark It is worth noting thatH o monotonically decreases as ¯m (the dimension of z(t)) increases The

proof of this claim is similar to the proof of the corresponding result in [9], Complement C8.5 Hence,

as could be intuitively expected, one should use all available instruments (spatial and/or temporal)

to obtain maximal theoretical accuracy However, practice has shown that too large a dimension

of the IV vector may in fact decrease the empirically observed accuracy This phenomenon can be explained by the fact that increasing ¯m means that a longer data set is necessary for the asymptotic

results to be valid

Optimal IV-SSF Criteria

Fortunately, the criterion, (64.23) and (64.24) can be expressed in a functional form that depends on the indeterminateθ in an explicit way (recall that, for most cases, the dependence of B in Eq (64.23)

onθ is not available in explicit form) By using the following readily verified equality [28],

Trang 10

which holds for any conformable matricesA, X, B, and Y , one can write Eq (64.23) as:5

f (θ) = tr {[( ˆ W L ˆU ˆ3−1)ˆR z ( ˆ W L ˆU ˆ3−1)]−1ˆSWˆ−1R B(BˆR y B)−1BWˆ−1R ˆS} (64.38) However, observe that

B(BˆR y B)−1B= ˆR −1/2 y 5 ˆR1/2

y B ˆR −1/2 y = ˆR −1/2 y 5

ˆR −1/2

y A ˆR −1/2 y (64.39) Inserting Eq (64.39) into Eq (64.38) yields:

f (θ) = tr [ ˆ3( ˆUWˆL ˆR z WˆL ˆU)−1ˆ3ˆSWˆ−1R ˆR −1/2 y 5ˆR −1/2

y A ˆR −1/2 y Wˆ−1R ˆS] (64.40) which is an explicit function ofθ Insertion of the optimal choice of W Linto Eq (64.40) leads to a further simplification of the criterion as seen below

Owing to the arbitrariness in the choice of ˆW R, there exists an infinite class of optimal IV-SSF criteria In what follows, we consider two members of this class

Let

ˆ

W R = ˆR −1/2 y (64.41) Insertion of Eq (64.41), along with Eq (64.35), into Eq (64.40) yields the following criterion function:

f WW (θ) = tr



5ˆR −1/2

y A ˜S ˜32

˜S∗ (64.42) where ˜S and ˜3 are made from the principal singular right vectors and singular values of the matrix

˜R = ˆR −1/2 z ˆR zy ˆR −1/2 y (64.43) (with ˆR zydefined in an obvious way) The function (64.42) is the UNCLE (spatial IV-SSF) criterion

of Wong and Wu [12,13]

Next, choose ˆW Ras

ˆ

The corresponding criterion function is

f V SO (θ) = tr



5ˆR −1/2

y A ˆR −1/2 y ¯S ¯32¯SˆR −1/2 y  (64.45) where ¯S and ¯3 are made from the principal singular pairs of

¯R = ˆR −1/2 z ˆR zy (64.46) The function (64.45) above is recognized as the optimal (temporal) IV-SSF criterion of Viberg et

al [14]

An important consequence of the previous discussion is that the DOA estimation methods of [12,

13] and [14], respectively, which were derived in seemingly unrelated contexts and by means of somewhat different approaches, are in fact asymptotically equivalent when used under the same conditions These two methods have very similar computational burdens, which can be seen by comparing Eqs (64.42) and (64.43) with Eqs (64.45) and (64.46) Also, their finite-sample properties appear to be rather similar, as demonstrated in the simulation examples Numerical algorithms for the minimization of the type of criterion function associated with the optimal IV-SSF methods are discussed in [17] Some suggestions are also given in the summary below

5 To within a multiplicative constant.

...

as could be intuitively expected, one should use all available instruments (spatial and/or temporal)

to obtain maximal theoretical accuracy However, practice has shown that too large a. ..

Trang 10

which holds for any conformable matricesA, X, B, and Y , one can write Eq (64. 23) as:5

f... IV vector may in fact decrease the empirically observed accuracy This phenomenon can be explained by the fact that increasing ¯m means that a longer data set is necessary for the asymptotic

Ngày đăng: 08/11/2013, 12:15

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN