Independent component analysis P23

23.1 MULTIUSER DETECTION AND CDMA COMMUNICATIONS In wireless communication systems, like mobile phones, an essential issue is division of the common transmission medium among several use

Trang 1

in more detail three particular applications of ICA or BSS techniques to CDMAdata These are a simplified complexity minimization approach for estimating fadingchannels, blind separation of convolutive mixtures using an extension of the naturalgradient algorithm, and improvement of the performance of conventional CDMAreceivers using complex-valued ICA The ultimate goal in these applications is todetect the desired user’s symbols, but for achieving this intermediate quantities such

as fading channel or delays must usually be estimated first At the end of the chapter,

we give references to other communications applications of ICA and related blindtechniques used in communications

23.1 MULTIUSER DETECTION AND CDMA COMMUNICATIONS

In wireless communication systems, like mobile phones, an essential issue is division

of the common transmission medium among several users This calls for a multiple access communication scheme A primary goal in designing multiple access systems

is to enable each user of the system to communicate despite the fact that the other

417

Independent Component Analysis Aapo Hyv¨arinen, Juha Karhunen, Erkki Oja

Copyright  2001 John Wiley & Sons, Inc ISBNs: 0-471-40540-X (Hardback); 0-471-22131-7 (Electronic)

Trang 2

FDMA TDMA CDMA

code code

Fig 23.1 A schematic diagram of the multiple access schemes FDMA, TDMA, and CDMA [410, 382].

users occupy the same resources, possibly simultaneously As the number of users inthe system grows, it becomes necessary to use the common resources as efficiently

as possible These two requirements have given rise to a number of multiple accessschemes

Figure 23.1 illustrates the most common multiple access schemes [378, 410, 444]

In frequency division multiple access (FDMA), each user is given a nonoverlappingfrequency slot in which one and only one user is allowed to operate This preventsinterference of other users In time division multiple access (TDMA) a similaridea is realized in the time domain, where each user is given a unique time period(or periods) One user can thus transmit and receive data only during his or herpredetermined time interval while the others are silent at the same time

In CDMA [287, 378, 410, 444], there is no disjoint division in frequency and timespaces, but each user occupies the same frequency band simultaneously The usersare now identified by their codes, which are unique to each user Roughly speaking,each user applies his unique code to his information signal (data symbols) beforetransmitting it through a common medium In transmission different users’ signalsbecome mixed, because the same frequencies are used at the same time Each user’stransmitted signal can be identified from the mixture by applying his unique code atthe receiver

In its simplest form, the code is a pseudorandom sequence of1s, also called a

chip sequence or spreading code In this case we speak about direct sequence (DS)

modulation [378], and call the multiple access method DS-CDMA In DS-CDMA,each user’s narrow-band data symbols (information bits) are spread in frequencybefore actual transmission via a common medium The spreading is carried out bymultiplying each user’s data symbols (information bits) by his unique wide-bandchip sequence (spreading code) The chip sequence varies much faster than the

Trang 3

MULTIUSER DETECTION AND CDMA COMMUNICATIONS 419

trans-information bit sequence In the frequency domain, this leads to spreading of the

power spectrum of the transmitted signal Such spread spectrum techniques are

useful because they make the transmission more robust against disturbances caused

by other signals transmitted simultaneously [444]

Example 23.1 Figure 23.2 shows an example of the formation of a CDMA signal On

the topmost subfigure, there are 4 user’s symbols (information bits)1+11+1

to be transmitted.The middle subfigure shows the chip sequence (spreading code)

It is now1+111+1 Each symbol is multiplied by the chip sequence in asimilar manner This yields the modulated CDMA signal on the bottom row of Fig.23.2, which is then transmitted The bits in the spreading code change in this case 5times faster that the symbols

Let us denote themth data symbol (information bit) byb m, and the chip sequence

bys(t) The time period of the chip sequence isT (see Fig 23.2), so thats(t) 2 f1+1gwhent2 0T), ands(t) = 0whent =2 0T) The length of the chipsequence isCchips, and the time duration of each chip isT c=T=C The number ofbits in the observation interval is denoted byN In Fig 23.2, the observation intervalcontainsN symbols, and the length of the chip sequence isC

Trang 4

Using these notations, the CDMA signal r(t) at time t arising in this simpleexample can be written

is larger, and it degrades gradually with increasing number of simultaneous userswho can be asynchronous [444] CDMA technology is therefore a strong candidatefor future global wireless communications systems For example, it has already beenchosen as the transmission technique for the European third generation mobile systemUMTS [334, 182], which will provide useful new services, especially multimediaand high-bit-rate packet data

In mobile communications systems, the required signal processing differs in thebase station (uplink) from that in the mobile phone (downlink) In the base station,all the signals sent by different users must be detected, but there is also much moresignal processing capacity available The codes of all the users are known buttheir time delays are unknown For delay estimation, one can use for example thesimple matched filter [378, 444], subspace approaches [44, 413], or the optimal butcomputationally highly demanding maximum likelihood method [378, 444] Whenthe delays have been estimated, one can estimate the other parameters such as thefading process and symbols [444]

In downlink (mobile phone) signal processing, each user knows only its own code,while the codes of the other users are unknown There is less processing power than

in the base station Also the mathematical model of the signals differs slightly, sinceusers share the same channel in the downlink communications Especially the firsttwo features of downlink processing call for new, efficient and simple solutions.ICA and BSS techniques provide a promising new approach to the downlink signalprocessing using short spreading codes and DS-CDMA systems

Figure 23.3 shows a typical CDMA transmission situation in an urban ment Signal 1 arrives directly from the base station to the mobile phone in the car

environ-It has the smallest time delay and is the strongest signal, because it is not attenuated

by the reflection coefficients of the obstacles in the path Due to multipath gation, the user in the car in Fig 23.3 receives also weaker signals 2 and 3, which

propa-have longer time delays The existence of multipath propagation allows the signal

to interfere with itself This phenomenon is known as intersymbol interference (ISI).

Using spreading codes and suitable processing methods, multipath interference can

be mitigated [444]

Trang 5

MULTIUSER DETECTION AND CDMA COMMUNICATIONS 421

Time delay

Fig 23.3 An example of multipath propagation in urban environment.

There are several other problems that complicate CDMA reception One of the

most serious ones is multiple access interference (MAI), which arises from the fact

that the same frequency band is occupied simultaneously MAI can be alleviated byincreasing the length of the spreading code, but at a fixed chip rate, this decreases

the data rate In addition, the near–far problem arises when signals from near and

far are received at the same time If the received powers from different users becometoo different, a stronger user will seriously interfere with the weaker ones, even ifthere is a small correlation between the users’ spreading codes In the FDMA andTDMA systems, the near–far problem does not arise because different users havenonoverlapping frequency or time slots

The near–far problem in the base station can be mitigated by power control, or by

multiuser detection Efficient multiuser detection requires knowledge or estimation of

many system parameters such as propagation delay, carrier frequency, and received

power level This is usually not possible in the downlink However, then blind

multiuser detection techniques can be applied, provided that the spreading codes areshort enough [382]

Still other problems appearing in CDMA systems are power control, tion, and fading of channels, which is present in all mobile communications systems.Fading means variation of the signal power in mobile transmission caused for exam-ple by buildings and changing terrain See [378, 444, 382] for more information onthese topics

Trang 6

synchroniza-23.2 CDMA SIGNAL MODEL AND ICA

In this section, we represent mathematically the CDMA signal model which is studied

in slightly varying forms in this chapter This type of models and the formation ofthe observed data in them are discussed in detail in [444, 287, 382]

It is straightforward to generalize the simple model (23.1) forKusers Themthsymbol of thekth user is denoted byb km, ands k()isk:th user’s binary chip sequence(spreading code) For each userk, the spreading code is defined quite similarly as inExample 23.1 The combined signal ofKsimultaneous users then becomes

wheren(t)denotes additive noise corrupting the observed signal

The signal model (23.2) is not yet quite realistic, because it does not take intoaccount the effect of multipath propagation and fading channels Including thesefactors in (23.2) yields our desired downlink CDMA signal model for the observeddatar(t)at timet:

d l denotes the delay of the lth path, which is assumed to be constant during theobservation interval of N symbol bits Each of theK simultaneous users hasL

independent transmission paths The terma lm is the fading factor of thelth pathcorresponding to themth symbol

In general, the fading coefficients a lm are complex-valued However, we canapply standard real-valued ICA methods to the data (23.3) by using only the real part

of it This is the case in the first two approaches to be discussed in the next twosections, while the last method in Section 23.5 directly uses complex data

The continuous time data (23.3) is first sampled using the chip rate, so thatC

equispaced samples per symbol are taken From subsequent discretized equispaceddata samplesrn],C-length data vectors are then collected:

Trang 7

CDMA SIGNAL MODEL AND ICA 423

be easily observed by shifting the spreading code to the right in Fig 23.2

The vector model (23.5) can be expressed in compact form as a matrix model.Define the data matrix

and the2KLNmatrixF=f

1:::fN]contains the symbols and fading terms

To see the correspondence of (23.9) to ICA, let us write the noisy linear ICAmodelx=As + nin the matrix form as

The data matrixXhas as its columns the data vectorsx(1)x(2)::: andSand

Nare similarly compiled source and noise matrices whose columns consist of thesource and noise vectorss(t)andn(t), respectively Comparing the matrix CDMAsignal model (23.9) with (23.12) shows that it has the same form as the noisy linearICA model Clearly, in the CDMA model (23.9)Fis the matrix of source signals,R

is the observed data matrix, andGis the unknown mixing matrix

Trang 8

For estimating the desired user’s parameters and symbols, several techniques areavailable [287, 444] Matched filter (correlator) [378, 444] is the simplest estimator,but it performs well only if different users’ chip sequences are orthogonal or the usershave equal powers The matched filter suffers greatly from the near–far problem,rendering it unsuitable for CDMA reception without a strict power control Theso-called RAKE detector [378] is a somewhat improved version of the basic matchedfilter which takes advantage of multiple propagation paths The maximum likelihood(ML) method [378, 444] would be optimal, but it has a very high computational load,and requires knowledge of all the users’ codes However, in downlink reception,only the desired user’s code is known To remedy this problem while preservingacceptable performance, subspace approaches have been proposed for example in[44] But they are sensitive to noise, and fail when the signal subspace dimensionexceeds the processing gain This easily occurs even with moderate system loaddue to the multipath propagation Some other semiblind methods proposed for theCDMA problem such as the minimum mean-square estimator (MMSE) are discussedlater in this chapter and in [287, 382, 444].

It should be noted that the CDMA estimation problem is not completely blind,because there is some prior information available In particular, the transmittedsymbols are binary (more generally from a finite alphabet), and the spreading code(chip sequence) is known On the other hand, multipath propagation, possibly fadingchannels, and time delays make separation of the desired user’s symbols a verychallenging estimation problem which is more complicated than the standard ICAproblem

23.3 ESTIMATING FADING CHANNELS

23.3.1 Minimization of complexity

Pajunen [342] has recently introduced a complexity minimization approach as a truegeneralization of standard ICA In his method, temporal information contained inthe source signals is also taken into account in addition to the spatial independenceutilized by standard ICA The goal is to optimally exploit all the available information

in blind source separation In the special case where the sources are temporally white(uncorrelated), complexity minimization reduces to standard ICA [342] Complexityminimization has been discussed in more detail in Section 18.3

Regrettably, the original method for minimizing the Kolmogoroff complexity sure is computationally highly demanding except for small scale problems But if thesource signals are assumed to be gaussian and nonwhite with significant time correla-tions, the minimization task becomes much simpler [344] Complexity minimizationthen reduces to principal component analysis of temporal correlation matrices Thismethod is actually just another example of blind source separation approaches based

mea-on secmea-ond-order temporal statistics; for example [424, 43], which were discussedearlier in Chapter 18

Trang 9

ESTIMATING FADING CHANNELS 425

In the following, we apply this simplified method to the estimation of the ing channel coefficients of the desired user in a CDMA systems Simulations withdownlink data, propagated through a Rayleigh fading channel, show noticeable per-formance gains compared with blind minimum mean-square error channel estimation,which is currently a standard method for solving this problem The material in thissection is based on the original paper [98]

fad-We thus assume that the fading process is gaussian and complex-valued Then theamplitude of the fading process is Rayleigh distributed; this case is called Rayleighfading (see [444, 378]) We also assume that a training sequence or a preamble isavailable for the desired user, although this may not always be the case in practice.Under these conditions, only the desired user’s contribution in the sampled data istime correlated, which is then utilized The proposed method has the advantage that

it estimates code timing only implicitly, and hence it does not degrade the accuracy

of channel estimation

A standard method for separating the unknown source signals is based on mization of the mutual information (see Chapter 10 and [197, 344]) of the separatedsignalsfm=y1 m):::y2KL(m)]T =y:

mini-J (y ) =

X

i H(y i) + log j det G j (23.13)whereH(y i)is the entropy ofy i(see Chapter 5) But entropy has the interpretationthat it represents the optimum averaged code length of a random variable Hencemutual information can be expressed by using algorithmic complexity as [344]

J (y ) =

X

i K(y i) + log j det G j (23.14)whereK()is the per-symbol Kolmogoroff complexity, given by the number of bitsneeded to describey i By using prior information about the signals, the coding costscan be explicitly approximated For instance, if the signals are gaussian, indepen-dence becomes equivalent to uncorrelatedness Then the Kolmogoroff complexitycan be replaced by the per-symbol differential entropy, which in this case depends onsecond-order statistics only

For Rayleigh type fading transmission channels, the prior information can be mulated by considering that the probability distributions of the mutually independentsource signalsy i(m)have zero-mean gaussian distributions Suppose we want toestimate the channel coefficients of the transmission paths, by sending a given lengthconstantb1m = 1symbol sequence to the desired user We consider the signals

for-y i(m),i= 1:::2L, withirepresenting the indexes of the2Lsources ing to the first user Theny i(m)will actually represent the channel coefficients of allthe first user’s paths Since we assume that the channel is Rayleigh fading, then thesesignals are gaussian and time correlated In this case, blind separation of the sourcescan be achieved by using only second-order statistics In fact, we can express theKolmogoroff complexity by coding these signals using principal component analysis[344]

Trang 10

correspond-23.3.2 Channel estimation *

Letyi(m) = y i(m)::: y i(mD+ 1)] denote the vector consisting ofD lastsamples of every such source signaly i(m),i= 1:::2L HereDis the number ofdelayed terms, showing what is the range of time correlations taken into account whenestimating the current symbol The information contained in any of these sources can

be approximated by the code length needed for representing theDprincipal ponents, which have variances given by the eigenvalues of the temporal correlationmatrixCi= Eyi(m)yTi(m)][344] Since we assume that the transmission paths aremutually independent, the overall entropy of the source is given by summing up theentropies of the principal components Using the result that the entropy of a gaussianrandom variable is given by the logarithm of the variance, we get for the entropy ofeach source signal

is the separating matrix

The separating matrixWcan be estimated by using a gradient descent approachfor minimizing the cost function (23.16), leading to the update rule [344]

W = @log J (y )

whereis the learning rate andis the momentum term [172] that can be introduced

to avoid getting trapped into a local minimum corresponding to a secondary path.LetwTi denote the ith row vector of the separating matrixW Since only thecorrelation matrixCiof theith source depends onwi, we can express the gradient

of the cost function by computing the partial derivatives

Trang 11

ESTIMATING FADING CHANNELS 427

wherer ki is the element(kj)of the observation matrixRdefined earlier usingformulas (23.4) and (23.9)

What is left to do now is to compute the gradient update part due to the mappinginformation It can be written [344]

log j det W j =

C

X

i=1 log k(I Pi)wik (23.20)wherePi = Wi(WTiWi)

1

WTi is a projection matrix onto the subspace spanned

by the column vectors of the matrixWi =w

1:::wi1

] Now the cost functioncan be separated, and the different independent components can be found one byone, by taking into account the previously estimated components, contained in thesubspace spanned by the columns of the matrixWi

Since our principal interest lies in the transmission path having the largest power,corresponding usually to the desired user, it is sufficient to estimate the first suchindependent component In this case, the projection matrix P

1 becomes a zeromatrix Then the overall gradient (23.17) for the first row wT

1 of the separatingmatrix can be written

@log J (y )

@wT

1

= 1

(23.21)

It suffices to consider the special case where only the two last samples are takeninto account, so that the the delayD = 2 First, second-order correlations are re-moved from the dataRby whitening This can be done easily in terms of standardprincipal component analysis as explained in Chapter 6 After whitening, the subse-quent separating matrix will be orthogonal, and thus the second term in Eq (23.16)disappears, yielding the cost function

In this case, the separating vectorswTi can be found by maximizing sequentially

Ey i(m)y i(m1) +y i(m 1)y i(m)], which is the first-order correlation coefficient

ofy i It follows that the function to be maximized becomes

J(w ) = wTErmrTm1

+ rm1

So the separating vectorwT

1 corresponding to the most important path is given bythe principal eigenvector of the matrix in Eq (23.24) We have used a symmetricexpression for the correlation coefficients in order to avoid asymmetry when the

Trang 12

observed data set is finite This usually improves the estimation accuracy Finally,

we separate the desired channel coefficients by computing the quantities

a 11

23.3.3 Comparisons and discussion

We have compared the method described and derived above to a well-performingstandard method used in multiuser detection, namely the minimum mean-squareerror estimator [452, 287] In the MMSE method, the desired signal is estimated(up to a scaling) from the formula

a column of the matrixGdefined in (23.10) corresponding to the desired user’s bit

b m, that is, eitherg

Tiêu đề	Independent component analysis
Tác giả	Aapo Hyvärinen, Juha Karhunen, Erkki Oja
Chuyên ngành	Telecommunications
Thể loại	Chapter
Năm xuất bản	2001

Định dạng
Số trang	24
Dung lượng	484,07 KB