23.1 MULTIUSER DETECTION AND CDMA COMMUNICATIONS In wireless communication systems, like mobile phones, an essential issue is division of the common transmission medium among several use
Trang 1in more detail three particular applications of ICA or BSS techniques to CDMAdata These are a simplified complexity minimization approach for estimating fadingchannels, blind separation of convolutive mixtures using an extension of the naturalgradient algorithm, and improvement of the performance of conventional CDMAreceivers using complex-valued ICA The ultimate goal in these applications is todetect the desired user’s symbols, but for achieving this intermediate quantities such
as fading channel or delays must usually be estimated first At the end of the chapter,
we give references to other communications applications of ICA and related blindtechniques used in communications
23.1 MULTIUSER DETECTION AND CDMA COMMUNICATIONS
In wireless communication systems, like mobile phones, an essential issue is division
of the common transmission medium among several users This calls for a multiple access communication scheme A primary goal in designing multiple access systems
is to enable each user of the system to communicate despite the fact that the other
417
Independent Component Analysis Aapo Hyv¨arinen, Juha Karhunen, Erkki Oja
Copyright 2001 John Wiley & Sons, Inc ISBNs: 0-471-40540-X (Hardback); 0-471-22131-7 (Electronic)
Trang 2FDMA TDMA CDMA
code code
Fig 23.1 A schematic diagram of the multiple access schemes FDMA, TDMA, and CDMA [410, 382].
users occupy the same resources, possibly simultaneously As the number of users inthe system grows, it becomes necessary to use the common resources as efficiently
as possible These two requirements have given rise to a number of multiple accessschemes
Figure 23.1 illustrates the most common multiple access schemes [378, 410, 444]
In frequency division multiple access (FDMA), each user is given a nonoverlappingfrequency slot in which one and only one user is allowed to operate This preventsinterference of other users In time division multiple access (TDMA) a similaridea is realized in the time domain, where each user is given a unique time period(or periods) One user can thus transmit and receive data only during his or herpredetermined time interval while the others are silent at the same time
In CDMA [287, 378, 410, 444], there is no disjoint division in frequency and timespaces, but each user occupies the same frequency band simultaneously The usersare now identified by their codes, which are unique to each user Roughly speaking,each user applies his unique code to his information signal (data symbols) beforetransmitting it through a common medium In transmission different users’ signalsbecome mixed, because the same frequencies are used at the same time Each user’stransmitted signal can be identified from the mixture by applying his unique code atthe receiver
In its simplest form, the code is a pseudorandom sequence of1s, also called a
chip sequence or spreading code In this case we speak about direct sequence (DS)
modulation [378], and call the multiple access method DS-CDMA In DS-CDMA,each user’s narrow-band data symbols (information bits) are spread in frequencybefore actual transmission via a common medium The spreading is carried out bymultiplying each user’s data symbols (information bits) by his unique wide-bandchip sequence (spreading code) The chip sequence varies much faster than the
Trang 3MULTIUSER DETECTION AND CDMA COMMUNICATIONS 419
trans-information bit sequence In the frequency domain, this leads to spreading of the
power spectrum of the transmitted signal Such spread spectrum techniques are
useful because they make the transmission more robust against disturbances caused
by other signals transmitted simultaneously [444]
Example 23.1 Figure 23.2 shows an example of the formation of a CDMA signal On
the topmost subfigure, there are 4 user’s symbols (information bits)1+11+1
to be transmitted.The middle subfigure shows the chip sequence (spreading code)
It is now1+111+1 Each symbol is multiplied by the chip sequence in asimilar manner This yields the modulated CDMA signal on the bottom row of Fig.23.2, which is then transmitted The bits in the spreading code change in this case 5times faster that the symbols
Let us denote themth data symbol (information bit) byb m, and the chip sequence
bys(t) The time period of the chip sequence isT (see Fig 23.2), so thats(t) 2 f1+1gwhent2 0T), ands(t) = 0whent =2 0T) The length of the chipsequence isCchips, and the time duration of each chip isT c=T=C The number ofbits in the observation interval is denoted byN In Fig 23.2, the observation intervalcontainsN symbols, and the length of the chip sequence isC
Trang 4Using these notations, the CDMA signal r(t) at time t arising in this simpleexample can be written
is larger, and it degrades gradually with increasing number of simultaneous userswho can be asynchronous [444] CDMA technology is therefore a strong candidatefor future global wireless communications systems For example, it has already beenchosen as the transmission technique for the European third generation mobile systemUMTS [334, 182], which will provide useful new services, especially multimediaand high-bit-rate packet data
In mobile communications systems, the required signal processing differs in thebase station (uplink) from that in the mobile phone (downlink) In the base station,all the signals sent by different users must be detected, but there is also much moresignal processing capacity available The codes of all the users are known buttheir time delays are unknown For delay estimation, one can use for example thesimple matched filter [378, 444], subspace approaches [44, 413], or the optimal butcomputationally highly demanding maximum likelihood method [378, 444] Whenthe delays have been estimated, one can estimate the other parameters such as thefading process and symbols [444]
In downlink (mobile phone) signal processing, each user knows only its own code,while the codes of the other users are unknown There is less processing power than
in the base station Also the mathematical model of the signals differs slightly, sinceusers share the same channel in the downlink communications Especially the firsttwo features of downlink processing call for new, efficient and simple solutions.ICA and BSS techniques provide a promising new approach to the downlink signalprocessing using short spreading codes and DS-CDMA systems
Figure 23.3 shows a typical CDMA transmission situation in an urban ment Signal 1 arrives directly from the base station to the mobile phone in the car
environ-It has the smallest time delay and is the strongest signal, because it is not attenuated
by the reflection coefficients of the obstacles in the path Due to multipath gation, the user in the car in Fig 23.3 receives also weaker signals 2 and 3, which
propa-have longer time delays The existence of multipath propagation allows the signal
to interfere with itself This phenomenon is known as intersymbol interference (ISI).
Using spreading codes and suitable processing methods, multipath interference can
be mitigated [444]
Trang 5MULTIUSER DETECTION AND CDMA COMMUNICATIONS 421
Time delay
Fig 23.3 An example of multipath propagation in urban environment.
There are several other problems that complicate CDMA reception One of the
most serious ones is multiple access interference (MAI), which arises from the fact
that the same frequency band is occupied simultaneously MAI can be alleviated byincreasing the length of the spreading code, but at a fixed chip rate, this decreases
the data rate In addition, the near–far problem arises when signals from near and
far are received at the same time If the received powers from different users becometoo different, a stronger user will seriously interfere with the weaker ones, even ifthere is a small correlation between the users’ spreading codes In the FDMA andTDMA systems, the near–far problem does not arise because different users havenonoverlapping frequency or time slots
The near–far problem in the base station can be mitigated by power control, or by
multiuser detection Efficient multiuser detection requires knowledge or estimation of
many system parameters such as propagation delay, carrier frequency, and received
power level This is usually not possible in the downlink However, then blind
multiuser detection techniques can be applied, provided that the spreading codes areshort enough [382]
Still other problems appearing in CDMA systems are power control, tion, and fading of channels, which is present in all mobile communications systems.Fading means variation of the signal power in mobile transmission caused for exam-ple by buildings and changing terrain See [378, 444, 382] for more information onthese topics
Trang 6synchroniza-23.2 CDMA SIGNAL MODEL AND ICA
In this section, we represent mathematically the CDMA signal model which is studied
in slightly varying forms in this chapter This type of models and the formation ofthe observed data in them are discussed in detail in [444, 287, 382]
It is straightforward to generalize the simple model (23.1) forKusers Themthsymbol of thekth user is denoted byb km, ands k()isk:th user’s binary chip sequence(spreading code) For each userk, the spreading code is defined quite similarly as inExample 23.1 The combined signal ofKsimultaneous users then becomes
wheren(t)denotes additive noise corrupting the observed signal
The signal model (23.2) is not yet quite realistic, because it does not take intoaccount the effect of multipath propagation and fading channels Including thesefactors in (23.2) yields our desired downlink CDMA signal model for the observeddatar(t)at timet:
d l denotes the delay of the lth path, which is assumed to be constant during theobservation interval of N symbol bits Each of theK simultaneous users hasL
independent transmission paths The terma lm is the fading factor of thelth pathcorresponding to themth symbol
In general, the fading coefficients a lm are complex-valued However, we canapply standard real-valued ICA methods to the data (23.3) by using only the real part
of it This is the case in the first two approaches to be discussed in the next twosections, while the last method in Section 23.5 directly uses complex data
The continuous time data (23.3) is first sampled using the chip rate, so thatC
equispaced samples per symbol are taken From subsequent discretized equispaceddata samplesrn],C-length data vectors are then collected:
Trang 7CDMA SIGNAL MODEL AND ICA 423
be easily observed by shifting the spreading code to the right in Fig 23.2
The vector model (23.5) can be expressed in compact form as a matrix model.Define the data matrix
and the2KLNmatrixF=f
1:::fN]contains the symbols and fading terms
To see the correspondence of (23.9) to ICA, let us write the noisy linear ICAmodelx=As + nin the matrix form as
The data matrixXhas as its columns the data vectorsx(1)x(2)::: andSand
Nare similarly compiled source and noise matrices whose columns consist of thesource and noise vectorss(t)andn(t), respectively Comparing the matrix CDMAsignal model (23.9) with (23.12) shows that it has the same form as the noisy linearICA model Clearly, in the CDMA model (23.9)Fis the matrix of source signals,R
is the observed data matrix, andGis the unknown mixing matrix
Trang 8For estimating the desired user’s parameters and symbols, several techniques areavailable [287, 444] Matched filter (correlator) [378, 444] is the simplest estimator,but it performs well only if different users’ chip sequences are orthogonal or the usershave equal powers The matched filter suffers greatly from the near–far problem,rendering it unsuitable for CDMA reception without a strict power control Theso-called RAKE detector [378] is a somewhat improved version of the basic matchedfilter which takes advantage of multiple propagation paths The maximum likelihood(ML) method [378, 444] would be optimal, but it has a very high computational load,and requires knowledge of all the users’ codes However, in downlink reception,only the desired user’s code is known To remedy this problem while preservingacceptable performance, subspace approaches have been proposed for example in[44] But they are sensitive to noise, and fail when the signal subspace dimensionexceeds the processing gain This easily occurs even with moderate system loaddue to the multipath propagation Some other semiblind methods proposed for theCDMA problem such as the minimum mean-square estimator (MMSE) are discussedlater in this chapter and in [287, 382, 444].
It should be noted that the CDMA estimation problem is not completely blind,because there is some prior information available In particular, the transmittedsymbols are binary (more generally from a finite alphabet), and the spreading code(chip sequence) is known On the other hand, multipath propagation, possibly fadingchannels, and time delays make separation of the desired user’s symbols a verychallenging estimation problem which is more complicated than the standard ICAproblem
23.3 ESTIMATING FADING CHANNELS
23.3.1 Minimization of complexity
Pajunen [342] has recently introduced a complexity minimization approach as a truegeneralization of standard ICA In his method, temporal information contained inthe source signals is also taken into account in addition to the spatial independenceutilized by standard ICA The goal is to optimally exploit all the available information
in blind source separation In the special case where the sources are temporally white(uncorrelated), complexity minimization reduces to standard ICA [342] Complexityminimization has been discussed in more detail in Section 18.3
Regrettably, the original method for minimizing the Kolmogoroff complexity sure is computationally highly demanding except for small scale problems But if thesource signals are assumed to be gaussian and nonwhite with significant time correla-tions, the minimization task becomes much simpler [344] Complexity minimizationthen reduces to principal component analysis of temporal correlation matrices Thismethod is actually just another example of blind source separation approaches based
mea-on secmea-ond-order temporal statistics; for example [424, 43], which were discussedearlier in Chapter 18
Trang 9ESTIMATING FADING CHANNELS 425
In the following, we apply this simplified method to the estimation of the ing channel coefficients of the desired user in a CDMA systems Simulations withdownlink data, propagated through a Rayleigh fading channel, show noticeable per-formance gains compared with blind minimum mean-square error channel estimation,which is currently a standard method for solving this problem The material in thissection is based on the original paper [98]
fad-We thus assume that the fading process is gaussian and complex-valued Then theamplitude of the fading process is Rayleigh distributed; this case is called Rayleighfading (see [444, 378]) We also assume that a training sequence or a preamble isavailable for the desired user, although this may not always be the case in practice.Under these conditions, only the desired user’s contribution in the sampled data istime correlated, which is then utilized The proposed method has the advantage that
it estimates code timing only implicitly, and hence it does not degrade the accuracy
of channel estimation
A standard method for separating the unknown source signals is based on mization of the mutual information (see Chapter 10 and [197, 344]) of the separatedsignalsfm=y1 m):::y2KL(m)]T =y:
mini-J (y ) =
X
i H(y i) + log j det G j (23.13)whereH(y i)is the entropy ofy i(see Chapter 5) But entropy has the interpretationthat it represents the optimum averaged code length of a random variable Hencemutual information can be expressed by using algorithmic complexity as [344]
J (y ) =
X
i K(y i) + log j det G j (23.14)whereK()is the per-symbol Kolmogoroff complexity, given by the number of bitsneeded to describey i By using prior information about the signals, the coding costscan be explicitly approximated For instance, if the signals are gaussian, indepen-dence becomes equivalent to uncorrelatedness Then the Kolmogoroff complexitycan be replaced by the per-symbol differential entropy, which in this case depends onsecond-order statistics only
For Rayleigh type fading transmission channels, the prior information can be mulated by considering that the probability distributions of the mutually independentsource signalsy i(m)have zero-mean gaussian distributions Suppose we want toestimate the channel coefficients of the transmission paths, by sending a given lengthconstantb1m = 1symbol sequence to the desired user We consider the signals
for-y i(m),i= 1:::2L, withirepresenting the indexes of the2Lsources ing to the first user Theny i(m)will actually represent the channel coefficients of allthe first user’s paths Since we assume that the channel is Rayleigh fading, then thesesignals are gaussian and time correlated In this case, blind separation of the sourcescan be achieved by using only second-order statistics In fact, we can express theKolmogoroff complexity by coding these signals using principal component analysis[344]
Trang 10correspond-23.3.2 Channel estimation *
Letyi(m) = y i(m)::: y i(mD+ 1)] denote the vector consisting ofD lastsamples of every such source signaly i(m),i= 1:::2L HereDis the number ofdelayed terms, showing what is the range of time correlations taken into account whenestimating the current symbol The information contained in any of these sources can
be approximated by the code length needed for representing theDprincipal ponents, which have variances given by the eigenvalues of the temporal correlationmatrixCi= Eyi(m)yTi(m)][344] Since we assume that the transmission paths aremutually independent, the overall entropy of the source is given by summing up theentropies of the principal components Using the result that the entropy of a gaussianrandom variable is given by the logarithm of the variance, we get for the entropy ofeach source signal
is the separating matrix
The separating matrixWcan be estimated by using a gradient descent approachfor minimizing the cost function (23.16), leading to the update rule [344]
W = @log J (y )
whereis the learning rate andis the momentum term [172] that can be introduced
to avoid getting trapped into a local minimum corresponding to a secondary path.LetwTi denote the ith row vector of the separating matrixW Since only thecorrelation matrixCiof theith source depends onwi, we can express the gradient
of the cost function by computing the partial derivatives
Trang 11ESTIMATING FADING CHANNELS 427
wherer ki is the element(kj)of the observation matrixRdefined earlier usingformulas (23.4) and (23.9)
What is left to do now is to compute the gradient update part due to the mappinginformation It can be written [344]
log j det W j =
C
X
i=1 log k(I Pi)wik (23.20)wherePi = Wi(WTiWi)
1
WTi is a projection matrix onto the subspace spanned
by the column vectors of the matrixWi =w
1:::wi1
] Now the cost functioncan be separated, and the different independent components can be found one byone, by taking into account the previously estimated components, contained in thesubspace spanned by the columns of the matrixWi
Since our principal interest lies in the transmission path having the largest power,corresponding usually to the desired user, it is sufficient to estimate the first suchindependent component In this case, the projection matrix P
1 becomes a zeromatrix Then the overall gradient (23.17) for the first row wT
1 of the separatingmatrix can be written
@log J (y )
@wT
1
= 1
(23.21)
It suffices to consider the special case where only the two last samples are takeninto account, so that the the delayD = 2 First, second-order correlations are re-moved from the dataRby whitening This can be done easily in terms of standardprincipal component analysis as explained in Chapter 6 After whitening, the subse-quent separating matrix will be orthogonal, and thus the second term in Eq (23.16)disappears, yielding the cost function
In this case, the separating vectorswTi can be found by maximizing sequentially
Ey i(m)y i(m1) +y i(m 1)y i(m)], which is the first-order correlation coefficient
ofy i It follows that the function to be maximized becomes
J(w ) = wTErmrTm1
+ rm1
So the separating vectorwT
1 corresponding to the most important path is given bythe principal eigenvector of the matrix in Eq (23.24) We have used a symmetricexpression for the correlation coefficients in order to avoid asymmetry when the
Trang 12observed data set is finite This usually improves the estimation accuracy Finally,
we separate the desired channel coefficients by computing the quantities
a 11
23.3.3 Comparisons and discussion
We have compared the method described and derived above to a well-performingstandard method used in multiuser detection, namely the minimum mean-squareerror estimator [452, 287] In the MMSE method, the desired signal is estimated(up to a scaling) from the formula
a column of the matrixGdefined in (23.10) corresponding to the desired user’s bit
b m, that is, eitherg