Cross layer adaptive transmission with i

From the implementation point of view, when imperfect channel state information is considered, it is not possible to calculate transmit power to guarantee a target packet error rate.. In

Trang 1

Cross-layer Adaptive Transmission with Incomplete

System State Information

Anh Tuan Hoang, Member, IEEE, and Mehul Motani, Member, IEEE

Abstract— We consider a point-to-point communication system

in which data packets randomly arrive to a finite-length buffer

and are subsequently transmitted to a receiver over a

time-varying wireless channel Data packets are subject to loss due to

buffer overflow and transmission errors We study the problem

of adapting the transmit power and rate based on the buffer and

channel conditions so that the system throughput is maximized,

subject to an average transmit power constraint Here, the system

throughput is defined as the rate at which packets are successfully

transmitted to the receiver We consider this buffer/channel

adap-tive transmission when only incomplete system state information

is available for making control decisions Incomplete system

state information includes delayed and/or imperfectly estimated

channel gain and quantized buffer occupancy We show that,

when some delayed but error-free channel state information is

available, optimal buffer/channel adaptive transmission policies

can be obtained using Markov decision theories When the

channel state information is subject to errors and when the

buffer occupancy is quantized, we discuss various buffer/channel

adaptive heuristics that achieve good performance In this paper,

we also consider the tradeoff between packet loss due to buffer

overflow and packet loss due to transmission errors We show

by simulation that exploiting this tradeoff leads to a significant

gain in the system throughput.

Index Terms— Cross-layer design, adaptive transmission,

throughput maximization, partially observable Markov decision

processes.

In this paper, we study the problem of buffer and channel

adaptive transmission in a point-to-point wireless

communi-cation scenario with the objective of maximizing the system

throughput, subject to an average transmit power constraint

We term our adaptive transmission schemes cross-layer since

transmission decisions at the physical layer take into account

not only the channel condition but also the data arrival

statistics and buffer occupancy, which are the parameters of

higher network layers

Our system model is depicted in Fig 1 Time is divided

into frames of equal length and during each frame, data

packets arrive at the transmitter buffer according to some

known stochastic distribution The buffer has a finite length

and when there is no space left, arriving packets are dropped

Manuscript received November 08, 2006; revised May 31, 2007, and

August 22, 2007.

A T Hoang is with the Department of Networking Protocols, Institute for

Infocomm Research (I2R), 21 Heng Mui Keng Terrace, Singapore 119613.

Previously, he was with the Department of Electrical and Computer

Engineer-ing, National University of Singapore E-mail: athoang@i2r.a-star.edu.sg.

M Motani is with the Department of Electrical and Computer

En-gineering, National University of Singapore, Singapore 119260 E-mail:

motani@nus.edu.sg.

Data packets in the buffer are transmitted to a receiver over

a discrete-time block-fading channel The fading process is represented by a finite state Markov chain (FSMC) ( [1], [2])

We define the system state during each time frame as the combination of the buffer occupancy and the channel state and assume that there is a signaling mechanism for the transmitter and receiver to exchange some system state information (SSI)

In our system model, data packets are subject to loss due

to buffer overflow and transmission errors We define the system throughput as the rate at which packets are successfully transmitted to the receiver The control problem is to adapt the transmit power and rate according to some SSI so that the system throughput is maximized, subject to an average trans-mit power constraint We are interested in scenarios in which only an incomplete observation of the instantaneous system state is available for making control decisions Incomplete SSI includes delayed and/or imperfectly estimated channel state and quantized buffer occupancy The case when control decisions can be made based on complete SSI is considered

in our related work [3], where interesting structural properties

of optimal adaptive transmission policies are studied

In the context of adaptive transmission, our paper is related

to the well-known works of Goldsmith in [4] and [5] In these works, it is shown that when the channel state information (CSI) is available at both the transmitter and receiver, the optimal power allocation scheme that achieves the capacity of

a time-varying wireless channel, subject to an average transmit power constraint, exhibits a water-filling structure over time The insight is that the transmitter should transmit at a higher power and rate when the channel is good while reducing the transmit power in poorer channel conditions However, data arrival statistics and buffer conditions are not of concern in [4] and [5]

In the context of cross-layer design, our paper is closely related to the works in [6]–[16], which consider similar problems of buffer/channel adaptive transmission An early work of Collins and Cruz adapts transmit power and rate based on the queue length and channel condition in order to minimize the average transmit power, subject to an average delay constraint [6] In [7] Berry and Gallager quantify the behavior of the power-delay tradeoff in the regime of asymp-totically large delay The same model is further studied in [8], [9], with some structural properties of the optimal policies identified In [10], Rajan et al consider a more generalized queueing model where packets can be dropped They propose transmission policies that are near-optimal, in terms of mini-mizing packet loss subject to an average delay and an average power constraint In [11], Karmokar et al further extend the

Trang 2

Transmitter Wireless Receiver

Channel Control Signals Buffer

Packets

Data packets arrive to the buffer according to some stochastic distribution.

The packets are then transmitted over a time-varying wireless channel There

are control signals for the transmitter and receiver to exchange buffer and

channel state information.

tradeoff to include average packet delay, average transmit

power, and average packet dropping probability They also

propose a suboptimal policy that approximates the behaviors

of the optimal policies In [12]–[16] the problem of cross-layer

adaptive transmission is considered from a different angle

in which transmission is carried out given a fixed amount

of energy and a limited amount of time The authors adapt

the transmit power and rate according to the amount of data

remaining, the present time relative to the deadline, and the

present channel state, in order to maximize the achievable

throughput ( [12]–[14]) or to maximize the probability of a

data file being successfully transmitted ( [15], [16])

We note that the works in [6]–[16] assume perfect

knowl-edge of the instantaneous buffer occupancy and channel state

In [17], Karmokar et al consider the problem of adapting the

error control coding scheme base on some imperfect

observa-tions of traffic statistics and channel condition In particular,

the channel observations are in the form of NACK/ACK that

are fed back from the receiver to the transmitter Similar to

our paper, the problem in [17] is formulated as a partially

observable Markov decision process (POMDP) Even though

the problem setup in [17] differs from that of our paper in

several points, the authors come to a similar conclusion that,

given partial observations, a heuristic called QMDP ( [18])

achieves good performance

An important contribution that differentiates our work from

[6]–[16] is that we exploit the tradeoff between packet loss

due to buffer overflow and packet loss due to transmission

errors Our results show that, by balancing these sources of

packet loss, significant gain in the system throughput can

be achieved From the implementation point of view, when

imperfect channel state information is considered, it is not

possible to calculate transmit power to guarantee a target

packet error rate We note that the problem formulation in

[10] and [16] allows for optimizing over both packet losses

due to transmission failure and buffer overflow However, their

assumptions result in no packet losses due to transmission

errors Specifically, their policies never transmit above the

Shannon capacity and they assume no transmission errors

at rates below capacity In their recent works ( [19], [20]),

Liu at al do take into account both packet losses due to

transmission errors and buffer overflows Their definition of

system throughput is also similar to ours However, the policies

considered in [19], [20] adapt to the channel state information

only, not to the buffer and data arrival statistics

The main contributions of this paper can be summarized as

follows

• We present tractable models of buffer/channel adaptive

transmission given imperfect SSI

• We exploit the tradeoff between packet loss due to buffer overflow and packet loss due to transmission errors This tradeoff results in a performance gain in the overall system throughput

• We show how buffer and channel adaptive transmission can be carried out given incomplete SSI In particular, we show that optimal adaptive policies can be obtained for the cases when some delayed but error-free channel state information is available When the channel state informa-tion is subject to errors and when the buffer occupancy

is quantized, we present various buffer/channel adaptive heuristics that achieve good performance

The rest of this paper is organized as follows In Section II,

we present our system model and discuss the approach that can

be used to obtain optimal adaptive transmission policies when the transmit power and rate can be chosen based on a perfect knowledge of the instantaneous system state Next, in Section III, we discuss the situations in which the transmitter and receiver only have partial information about the current buffer and channel states In Section IV, we show that optimal control policies can be obtained when some delayed but error-free channel states are available for making decision When this is not possible, we propose various heuristics to obtain policies with good performance in Section V Numerical results and discussion are given in Section VI Finally, we conclude the paper in Section VII

II THROUGHPUTMAXIMIZATIONPROBLEM

A System Model

The system model considered in this paper is depicted in Fig 1 Time is divided into frames of equal length of Tf

seconds During framei, Ai packets arrive at the transmitter buffer We assume thatAi is independent and identically dis-tributed (i.i.d.) over time and follows a stationary distribution

pA(a) Each data packet contains L bits, the buffer can store

up toB packets and when the buffer is full, all arriving packets are dropped We further assume that arriving packets are only added to the buffer at the end of each time frame

We consider a discrete-time block-fading channel with additive white Gaussian noise (AWGN) The fading process

is represented by a stationary and ergodic K-state Markov chain, with the channel states numbered from0 to K − 1 The power gain of channel stateg, g ∈ {0, K − 1}, is denoted

by γg During each time frame, we assume that the channel remains in a single state, between two consecutive frames, the probability of transitioning from channel state g to channel stateg′is denoted byPG(g, g′) The stationary distribution of each channel state is denoted bypG(g)

In general, a finite state Markov channel model (FSMC)

is suitable for modeling a slowly varying flat-fading channel [1], [21]–[23] A FSMC is constructed for a particular fading distribution, e.g., log-normal shadowing or Rayleigh fading,

by first partitioning the range of the fading gain into a finite number of sections Then each section of the gain value corresponds to a state in the Markov chain Given knowledge

of the fading process, the stationary distributionp (g) as well

Trang 3

as the channel state transition probabilities PG(g, g′) can be

derived For more details, the reader is referred to [1], [21]–

[23]

Let Bi denote the number of packets in the buffer at the

beginning of framei and Gidenote the channel state

through-out framei, the system state at frame i is Si, (Bi, Gi) For

time frame i, let Pi(Watts) andUi(packets/frame) denote the

transmit power and rate, respectively We have 0 ≤ Ui ≤ Bi

andPi∈ P , where P is the set of all power levels at which

the transmitter can operate

B Buffer and Channel Adaptive Transmission

Given a particular system state(b, g), where b is the buffer

occupancy and g is the channel state (0 ≤ b ≤ B, 0 ≤ g <

K), each chosen pair of transmission rate and power (u, P )

results in some expected number of packets lost due to buffer

overflow and transmission errors We characterize these losses

by two functions: Lo(b, u) is the expected number of packets

lost due to buffer overflow and Le(g, u, P ) is the expected

number of packets discarded due to transmission error Note

that in this paper, we do not consider retransmission of

erroneous packets

For our system model, when the data arrival process is fixed,

maximizing the system throughput is equivalent to minimizing

total packet loss due to buffer overflow and transmission

errors This is achieved by varying the transmission rate and

power (Ui, Pi) according to some knowledge of Si Note

that there are various ways for the transmitter to change its

transmission rate Ui It can be done by changing the channel

coding scheme [24], i.e by encoding data bits in the buffer

using different code rates while keeping the transmission rate

for the coded bits fixed.Uican also be varied by keeping the

symbol rate fixed and changing the signal constellation size of

a modulator [5], [8], [25] In existing communication standards

such as IEEE.802.11 and IEEE.802.16, different transmission

rates are achieved by combinations of different coding and

modulation schemes

C Buffer Overflow and Transmission Error Tradeoff

At this point, let us point out an interesting tradeoff between

the two sources of packet loss, i.e., buffer overflow and

transmission errors Consider a particular system state (b, g)

and a fixed transmit powerP If we increase the transmission

rate u, the amount of buffer overflow is reduced However,

increasing u when P is fixed results in a greater number of

packet transmission errors The reverse is also true, for fixed

P , the amount of packet transmission errors can be reduced by

lowering the transmission rateu, but that will be at the cost of

increasing the buffer overflow rate This argument highlights

the need to find a good tradeoff between packet transmission

errors and buffer overflow when choosing transmit power and

rate In this paper, our control decision strives for an optimal

tradeoff between these two sources of packet loss

D Throughput Maximization with Complete SSI

Before considering buffer/channel adaptive transmission

with incomplete SSI, let us briefly discuss how optimal

buffer/channel adaptive transmission policies can be obtained for the case of complete SSI With complete SSI, the through-put maximization problem can be reformulated as the problem

of minimizing the weighted sum of the long-term packet loss rate and the average transmission power In particular, consider the following problem of selecting transmission rate and power (Ui, Pi):

arg min

U i ,P i

( lim sup

T →∞

1

TE

(T−1 X

i=0

C(Bi, Gi, Ui, Pi)

)) , (1) where

C(b, g, u, P ) = P + β (Lo(b, u) + Le(g, u, P )) (2) Here β is a positive weighting factor that gives the priority

of reducing packet loss over conserving power When β is increased, we tend to transmit at a higher rate in order to lower the packet loss rate at the expense of using higher transmit power On the other hand, for smaller values ofβ, the average transmission power will be reduced at the cost of increasing the packet loss rate If Pβ and Lβ are the average power and packet loss rate (due to buffer overflow and transmission errors) obtained when solving (1) for a particular value of

β, then Lβ is also the minimum achievable loss rate given a power constraint ofPβ

For our system model in which the channel state Gi

evolves according to a stationary, ergodic Markov process, the optimization problem in (1) can be classified as an infinite-horizon, average-cost Markov decision process [26] For such

a problem, given complete system SSI, there exists a stationary control policy that is optimal Let π be a stationary policy which maps system states into transmission rate and power for each framei, i.e., π(Bi, Gi), (Ui, Pi) Defining

Javr(π) = lim sup

T →∞

1

TE

(T−1 X

i=0

C(Bi, Gi, Ui, Pi) | π

) , (3) the optimization problem in (1) becomes

π∗= arg min

The above infinite-horizon, average-cost Markov decision pro-cess (MDP) can be solved effectively using dynamic program-ming techniques such as policy iteration and value iteration [26, Chapter 6]

It is also useful to consider the discounted cost of using policy π with initial system state (b, g), i.e.,

Jα(b, g, π)

= lim

T→∞E

(T−1 X

i=0

αiC (Bi, Gi, Ui, Pi) |B0= b, G0= g, π

) , (5) where0 < α < 1 is the discounting factor As the immediate cost functionC(b, g, u, P ) is bounded, the limit in (5) always exists Correspondingly, we have the problem of finding a control policy that minimizes the discounted cost, i.e.,

π∗

Trang 4

It can be shown that π∗αconverges to π∗ which is the solution

of (4) asα → 1 ( [26, Chapter 6]) Moreover, let J∗

α(b, g) be the minimum discounted cost when starting with initial state

(b, g), the solution of the discounted cost problem satisfies the

simple Bellman equation ( [26, Chapter 6]):

Jα∗(b, g) = min

(u,P )

n

C (b, g, u, P ) + α

K−1X

g ′ =0

∞

X

a=0

PG(g, g′)

pA(a)Jα∗ min{b − u + a, B}, g′o

(7)

The physical interpretation of (7) is that, for the discounted

cost problem, at each stage of control, the optimal control

action should minimize the sum of the immediate cost C(.)

and the α-weighted future cost, provided that in the

sub-sequent future stages, optimal control actions are selected

This elegant Bellman equation is useful for analyzing the

structural properties of optimal control policies It is also

the inspiration behind the effective QMDP heuristic ( [18])

when only incomplete system state information is available

for making control decisions This is discussed in Section

V-B.3

III INCOMPLETESYSTEMSTATEINFORMATION

Let us now consider the cases when only imperfect

knowl-edge of the instantaneous system state is available for making

control decisions Rather, the transmit power and rate are

adapted based on a partially observed system state which

includes quantized buffer occupancy and delayed and/or

im-perfectly estimated channel state

A Quantized Buffer State Information

Although the transmitter usually knows the exact buffer

occupancy, we may not want to adapt the transmission

pa-rameters to this exact value Firstly, the buffer occupancy can

change frequently, therefore, adapting to its exact value may

require a significant amount of signaling from the transmitter

to the receiver Secondly, apart from the signaling issue,

we may want to quantize the buffer occupancy in order to

reduce the complexity in obtaining and implementing the

buffer/channel adaptive policies Given that the buffer capacity

is B and the number of channel states is K, using the exact

buffer occupancy results in the total number of system states of

(B + 1)K When B and K are large, by quantizing B using a

small number of levels, we can significantly reduce the number

of system states and consequently reduce the complexity of

obtaining and implementing the adaptive transmission policies

We can quantize the buffer occupancy using a small number

of thresholds and only update the transmit power and rate

when there is a threshold crossing In this paper, the buffer

occupancy is quantized usingM + 1 thresholds, i.e., 0 = b0<

b1< < bM = B+1 The buffer is said to be in state k, 0 ≤

k < M , if the number of packets currently queueing satisfies

bk ≤ b < bk+1 Denoting the quantized buffer occupancy at

timei by Bi, we have

Bi = bk, where k satisfies bk ≤ Bi< bk+1 (8)

B Delayed Imperfect Channel Estimates

We assume that the channel gain is first estimated at the receiver, then quantized into one of the possible values {γ0, γ1, γK−1}, and finally the estimated channel index is fed back to the transmitter This process introduces both delay and errors in the transmitter knowledge of the channel state

If we take into account the effects of both delay and errors, then at time i, what available at the transmitter is a sequence

of delayed imperfect estimates of the channel states up to time

i − m, i.e., { bG0, bGi−m}, i ≥ m ≥ 0 Note that mTf is the total estimation and feedback delay We account for the fact that bGi can be erroneous by the following function:

PE(g, bg) = Pr( bGi= bg | Gi = g), (9) which gives the probability of wrongly estimating channel state g as channel state bg Note that PE(g, bg) depends on the specific channel estimation technique employed at the receiver

In this paper, we assume that the channel estimation error does not depend on the chosen transmission parameters and

is i.i.d over time We also assume thatPE(g, bg) is known at the transmitter for all pairs(g, bg)

As an example, let us assume that if the actual channel state

is g, then the estimated channel gain prior to quantization is

of the form:

b

where v is a Gaussian random variable with zero mean and variance σ2 Quantizing bγ to the closest value in the set {γ0, γ1, γK−1} to obtain the estimated channel index bg,

we have:

PE(g, bg) =1

2

erf γb g+ γb g+1− 2γg

2√ 2σ

− erf γb g+ γb g−1− 2γg

2√ 2σ

, 0 < bg < K − 1,

(11) and

PE(g, 0) = 1

2

1 + erf

γ0+ γ1− 2γg

2√ 2σ

PE(g, K − 1) = 12

1 − erf

γK−2+ γK−1− 2γg

2√ 2σ

, (13) where erf(.) is the standard error function

IV OPTIMALADAPTIVETRANSMISSIONPOLICIESGIVEN

In this section, we consider a special case in which the channel information for choosing the transmit power and rate at time frame i is of the form {G0, Gi−m−n, bGi−m−n+1, bGi−m}, i ≥ m + n, m ≥

0, n ≥ 0 This means that, at time i, in addition to the imperfect channel estimates { bGi−m−n+1, bGi−m}, the transmitter knows all the exact channel states up to time

i − m − n This assumption can be justified by the fact that the accuracy of channel estimation process may be improved if the receiver is given extra time and information

to do processing [5] For example, when a certain estimation delay is permitted, the receiver can interpolate between past

Trang 5

and future estimates to obtain more accurate predictions.

Therefore, our assumption corresponds to the case when the

delay (m + n)Tf is long enough so that the receiver can

obtain a near perfect channel estimate

Due to the Markov property of the channel model, it is

enough to only maintain a truncated sequence of the channel

observation history which can be represented by the following

channel observation vector:

Hi= (Gi−m−n, bGi−m−n+1 bGi−m) (14)

As there areK possible channel states, the number of all

pos-sible channel observation vectors HiisKn+1 The important

point to note is that even though the channel state information

is incomplete, the number of possible values for Hi is still

finite This allows the problem of minimizing a weighted

sum of the long term packet loss rate and average transmit

power to be formulated as a finite-state MDP, with the actual

channel state Gi being replaced by the channel observation

vector Hi In order to fully specify the MDP, we need to

derive the dynamics of Hi, together with the cost functions

associated with choosing transmission rate and power(u, P )

in state (Bi, Hi)

A When Hi= (Gi−1, bGi)

To simplify the derivations, we consider the case when

Hi = (Gi−1, bGi) Physically, this means that at time i,

the transmitter knows the exact previous channel state Gi−1

and has an estimate of the current channel state bGi This

corresponds to setting m = 0 and n = 1 in (14) We note

that the subsequent derivations can be extended for general

values ofm and n

At time i, given the channel observation vector Hi =

(Gi−1, bGi), we can derive the conditional probability

distri-bution of the channel state Gi as:

ρG g, g, bg, Pr(Gi = g|Hi= (g, bg))

= Pr(Gi = g|Gi−1= g, bGi= bg)

=Pr(Gi= g, Gi−1 = g, bGi= bg)

Pr(Gi−1= g, bGi= bg)

=Pr(Gi= g, bGi= bg|Gi−1= g)Pr(Gi−1 = g)

Pr( bGi= bg|Gi−1 = g)Pr(Gi−1= g)

=Pr(Gi= g, bGi= bg|Gi−1= g)

Pr( bGi= bg|Gi−1 = g)

= PG(g, g)PE(g, bg)

PK−1

g ′ =0PG(g, g′)PE(g′, bg).

(15) Based on (15), the dynamics of Hi can be written as:

PH(g, bg, g′, bg′), Pr Hi+1= (g′, bg′)|Hi= (g, bg)

= Pr Gi= g′, bGi+1= bg′ |Hi= (g, bg)

= Pr Gi= g′|Hi= (g, bg)Pr( bGi+1= bg′|Gi= g′)

= ρG(g′, g, bg) ×

K−1X

k=0

PG(g′, k)PE(k, bg′)

(16)

At timei, given that the buffer occupancy is Bi= b and the channel observation vector is Hi= (g, bg), if the transmission rate and power are set to u and P respectively, the average number of packets lost due to buffer overflow is still given

byLo(b, u) while the expected number of packets lost due to transmission error is

LH

e(g, bg, u, P ) =

K−1X

g=0

ρG g, g, bgLe(g, u, P ) (17)

Knowing the dynamics of Hi together with the cost of a transmission action in each state (Bi, Hi), an MDP can be readily formulated, i.e., similar to that given in Section II-D,

to minimize the weighted sum of the long term packet loss rate and average transmit power

B When Hi = Gi−m

In the special case when Hi= Gi−m, i.e., the transmission decisions at timei can be made based on the perfect knowl-edge of channel state at time i − m, the number of possible values for Hi isK As the result, the size of the newly form MDP is the same as the size of the MDP for the case of complete channel state information

Now, we consider the situation when no delayed error-free channel estimate is available for choosing transmit power and rate At timei, the transmitter knows a sequence of imperfect channel estimates which can be represented by the following channel observation vector:

Ii= ( bG0 bGi−m) (18)

A Optimal Control Policy Given Delayed Imperfect Channel Estimates With i.i.d Channel Model

In the special case when the channel states are i.i.d over time, there is no extra information gained by keeping estimates

of past channel states We suppose that during frame i, the transmitter knows the estimates of channel state i, i.e., bGi, then the channel observation vector Iiin (18) is simplified to defined as

The dynamics of Ii can be derived as:

PI(bg, bg′), Pr(Ii+1= bg′|Ii= bg)

= Pr( bGi+1= bg′| bGi= bg) =

K−1X

g=0

PE(g, bg′)pG(g) (20)

Also, during time frame i, given that the channel estimate

is Ii = bg, we can derive the probability distribution of the current channel states as

φG(g, bg) , Pr(Gi= g|Ii= bg) = Pr(Gi= g| bGi= bg)

= PE(g, bg)pG(g)

PK−1

g =0PE(g′, bg)pG(g′).

(21)

Trang 6

At time i, given that the buffer occupancy is Bi = b and

the channel observation vector is Ii= (bg), if the transmission

rate and power are set to u and P respectively, the average

number of packets lost due to buffer overflow is still given

byLo(b, u) while the expected number of packets lost due to

transmission error is

LI

e(bg, u, P ) =

K−1X

g=0

φG g, bgLe(g, u, P ) (22)

Note that the number of possible values for Ii is K

Knowing the dynamics of Ii together with the cost of a

transmission action in each state (Bi, Ii), an MDP can be

readily formulated, i.e., similar to that given in Section II-D,

to minimize the weighted sum of the long term packet loss

rate and average transmit power

B Suboptimal Control Policies Given Imperfect Channel

Es-timates

Now let us consider the case when the channel states are

correlated over time and at time i, the transmitter knows

only a sequence of delayed imperfect channel estimates Ii=

( bG0 bGi−m) To simplify the notations, we further assume

thatm = 0, however, when m > 0 the analysis is similar

The control problem in this situation can be modeled as a

partially observable Markov decision process (POMDP) For a

POMDP in which the system states are correlated over time, in

order to make an optimal control decision, the controller needs

to keep track of the entire observation history That means

for our control problem, the transmitter needs to record the

entire channel estimation history, i.e., Ii, in order to select

optimal transmit power and rate Instead of remembering the

entire observation history, the controller in a POMDP can keep

track of the so called belief state, which is the probability

distribution of the system state, conditioned on the observation

history For our particular problem, we can define Ψi as the

belief channel state at time i, i.e., then

Ψi(g) = Pr(Gi = g | Ψ0, bG0, bGi), (23)

where the initial probability distribution Ψ0 is assumed

known In case Ψ0 is not given, it can be set to Ψ0(g) =

pG(g), i.e., the stationary distribution of the channel states

The advantage of keeping a belief state for every time frame

is that it contains all relevant information for making control

actions [26] Furthermore, in the next time frame, given a new

channel estimation bGi+1 = bg, the new belief state can be

readily derived from

Ψi+1(g) = Pr(Gi+1= g | Ψ0, bG0, , bGi, bGi+1 = bg)

= Pr(Gi+1= g|Ψi, bGi+1= bg)

= Pr(Gi+1= g, bGi+1= bg|Ψi)

Pr( bGi+1= bg|Ψi)

= PE(g, bg)PK−1g′ =0Ψi(g′)PG(g′, g)

PK−1

g ′ =0PE(g′, bg)PK−1g ′′ =0Ψi(g′′)PG(g′′, g′).

(24) Unfortunately, the number of possible channel observation

vectors Ii and possible belief channel states Ψi are infinite

Due to this it is essentially impossible to obtain an optimal adaptive policy based on either Ii or Ψi as doing so may require infinite time and memory Therefore, instead of aiming for an optimal control policy, let us look at some approaches that can be used to approximate it All of these approximations start with the assumption that we have already obtained the MDP policy π∗, i.e., an optimal policy when the system state

is fully observable

1) Employing the MDP Policy π∗: The most straightfor-ward approach is to ignore the partial observability of the channel states and just employ policy π∗ In other words,

at timei, given the channel estimate bGiand buffer occupancy

Bi, the transmission parameters are set as:

(Ui, Pi) = π∗(Bi, bGi) (25)

2) The Most Likely State Heuristic: In this approach, we first determine the state that the channel is most likely in, i.e.,

GM LSi = arg max

g∈{0, K−1}{Ψi(g)} (26) Note that Ψi is the belief channel state at time i and is calculated using (24) Then the transmission parameters are set as:

(Ui, Pi) = π∗(Bi, GM LS

This approach, which is usually termed the Most Likely State (MLS) approach, was proposed in [27]

3) The QMDP Heuristic: This approach relates to the

discounted cost problem defined in (6) Let the Q function

be defined as:

Q(b, g, u, P ) = C(b, g, u, P ) + α

K−1X

g ′ =0

∞

X

a=0

PG(g, g′)pA(a)Jα∗ min{b − u + a, B}, g′,

(28) from the Bellman equation (7), when the system state is fully observed, Q(b, g, u, P ) represents the cost of taking action (u, P ) in state (b, g) and then acting optimally afterward Based on this, the popular QMDP heuristic takes into account the belief state for one step and then assumes that the state is entirely known [18] Applying to our control problem, at time

i, given the buffer occupancy Bi and the belief channel state

ΨI, the transmission rate and power are chosen according to:

(Ui, Pi) = arg min

u∈{0, B i }, P ∈P

nK−1X

g=0

Ψi(g)Q(Bi, g, u, P )o

(29) For a deeper discussion on different approaches to approx-imate an optimal solution for POMDP, please refer to [28]

4) The Minimum Immediate Cost Heuristic: Finally, to assess the effectiveness of the MDP, MLS, and QMDP ap-proaches, which are all MDP-based, we introduce a non-MDP heuristic called the Minimum Immediate Cost (MIC) approach In the MIC approach, at time frame i, given the belief state Ψi, the transmission parameters are selected so

Trang 7

that the expected immediate cost is minimized, i.e.,

(Ui, Pi) = arg min

u∈{0, B i }, p∈P

nK−1X

g=0

Ψi(g)C(Bi, g, u, P )o

(30)

VI NUMERICAL RESULTS ANDDISCUSSION

A System Parameters

The system for our numerical study is as follows Packets

arrive to the buffer according to a Poisson distribution with

rate λ = 3 × 103 packets/second All packets have the same

length of L = 100 bits The buffer length is B = 15 packets

The channel bandwidth is W = 100 kHz and AWGN noise

power density is No/2 = 10−5 Watt/Hz We consider two

8-state FSMCs as described in Table I, where the channel model

in Scenario 1 is obtained by quantizing the fading range of a

Rayleigh fading channel that has average gain γ = 0.8 and

Doppler frequency fD = 10 Hz and the channel model in

Scenario 2 corresponds to fD= 20 Hz

Adaptive transmission is based on a rate,

variable-power M-ary quadrature amplitude modulation (MQAM)

scheme similar to that described in [5] Let Ts be the symbol

period of the MQAM modulator and assume a Nyquist

signal-ing pulse, sinc(t/Ts), is used so that the value of Tsis fixed at

1/W seconds When the symbol period Tsis kept unchanged,

varying the signal constellation size of the modulator gives

us different data transmission rates As has been specified in

Section II, the power and rate adaptation are carried out in

a frame-by-frame basis Each frame contains F modulated

symbols and therefore,Tf= F Ts Here we setF = L = 100

so that when a signal constellation of size M = 2u is used,

exactlyu packets are transmitted from the buffer during each

time frame

Given a particular system state (b, g), a control action

(u, P ), and a Poisson arrival with rate λ, the expected number

of packets lost due to buffer overflow is

Lo(b, u) = (λTf) 1 −

B−b+u−1

X

a=0

pA(a)

!

− (B − b + u) 1 −

B−b+uX

a=0

pA(a)

! , (31)

where

pA(a) = exp(−λTf)(λTf)a

We assume that a transmitted packet is in error if at least

V out of the L bits in the packet are in error The expected

number of packets discarded due to transmission errors can be

calculated by

Le(g, u, P ) =u

L

X

j=V

L j

(Pb(g, u, P ))j

(1 − Pb(g, u, P ))(L−j)

,

(33)

where Pb(g, u, P ) is the (uncoded) bit error rate when using

transmit power P and rate u on channel state g P (g, u, P )

0.1 0.15 0.2 0.25 0.3 0.4 0.5 0.6

Power (dB)

Correlated Channel Model

OCPI, fixed BER = 10−3 OCPI, fixed BER = 10−4 OCPI, fixed BER = 10−5 OCPI, fixed BER = 10−6 OCPI without BER constraint

without a BER constraint Channel model is given in Table I, Scenario 1.

can be approximated by ( [5]):

Pb(g, u, P ) = 0.2 exp

−1.5W N o(2P γug

− 1)

We consider the performance of different approaches dis-cussed in Sections IV and V When the packet arrival rate

is fixed, maximizing the system throughput is equivalent

to minimizing total packet loss due to buffer overflow and transmission error Therefore, for each scheme, the long-term packet loss rate versus average transmit power is plotted

B Performance with Buffer Overflow and Transmission Error Tradeoff

In Fig 2, we plot the performance of the optimal buffer/channel adaptive transmission policies with and without

a BER constraint Here, we assume that the system state information is perfect and consider optimal control policies (termed OCPI) We also assume that a packet is in error if any bit in the packet is corrupted, this meansV = 1 in (33), this is also assumed for the results plotted in Figs 3 and 4 The OCPI policies without any BER constraint are obtained by solving the MDP in (4) The OCPI policies with a BER constraint are obtained by solving some similar MDP described in [7]–[9]

As can be seen, when the BER constraint is relaxed, significant gain can be achieved When the fixed BER is set to relatively high values, i.e.10−3and10−4, adaptive policies perform well

in low range of transmission power but become much worse than the policies without BER constraint when the power is high On the other hand, when the fixed BER is set to a relatively low value, i.e 10−6, the performance of adaptive policies is much worse than that of the policies without BER constrant in the low power range

To further understand the tradeoff between buffer overflow and transmission errors, in Fig 3, we separately plot the packet loss due to buffer overflow and packet loss due to transmission errors for optimal buffer/channel adaptive policies with and

Trang 8

TABLE I

C HANNEL STATES AND TRANSITION PROBABILITIES ( AN 8- STATE FSMC OBTAINED BY QUANTIZING A R AYLEIGH FADING CHANNEL WITH AVERAGE

GAIN 0 8 AND D OPPLER FREQUENCY 10 H Z IN S CENARIO 1 AND 20 H Z IN S CENARIO 2).

Pk,k+1 0.0641 0.0807 0.0859 0.0835 0.0745 0.0590 0.0361 0

Pk,k−1 0 0.0641 0.0807 0.0859 0.0835 0.0745 0.0590 0.0361

P k,k+1 0.1282 0.1613 0.1718 0.1670 0.1489 0.1181 0.0723 0

Pk,k−1 0 0.1282 0.1613 0.1718 0.1670 0.1489 0.1181 0.0723

10−5

10−4

10−3

10−2

10−1

100

Power (dB)

Overflow Rate (BER = 10−3

Error Rate (BER = 10 −3 )

Overflow Rate (BER = 10−6)

Error Rate (BER = 10 −6 )

Overflow Rate (no BER constraint)

Error Rate (no BER constraint)

Fig 3 Packet loss due to buffer overflow and transmission errors of optimal

buffer/channel adaptive scheme with and without a BER constraint Channel

model is given in Table I, Scenario 1.

without a BER constraint It is clear that, without a BER

constraint, an optimal policy varies the transmission error

rate dynamically according to the available transmit power

In particular, at low power, a greater number of transmission

errors can be tolerated in order to reduce buffer overflow On

the other hand, when plenty of transmit power is available,

a good adaptive policy should transmit at a high rate and

high power to minimize both transmission errors and buffer

overflow This argument can be further illustrated in Fig 4,

where we plot the ratio between packet loss due to buffer

overflow and packet loss due to transmission errors

C Performance Under Quantized Buffer Occupancy

First, let us look at the performance of the buffer/channel

adaptive transmission approach when the buffer occupancy

is quantized When the buffer occupancy is quantized, the

performance of policy π∗ (obtained by solving (4)) depends

on two factors, i.e., the number of quantized buffer states, and

the selected quantization thresholds Clearly, the greater the

number of quantized states, the closer the performance to the

optimal At the same time, given a fixed number of quantized

states, the performance depends on the set of selected

thresh-olds An intuitive way to select good quantization thresholds

is to divide the range of buffer occupancy more finely at the

100

101

102

10 3

10 4

105

Power (dB)

Overflow/Errors (BER = 10−3) Overflow/Errors (BER = 10−6) Overflow/Errors (no BER constraint)

Fig 4 Ratio between packet loss due to buffer overflow and packet loss due

to transmission errors of optimal adaptive scheme with and without a BER constraint Channel model is given in Table I, Scenario 1.

range of high probability distribution For example, if we know that most of the time, the buffer occupancy is low, then a greater number of thresholds should be set at low values

In Fig 5, we plot the performance of π∗, in terms of total long term packet loss rate versus average transmit power, for different buffer quantization schemes The number of quantized buffer states is increased from two to four In particular, in the first quantization scheme, we set a single threshold at 7 When the buffer occupancy is less than 7, it

is quantized to 0, otherwise, it is quantized to 7 Similarly, for the case of three quantized buffer states, we set the two thresholds at4 and 9, and for the case of four quantized buffer states, we set the three thresholds at 3, 6, and 10 For the results in Fig 5, as well as in Figs 6-9, we assume that a packet is in error if more than ten out of 100 bits in the packet are corrupted, this means V = 11 in (33) As can

be seen, when only two quantized states are used, there is a significant loss compared to the case of adapting to the exact buffer occupancy However, the performance loss is reduced significantly when the number of quantized buffer states is increased to three and four When four quantized buffer states are used, the performance is very near optimal This suggests that we can often quantize the buffer occupancy in order

to reduce the complexity of the adaptive transmission policy

Trang 9

10 12 14 16 18 20 22 24

0.05

0.1

0.15

0.2

0.25

0.3

Power (dB)

2 quantized buffer states (threshold = 7)

3 quantized buffer states (thresholds = 4, 9)

4 quantized buffer states (thresholds = 3, 6, 10) Using exact buffer occupancy (16 states)

performance is in terms of normalized packet loss rate versus average transmit

power System parameters are given in Section VI-A Channel model is given

in Table I, Scenario 2.

without suffering significant performance degradation

D Performance of Different Approaches Given Delayed

Error-free Channel State

Let us look at the performance of different buffer/channel

adaptive transmission schemes when a delayed error-free

chan-nel state and an accurate buffer occupancy are available for

making control decisions We consider two scenarios In the

first scenario, at time framei, the transmitter knows the exact

channel state at timei−1, i.e., Gi−1 In the second scenario, in

addition to knowingGi−1, the transmitter also has an estimate

of the channel state at timei, i.e., bGi Note that both of these

scenarios have been discussed in Section IV In both cases,

we have shown that optimal transmission policies, which

maximize the system throughput given incomplete channel

state information, can be obtained To facilitate the discussion,

we term the optimal adaptive policies under the first and

second scenarios OCDI 1 and OCDI 2 (Optimal Control under

Delay Information 1 and 2) In addition to this, we also look

at the approach of blindly employing policy π∗ with delayed

information This approach is termed BCDI (Blind Control

under Delay Information)

We plot the packet loss rate versus average transmit power

for each scheme Here, the packet loss rate is normalized by

the average packet arrival rate Clearly, the packet loss rates of

all schemes are lower-bounded by the packet loss rate when

optimal adaptive policies are employed with perfect system

state information, that is, the OCPI curve The performance

of OCDI 1, OCDI 2, BCDI, and OCPI schemes are given in

Figs 6 and 7 Fig 6 corresponds to channel model in Table I

Scenario 2 while Fig 7 is for the channel model in Scenario

1

In Figs 6 and 7, we observe, as expected, that the

perfor-mance of all schemes under delayed channel state information

is lower-bounded by the performance of optimal transmission

scheme with perfect channel knowledge More importantly, the

0.05

0.1 0.15 0.2 0.25 0.3

Power (dB)

BCDI OCDI_1 OCDI_2 (σ = 0.1) OCDI_2 (σ = 0.05) OCPI

Fig 6 Performance, i.e., normalized packet loss rate versus average transmit power, for different adaptive transmission schemes given delayed error-free channel state information System parameters are given in Section VI-A Channel model is in Tab I, Scenario 2.

0.1

0.15 0.2 0.25 0.3

Power (dB)

BCDI OCDI_1 OCDI_2 (σ = 0.1) OCPI

Fig 7 Performance, i.e., normalized packet loss rate versus average transmit power, for different adaptive transmission schemes given delayed channel state information System parameters are given in Section VI-A Channel model is

in Tab I, Scenario 1.

performance degradation increases when the channel changes faster (Fig 6) This is expected because when the channel changes faster, the delayed channel state contains less infor-mation about the current channel state

The second observation that we can make from Figs 6 and

7 is that the greater amount of information an adaptive scheme has, the better its performance is In particular, the OCDI 1 scheme performs better than BCDI scheme and OCDI 2 scheme performs better than OCDI 1 The performance of scheme OCDI 2 improves when the quality of the channel estimate bGi is improved For example, when σ = 0.05, the performance of OCDI 2 is quite close to that of the optimal scheme under perfect SSI When the channel estimate bGi has high error probability (σ = 0.1), the performance of OCDI 2 approaches that of OCDI 1 However, the performance gain of

Trang 10

10 12 14 16 18 20 22

0.05

0.1

0.15

0.2

0.25

0.3

Power (dB)

MIC BCEI MLS QMDP OCPI

Fig 8 Performance, i.e., normalized packet loss rate versus average transmit

power, for different adaptive transmission schemes given imperfect channel

estimate System parameters are given in Section VI-A Channel model is

in Tab I, Scenario 2 The standard deviation of channel estimating noise is

σ = 0.05.

OCDI 2 comes at a cost of higher complexity In particular,

the number of internal channel states for OCDI 2 isK2while

it is K for OCDI 1

E Performance of Different Approaches Given Imperfect

Channel Estimates

Now let us look at the performance of different buffer

and channel adaptive transmission schemes when no

error-free channel state information is available at the transmitter

In particular, during time slot i, the transmitter only has an

estimate of the channel state, i.e., bGi For this numerical study,

we assume that the estimation error for the channel gain has

a Gaussian distribution with zero mean and variation of σ2

The estimation statistics can be computed using equation (11)

- (13)

As has been discussed in Section V-B, for the general

case of correlated channel model, when no perfect channel

estimate is available at the transmitter, it is not practical to

look for optimal adaptive transmission policies Instead, there

are various approaches that can approximate optimal control

policies at lower complexity These approaches are: BCEI,

MLS, QMDP and they have been discussed in Section V-B

Note that BCEI is the approach that blindly employs policy

π∗ with erroneous channel state information Again, we plot

the performance of different adaptive schemes in terms of

normalized packet loss rate versus average transmit power The

performance of all schemes are compared to the case when an

optimal scheme is employed under perfect SSI, that is, the

OCPI curve The performance of different classes of adaptive

policies is given in Figs 8 and 9 Fig 8 is obtained for the

case whenσ = 0.05 and Fig 9 is for the case when σ = 0.1

In both Figs 8 and 9, the channel model in Table I, Scenario

2, is used

As can be seen, the MIC approach, which only tries to

minimize the immediate cost during each time frame and does

0.05

0.1 0.15 0.2 0.25 0.3 0.35 0.4

Power (dB)

MIC BCEI MLS QMDP OCPI

Fig 9 Performance, i.e., normalized packet loss rate versus average transmit power, for different adaptive transmission schemes given imperfect channel estimate System parameters are given in Section VI-A Channel model is

in Tab I, Scenario 2 The standard deviation of channel estimating noise is

σ = 0.1.

not take the dynamics of the system into account has the worst performance Significant performance gain can be achieved by using BCEI, MLS, and QMDP approaches This shows the important of structuring the problem as a partially observable Markov decision process

Among the three approaches BCEI, MLS, and QMDP, it seems that QMDP performs best We note that there is no significant extra complexity when using QMDP instead of BCEI or MLS, therefore, QMDP is a good choice to cope with imperfect estimated channel state information Between BCEI and MLS, MLS tends to perform better at low power range, while at higher power range, BCEI achieves better results However, we note that the difference in the performance

of BCEI and MLS is not significant, therefore, the simpler approach, i.e., BCEI, is preferable

VII CONCLUSION

In this paper, we consider the problem of buffer and channel adaptive transmission for maximizing the throughput of a transmission over a wireless fading channel, subject to an average transmit power constraint We consider scenarios in which the system state information for making control deci-sions is incomplete This includes delayed and/or imperfectly estimated channel state and quantized buffer occupancy We also allow for a tradeoff due to the loss from both transmission errors and buffer overflow and obtain significant throughput improvement

This paper shows the importance of cross-layer design in achieving good performance for wireless data communication system This paper also demonstrates that, even when the sys-tem state is not fully observable, buffer and channel adaptive transmission can still be implemented in an effective manner

Định dạng
Số trang	11
Dung lượng	738,01 KB