Chuyển đổi lý thuyết P6 doc

The throughput and loss performance of the basic unbuffered banyan network,which thus includes n stages of SEs, can be evaluated by recursive analysis of the load onadjacent stages of th

Trang 1

Chapter 6 ATM Switching with

Minimum-Depth Blocking Networks

Architectures and performance of interconnection networks for ATM switching based on theadoption of banyan networks are described in this chapter The interconnection networks pre-sented now have the common feature of a minimum depth routing network, that is the path(s)from each inlet to every outlet crosses the minimum number of routing stages required toguarantee full accessibility in the interconnection network and to exploit the self-routingproperty According to our usual notations this number n is given by for a net-work built out of switching elements Note that a packet can cross more than n

stages where switching takes place, when distribution stages are adopted between the switchinlets and the n routing stages Nevertheless, in all these structures the switching result per-formed in any of these additional stages does not affect in any way the self-routing operationtaking place in the last n stages of the interconnection network These structures are inherentlyblocking as each interstage link is shared by several I/O paths Thus packet loss takes place ifmore than one packet requires the same outlet of the switching element (SE), unless a properstorage capability is provided in the SE itself

Unbuffered banyan networks are the simplest self-routing structure we can imagine ertheless, they offer a poor traffic performance Several approaches can be considered toimprove the performance of banyan-based interconnection networks:

Nev-1. Replicating a banyan network into a set of parallel networks in order to divide the offeredload among the networks;

2. Providing a certain multiplicity of interstage links, so as to allow several packets to share theinterstage connection;

3. Providing each SE with internal buffers, which can be associated either with the SE inlets

or to the SE outlets or can be shared by all the SE inlets and outlets;

4. Defining handshake protocols between adjacent SEs in order to avoid packet loss in a ered SE;

buff-n = logb N

This document was created with FrameMaker 4.0.4

Trang 2

168 ATM Switching with Minimum-Depth Blocking Networks

5. Providing external queueing when replicating unbuffered banyan networks, so that ple packets addressing the same destination can be concurrently switched with success.Section 6.1 describes the performance of the unbuffered banyan networks and describesnetworks designed according to criteria 1 and 2; therefore networks built of a single banyanplane or parallel banyan planes are studied Criteria 3 and 4 are exploited in Section 6.2, whichprovides a thorough discussion of banyan architectures suitable to ATM switching in whicheach switching element is provided with an internal queueing capability Section 6.3 discusseshow a set of internally unbuffered networks can be used for ATM switching if queueing isavailable at switch outlets with an optional queueing capacity associated with network inletsaccording to criterion 5 Some final remarks concerning the switch performance underoffered traffic patterns other than random and other architectures of ATM switches based onminimum-depth routing networks are finally given in Section 6.4

multi-6.1 Unbuffered Networks

The class of unbuffered networks is described now so as to provide the background necessaryfor a satisfactory understanding of the ATM switching architectures to be investigated in thenext sections The structure of the basic banyan network and its traffic performance are firstdiscussed in relation to the behavior of the crossbar network Then improved structures usingthe banyan network as the basic building block are examined: multiple banyan planes and mul-tiple interstage links are considered

6.1.1 Crossbar and basic banyan networks

The terminology and basic concepts of crossbar and banyan networks are here recalled and thecorresponding traffic performance parameters are evaluated

6.1.1.1 Basic structures

In principle, we would like any interconnection network (IN) to provide an optimum mance, that is maximum throughput and minimum packet loss probability Packets arelost in general for two different reasons in unbuffered networks: conflicts for an internal INresource, or internal conflicts, and conflicts for the same IN outlet, or external conflicts The lossdue to external conflicts is independent of the particular network structure and is unavoidable

perfor-in an unbuffered network Thus, the “ideal” unbuffered structure is the crossbar network (seeSection 2.1) that is free from internal conflicts since each of the crosspoints is dedicated toeach specific I/O couple

An banyan network built out of SEs includes n stages of SEs in which An example of a banyan network with Baseline topology and size isgiven in Figure 6.1a for and in Figure 6.1b for As already explained inSection 2.3.1, internal conflicts can occur in banyan networks due to the link commonality ofdifferent I/O paths Therefore the crossbar network can provide an upper bound on through-

Trang 3

(6.1)Once the switch throughput is known, the packet loss probability is simply obtained as

Thus, for an asymptotically large switch , the throughput is with a switch

Owing to the random traffic assumption and to their single I/O path feature, banyan works with different topologies are all characterized by the same performance The trafficperformance of unbuffered banyan networks was initially studied by Patel [Pat81], whoexpressed the throughput as a quadratic recurrence relation An asymptotic solution was thenprovided for this relation by Kruskal and Snir [Kru83] A closer bound of the banyan networkthroughput was found by Kumar and Jump [Kum86], who also give the analysis of replicated

net-Figure 6.1 Example of banyan networks with Baseline topology

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

Trang 4

170 ATM Switching with Minimum-Depth Blocking Networks

and dilated banyan networks to be described next Further extensions of these results arereported by Szymanski and Hamacker [Szy87]

The analysis given here, which summarizes the main results provided in these papers, relies

on a simplifying assumption, that is the statistical independence of the events of packet arrivals

at SEs of different stages Such a hypothesis means overestimating the offered load stage bystage, especially for high loads [Yoo90]

The throughput and loss performance of the basic unbuffered banyan network,which thus includes n stages of SEs, can be evaluated by recursive analysis of the load onadjacent stages of the network Let indicate the probability that a genericoutlet of an SE in stage i is “busy”, that is transmits a packet ( denotes the external loadoffered to the network) Since the probability that a packet is addressed to a given SE outlet

is , we can easily write

(6.2)Thus, throughput and loss are given by

Figure 6.2 Switch capacity of a banyan network

-=

0.20.30.40.50.60.70.8

Trang 5

Unbuffered Networks 171

The switch capacity, , of a banyan network (Equation 6.2) with different sizes b of the

basic switching element is compared in Figure 6.2 with that provided by a crossbar network

(Equation 6.1) of the same size The maximum throughput of the banyan network decreases as

the switch size grows, since there are more packet conflicts due to the larger number of

net-work stages For a given switch size a better performance is given by a banyan netnet-work with a

larger SE: apparently as the basic SE grows, less stages are needed to build a banyan

net-work with a given size N

An asymptotic estimate of the banyan network throughput is computed in [Kru83]

which provides an upper bound of the real network throughput and whose accuracy is larger

for moderate loads and large networks Figure 6.3 shows the accuracy of this simple bound for

a banyan network loaded by three different traffic levels The bound overestimates the real

net-work throughput and the accuracy increases as the offered load p is lowered roughly

independently of the switch size

It is also interesting to express π as a function of the loss probability

occurring in the single stages Since packets can be lost in general at any stagedue to conflicts for the same SE outlet, it follows that

Figure 6.3 Switch capacity of a banyan network

ρmax

b×b

b–1( ) n 2b

p

+ -

-≅

0.10.20.30.40.50.60.70.80.9

1 10 100 1000 10000

b=2

Crossbar Analysis Bound

Switch size, N

p=0.5 p=0.75 p=1.0

=

Trang 6

or equivalently by applying the theorem of total probability

Therefore the loss probability can be expressed as a function of the link load stage by stage as

(6.3)

For the case of the stage load given by Equation 6.2 assumes an expression that isworth discussion, that is

(6.4)

Equation 6.4 says that the probability of a busy link in stage i is given by the probability of

a busy link in the previous stage decreased by the probability that both the SE inlets arereceiving a packet ( ) and both packets address the same SE outlet So, the lossprobability with SEs given by Equation 6.3 becomes

(6.5)

6.1.2 Enhanced banyan networks

Interconnection networks based on the use of banyan networks are now introduced and theirtraffic performance is evaluated

6.1.2.1 Structures

Improved structures of banyan interconnection networks were proposed [Kum86] whose basicidea is to have multiple internal paths per inlet/outlet pair These structures either adopt multi-ple banyan networks in parallel or replace the interstage links by multiple parallel links

An interconnection network can be built using K parallel networks

(planes) interconnected to a set of N splitters and a set of N combiners throughsuitable input and output interconnection patterns, respectively, as shown in Figure 6.4 These

structures are referred to as replicated banyan networks (RBN), as the topology in each plane is

banyan or derivable from a banyan structure The splitters can distribute the incoming traffic indifferent modes to the banyan networks; the main techniques are:

Trang 7

RBNs with random and multiple loading are characterized by full banyan networks, thesame input and output interconnection patterns, and different operations of the splitters,whereas selective loading uses “truncated” banyan networks and two different types of inter-connection pattern In all these cases each combiner that receives more than one packet in aslot discards all but one of these packets

A replicated banyan network operating with RL or ML is represented in Figure 6.5: bothinterconnection patterns are of the EGS type (see Section 2.1) With random loading eachsplitter transmits the received packet to a randomly chosen plane out of the planeswith even probability The aim is to reduce the load per banyan network so as toincrease the probability that conflicts between packets for interstage links do not occur Eachreceived packet is broadcast concurrently to all the planes with multiple loading.The purpose is to increase the probability that at least one copy of the packet successfullyreaches its destination

Selective loading is based on dividing the outlets into disjoint subsets and ing each banyan network suitably truncated to one of these sets Therefore one EGS pattern ofsize connects the splitters to the banyan networks, whereas suitable patterns (one

dedicat-per banyan network) of size N must be used to guarantee full access to all the combiners from

every banyan inlet The splitters selectively load the planes with the traffic addressing theirrespective outlets In order to guarantee full connectivity in the interconnection network, ifeach banyan network includes stages , the splitters transmit each packet to

Figure 6.4 Replicated Banyan Network

N-1

1 0

Banyan networks

Trang 8

the proper plane using the first k digits (in base b) of the routing tag The example in

Figure 6.6 refers to the case of , and in which the truncated banyannetwork has the reverse Baseline topology with the last stage removed Note that the connec-tion between each banyan network and its combiners is a perfect shuffle (or EGS) pattern Thetarget of this technique is to reduce the number of packet conflicts by jointly reducing theoffered load per plane and the number of conflict opportunities

Providing multiple paths per I/O port, and hence reducing the packet loss due to conflictsfor interstage links, can also be achieved by adopting a multiplicity ofphysical links for each “logical” interstage link of a banyan network (see Figure 4.10 for

, and ) Now up to packets can be concurrently exchanged

between two SEs in adjacent stages These networks are referred to as dilated banyan networks

(DBN) Such a solution makes the SE, whose physical size is now , much morecomplex than the basic SE In order to drop all but one of the packets received by thelast stage SEs and addressing a specific output, combiners can be used that concentratethe physical links of a logical outlet at stage n onto one interconnection network output.

However, unlike replicated networks, this concentration function could be also performeddirectly by each SE in the last stage

Figure 6.5 RBN with random or multiple loading

N-1

1 0

Banyan networks

Trang 9

6.1.2.2 Performance

Analysis of replicated and dilated banyan networks follows directly from the analysis of a single

banyan network Operating a random loading of the K planes means evenly partitioning the offered load into K flows The above recursive analysis can be applied again considering that

the offered load per plane is now

Throughput and loss in this case are

(6.6)

(6.7)For multiple loading it is difficult to provide simple expressions for throughput and delay.However, based on the results given in [Kum86], its performance is substantially the same asthe random loading This fact can be explained considering that replicating a packet on all

Figure 6.6 Example of RBN with selective loading

p0 p K

-p

–

Trang 10

planes increases the probability that at least one copy reaches the addressed output, as thechoice for packet discarding is random in each plane This advantage is compensated by thedrawback of a higher load in each plane, which implies an increased number of collision (andloss) events.

With selective loading, packet loss events occur only in stages of each plane and theoffered load per plane is still The packet loss probability is again given by

with the switch throughput provided by

since each combiner can receive up to K packets from the plane it is attached to.

In dilated networks each SE has size , but not all physical links are active, that is

enabled to receive packets SEs have 1 active inlet and b active outlets per logical port at stage

1, b active inlets and active outlets at stage 2, K active inlets and K active outlets from stage

k onwards The same recursive load computation as described for the basic

ban-yan network can be adopted here taking into account that each SE has bK physical inlets and b

logical outlets, and that not all the physical SE inlets are active in stages 1 through The

event of m packets transmitted on a tagged link of an SE in stage i , whose bility is , occurs when packets are received by the SE from its b upstream SEs and

proba-m of these packets address the tagged logical outlet If denotes the probability that m

packets are received on a tagged inlet an SE in stage 1, we can write

The packet loss probability is given as usual by with the throughput provided by

The switch capacity, , of different configurations of banyan networks is shown inFigure 6.7 in comparison with the crossbar network capacity RBNs with random and selec-tive loading have been considered with and , respectively A dilatedbanyan network with link dilation factors has also been studied RBN with ran-dom and selective loading give a comparable throughput performance, the latter behaving alittle better A dilated banyan network with dilation factor behaves much better than anRBN network with replication factor The dilated banyan network with

Trang 11

gives a switch throughput very close to crossbar network capacity Nevertheless the overallcomplexity of a dilated banyan network is much higher than in RBNs (more complex SEs arerequired) For example a network with size and includes 160 SEs ofsize in an RBN and 80 SEs of size Since we have no queueing in unbufferedbanyan networks, the packet delay figure is not of interest here

So far we have studied the traffic performance of unbuffered banyan networks with dom offered traffic in which both internal and external conflicts among packets contribute todetermine the packet loss probability A different pattern of offered traffic consists in a set ofpackets that does not cause external conflicts, that is each outlet is addressed by at most onepacket A performance study of these patterns, referred to as permutations, is reported in[Szy87]

ran-6.2 Networks with a Single Plane and Internal

Queueing

In general the I/O paths along which two packets are transmitted are not link-independent in

a banyan network Thus, if two or more ATM cells address the same outlet of an SE (that is thesame interstage link or the same switch output), only one of them can actually be transmitted

to the requested outlet The other cell is lost, unless a storage capability is available in the SE

We assume that each switching element with size is provided with a queueing capacity

per port of B cells per port and we will examine here three different types of arrangements of this memory in the SE: input queueing, output queueing and shared queueing With input and out-

Figure 6.7 Switch capacity for different banyan networks

N = 32 K = K d = 2

0.30.40.50.60.70.8

Trang 12

put queueing b physical queues are available in the SE, whereas only one is available with shared queueing In this latter case the buffer is said to include b logical queues, each holding

the packets addressing a specific SE outlet In all the buffered SE structure considered here weassume a FIFO cell scheduling, as suggested by simplicity requirements for hardwareimplementation

Various internal protocols are considered in our study, depending on the absence or ence of signalling between adjacent stages to enable the downstream transmission of a packet

pres-by an SE In particular we define the following internal protocols:

• backpressure (BP): signals are exchanged between switching elements in adjacent stages

so that the generic SE can grant a packet transmission to its upstream SEs only within thecurrent idle buffer capacity The upstream SEs enabled to transmit are selected according to

the acknowledgment or grant mode, whereas the number of idle buffer positions is mined based on the type of backpressure used, which can be either global (GBP) or local

deter-(LBP) These operations are defined as follows:

— acknowledgment (ack): the generic SE in stage i issues as many requests as

the number of SE outlets addressed by head-of-line (HOL) packets, each transmitted to

the requested downstream SE In response, each SE in stage i enables the

transmission by means of acknowledgments to all the requesting upstream SEs, if their

number does not exceed its idle buffer positions, determined according to the GBP orLBP protocol; otherwise the number of enabled upstream SEs is limited to thoseneeded to saturate the buffer;

— grant (gr): without receiving any requests, the generic SE in stage i grantsthe transmission to all the upstream SEs, if its idle buffer positions, , are at least b;

otherwise only upstream SEs are enabled to transmit; unlike the BP-ack protocol,the SE can grant an upstream SE whose corresponding physical or logical queue isempty with the BP-gr operations;

— local backpressure (LBP): the number of buffer places that can be filled in the

generic SE in stage i at slot t by upstream SEs is simply given by the

num-ber of idle positions at the end of the slot ;

— global backpressure (GBP): the number of buffer places that can be filled in the

generic SE in stage i at slot t by upstream SEs is given by the number of

idle positions at the end of the slot increased by the number of packets that are

going to be transmitted by the SE in the slot t;

• queue loss (QL): there is no exchange of signalling information within the network, sothat a packet per non-empty physical or logical queue is always transmitted downstream byeach SE, independent of the current buffer status of the destination SE; packet storage inthe SE takes place as long as there are enough idle buffer positions, whereas packets are lostwhen the buffer is full

From the above description it is worth noting that LBP and GBP, as well as BP-ack andBP-gr, result in the same number of upstream acknowledgment/grant signals by an SE if at

least b positions are idle in its buffer at the end of the preceding slot Moreover, packets can be

lost for queue overflow only at the first stage in the BP protocols and at any stage in the QLprotocol In our model the selection of packets to be backpressured in the upstream SE (BP) or

to be lost (QL) in case of buffer saturation is always random among all the packets competing

Trang 13

for the access to the same buffer Note that such general description of the internal protocolsapplied to the specific type of queueing can make meaningless some cases.

The implementation of the internal backpressure requires additional internal resources to

be deployed compared to the absence of internal protocols (QL) Two different solutions can

be devised for accomplishing interstage backpressure, that is in the space domain or in the timedomain In the former case additional internal links must connect any couple of SEs interfaced

by interstage links In the latter case the interstage links can be used on a time division base totransfer both the signalling information and the ATM cells Therefore an internal bit rate, ,

higher than the link external rate, C (bit/s), is required With the acknowledgment BP we have a two-phase signalling: the arbitration phase where all the SEs concurrently transmit their requests downstream and the enable phase where each SE can signal upstream the enabling sig-

nal to a suitable number of requesting SEs The enable phase can be accomplishedconcurrently by all SEs with the local backpressure, whereas it has be a sequential operationwith global backpressure In this last case an SE needs to know how many packets it is going totransmit in the current slot to determine how many enable signals can be transmittedupstream, but such information must be first received by the downstream SEs Thus the enable

phase of the BP-ack protocol is started by SEs in stage n and ends with the receipt of enable

signal by SEs in stage 1 Let and (bit) be the size of each downstream and upstream nalling packet, respectively, and (bit) the length of an information packet (cell) Then theinternal bit rate is for the QL protocol and for the BP protocolwhere η denotes the switching overhead This factor in the BP protocol with acknowledgment is

sig-given by

(6.8)

In the BP protocol with grant we do not have any request phase and the only signalling is resented by the enable phase that is performed as in the case of the BP-ack protocol Thus theinternal rate of the BP-gr protocol is given by Equation 6.8 setting

rep-The network is assumed to be loaded by purely random and uniform traffic; that is at stage 1:

1. A packet is received with the same probability in each time slot;

2. Each packet is given an outlet address that uniformly loads all the network outlets;

3. Packet arrival events at different inlets in the same time slots are mutually independent;

4. Packet arrival events at an inlet or at different inlets in different time slot are mutually pendent

inde-Even if we do not provide any formal proof, assumption 2 is likely to be true at every stage,because of general considerations about flow conservation across stages The independenceassumption 3 holds for every network stage in the QL mode, since the paths leading to the dif-

ferent inlets of an SE in stage i cross different SEs in stage (recall that one path throughthe network connects each network inlet to each network outlet) Owing to the memory

Trang 14

device in each SE, the assumption 4, as well as the assumption 3 for the BP protocol, nolonger holds in stages other than the first For simplicity requirements the assumption 3 is sup-posed to be always true in all the stages in the analytical models to be developed later In spite

of the correlation in packet arrival events at a generic SE inlet in stages 2 through n, our

mod-els assume independence of the state of SEs in different stages Such a correlation could betaken into account by suitably modelling the upstream traffic source loading each SE inlet.Nevertheless, in order to describe simple models, each upstream source will be representedhere by means of only one parameter, the average load

We assume independence between the states of SEs in the same stage, so that one SE perstage is representative of the behavior of all the elements in the same stage ( will denote

such an element for stage i) For this reason the topology of the network, that is the specific

kind of banyan network, does not affect in any way the result that we are going to obtain Asusual we consider banyan networks with switching elements, thus including

stages

Buffered banyan networks were initially analyzed by Dias and Jump [Dia81], who onlyconsidered asymptotic loads, and by Jenq [Jen83], who analyzed the case of single-bufferedinput-queued banyan networks loaded by a variable traffic level The analysis of buffered ban-yan networks was extended by Kumar and Jump [Kum84], so as to include replicated anddilated buffered structures A more general analysis of buffered banyan networks was presented

by Szymanski and Shiakh [Szy89], who give both separate and combined evaluation of ent SE structures, such as SE input queueing, SE output queueing, link dilation The analysisgiven in this section for networks adopting SEs with input queueing or output queueing isbased on this last paper and takes into account the modification and improvements described

differ-in [Pat91], madiffer-inly directed to improve the computational precision of network throughput andcell loss In particular, the throughput is only computed as a function of the cell loss probabil-ity and not vice versa

As far as networks with shared-queued SEs are concerned, some contributions initiallyappeared in the technical literature [Hlu88, Sak90, Pet90], basically aiming at the study of asingle-stage network (one switching element) Convolutional approaches are often used thatassume mutual independence of the packet flows addressing different destinations Analyticalmodels for multistage structures with shared-buffered SEs have been later developed in [Tur93]and [Mon92] Turner [Tur93] proposed a simple model in which the destinations of the pack-ets in the buffer were assumed mutually independent Monterosso and Pattavina [Mon92]developed an exact Markovian model of the switching element, by introducing modellingapproximation only in the interstage traffic The former model gave very inaccurate results,whereas the latter showed severe limitation in the dimensions of the networks under study Themodel described here is the simplest of the three models described in [Gia94] in which the SEstate is always represented as a two-state variable The other two more complex models therein,not developed here, take into account the correlation of the traffic received at any stage otherthan the first

SE i

n = logb N

Trang 15

6.2.1 Input queueing

The functional structure of a SE with input queueing, shown in Figure 6.8 in the tion with additional interstage links for signalling purposes, includes two (local) queues, eachwith capacity cells, and a controller Each of the local queues, which interface directlythe upstream SEs, performs a single read and write operation per slot The controller receivessignals from the (remote) queues of the downstream SEs and from the local queues when per-forming the BP protocol With this kind of queueing there is no need for an arbitration phasewith downstream signalling, since each queue is fed by only one upstream SE Thus the BPprotocol can only be of the grant type Nevertheless, arbitration must take place slot by slot bythe SE controller to resolve possible conflicts arising when more than one HOL cell of thelocal queues addresses the same SE outlet

solu-Packet transmissions to downstream SEs (or network outlets) and packet receipt fromupstream SEs (or network inlets) take place concurrently in the SE at each time slot For thesake of better understanding the protocols QL and GBP, we can well imagine for an SE thatpacket transmissions occur in the first half of the slot, whereas packet receipts take place in thesecond half of the slot based on the empty buffer space at the end of the first phase With theLBP protocol there is no need for such decomposition as the amount of packets to be received

is independent of the packets to be transmitted in the slot In such a way we can define a tual half of each time slot that separates transmissions from receipts

vir-In order to develop analytical models for the network, it turns out useful to define the lowing probability distributions to characterize the dynamic of the generic input queue of the

fol-SE, the tagged queue:

• = Pr [the tagged queue at stage i at time t contains m packets];

• = Pr [the tagged queue at stage i at time t contains m packets if we consider to be removed those packets that are going to be transmitted in the slot t];

• = Pr [an SE at stage i at time t offers a packet to a queue at stage ]; denoted the external offered load;

• = Pr [a packet offered by a queue at stage i at time t is actually transmitted by the

Trang 16

• = Pr [a packet offered by a queue at stage i at time t is selected for transmission].

Note that the denotes the probability distribution of the tagged queue at the time slot if transmission and receipt of packets occur sequentially in the slot The LBP protocoldoes not require the definition of the distribution , as the ack/grant signals depend only

half-on the idle buffer space at the end of the last slot Moreover, for the sake of simplicity, the lowing notation is used:

fol-In the following, time-dependent variables without the subscript t indicate the steady-state

value assumed by the variable

The one-step transition equations for the protocols QL and GBP describing the dynamic

of the tagged queue due first to cell transmissions and then to the cell receipts are easilyobtained:

The analogous equations for the LBP protocol with are

Trang 17

which for reduce to

and for to

Based on the independence assumption of packet arrivals at each stage, the distributionprobability of is immediately obtained:

(6.9)with the boundary condition

Since the probability that a HOL packet is selected to be transmitted to the stream SE is

down-the distribution probability of is given by

An iterative approach is used to solve this set of equations in which we compute all the

state variables from stage 1 to stage n using the values obtained in the preceding iteration for

the unknowns A steady state is reached when the relative variation in the value assumed by thevariables is small enough Assuming that a suitable and consistent initial value for these vari-ables is assigned, we are so able to evaluate the overall network performance

j+1 -

Trang 18

Packet losses take place only at stage 1 with backpressure, whereas in the QL mode a packet is

lost at stage i only if it is not lost in stages 1 through , that is

Moreover the switch throughput, ρ, is the traffic carried by the last stage

Fig-if low offered loads are considered, whereas the model for LBP and QL turns out to be lessaccurate

The loss performance given by the analytical model for three protocols GBP, LBP and QLfor the same buffer size is shown in Figure 6.12 As one might expect, the GBP protocol givesthe best performance and behaves significantly better than the other two protocols especiallyfor small buffers Apparently, if the buffer is quite large the performance improvement enabled

by the exploiting of the buffer positions (at most one with IQ) being emptied in the same slot(GBP over LBP) becomes rather marginal

6.2.2 Output queueing

With output queueing, the (local) queues of the SE, each with capacity cells, face the SE outlets, as represented in Figure 6.13 for a SE in the space division solutionfor the inter-stage signalling Now switching precedes rather than following queueing so that

inter-each queue must be able to perform up to b write and 1 read operations per slot The SE

con-troller exchanges information with the SEs in the adjacent stages and with the local queueswhen the BP protocol is operated In case of possible saturation of any local queues, it is a task

of the SE controller to select the upstream SEs enabled to transmit a packet without

Trang 19

Figure 6.9 Loss performance with IQ and GBP

Figure 6.10 Loss performance with IQ and LBP

Offered load, p

Trang 20

Figure 6.11 Loss performance with IQ and QL

Figure 6.12 Loss performance with IQ and different protocols

Trang 21

ing the local queue capacity Note that now there is no need of arbitration by the SE controller

in the downstream packet transmission as each local queue feeds only one downstream SE

In general output-queued SEs are expected to perform better than input-queued SEs Infact, in the latter structure the HOL blocking can take place, that is a HOL cell is not transmit-ted owing to a conflict with the HOL cell of other local queue(s) for the same SE outlet, thusreducing the SE throughput With output-queued SEs each local queue has exclusive access to

a SE outlet and eventual multiple cell arrivals from upstream SEs are handled through suitablehardware solutions Thus, SEs with output queueing are much more complex than SEs withinput queueing

With output queueing the one-step transitions equations for the protocols QL and GBPdescribing the dynamics of the tagged output queue due to packet transmissions are then givenby

A different behavior characterizes the SE dynamics under BP protocol with

acknowledg-ment or grant, when the number of idle places in the buffer is less than the SE size b Therefore

the evolution of the tagged output queue due to packet receipt under QL or BP-ack isdescribed by

Figure 6.13 SE with output queueing

Trang 22

and under the BP-gr protocol by

After defining the function

which represents the probability that a queue holding h packets transmits a packet, the

one-step transition equations in the case of LBP-ack protocol are

The analogous equations for the LBP-gr protocol are obtained by simply replacing b with

when b appears as first parameter in the function and as superior edge in a

Trang 23

Note that denotes the probability that the HOL packet of the tagged queue in

stage i is actually granted the transmission given that x positions are available in the stream buffer and y SEs in stage i compete for them These y elements are the tagged queue

down-together with other non-empty queues addressing the same SE outlet in stage

under the acknowledgment protocol, just the b SEs in stage i interfacing the same SE of stage

i+1 as the tagged queue under the grant protocol This probability value becomes 1 for ,since all the contending packets, including the HOL packet in the tagged queue, are accepteddownstream

The analogous equations for the LBP protocol are obtained by simply replacing with , whereas for the QL mode we obviously have

After applying the iterative computation of these equations already described in the ceding Section, the steady-state network performance measures are obtained Throughput anddelay figures are expressed as in the case of input queueing, so that the throughput value isgiven by Equation 6.10 and the delay by Equation 6.11 The packet loss probability is now

pre-where is the loss probability at stage i and represents the probability that a packet

offered to a memory with x idle positions in stage i is refused These variables are obtained as

b i t,

d' i+1 t, ( )h β b 1– j 1–d i t, ( )0

b

, ,

min B o–h

j+1 - 1,

Trang 24

As with IQ, we assess now the accuracy of the analytical model by considering a networkwith and with a total buffer capacity per SE in the range of cells4–32 cells Now the GBP protocol with acknowledgment gives a very good matching withsimulation data as with input queueing (Figure 6.14), whereas the same is no more true whengrant is used (Figure 6.15) The degree of accuracy in evaluating loss probabilities by the GBP-

gr protocols applies also to the LBP protocols, in both acknowledgment and grant versions Inthe case of the QL protocol, the model accuracy with output queueing is comparable withthat shown in Figure 6.11 for input queueing

The packet loss probability of the five protocols with output queueing given by the ical model is plotted in Figure 6.16 As with IQ, the GBP significantly improves theperformance of the LBP only for small buffers The same reasoning applies to the behavior ofthe acknowledgment protocols compared to the grant protocols In both cases the better usage

analyt-of the buffer enabled by GBP and by ack when the idle positions are less than b is appreciable only when the buffer size is not much larger than b

Figure 6.14 Loss performance with OQ and GBP-ack

θi( )x

β b 1– r a i 1,t

b

, ,

r+1 -

Offered load, p

Trang 25

Figure 6.15 Loss performance with OQ and GBP-gr

Figure 6.16 Loss performance with OQ and different protocols

LBP-gr QL

Trang 26

6.2.3 Shared queueing

An SE with internal shared queueing is provided with a total buffer capacity of (cells)that is shared among all the SE inlets and outlets (see Figure 6.17 in which additional interstage

links have been used for signalling purposes) The buffer is said to include b logical queues each

holding the packets addressing a specific SE outlet On the SE inlet side, the cells offered bythe upstream SEs are stored concurrently in the buffer that holds all the cells independently ofthe individual destinations According to our FIFO assumption, the controller must storesequentially the received packets in each logical queue, so as to be able to transmit one packet

per non-empty logical queue in each slot Thus the queue must be able to perform up to b write and b read operations per slot As in the case of SEs with output queueing, HOL block-

ing cannot take place since there is no contention among different queues for the same SEoutlets The SE controller, which exchanges information with the SEs in adjacent stages andwith the local queue, performs the arbitration for the concurrent access to the local queue bythe upstream switching element when buffer overflow is going to occur

Based on the model assumptions defined at the beginning of Section 6.2 and analogously

to the two previous queueing models only one SE per stage, the tagged SE, is studied as sentative of the behavior of all the elements in the same stage ( will denote such an

repre-element for stage i) Let us define the following state variables for the tagged SE:

• : state of an SE in stage i ( ) at time t;

• : state that at time t would assume if the packets to be transmitted during slot t are

removed from its buffer (that is the state assumed by the SE at half time slot if transmissionand receipt of packets are considered to occur sequentially in the slot);

whose probability distributions are

• = Pr [the state of at time t is ];

Note that the LBP protocol does not require the definition of the variable and the sponding distribution as the ack/grant signals depend only on the idle buffer space atthe end of the last slot Observe that now the state variable is much more general than the

corre-Figure 6.17 SE with shared queueing

Trang 27

scalar representing the queue state with the previous queueing types Now the following tional variables are needed:

addi-• = Pr [the buffer of at time t holds h packets];

• = Pr [the buffer of at time t holds h packets, if the packets to be transmitted during slot t are removed from the buffer];

For the sake of convenience we redefine here also the variable describing the interstage traffic,that is

• = Pr [a link outgoing from offers a packet at time t]; denotes the externaloffered load;

• = Pr [a packet offered by at time t is actually transmitted by ]

If S denotes the set of all the SE states, the dynamics can be expressed as

(6.12)

where

• = Pr [a transition occurs from state to state ];

• = Pr [a transition occurs from state to state ]

Different approaches have been proposed in the technical literature We simply recall herethe main assumptions and characteristics of two basic models by referring to the originalpapers in the literature for the analytical derivations of the performance results In the first pro-

posal by Turner [Tur93], which here will be referred to as a scalar model, the state

simply represents the number of packets in the buffer at time t With this model the

desti-nations of packets in the buffer are assumed to be mutually independent In the second

proposal by Monterosso and Pattavina [Mon92], which here will be called a vectorial model, the

independence assumption of the addresses of packets sitting in the same buffer is removed The

state s is a vector of b components

, in which represents the number of packets addressing one specific SEoutput and the different components are sorted in decreasing order The shortcomings of thesetwo models are very poor results for the former and a large state space growth with buffer or

SE size in the latter

Our approach to make the analysis feasible and accurate is to represent the buffer content

by means of only two variables, one being the content of a specific logical queue (the taggedqueue) and the other being the cumulative content of the other logical queues [Gia94]

Thus, if S denotes the set of the SE states, the buffer content in the generic SE state

will be represented by the two state variables

• : number of packets in the tagged logical queue when the SE state is m;

• : cumulative number of packets in the other logical queues when the SE

Trang 28

with the total buffer content indicated by and the obvious boundary conditions

This model is called here a bidimensional model, since only two variables characterize the SE.

Two other more complex (and accurate) models are also described in [Gia94] which use one

(tridimensional model) or two (four-dimensional model) additional state variables to take into

account the correlation in the interstage traffic A different kind of bidimensional model isdescribed in [Bia93] In our bidimensional model this traffic is assumed to be strictly randomand thus characterized by only one parameter The two distributions and , aswell as and , can now be related by

(6.13)

(note that m is a state index, while h is an integer).

In order to solve Equation 6.12 it is useful to split into two factors:

(6.14)where

• = Pr [ at time t receives the number of packets necessary to reach state m from state j];

• = Pr [the transition from state j to state m takes place, given that the SE has

received the number of packets required by this transition]

The two factors of are given by

(6.15)

where the average interstage load is

(6.16)with the usual boundary condition

Trang 29

The function describing the packet transmission process by is given by

(6.17)

in which represents the probability that v non-tagged logical queues of at time

t hold a packet ready to be transmitted, given that the buffer holds l packets addressed to

the non-tagged outlets We can compute this function as follows:

with

where indicates the number of packets in the j-th (non-tagged) logical queue of

at time t.

In order to calculate the joint distribution

some approximations are introduced We assume all the logical queues to be mutually dent and their distribution to be equal to that of the tagged logical queue Therefore we have

Định dạng
Số trang	59
Dung lượng	917,35 KB