Let us introduce the following notations: number of packets entering the tagged virtual queue at the beginning of slot m; number of packets in the tagged virtual queue after the packet
Trang 1292 ATM Switching with Non-Blocking Multiple-Queueing Networks
outcome of the contention phase carried by field GR: means a granted request,
In the data phase the PCs that either lose contention in the request phase, or issue an idle
packet REQ, transmit an idle packet DATA, whereas the contention winners transmit a packetDATA carrying their HOL cell (see the example in Figure 8.8 referred to in the solution c' forthe implementation of the routing network described in Section 3.2.3) The sorting–routing
structure is K-non-blocking so that all the non-idle packets DATA reach the addressed OPC.
which two are denied: one addressing outlet 1 owing to queue saturation and one
addressing outlet 5 owing to a number of requests larger than K The same packet
configura-tion of this example, applied to an IOQ switch without backpressure, has been shown inFigures 8.4–8.5: in that case five packets have been successfully switched rather than four inthe BP switch due to absence of queue saturation control
Hardware structure. In the above IOQ Three-Phase switch with BP protocol and size
, the K-non-blocking network is the minimum-cost implementation described in
Section 3.2.3 with a Batcher sorting network cascaded through an EGS pattern onto
K banyan networks As already discussed, the only self-routing tag DA suffices in such
a K-non-blocking structure to have the packet self-routing.
The merge network MN receives a packet bitonic sequence, that is the juxtaposition of anaddress-ascending sequence of packets QUE on inlets and an address-descending
The allocation network AN is a variation of the running sum adder network alreadydescribed for the channel grouping application in an IQ switch (see Section 7.1.3.1) that sums
Figure 8.8 Example of packet switching (Phase III) in the BP IOQ Three-Phase switch
GR<K
GR≥K
Network RN
Network
RN1
1 1
1 5 1 5
1 0 1 1 0 0 1 0
5 0 1 5 0 0 7 0
AC DA 0
0 0 0
1
0
1 0 0
1 1
5 5 1 7
PC
Network SN
III Data
0 1 2 3 4 5 6 7 1 7
Trang 2on partitioned sets of its inlets Here the network size is doubled, since also the queue statusinformation is used in the running sum Now the allocation network has size and
Figure 8.9 The component with inlets and , outlets and , which is referred
to as an i-component, has the function of identifying the partition edges, that is the inlets with smallest and largest address carrying packets REQ with the same address DA The network AN drops the fields AC and DA, while transferring transparently all the other fields of packets REQ and QUE except their last fields GR and QS, respectively As long as this transparent
transfer takes place, the adders are disabled, so that they start counting when they receive the
first bit of GR and QS If and denote the binary numbers 0 and 1, each with
bits, the outlets of the i-component assume the following status:
The 0-component always transfers transparently on the field QS or GR received on
Figure 8.9 also shows the output of the i-components and adders at each stage, when the
run-ning sum is actually performed with the same pattern of packets of Figure 8.7
The concentration network CN is required to route N packets from N inlets
packets REQ represent a monotonic sequence of increasing addresses, as is guaranteed by the
pattern of indices written in field CI by SN and by the topological mapping between adjacent networks Therefore it can easily be shown that CN can be implemented as a reverse n-cube
network with stages In fact, if we look at the operations of the concentrator by
means of a mirror (inlets becomes outlet and viceversa, the reverse n-cube network becomes
an n-cube network) the sequence of packets to be routed to the outlets of the mirror network
is compact and monotone (CM sequence) and thus an n-cube network, which is functionally
equivalent to an Omega network, accomplishes this task (see Section 3.2.2)
As in all the other applications of the three-phase algorithm, an internal channel rate
bit/s higher than the external rate C bit/s must be used since the request,
acknowledgment and data phases share the same hardware The switching overhead factor iscomputed assuming the same hypotheses (e.g., each sorting/switching/adding stage accountsfor a 1-bit latency) and following the same procedure as for the IQ switch (Section 7.1.1.1)
acknowledg-ment phase Since the length of the six fields in a packet REQ sums up to
bit, then the total duration of the first two phases is
c i 1 c i reset i info i
0( )b ( )1 b
( )b if PI c( )i =1,PI c( i 1) =01
Trang 3Combined Input–Output Queueing 295
phase lasts bit times) Such an overhead, which is higher than in a multichannel switch(see Section 7.1.3.1), can be reduced by pipelining as much as possible the transmission of thedifferent packets through the interconnection network According to the procedure explained
in [Pat90], a reduced duration of the first two phases is obtained,
8.1.2 Performance analysis
The analysis of a non-blocking switch with input and output queueing (IOQ) is now
performed under the random traffic assumption with average value of the offered load p
The analysis relies as in the case of pure input queueing on the concept of virtual
queue defined as the set of HOL positions in the different input
queues holding a cell addressed to outlet i A cell with outlet address i entering the HOL
posi-tion also enters the virtual queue So, the capacity of each virtual queue is N (cells) The
virtual queue feeds the output queue , whose server is the switch output Agraphical representation of the interaction among these queues is given in Figure 8.10
The analysis always assumes a FIFO service in the input, virtual and output queue and,unless stated otherwise, an infinite switch size is considered Thus the number ofcells entering the virtual queues in a slot approaches infinity and the queue joined by eachsuch cell is independently and randomly selected Furthermore, since the arrival process fromindividual inlets to a virtual queue is asymptotically negligible, the interarrival time from aninput queue to a virtual queue becomes sufficiently long Therefore, the virtual queues, theoutput queues, as well as the input queues, are mutually independent discrete-time systems.Owing to the random traffic assumption, the analysis will be referred to the behavior of ageneric “tagged” input, virtual and output queue, as representative of any other queue of the
Figure 8.10 Queue model for the non-blocking IOQ switch
Virtual queue
Output queues Input queues
Trang 4same type Let , and denote the probability that a packet is offered to the tagged inputqueue, that a packet leaves the tagged input queue and that a packet leaves the tagged output
queue, respectively If the external offered load is p, then
As usual, the three main performance parameters will be computed, that is the switch
probabil-ity π These two latter measures will be computed as
(8.1)
(8.2)
where η denotes a waiting time, δ a queueing time (waiting plus service) and the subscripts i,
v and o indicate the input, virtual and output queue, respectively Note that denotes thetime it takes just to enter the HOL position in the input queue, as the waiting time beforewinning the output contention is represented by
The analysis will be first developed for the case of an output queue size equal to the
speed-up factor and infinite input queues and then extended to the case of arbitrary capacities forinput and output queues
8.1.2.1 Constrained output queue capacity
Our aim here is to analyze the IOQ switch with backpressure, output queue capacity equal to
[Ili90] Note that the selected output queue capacity is the minimum possible value, since it
would make no sense to enable the switching of K packets and to have an output queue ity less than K Rather than directly providing the more general analysis, this model in which
capac-the output queues has capac-the minimum capacity compatible with capac-the switch speed-up has capac-theadvantage of emphasizing the impact of the only speed-up factor on the performanceimprovements obtained over an IQ switch
Let us study first the tagged input queue, which is fed by a Bernoulli process Therefore it
by the waiting time it takes for the tagged packet to begin service in the tagged virtualqueue plus the packet transmission time, that is Using a well-known result[Mei58] about this queue, we get
(8.3)
in which the first and second moment of the waiting time in the virtual queue are obtainedfrom the study of the tagged virtual and output queue
Let us introduce the following notations:
number of packets entering the tagged virtual queue at the beginning of slot m;
number of packets in the tagged virtual queue after the packet switch taking place in
A m
R m
Trang 5Combined Input–Output Queueing 297
number of packets switched from the tagged virtual queue to the tagged output queue
in slot m;
number of packets in the tagged output queue at the end of slot m, that is after the packet switch taking place in slot m and after the eventual packet transmission to the
corresponding switch outlet
Based on the operations of the switch in the BP mode, the state equations for the system
):
(8.4)
These equations take into account that the packet currently transmitted by the output queue
to the switch outlet occupies a position of the output queue until the end of its transmission.Then, the cumulative number of packets in the tagged virtual queue and tagged outputqueue is given by
(8.5)
Note that according to this equation, if one packet enters an empty tagged virtual queue andthe output queue is also empty, then In fact, the packet is immediately switchedfrom the tagged virtual queue to the tagged output queue, but the packet spends one slot inthe tagged output queue (that is the packet transmission time)
Trang 6As with pure input queueing, the procedure described in [Kar87] enables us to prove thatthe random variable , also representing the number of packets becoming HOL in theirinput queue and addressing the tagged output queue, has a Poisson distribution as Since Equation 8.6 describes the evolution of a single-server queue with deterministic server(see Equation A.1), the system composed by the cascade of the virtual queue and tagged out-put queue behaves as an queue with an infinite waiting list Note that representshere the total number of users in the system including the user currently being served.
In order to compute the cumulative delay spent in the tagged virtual queue and taggedoutput queue, we can say that a packet experiences two kinds of delay: the delay for transmit-ting the packets still in the virtual queue arrived before the tagged packet, whose number is, and the time it takes for the tagged packet to be chosen in the set of packets with thesame age, i.e arriving in the same slot By still relying on the approach developed in [Kar87],
it is possible to show that the cumulative average delay in the tagged virtual queue and taggedoutput queue is given by the average delay (waiting time plus service time) of an
queue with arrival rate p, that is
(8.7)
However, a more detailed description of the virtual queue operation is needed since theaverage waiting time in the input queue is a function of the first two moments of the waitingtime in the virtual queue (see Equation 8.3) Since it has been proved [Ili90] that
we simply have to compute the first two moments of the virtual queue content The tion greatly simplifies the computation of these two moments, since the occurrence
assump-of at least one empty position in the tagged output queue implies that the tagged virtual queue
is empty So, the only state variable fully describes the content of the two queues, whichcan be seen as a single queue with the first positions representing the output queue and the
other N positions modelling the virtual queue (Figure 8.12) Therefore
Trang 7300 ATM Switching with Non-Blocking Multiple-Queueing Networks
Figure 8.13 Switch capacity of a BP IOQ switch when B o =K
Figure 8.14 Delay performance of a BP IOQ switch when B =K
0.500.600.700.800.901.00
Trang 8In the case of infinite input queue, the previous analysis of a queue applieshere too, so that the average waiting time is given by Equation 8.3, in which the first twomoments of the waiting time in the virtual queue are given by definition as
(8.10)
Unlike the previous case of a constrained output queue size, now the distribution function
is needed explicitly, since the condition makes it possible that thetagged output queue is not full and the tagged virtual queue is not empty in the same slot Thedistribution of the random variable is computed later in this section
When the input queue has a finite capacity, the tagged input queue behaves as a
queue with a probability of a packet arrival in a slot Also in this case theservice time distribution is general and the tagged virtual queue receives a load that can still be
input queue An iterative approach is used to solve the queue, where resents the total number of users admitted in the queue (including the user being served),which starting from an initial admissible value for , consists in
rep-• evaluating the distribution function of the service time in the input queue, which isgiven by the distribution function computed in the following section as a function
of the current value;
• finding the cell loss probability in the input queue according to the procedure described
in the Appendix;
These steps are iterated as long as the new value of differs from the preceding value for lessthan a predefined threshold The interaction among different input queues, which occurswhen different HOL packets address the same output queue, is taken into account in the eval-uation of the distribution function of the waiting time in the virtual queue, as will be shownlater
The packet loss probability is found by observing in each slot the process where
n is the number of packets in the input queue and j is the number of time slots needed to
com-plete the service of the HOL packet starting from the current slot Note that the service timefor a new packet entering the input queue is given by the above-mentioned distribution
the packet loss probability
The analysis of virtual queue and output queue will be developed separately for the twointernal protocols BP and QL
Trang 9302 ATM Switching with Non-Blocking Multiple-Queueing Networks
The BP protocol. The operations of the switch model with backpressure is illustrated by the
describ-ing the system evolution still apply for the description of variables and , whereas theone describing must be modified to take into account that the speed-up too can limit thepacket transfer from the virtual queue to the output queue, that is
The cumulative number of packets in the tagged virtual queue and tagged outputqueue and the total average delay spent in the two queues are still expressed by Equations 8.6and 8.7
As previously discussed, the evaluation of the delay and loss performance in the inputqueue requires the knowledge of the distribution function of the service time of the HOLcell in the input queue, which is obtained through the distribution function of the waitingtime in the virtual queue
Let the couple denote the generic state of the system , , that is the
tagged virtual queue and the tagged output queue hold at the end of the m-th slot i and j
pack-ets, respectively, and let indicate the probability of the state In order to compute
being served in the tagged virtual queue, a deterministic function is introduced: it sents the number of cells that can be switched from the tagged virtual queue to the tagged
repre-output queue in n time slots starting from an initial state of j cells in the repre-output queue If
is the system state found by the tagged packet entering the HOL position in the tagged input
queue and given that it is selected as k-th among those packets entering the same virtual queue
in the same slot, then
(8.11)
In fact, packets must be switched before the tagged packet
The evaluation of the number of packets that can be switched in n slots from the
tagged virtual queue to the tagged output queue can be evaluated by computing the number
of packets that can be switched in the generic slot l of these n slots It is simple to
Figure 8.15 Example of tagged VQ-OQ switching for BP operation
ηv n, = Pr F[ n j, <i+k≤F n+1 j, ]
i+k–1
F n j,
Trang 10show that is given by the minimum between the speed-up K and the number of idle
posi-tions in the output queue, that is
Since by definition
through a simple substitution we obtain the recursive relations for :
The probability distribution is then given by Equation 8.11 considering the range of
variation of k and saturating on all the possible i and j values that give the state That is
(8.12)
in which is the probability that the tagged packet is the k-th among the set of packets
entering the virtual queue in the same slot The factor is obtained as the probability that the
tagged packet arrives in a set of i packets times the probability of being selected as k-th in the set of i packets The former probability is again found in [Kar87] to be , in which
is the probability of i arrivals in one slot given by the Poisson distribution (we are assuming the
In order to determine the probability , we can write the equations expressing theMarkov chain of the system (virtual queue, output queue) Four cases are distinguished:
Here is evaluated by considering all the possible state transitions leading to state
virtual queue for each k from 0 to m Since the tagged virtual queue will be empty at the end of the slot, all the k packets already in the tagged virtual queue together with the new
Trang 11304 ATM Switching with Non-Blocking Multiple-Queueing Networks
packets are switched to the output queue, whose final state is
(recall that one packet per slot is transmitted to theswitch outlet) Thus we obtain
(8.13)
in which
Note that the upper bound of the first sum comes from the condition that the index m
must be less than the speed-up factor in order for the virtual queue to be empty at the end
of the slot The second term takes into account that a transition from state (0,0) to state
is possible only when , by also considering that no packet is transmitted by theempty output queue
2. Non-empty tagged virtual queue and not more than K−1 packets in the tagged output
(8.14)
since the tagged virtual queue is non-empty only if the output queue holds at least K
pack-ets
3. Non-empty tagged virtual queue and more than K−1 packets in a non-saturated tagged
(8.15)
4. Saturated tagged output queue Through considerations similar to those inCase 1, the state probability is given by
(8.16)
In fact, the total number of packets in the virtual queue after the new arrivals becomes
Only of these packets can be switched to occupy the residual idle
capacity of the output queue, whereas the other i packets remain stored in the virtual
queue The last term gives a non-null contribution only for architectures with a speed-upequal to the output queue capacity (remember that our assumption is )
Some performance results given by the analytical model just described are now provided,whose accuracy is also shown by providing as well the results obtained through computer sim-ulation of the switch architecture A switch size has been selected in the simulationmodel, so that the mismatch between the finite size of the simulated switch and the infiniteswitch size of the model can be neglected In fact, it is well known (see, e.g., [Kar87], [Pat90])
Trang 12that the maximum throughput of a non-blocking switching structure with input (or input andoutput) queueing converges quite rapidly to the asymptotic throughput as the switch sizeincreases Unless specified otherwise, solid lines in the graphics represent analytical results,whereas plots are simulation data.
The switch capacity is obtained as in the case of constrained output queue capacity,that is as the limiting load of an infinite capacity input queue Therefore themaximum throughput is again given by the root of the function
(see Equation 8.3) Note that now a completely different procedure has been applied to pute the moments of the waiting time in the virtual queue starting from its probabilitydistribution function (Equation 8.12) Figure 8.16 shows the asymptotic throughput perfor-mance of the switch in the BP mode for increasing output buffer sizes starting with
com-and different speed-up values The throughput increases with the output buffer size
assuming a speed-up provides almost the same performance for small buffer sizes
, but the asymptotic throughput is now 0.996 Thus a speed-up gives anasymptotic throughput very close to the theoretical maximum
The average packet delay is given by Equation 8.1: the input queue component is provided
by Equation 8.3 with the moments of the waiting time computed by the iterative procedureexplained above, whereas the virtual and output queue component is given by Equation 8.7
The performance figure T is plotted in Figure 8.17 as a function of the offered load for with With the BP mode the average delay decreases for a given offered load
as the output buffer size is increased This is clearly due to the increase of the asymptoticthroughput with the output buffer size that implies a better performance for an increasing
Figure 8.16 Switch capacity of a BP IOQ switch
p = p i
K = 2 B i = ∞
Trang 13Combined Input–Output Queueing 307
An example of packet switching taking place in slot m is shown in Figure 8.19 for
The state equations are now given by
(8.17)
Figure 8.18 Loss performance of a BP IOQ switch
Figure 8.19 Example of tagged VQ-OQ switching for QL operation
Trang 14Unlike the BP protocol, in this case the cumulative time spent by the tagged packet in thevirtual queue and output queue cannot be expressed through an system, because ofthe difference between and However, now the number of packets in the virtualqueue, expressed as
(8.18)
may be computed separately from Thus, from Equation 8.18 we obtain the equations for
the computation of the probability of i packets in the virtual queue, that is
The distribution function of the waiting time in the virtual queue is then computed larly to Equation 8.12 considering that in the QL mode , that is
simi-(8.19)
Note that the distribution is not needed to compute
In order to obtain the distribution function of the waiting time in the output queue, thejoint statistics of the virtual queue and output queue has to be evaluated also in this case Infact, even if there is no backpressure mechanism, the output queue state is still dependent
on the virtual queue state in the preceding time slot The equations expressing the jointprobability of the state are the same as in the BP mode for a non-saturated output queue(cases 1–3) as the backpressure mechanism will have been operating only when The equation of Case 4 of the BP mode (Equation 8.16) must be replaced by
(8.20)
The distribution function of the number of packets in the output queue is easily obtained
as the marginal distribution of
Trang 15Combined Input–Output Queueing 309
and by applying Little’s formula the average waiting time in the output queue is obtained, thatis
are lost if the residual output queue capacity is packets (remember that one packet isalways transmitted by the non-empty output queue) For , the residual queue capacitymust be , owing to the limiting speed-up factor Thus, the output queue must hold
packets in order for exactly l packets to be lost.
The Markov chain representing the state can be solved iteratively by setting a able threshold for the state probability value assumed in two consecutive iterations However,
suit-in the BP mode the bidimensional Markov chasuit-in can be also solved without iterations by: (a)determining the distribution of the total number of cells in the tagged virtual and outputqueues by means of a monodimensional Markov chain; (b) using this distribution and theEquations 8.13–8.16 in such a way that each unknown probability can be expressed as a func-tion of already computed probabilities In the QL mode a similar approach does not yield thesame simplifying procedure Thus, the solution of the bidimensional Markov chain becomescritical in the QL mode as the state space increases, that is when the product becomeslarge To cope with these situations an approximate analysis of the QL mode can be carried out
in which the output queue state is assumed to be independent of the virtual queue state Byusing Equations 8.13–8.15, 8.20, replacing by and summing over the virtual queue
state index i, we obtain the distribution function of the packets in the output queue:
Trang 16(see Equation 8.3) Now the moments of the waiting time inthe virtual queue are computed by means of Equations 8.10 and 8.19 Figure 8.20 shows theasymptotic throughput of the switch in the QL mode for output buffer sizes ranging from
to The results of the exact model, which owing to the mentioned putation limits are limited up to , are plotted by a solid line, whereas a dashed linegives the approximate model (as usual, plots represent simulation results) By comparing theseresults with those under backpressure operation (Figure 8.16), the QL mode provides betterperformance than the BP mode The reason is that the cumulative number of packets lost by allthe output queues is random redistributed among all the virtual queues in the former case (insaturation conditions the total number of packets in the virtual queues is constant) Such a ran-dom redistribution of the packet destinations reduces the HOL blocking that occurs in the BPmode where the HOL packets are held as long as they are not switched According to ourexpectations, the asymptotic throughput of the QL and BP mode for very large values arethe same, since a very large output buffer makes useless the backpressure mechanism betweenoutput queue and virtual queue A very good matching is found between analytical and simu-lation results for the exact model of QL, whereas the approximate model overestimates themaximum throughput for small output queue sizes The approximation lies in independence ofthe arrival process to the output queue from the output queue status in the model, so that asmoother traffic is offered to the output queue (in the approximate model the output queuecan well be empty even if the virtual queue is not empty)
com-The average packet delay T for the QL mode is given by Equation 8.1: the output queue
component is given by Equation 8.21, the virtual queue component by Equation 8.19, whoseexpression is used in the iterative computation of the input queue component as provided by
Equation 8.3 The parameter T is plotted in Figure 8.21 as a function of the offered load
for with Unlike the BP mode, with the QL operation all the delaycurves have the same asymptotic load value independently of the actual value In fact thepacket delay approaches infinity when the input queues saturate and the saturation conditionsare functions only of the speed-up and are independent of the output buffer size Such asymp-totic offered load, , coincides with the asymptotic switch throughput given by aninfinite output buffer size The relatively small increase of packet delay for increasing values
Trang 17312 ATM Switching with Non-Blocking Multiple-Queueing Networks
is only due to the larger amount of time spent in the output queue All the delay curves move
to the right with higher speed-up values K
In the QL mode cells can be lost both at input queues and at output queues The formerloss probability, , is evaluated by means of the analysis of the tagged input queue
, whereas the latter loss probability is computed by Equation 8.22 These twoloss performance figures are given in Figure 8.22 and in Figure 8.23, respectively for speed-up and different output queue sizes under a variable offered load In general we note that
a very low loss value under a given offered load requires larger buffers at outputs than at inputs.The model accuracy in evaluating delay and loss performance figures give very good results, asshown in Figures 8.21–8.23 (plots are again simulation results) If the speed-up is increased,much smaller input queues are required For example, gives a loss probability below
As for the comparison of the total cell loss between the BP and QL modes as a function ofthe output buffer size, for different input queue sizes, Figure 8.24 shows the cell loss probabil-ity given by the only analytical model for and The loss curves for the twomodes approach the same asymptotic level for large output buffer sizes, as the absence of out-put buffer saturation makes the backpressure mechanism ineffective so that the two modesoperate in the same way Apparently, smaller cell loss figures are obtained as the input buffersize is increased Below a certain size of the output buffer the two modes provide a differentcell loss performance that is a function of the offered load and input buffer size For smallerinput queues the QL mode provides lower loss figures, whereas for higher values of the BPmode performs better Such behavior, already observed above, occurs for loads up to a certainvalue, since for load values close to the maximum throughput the QL mode always gives thebest loss results This phenomenon can be explained by considering that the backpressure
Figure 8.22 Input loss performance of a QL IOQ switch