Qualitatively similar observations have been reportedelsewhere [2, 5, 8, 16±18, 20].19.1.3 Summary of the Proposed Architecture This chapter introduces a per-virtual-circuit per-VCframin
Trang 1in ATM networks Chapter 16 discusses an alternative approach for provisioning forlong-range-dependent (LRD)traf®c See also the work of Heyman and Lakshman(Chapter 12)and Li and Li (Chapter 13).
19.1.2 Correlated Traf®c and Its Implications
Studies on a range of video applications indicate that there exists a slowly decayingautocorrelation structure in the underlying stochastic processes [3, 7, 8, 12, 19, 21]
Self-Similar Network Traf®c and Performance Evaluation, Edited by Kihong Park and Walter Willinger ISBN 0-471-31974-0 Copyright # 2000 by John Wiley & Sons, Inc.
481
Copyright # 2000 by John Wiley & Sons, Inc Print ISBN 0-471-31974-0 Electronic ISBN 0-471-20644-X
Trang 2Providing guarantees on maximum-delay, delay-jitter, and cell-loss probabilities (orsome other measure of cell loss)in the presence of such traf®c is nontrivial,especially if the coef®cient of variation of the marginal distribution (or thedistribution tail)is large This is because such traf®c signi®cantly increases queuelength statistics at a multiplexer [2, 5, 8, 16±18, 20].
As an illustration, consider the performance of a ®nite-buffer queue Let thearrival process be a fractionally differenced autoregressive moving average process:
1 f1BDdXt Et;where
f1 0 < f1< 1 represents an exponentially decaying autocorrelation nent;
compo- d 0 < d <1
2 represents a hyperbolically decaying autocorrelation component;
fEtg is white noise, that is, uncorrelated; and
B is the backward shift operator, that is, BXt Xt 1
The correlation structure of Xt can be controlled by changing f1 and d Figure19.1 shows the mean cell loss versus number of frame buffers for three inputcorrelation structures with approximately the same coef®cient of variation 0:24
A frame buffer is the maximum number of cells that can be transmitted by the outputchannel in a given time interval (frame time)
Fig 19.1 Mean number of cells dropped for different dependency structures in the load (a)Mean utilization 0.9; (b)mean utilization 0.8 The model form used is afractionally differenced ARIMA 1; d; 0 process [13]: 1 f1BDdXt Et
Trang 3work-The solid line showing a slow decay is for a long-memory input traf®c sequence.The dashed line in the middle is for traf®c with short memory The line with thesmallest mean cell loss corresponds to a near white noise stream Note that mean cellloss decays very slowly with increasing buffer size for traf®c with a slowly decayingautocorrelation function Qualitatively similar observations have been reportedelsewhere [2, 5, 8, 16±18, 20].
19.1.3 Summary of the Proposed Architecture
This chapter introduces a per-virtual-circuit (per-VC)framing structure and apseudo-earliest-due-date cell dispatcher to provide guaranteed delay-jitter bounds.Heterogeneous jitter bounds are supported through software-controlled frame sizes,which may be independently set of each VC The framing structure is a general-ization of per-link framing introduced by Golestani The proposed framing structureeliminates correcting for phase mismatches between incoming frames and outgoingframes, necessary in per-link framing This results in reduction in end-to-end delaybound and buffer requirements, and a simpler implementation
Strong autocorrelations typically seen in video traf®c make equivalent bandwidthcomputations for heterogeneous cell-loss bounds intractable To address this, theframing strategy is combined with an active cell-discard mechanism with prioritizedcell-dropping, the latter utilizing the history of dropped cells and target cell-lossbounds for each VC Upper bounds on the equivalent bandwidth needed to support agiven workload with a target quality of service are developed These are validatedthrough numerical and simulation results from variable bit rate MPEG-I videotraces
A high-level view of the proposed architecture is as follows
1 A Framing Structure on a Per-VC Basis To provide heterogeneous delay-jitterbounds, a framing structure is induced on VCs, similar to that in Golestani [9±11].(Differences between the two approaches are described later.)Consider a virtualcircuit (VC), i, with a desired delay-jitter bound Mi The frame structure splits timefor this VC into juxtaposed intervals of length Miat each multiplexing point Cellsfrom VC i that arrive in a given frame at a multiplexer are buffered, and nottransmitted until the beginning of the next frame time If suf®cient capacity isavailable to transmit these cells in the next interval, all cells arrive at the next hopwithin an interval of length Mi This way, they are guaranteed to meet a delay-jitterbound of Mi Also, if Hi is the number of hops for VC i, and Diis a bound on theone-way propagation and processing delay, all cells that make it to the receiver areguaranteed an end-to-end delay bound of HiMi Di
2 Priority Scheduling Cells at an output queue that are ready to go contend forbandwidth and have competing delay-jitter bounds and cell-loss probabilitybounds A priority scheduler addresses these concerns For delay-jitter bounds, thescheduler follows an earliest due-date principle with modi®cations to enhancealgorithmic ef®ciency For cell-loss bounds, it uses a minimum guaranteed capacity,
Trang 4Ci cells=frame for VC i, with the rest of the cells, if any, scheduled on anonguaranteed basis Ci is based on (i)marginal distribution of #cells=frame, (ii)maximum acceptable probability of cell loss in a frame, and (iii)the equivalentbandwidth of all VCs in this jitter class The Ci are computed by the equivalent-bandwidth unit described in Item 4 below.
3 An Active Cell-Discard Unit If there are excess cells left over from a frame atthe multiplexer after the corresponding frame time is over, it is likely that this is due
to persistence in the arrival process as suggested by the solid lines in Fig 19.1 Thesecells are likely to cause increased delay for cells in successive frames Since bufferingdoes not reduce cell loss signi®cantly for persistent traf®c, we may elect to either tossthem right away or mark them as low-priority cells and discard them on demand.The active cell-discard unit reclaims (or marks as old)cells that do not gettransmitted in their frame time It is activated at the end of each frame
An important side effect of using the active cell-discard unit is that it simpli®escomputation of equivalent bandwidth for correlated traf®c (especially for hetero-geneous cell-loss bounds), while achieving high utilization through statisticalmultiplexing
4 Equivalent Bandwidth Computations Algorithms for computing upperbounds on equivalent bandwidths are developed They address heterogeneous cell-loss probabilities and heterogeneous jitter classes
The computation decomposes traf®c by their jitter requirements All connectionsrequiring a given delay-jitter are grouped into a class For each jitter class, let
Ek k 0; 1; 2; be the desired mean cell-loss ratio of a subset of connections k
An iterative algorithm approximates the total capacity needed to meet fEkg Also,virtual capacities Ck k 0; 1; 2; are computed All groups of connectionsspecifying Ekare guaranteed a bandwidth of Ck in every frame time if they need it.However, unused portions of virtual capacities are available to other connections
The frame size for each VC is software setable, so delay-jitter bounds may benegotiated over a continuum (at the granularity of cell transmission time) Also,unlike per-link framing (see Section 19.1.4), the frame size of a given VC isnot constrained by frame sizes of other active VCs
protection from misbehaving or malfunctioning VCs
A call admission unit will use the equivalent bandwidth algorithms todetermine if a speci®c cell can be admitted without violating quality-of-serviceguarantees of other calls (or if an important call must be admitted, which calls
to disconnect) The call-admission unit is beyond the scope of this chapter
Trang 519.1.4 Relationship with Stop-and-Go Queueing
Per-VC framing has been derived from Stop-and-Go Queueing described inGolestani [9±11] The primary enhancements are as follows
1 Framing is induced on a per-VC basis instead of a per-output-link basis; seeFigs 19.2 and 19.3 Per-VC framing eliminates the need for correcting forphase mismatches between incoming frames and outgoing frames at a multi-plexer and signi®cantly simpli®es its implementation As we shall see inSection 19.3, per-VC framing also reduces the maximum queueing delay byhalf and cuts buffer requirements by one-third at a switch, while retaining thesame delay-jitter bound per-link framing
2 Once cells from a frame become active (i.e., not dormant, waiting for theirnext frame time), they compete with active cells from other VCs for the outputlink The algorithms that decide on which active cells to transmit and when,and which cells to drop, are necessitated by the need to meet heterogeneous
Fig 19.3 Arriving and departing frames for VC i when framing is induced on a per-VCbasis Frames of different VCs need not be synchronized
Fig 19.2 Arriving frames and departing frames when framing is induced in a per-link basis.The phase mismatch between arriving and departing frames is corrected through delaycircuits
Trang 6cell-loss bounds and heterogeneous delay-jitter bounds simultaneously Theyalso provide a ®rewall across connections (protection from misbehavingsources) These algorithms are new.
In Golestani [9], the objective was to support no-loss transmission withheterogeneous delay-jitter bounds The latter were integral multiples of thesmallest jitter bound supported Golestani showed that a preemptive priorityscheduler with highest priority to the smallest jitter class could meet all jitterbounds if suf®cient capacity was available Golestani [11] also presented asolution that allowed for cell losses for a single jitter class (®xed delay-jitterbound)
In the general case of meeting heterogeneous delay-jitter bounds withpotential cell losses, however, the scheduler needs to follow (i)an earliest due-date principle and (ii)a cell-drop policy that takes into account currentobservations on dropped-cells per VC, and heterogeneous cell-loss boundsacross VCs See Section 19.2.3
3 The original Stop-and-Go Queueing requirements that a traf®c stream declareits r; T-smooth1parameter is dropped This trades off higher utilization for alossless network For a long-memory input stream, the average rate over asmall interval, T, can be signi®cantly higher (or lower)that its overall averagerate, so r would need to be the peak rate for lossless transmission and wouldresult in signi®cantly low utilizations In the current proposal, cell losses,while allowed, will be reduced through statistical multiplexing across virtualcircuits and controlled through equivalent bandwidth computations
4 No-loss transmission can be guaranteed in the proposed architecture if desired;see Section 19.4.3 However, the emphasis is on ef®cient statistical multi-plexing that can also guarantee speci®ed cell-loss bounds
19.1.5 Outline
The rest of this chapter is organized as follows Section 19.2 presents the proposedarchitecture It includes (1)per-VC framing with active cell-discard and (2)celldispatching to meet the heterogeneous delay-jitter and cell-loss guarantees forheterogeneous VCs (with heterogeneous marginal distributions and autocorrelationstructures) Section 19.3 presents maximum-delay bound, delay-jitter bound, andbuffer requirements for per-VC framing and compares the results with per-linkframing Section 19.4 addresses upper bounds on equivalent bandwidth needed tomeet heterogeneous delay-jitter requirements and heterogeneous cell-loss probabilitybounds, presents numerical and simulation examples, and shows that loss-freetransmission may be achieved for desired VCs Section 19.5 presents relatedwork Section 19.6 presents our conclusions
1 An r; T-smooth stream was de®ned as one where the average bit rate over a time interval T did not exceed r Equivalently, the number of bits over nT; n 1T did not exceed rT, for all integer n.
Trang 719.2 PROPOSED ARCHITECTURE
19.2.1 Framing on a Per-Virtual-Circuit Basis Versus Per-Link Basis
Enforcing framing on a per-link basis [9±11] results in a phase mismatch at a switchbetween arriving frames on input links and departing frames on output links Thisphase mismatch is due to different propagation delays on different input links Asshown in Fig 19.2, the arriving frames on input link 1 and departing frames on theoutput link have a phase mismatch of y1d, while the arriving frames on input link 2have a phase mismatch with respect to the output link of y2d
To correct for a phase mismatch, additional delay circuitry is necessary Also, theadmissible set of frame sizes is constrained For example, all frame sizes areconsidered integer multiples of a base frame size in Golestani [9] A simplerapproach is to adopt a per-VC framing, without concern for what the frame sizes are,and whether or not the frames from different VCs are synchronized with respect toeach other; see Fig 19.3 As we will show in Section 19.3, per-VC framing, inconjunction with active cell-discard and an appropriate scheduler, retains theadvantages of per-link framing, while improving on performance bounds andfunctional ¯exibility For example, if VC i's frame size is Mi, and the number ofhops is Hi, per VC framing provides the same delay-jitter bound, Mi, as per-linkframing, a reduction in maximum-delay bound by an amount HiMi, and a maximumbuffer requirement that is one-third lower Also, in conjunction with the celldispatcher described in Section 19.2.3, it guarantees heterogeneous cell-lossbounds for correlated traf®c Functional ¯exibility includes ability to set andmodify admissible jitter classes at run time, and not be constrained to an integermultiple of a base frame size
Hardware support for ef®cient and ¯exible implementation of per-VC framingwith active cell-discard is discussed next
19.2.2 Implementation of Per-Virtual-Circuit Framing with Active CellDiscard
The objective is to induce a framing structure on top of cells of a given VC, and forthe multiplexer to actively discard (or mark as old)cells that are not served duringtheir assigned frame time
In order to allow for ¯exibility of application-speci®ed jitter bounds, theframetime should be software setable (e.g., it may be negotiated during connec-tion-open) It should then be set to the connection's delay-jitter tolerance One mayallow for adjusting the frame time during the lifetime of a VC, if desired
Issues that need to be addressed for per-VC framing, and active cell-discard are asfollows
1 Frame Identi®cation Across Nodes (where Nodes Refer to Switches and EndPoints) Cells transmitted during the tth frame t 0; 1; 2; by a node must berecognized as belonging to frame t by the next downstream node An alternating bit
Trang 8sequence number distinguishing cells in adjacent frames is suf®cient if the sequencenumber is generated at the transmitter Old cells, if implemented, will be marked bythe ®rst multiplexer where a jitter deadline is missed.
2 Frame-clock Generation For the ith VC, one needs a step-down counter,initialized under software control, to the maximum number of cells that constitutesits frame time Let this number be Mi The counter is to be fed with a clock that runs
at the speed of cell transmission at the output link On each clock cycle (at cellgranularity), the counter must count down one tick until it hits zero At this point, itwill need to generate a frame-clock signal and reset itself to Mi
3 Cell Tagging A cell arriving during frame t for VC i will not be eligible forservice until frame t 1 for the same VC It is, therefore, assigned a state, dormant,
on arrival See Fig 19.4(a) When the next frame-clock signal arrives, the cell isready to be transmitted, so its state needs to be changed to active If it still remains inthe queue when the following frame-clock signal arrives, it is old, and now there aretwo possibilities One strategy is simply to discard the cell and reclaim its buffer Asecond strategy is to change its state to old and keep it eligible for transmission on abest-effort basis In ATM networks, cells need to be delivered in sequence, so itmight be simplest to discard the old cells
To simplify the discussion for what happens next, let us assume that active cellsthat are not transmitted in their frame time are discarded Then, at any given time,cells belonging to frames t and t 2 will never be simultaneously present at themultiplexer output queue, and all that is necessary is to distinguish between cells offrames t and t 1 A single bit, therefore, suf®ces to distinguish between active anddormant cells
Fig 19.4 Implementing per-VC framing with active cell-discard
Trang 9Assume that during frame t, dormant cells are represented by a 0 and active cellshave been marked 1 in the previous cycle On a new cell arrival, the multiplexerneeds to attach to it a tag identifying its VC and its frame number (in this case 0), setits valid bit to 1, and forward it to the output queue See Fig 19.4(b) The valid bit'sfunction is to help discard cells, similar to the action of ¯ushing a cache memory on
a context switch In a fast cell-switched, VC network, a switch would implement atagging scheme for VC identi®ers anyway, so additional circuitry needed is small
On the next frame clock, the entire output queue would be fed with two logicalsignals, one to deactivate the active cells that did not get transmitted during theirallotted frame time (due to lack of available capacity), and one to activate thedormant cells See Fig 19.4(c) Both of these can be achieved by associativelymatching cell tags with an identi®er representing the appropriate VC and its state.The primary difference between this and off-the-shelf content-addressable memories
is that more than one match is likely, especially for dormant cells On a match,activie cells mark themselves invalid by setting their valid bits to 0; the dormantcells move to the active state and are ready to be transmitted At this point, theymove under the control of the cell dispatcher, which must decide on a strategy that isconsistent with the overall goals of delay-jitter and statistical cell-loss bounds
A convenient model for the buffer memory organization is to view it as a set oflogical queues, one per VC, with a sequence number distinguishing active anddormant cells All old cells may potentially be grouped into one logical queue, asdiscussed below
19.2.3 Cell Dispatcher
The cell dispatcher is responsible for (1) scheduling and (2) transmitting active and(potentially) old cells Dormant cells are not within its purview From the dispatch-er's perspective, the active cells for each VC are assumed to be logically organized as
a queue (see Fig 19.5(a)) The old cells (implemented optionally)are organizedeither as separate queues or as a single queue In either case, they are served on abest-effort basis and may be reclaimed before they are served to accommodate newcell arrivals
The dispatcher consists of two concurrent units, a scheduler and a transmitter.The scheduler allocates cell times to active cells of individual VCs and decideswhich cells are to be dropped if contentions for capacity arise The transmittertransmits them (and old cells if all active queues are empty and old cells are waiting).The scheduler will guarantee transmission of at least Ci cells=frame for connection
i i 1; ; K, where K is the number of active VCs at the multiplexer Thecomputation of the Ciis based on cell-loss requirements for different VCs and theirmarginal distributions, and is presented in Section 19.4 The scheduler and thetransmitter share a circular buffer that represents channel allocations in the future.This circular buffer is presented below as a linear array for convenience ofexposition Let this data-structure be called channel_image Channel_image
n records the ID of one VC If channel_imagen equals i, the transmitter will
Trang 10transmit from the head of the active queue corresponding to VC i at time n This will
be modi®ed below after the basic algorithm is presented
The scheduler is activated on every new frame activation, that is, on a frameclock Let the new frame activation be at time n (See Fig 19.5(b).) Let thecorresponding VC be i, the frame length (jitter bound)be Mi, the number of cells
in the current active frame be mi, and the minimum number of cells guaranteed to betransmitted from this VC in this frame be Ci Channel_image records the action
to be taken by the transmitter in future slots The scheduler either marks the slots inchannel_image with a VC identi®er or leaves them empty If it does mark a slot,
it also records whether the transmission is to be guaranteed or not-guaranteed If aslot is marked not-guaranteed, it may be reclaimed at some point in the future toserve a different VC (as described in Section 19.2.3.1)
The scheduler's task is as follows Assume that it is activated on VC i's frameclock The time window in the future over which the mi active cells need to betransmitted is n 1, n Mi] (The nth slot is kept aside for the transmission tobegin transmitting.)The scheduler follows the following algorithm
(a)Beginning with n Mi, going down to n 1, the scheduler attempts to ®ndthe largest ki mi slots that are empty in channel_image n Mithrough channel_image n 1, and marks each with the current VC, i.(b) If (ki Ci) {
Trang 11(c)If ki mi, the scheduler's allocation task for this frame is done Else, it needs
to pick at most mi ki slots and at least Ci ki slots in channel_image n 1 through channel_image n Mi that are marked, but notnecessarily guaranteed, and overwrite them with i Each slot overwrittenrepresents a cell loss for the corresponding VC Since this would give priority
to some VCs over others when the number of active cells exceeds thecapacity available (for meeting deadlines), the policy used must balancefrom fairness and cell-loss commitments The dropping policy is describedbelow
19.2.3.1 Dropping Policy Let the negotiated mean cell-loss ratio of VC i be Ei,and the estimated cell-loss ratio at time n be ^Ein Let Sin Ei=^Ein Yang and Pan[24] have proposed a dropping policy where, if an incoming cell arrives to a fullbuffer (in our case, full channel_image), the scheduler will search the buffer forthe VC j that has the largest Sjn and discard one of its cells If the arriving cellbelongs to VC j itself, then that cell will be dropped The authors show that usingthe largest Sjn is optimal in bandwidth utilization among all stationary, space-conserving loss-scheduling schemes It is also optimal among all stationaryscheduling strategies when cell-loss requirements of all VCs are equal (see Yangand Pan [24])
We modify this strategy to guarantee a minimum Ci cells=frame for individualVCs and replace negotiated mean cell-loss ratio with negotiated over¯ow probability.Continuing from the point just after (c)in the cell-dispatcher's algorithm:
(d)If k1< Ci, the scheduler needs to overwrite at least Ci kicells belonging to
(not_guaranteed[j]> 0, starting with the largest Sjn, and updating
Sjn as described in Yang and Pan [24] In all cases, the number of erasedcells is decremented from not_guaranteed[j] guaranteed_-cells[i] is incremented by the corresponding amount The latter should
be equal to Ciat the end of this step This is because the equivalent bandwidthalgorithm described in Sectin 19.4.1.2 ensures thatPiCiis always less than
or equal to available capacity Therefore, if a cell is to be dropped,
Pj6inot guaranteedj Ci ki:
(e)If mi> Ci, then from the previous step, Ci cells have been scheduled Anadditional li cells 0 li mi Ci may be scheduled, in which case
cells only if there exists a j such that (not_guaranteed[j]> 0)and
Sjn > Sin
(f) Sin is updated with the number of dropped cells in this fame, if any, that is,with mi Ci li
Trang 12The above cell dispatcher guarantees a minimum capacity Ci in each frame for
VC i Extra capacity, if available, will be used by other VCs without interferencefrom the dispatcher's dropping policyÐuntil the sum of arrivals ®lls up chan-nel_image, that is, exceeds the total output link capacity This way, if the number
of arrivals in a frame for some VC is less than the allocated capacity for it, the extracapacity may be used by other VCs on a nonguaranteed basis
If the output link capacity is exceeded, however, the dropping policy prioritizesthe nonguaranteed cells of different VCs in accordance with their current Sj values.With upper-bound computations of equivalent bandwidths (Section 19.4), andtypical low cell-loss requirements expected, the dropping policy is not expected to
be called upon too frequently Simulation results comparing it with one that dropsnonguaranteed cells on a last-in-®rst-out (LIFO)basis (i.e., newly active non-guaranteed cells that ®nd a full channel-image)are given in Section 19.4.1.3.19.2.3.2 Transmitter The transmitter works as follows At time n,
1 If channel_imagen is not empty, let i channel_image [n] Thetransmitter transmits an active cell from the queue corresponding to VC i, andgoes to Step 3
2 If channel_imagen is empty, however, the transmitter proceeds to thenext nonempty slot n0> n If no such n0 exists, it proceeds to Step 4 Else ittransmits an active cell from the queue corresponding to VC recorded
in channel_imagen0 and marks this slot empty Let this VC identi®er
Assume that the current time is n The scheduler reserves slots beginning withtime n Midown to n 1 This ensures that slots near time n are available for VCsactivated at some time in the future with deadlines earlier than n Mi, that is, havingsmaller delay-jitter bounds Since the objective is to ensure that active cells of VC iare transmitted at or before time n Mi, this strategy would meet more deadlinesthan if its cells were allocated from time n 1 upward
The transmitter scans channel_image[] in ascending slot order (makingnecessary modulo arithmetic corrections for a circular buffer implementation) If slot
n is nonempty, it transmits a cell from the corresponding VC recorded in