The GBAR model exploits the properties enjoyed by teleconferencingtraf®c described in Section 12.2.2, the geometrically decaying autocorrelationTABLE 12.3 CAC Performance for Video Confe
Trang 1LONG-RANGE DEPENDENCE AND
QUEUEING EFFECTS FOR VBR VIDEO
be allocated to the connection to ensure adequate quality of service A model of thebandwidth that the connection will try to consume is required for this task Inaddition to providinga good description of the bandwidth requirements, the sourcemodel should be usable in the connection-acceptance decision model
Other chapters that contain related material are Chapters 9, 13, 16, and 17
12.1.1 Special Propertiesof Video
There are some physical reasons why traces from video sources are special Video is
a succession of regularly spaced still pictures, called frames Each still picture isrepresented in digital form by a coding algorithm, and then compressed to save
Self-Similar NetworkTraf®c and Performance Evaluation, Edited by KihongPark and Walter Willinger ISBN 0-471-31974-0 Copyright # 2000 by John Wiley & Sons, Inc.
285
Self-Similar NetworkTraf®c and Performance Evaluation, Edited by KihongPark and Walter Willinger
Copyright # 2000 by John Wiley & Sons, Inc Print ISBN 0-471-31974-0 Electronic ISBN 0-471-20644-X
Trang 2bandwidth See, for example, Netravili and Haskell [24] for full information aboutvideo coding A common way to save bandwidth is to send a reference frame, andthen send the differences of successive frames This is called interframe coding.Since the adjacent pictures cannot be too different from each other (because mostmotion is continuous), this generates substantial autocorrelation in the sizes offrames that are near to each other To protect against transmission errors, a full frame
is sent periodically Furthermore, when there is a scene change the frames no longerdepend on the past frames, so functional correlation ends; this may also end thestatistical correlation in the frame sizes Scene changes require that a complete newpicture be transmitted, so the scene lengths have an effect on the trace For these andseveral other reasons that are too dif®cult to describe here, video traf®c is differentfrom broadband data traf®c, and so the models and conclusions described in thischapter may not apply to other types of traf®c
Video quality degrades when information is lost during transmission or when theinterarrival times of frames are either large or very variable The latter is controlled
by limitingbuffer sizes; frames that arrive late might as well not arrive at all Videoengineers often describe the size of a buffer by the length of time it takes to empty it(which is the maximum delay a frame can incur) Current design objectives are for
a maximum delay of between 100 and 200 ms Since several buffers may beencountered from source to destination and there are other sources of delay (e.g.,propagation time), some studies use 10 ms as the maximum buffer size
Frames are transmitted in ®xed size units that we call cells The rate ofinformation loss is the cell-loss rate or CLR We are interested in situations wherethe cell losses occur because of buffer over¯ow We consider models of a singlestation, where the buffer size and buffer drain rate are given Under these conditions,the CLR is controlled by keepingthe traf®c intensity small enough to achieve aperformance goal A typical performance goal is to keep the CLR no larger than
10 k, where k is usually between three and six
These contraints on buffer size and CLR (and indirectly on traf®c intensity) giverise to a practical region of operation where the constraints are satis®ed The notion
of ``high'' traf®c intensity is related to these constraints When the design parameters(e.g., buffer size and processing speed) are speci®ed, the traf®c intensity is highwhen the constraints are just barely satis®ed
12.1.2 Source Modeling
The central problem of source modelingis to choose how to represent data traces bystatistical models A source model is sought for a purpose, which is usually as aninput process to a performance model We think that a source model is acceptable if
it ``adequately'' describes the trace in the performance model at hand By adequate
we mean that when the source model is used in the performance model, the values ofthe operatingcharacteristics of interest produced are ``close enough'' to the valueproduced by the trace The de®nition of close enough may depend on the use towhich the performance model will be put For example, long-range network
Trang 3planningtypically requires less accuracy for delay statistics and loss rates thanequipment engineering does.
We don't regard good source models for a given trace to be unique Differentpurposes may best be served by different models For example, the DAR and GBARmodels described in Sections 12.2.2 and 12.2.3 are designed for different purposes
A consequence of our emphasis on testingsource models by how well they emulatethe behavior of the trace they model in a performance model is that the con®denceintervals we emphasize are on the operatingcharacteristics of the performancemodels
12.1.3 Outine
We divide VBR video into two classes, video conferences and entertainment video.Section 12.2 contains two models for video conferences, and Section 12.3 contains amodel for entertainment video The models are vetted by comparingthe performancemeasures they induce in a simulation to the performance measures induced by datatraces All of these models are Markov chains, so they are short-range dependent(SRD) Hurst parameter estimates for the time series these models describe indicatethe presence of long-range dependence The reasons that short-range dependentmodels can provide good models for time series that exhibit long-range dependenceare given in detail in Section 1.4 Our results are summarized in the last section.12.2 VIDEO CONFERENCES
Video conferences show talkingheads and may be the easiest type of video tomodel The models developed for them will be expanded to describe entertainmentvideo in Section 1.3
12.2.1 Source Data
We have data from three different coders and four video teleconferences of aboutone-half hour in length The data consists of the size of each still picture, that is, ofeach frame All of the teleconferences show a head-and-shoulders scene withmoderate motion and scene changes, and with little camera zoom or pan All ofthe coders use a version of the H.261 video codingstandard The key differences inthe sequences are that sequence A was recorded by a coder that uses neither discrete-cosine transform (DCT) nor motion compensation, sequence B was recorded by acoder that uses both DCT and motion compensation, and sequences C and D wererecorded by a coder that used DCT but not motion compensation The graphs inFigs 12.1 and 12.2 show that the details (presence or absence of DCT or motioncompensation) do not have a signi®cant effect on the statistics of interest to us here.The summary statistics of these sequences are given in Table 12.1
All of these sequences are adequately described by negative-binomial marginaldistributions and geometric autocorrelation functions Figure 12.1 shows Q-Q plots
of the marginal distributions, which have been divided by their means; the ®t is
12.2 VIDEO CONFERENCES 287
Trang 5excellent for sequences C and D, good for sequence A, and adequate for sequence B.The negative-binomial distribution is the discrete analog of the gamma distribution,and a discretized version of the latter can be used when it is more convenient to
do so
Figure 12.2 shows the autocorrelation functions The ordinate has a log scale, sogeometric functions will appear as straight lines The geometric property holds for atleast 100 lags (2.5 seconds) for sequences B, C, and D, and for 50 lags for sequence
A For lags larger than 250, the geometric function underestimates the tion function We examined sequences A, B, and C and concluded they possess long-range dependence Since the autocorrelation functions shown in Fig 12.2 are solarge for small lags, it seem intuitive (to us, at least) that the short-range correlationsshould be the important ones to capture in a source model We propose usingthegeometric function rk for the autocorrelation function
autocorrela-Since the negative-binomial and gamma distributions are speci®ed by twoparameters, these parameters can easily be esimated from the mean and the variance
of the number of cells per frame by the method of moments Only those twomoments and the correlation coef®cient (r) are needed to specify the key properties
of VBR teleconference traf®c The correlation coef®cient can be estimated from thegeometric portion of the autocorrelation function by taking logarithms and doing alinear regression
12.2.2 The DAR Model
Our ®rst investigations of these sequences with Tabatabai [16] and Heeke [18]focused on multiplexingissues First, we established that the time series werestationary This was done by examiningplots of smoothed versions of the time seriesand boxplots of many partitions of the time series Next, we showed a Markov chainprovided a good description of the time series This was done via simulations asdescribed in Section 12.2.2.1 This means that the marginal distributions of the timeseries can be viewed as the steady-state distributions of the Markov chain A Markovchain that has a geometric autocorrelation function and whose steady-state distribu-tion can be speci®ed is the DAR(1) process introduced by Jacobs and Lewis [20].The only member of the DAR k family that is used here is the DAR(1), so the (1)will be deleted The transition matrix is given by
TABLE 12.1 Summary Statistics of Data Sequences
Trang 6where r is the lag-one autocorrelation coef®cient I is the identity matrix, and eachrow of Q consists of the steady-state probabilities In our case the steady-stateprobabilities are the negative-binomial probabilities described above, truncated atsome convenient value at least as large as the peak rate (the missing probability isadded to the last probability kept).
Equation (12.1) is convenient for analytical work, but it masks the simplicity ofthe DAR model Let Xnbe the size (in bits, bytes, or cells as appropriate) of the nthframe and r be as above; the DAR model is
is the reason the GBAR model described in Section 12.2.3 was introduced Thisdifference between the sample paths of the model and the data trace is mitigatedwhen several sources are multplexed The probability that X0
n Xn 1 is smallenough to be ignored in the following calculation When k sources are multiplexed,
Xn Xn 1 with probability rk, so the mean time between potential cell ratechanges with r 0:98 and k 16 is 3.6 Consequently, sample paths of themultiplexed cell streams from 16 sources are not constant for longintervals.12.2.2.1 Validating the DAR Model We validate the DAR model by lookingatperformance models for multplexinggain and connection admission control Toestimate statistical multiplexinggain, we use cell-loss probabilities [16] from asimple model of a switch The source model is a FIFO buffer that is drained at
45 Mb=s The length of the buffer is expressed as the time to drain a full buffer; this
is the maximum possible delay The results of ten simulations of the DAR model forsequence C are given by 95% con®dence intervals and are shown in Table 12.2 The
TABLE 12.2 Cell-Loss Rates for Trace and 95% Con®dence Intervals for DAR Model
Trang 7results of these simulations show that the DAR model does a good job of estimatingthe cell-loss rate when 16 sources are multiplexed Similar results were obtained forthe other sequences [18].
Now we consider connection admission control (CAC) Since the DAR model is aMarkov chain model of the source, it conforms to one of the sets of conditions asource model must have for the effective bandwidth (EBW) theory of Elwalid andMitra [8] Moreover, the DAR model is a reversible Markov chain, and so it inspired
a powerful extension of the EBW method, called the Chernoff-dominated eigenvalue(CCE) method [7] Suppose we have a switch that can process at rate C (Mb=s) andhas a buffer of size B (ms) We want to ®nd the maximum number of statisticallyhomogeneous sources that can be admitted while keeping the cell-loss rate no largerthan 10 6 The CDE method gives an approximate analytic solution with knownerror bounds; this solution is denoted by KCDE Another way to obtain the solution is
to test candidate values by evaluatingthe cell-loss rate by simulation; we treat this asthe exact solution and denote it by Ksim Table 12.3 compares the results of the CDEmethod to the CAC found from simulations The number admitted by the CDEmethod is a very close approximation to the ``true'' value obtained by simulation.This implies that the DAR model captures enough of the statistical properties of thetrace to produce good admission decisions
12.2.3 The GBAR Model
The DAR model may not be suitable for a single source (by a single source we mean
a source that does not interact with other sources) as described above Lucantoni et
al [22] give three areas where single source models are useful: studying what types
of traf®c descriptors make sense for parameter negotiation with the network at callsetup, testing rate control algorithms, and predicting the quality of service degrada-tion caused by congestion on an access link For this reason, Heyman [11] proposedthe GBAR model Lakshman et al [21] use the GBAR model to predict frame sizes
in a rate control algorithm
Lucantoni et al [22] propose a Markov-renewal process model to describe asingle source This model has the advantage of being very general, and thedisadvantage that it is not parameterized by some simple summary statistics of thedata trace The GBAR model exploits the properties enjoyed by teleconferencingtraf®c described in Section 12.2.2, the geometrically decaying autocorrelationTABLE 12.3 CAC Performance for Video Conference A and Video Conference C
Trang 8function and the negative-binomial (or gamma) marginal distributions, to produce asimple model based on the three parameters that describe these features.
The GBAR(1) process was introduced by McKenzie [23], alongwith some otherinterestingautoregressive processes (As with the DAR model we will drop theargument (1).) Two inherent features of this process are the marginal distribution isgamma and the autocorrelation function is geometric
Toward de®ningthe GBAR model, let Ga b; l denote a random variable with agamma distribution with shape parameter b and scale parameter l; that is, thedensity function is
fG t G b 1l ltb e lt; t > 0: 12:3Similarly, let Be p; q denote a random variable with a beta distribution withparameters p and q; that is, with density function
fB t G p 1G q 1G p q tp 1 tq; 0 < t < 1; 12:4where p and q are both larger than 1 The GBAR model is based on two well-known results: the sum of independent Ga a; l and Ga b; l random variables is a
Ga a b; l random variable, and the product of independent Be a; b a and
Ga b; l random variables is a Ga a; l random variable Thus, if Xn 1 is Ga b; l,
Anis Be a; b a, and Bnis Ga b a; l, and these three are mutually independent,then
A possible physical interpretation of Eq (12.5) is the following Interpret An asthe fraction of frame n 1 that is used in the predictor of frame n, so the ®rst term
on the left of Eq (12.5) is the contribution of interframe prediction In hybridPCM=DPCM coding[24], for example, resistance to transmission error is accom-plished by periodically settingsome differential predictor coef®cients to zero andsendinga PCM value We can think of Bn as the number of cells to do that If the
Trang 9distributional and independence assumptions listed above Eq (12.5) are valid, thenthe GBAR process will be formed.
Simulatingthe GBAR process only requires the ability to simulate independentand identically distributed gamma and beta random variables This is easily done; forexample, algorithms and Fortran programs are presented in Bratley et al [3].The GBAR process is used as a source model by generating noninteger valuesfrom Eq (12.5) and then roundingto the nearest integer It would be cleaner if adiscrete process with negative-binomial marginals could be generated in the ®rstplace McKenzie describes such a process (his Eq (3.6)) Unfortunately, that processrequires much more computation to simulate, and the extra effort does not appear to
be worthwhile
12.2.3.1 Validating the GBAR Model Ten sample paths of the GBAR processwere generated and used as the arrival process (number of cells per frame with a
®xed interframe time) in a simulation of a service system with a ®nite buffer and
a constant-rate server The cell-loss rates from these paths were averaged to obtain apoint estimate for the GBAR model The traf®c intensity is varied by changing theservice rate The points produced by the simulations are denoted by an asterisk InFig 12.3, we see that cell-loss rates computed from the GBAR model are close to thecell-loss rates computed from the data Note that for each traf®c intensity,the decrease in the cell-loss rate as the buffer size increases is very slight, forboth the model and the data This con®rms the prediction of Hwangand Li [19] ofbuffer ineffectiveness Since the GBAR model has only short-range dependence, thiseffect is not caused by long-range dependence here
Fig 12.3 Cell-loss rates for sequence A
12.2 VIDEO CONFERENCES 293
Trang 10Figure 12.4 shows that the mean queue lengths in an in®nite buffer computedfrom ®ve GBAR paths are similar to the mean queue lengths computed using thedata In Fig 12.4 the vertical axis on the left shows the mean queue length in cells.Video quality is poor when the cell delays are large; 100 ms is an upper bound on theacceptable delay at a node in a network that provides video services Two bufferdrain times are also shown; the practical region for the maximum is below and to theleft of the 100 ms line The range of the mean queue lengths shown exceeds thepractical region for maximum queue length In the practical region, the model andthe trace give very similar mean delays The differences between the mean queuelength from the GBAR model and the mean queue lengths from the data would beeven smaller if a ®nite buffer were imposed (This is the truncatingeffects of ®nitebuffers that is described in Section 12.4.2) The comparisons for sequences B, C, and
D are qualitatively the same as for sequence A
12.3 BROADCAST VIDEO
Now we turn to more dynamic sequences, such as ®lms, news, sports, andentertainment television Since the main purpose of the models is to aid in networkperformance evaluations, we are particularly interested in usingthe models to predictcell-loss rates
Trang 11Broadcast coded video has different bit-rate characteristics than coded video conferences Video conference sequences consist of head-and-shoulderspictures with little or no panning, while broadcast video is characterized by asuccession of scenes With interframe coding, it is clear that scene changes requiremore bits than intrascene frames, so broadcast video will differ from videoconferences in at least this respect; this was demonstrated by Yasuda et al [32].There are some other differences too, as demonstrated by Verbiest et al [31] andfurther ampli®ed by Verbiest and Pinnoo [30] In these papers, a DPCM-basedcodingalgorithm is used In the latter paper, it is shown that the number of bits perframe has a different autocorrelation function for broadcast video than for videoconferences or video telephony The autocorrelation functions for the last two aresimilar to each other and decay geometrically to zero For broadcast video, theautocorrelation function does not decay to zero Moreover, the ®rst frame after ascene change has signi®cantly more bits than other frames in the scene Ramamurthyand Sengupta [26] observe that the correlation function declines more rapidly atsmall lags than at large lags, and that the time series can be described by a semi-Markov process that has states identi®ed by the bit rates for different types of scenes(and a state for scene changes) We build on this idea [12]; the simple DAR andGBAR models that described video conferences are not suf®cient for broadcastvideo, although the DAR model is used as a building block in a more complexmodel.
VBR-12.3.1 Modeling Broadcast Video
We obtained several data sets giving the number of bits per frame for sequencesencoded by an intra®eld=interframe DPCM codingscheme without use of DCT ormotion compensation We did not have access to the actual video sequences Hence,
a visual identi®cation of scene-change frames was not possible Our modelingstrategy was to ®rst develop a way to identify scene changes, then construct modelsfor the lengths of the scenes and the number of cells in a scene-change frame.Finally, models for the number of cells per frame for frames within scenes weredeveloped
12.3.1.1 Preliminary Data Analysis Before describingthe statistical models, wereport some elementary statistics about the sequences we examined Figure 12.5shows the peak and mean bit rates, and their ratios
The to-mean ratios vary from 1.3 to 2.4 By way of comparision, the to-mean ratio for video-conference sequence with this codec (sequence A) is 3.2.Note that the larger peak-to-mean ratios are associated with the lower mean bit rates.The sequences divers, ®lm, Isuara 1, and Isaura 2, which have a low mean rate andhigh peak-to-mean ratios, were different TV programs recorded from a Cable TVnetwork (and designated as normal quality broadcast video) The sequences with lowpeak-to-mean ratios (such as football, sport, news, etc.) were taken directly from the
peak-TV studios (and designated high-quality broadcast video)
12.3 BROADCAST VIDEO 295
Trang 1212.3.2 Identifying Scene Changes
Figure 12.6 shows two segments of the trace of ®lm, and it can be seen that there areseveral spikes that are possibly due to scene changes Since these spikes may be adominant cause of cell losses, we need to model both their spacingand themagnitude If we use merely ®xed spacing at the correct rate but do not modelthe distribution of their spacing, multiplexed sources with nonidentical startingpoints will not have coincident spikes from time to time This will underestimate celllosses
Since we do not have a video record of the sequences, we will assume that a scenechange occurs when a frame contains an abnormally large number of cells compared
to its neighbors We make this notion quantitative in the following way Let Xi bethe number of cells in frame i At a scene change, the second difference
Xi1 Xi Xi Xi 1 will be large in magnitude and negative in sign Toquantify what we mean by large, we divide the second difference by the average
of the past few frames We found that using25 frames (1 second) in the average wasabout the same as using6 frames (about1
4second), and the latter was adopted Wechose 0:5 as the critical value; this choice is entirely subjective It would be nice ifthe subjectivity could be replaced by an objective criterion We examined thestatistical theory of outlier identi®cation for guidance and concluded that this iswishful thinkingbecause objective tests need to have ``outlier'' speci®ed externally
To see if our criterion identi®ed scene changes accurately, we looked at the timeseries Xiand the correspondingvalues of the scaled second differences Some of the
Isaura 2 Isaura 1
Fig 12.5 Peak-to-mean ratios
Trang 13data are shown in Fig 12.6 The ®rst 1000 frames are shown on the left The testvalues identify the scene changes with no false positives The choice of 0:5 as thecritical value is not signi®cant Frames 9001 through 10,000 are shown on the right.This is a more active (in terms of bit-rate ¯uctuations) subsequence, and changingthe critical value will affect the number of frames identi®ed as scene changes Thereare 317 scene changes when 0:5 is the critical value (the mean scene length is 6.5seconds); there are 374 when 0:4 is used, and 283 when 0:6 is used The densityfunctions of the scene lengths produced by these critical values are shown in Fig.12.7 We observe that the critical value does not have a large effect on the densityfunction.
12.3.2.1 Scene Lengths Plots of the autocorrelation function showed that scenelengths are uncorrelated, so the main modeling issue is to characterize the distribu-tion of the number of frames in a scene The shape of the density in Fig 12.7 wasobserved on all sequences except for news (This is a news broadcast There are 134scene changes, and 75 of them are 3 frames long Moreover, most of these occurconsecutively We do not have an explanation for this The 3-frame scenes weredeleted and the remainder are called news.d.)
This unimodal and long-tailed shape is characteristic of distributions used inreliability and insurance-loss models We will use the followingthree distributions ascandidates for describingscene lengths (and the number of cells per frame in thesequel)
Fig 12.6 Bit rates and scaled second differences
12.3 BROADCAST VIDEO 297
Trang 14The Gamma Distribution The density function is given by Eq (12.3).
The Weibull Distribution It has the complementary distribution function (theprobability that the random variable is larger than x)
be the coef®cient of variation; that is, g1 s=m Let g3 m3=s3; it is the coef®cient
of skewness For the distributions described above (and some others not mentionedhere), g1and g3do not depend on the scale parameter l Plotting g3versus g1gives a
FramesFig 12.7 Density functions of scene lengths
Trang 15curve for the gamma and Weibull distributions, and a curve for each value of k forthe Pareto distribution Figure 12.8 shows these curves, and points for the samplevalues for the scene-length moments from our video sequences (We excluded bingofrom this and all subsequent ®gures because it contained only 12 scenes Weincluded data from a software intraframe coding[10] of the ®lm ``Star Wars.'')Except for sequences news and news.d, the moments of all the sequences plottedfall near the four curves traced out by the distributions; the distances from the curvesare within the deviations we observed when samples from a true gamma distributionwere compared with the gamma curves We conclude that usually, but not always,scene lengths will follow a unimodal distribution that ®ts one of the common failure-time distributions.
12.3.3 Scene-Change Frames
Now we attempt to model the number of cells in the scene-change frames It is clearthat the frames that start a scene will have stochastically more cells than other framesbecause most of the picture has to be constructed afresh Different models will beneeded for the scene-change frames than for the intrascene frames Plots of theautocorrelation function showed no correlation amongframe sizes, so obtainingthedistribution is the main modelingissue
The distributions we use are on the interval 0; 1 The minimum number of cells
in a scene-change frame is a few thousand, so we shifted the data by subtracting the
Fig 12.8 Moments of scene lengths
12.3 BROADCAST VIDEO 299
Trang 16smallest value In other words, we model the amount larger than the minimum value
in the data
Figure 12.9 shows how the empirical values of g1 and g3 compare to the curvesfor the Gamma and Weibull distributions Except for fysics and football, the data arefar from the theoretical curves Since the third moment has a large sampling error forlong-tailed distributions [18, p 26], the deviation from the theoretical curves may bedue to samplingerror Based on Q-Q plots, two of the sequences (Isaura 1 andIsaura 2) had good Weibull ®ts, and two (dutch and star wars) had good gamma ®ts.The lognormal distribution ®ts the ®lm data well We could not ®t a model to theother sequences We conclude that a known distribution may not always be a good ®t
to the number of cells in a scene-change frame, and that the same distribution isunlikely to be a good ®t to all sequences
12.3.4 Intrascene Frames
A close look at the bit-rate plot in Fig 12.6 revealed that the effect of a scene changeappears to last for two frames; the ®rst frame after the scene change is also extralarge This was also observed by Ramamurthy and Sengupta [26] by looking at Fig
12 in Verbiest and Pinnoo [30] Since our data were produced by the same codec asthe data in Verbiest and Pinnoo [30], this is not surprising We examined ®lm, Isaura
1, and Isaura 2 and found that linear regression provided a good model for the
I I
Fig 12.9 Number of cells in scene-change frames
Trang 17number of cells in the ®rst frame after a scene change Letting Zn be the number ofcells in the nth scene-change frame, and Ynbe the number of cells in the succeedingframe, we have
as the video conference scenes do We obtain similar functions when all of theintrascene frames (excludingthe ®rst two frames of each scene) are treated as asingle time series The left side of Fig 12.10 shows this function for ¯im and Isaura
1 This method of estimatingthe autocorrelation function ignores scene boundaries.For example, the last frame of a scene and the ®rst (counted) frame of the subsequentscene contribute to the lagone term in the autocorrelation function For ®lm, thiscauses the autocorrelation function to have periodic pulses, as shown in Fig.12.10(c) These pulses are not present in the autocorrelation functions of the ®rstsix scenes and are just an artifact of the aggregation across scenes A more re®nedestimate is obtained by dividingthe data into scenes, calculatingthe empiricalautocorrelation function for each scene, and then averaging over the scenes To be
Fig 12.10 Autocorrelation within scenes
12.3 BROADCAST VIDEO 301