Most commonly used state space models are Markov chains.. A number of concise notations based on queueing networks and stochastic Petri nets have evolved, and software packages that auto
Trang 1Introduction
1.1 MOTIVATION
Information processing system designers need methods for the quantification
of system design factors such as performance and reliability Modern com- puterr communicationI’ and production line systems process complex work- loads with random service demands Probabilistic and statistical methods are commonly employed for the purpose of performance and reliability eval- uation The purpose of this book is to explore major probabilistic modeling techniques for the performance analysis of information processing systems Statistical methods are also of great importance but we refer the reader to other sources [JainSlI’Triv82] for this topic Although we concentrate on per- formance analysisI’we occasionally consider reliabilityI”availabilityI’and com- bined performance and reliability analysis Performance measures that are commonly of interest include throughputI’resource utilizationrloss probabili-
@and delay (or response time)
The most direct method for performance evaluation is based on actual measurement of the system under study HoweverI’during the design phaser the system is not available for such experimentsrand yet performance of a given design needs to be predicted to verify that it meets design requirements and to carry out necessary trade-offs HenceI’abstract models are necessary for performance prediction of designs The most popular models are based on discrete-event simulation (DES) DES can be applied to almost all problems
of interestrand system details to the desired degree can be captured in such simulation models F’urthermoreI’many software packages are available that facilitate the construction and execution of DES models
1
Gunter Bolch, Stefan Greiner, Hermann de Meer, Kishor S Trivedi
Copyright 1998 John Wiley & Sons, Inc Print ISBN 0-471-19366-6 Online ISBN 0-471-20058-1
Trang 2The principal drawback of DES models, however, is the time taken to run such models for large, realistic systems particularly when results with high accuracy (i.e., narrow confidence intervals) are desired A cost-effective alter- native to DES models, analytic models can provide relatively quick answers to
“what if” questions and can provide more insight into the system being stud- ied However, analytic models are often plagued by unrealistic assumptions that need to be made in order to make them tractable Recent advances in stochastic models and numerical solution techniques, availability of software packages, and easy access to workstations with large computational capa- bilities have extended the capabilities of analytic models to more complex systems
Analytical models can be broadly classified into state space models and non-state space models Most commonly used state space models are Markov chains First introduced by A A Markov in 1907, Markov chains have been in use in performance analysis since around 1950 In the past decade, consider- able advances have been made in the numerical solution techniques, methods
of automated state space generation, and the availability of software packages These advances have resulted in extensive use of Markov chains in performance and reliability analysis A Markov chain consists of a set of states and a set of labeled transitions between the states A state of the Markov chain can model various conditions of interest in the system being studied These could be the number of jobs of various types waiting to use each resource, the number of resources of each type that have failed, the number of concurrent tasks of a given job being executed, and so on After a sojourn in a state, the Markov chain will make a transition to another state Such transitions are labeled either with probabilities of transition (in case of discrete-time Markov chains)
or rates of transition (in case of continuous-time Markov chains)
Long run (steady-state) dynamics of Markov chains can be studied using
a system of linear equations with one equation for each state Transient (or time dependent) behavior of a continuous-time Markov chain gives rise to
a system of first-order, linear, ordinary differential equations Solution of these equations results in state probabilities of the Markov chain from which desired performance measures can be easily obtained The number of states
in a Markov chain of a complex system can become very large and, hence, automated generat ion and efficient numerical solution met hods for underlying equations are desired A number of concise notations (based on queueing networks and stochastic Petri nets) have evolved, and software packages that automatically generate the underlying state space of the Markov chain are now available These packages also carry out efficient solution of steady-state and transient behavior of Markov chains In spite of these advances, there
is a continuing need to be able to deal with larger Markov chains and much research is being devoted to this topic
If the Markov chain has nice structure, it is often possible to avoid the generation and solution of the underlying (large) state space For a class
of queueing networks, known as product-form queueing networks (PFQN),
Trang 3it is possible to derive steady-state performance measures without resorting
to the underlying state space Such models are therefore called non-state space models Other examples of non-state space models are directed acyclic task precedence graphs [SaTr87] and fault-trees [STP96] Other examples of methods exploiting Markov chains with “nice” structure are matrix-geometric methods [Neut81] We do not discuss these model types in this book due to space limitations
Relatively large PFQN can be solved by means of a small number of simpler equations However, practical queueing networks can often get so large that approximate methods are needed to solve such PFQN Furthermore, many practical queueing networks (so-called non-product-form queueing networks, NPFQN) do not satisfy restrictions implied by product-form In such cases,
it is often possible to obtain accurate approximations using variations of algo- rithms used for PFQNs Other approximation techniques using hierarchical and fixed-point iterative methods are also used
The flowchart shown in Fig 1.1 gives the organization of this book The rest of this chapter, Section 1.2, covers the basics of probability and statistics
In Chapter 2, Markov chains basics are presented together with generation methods for them Exact steady-state solution techniques for Markov chains are given in Chapter 3 and their aggregation/disaggregation counterpart in Chapter 4 These aggregation/disaggregation solution techniques are useful for practical Markov chain models with very large state spaces Transient solution techniques for Markov chajns are introduced in Chapter 5
Chapter 6 deals with the description and computation of performance mea- sures for single-station queueing systems in steady state A general description
of queueing networks is given in Chapter 7 Exact solution methods for PFQN are described in detail in Chapter 8 while approximate solution techniques for PFQN are described in Chapter 9 Solution algorithms for different types
of NPFQN ( sue as networks h with priorities, non-exponential service times, blocking, or parallel processing) are presented in Chapter 10
The solution algorithms introduced in this book can also be used for opti- mization problems as described in Chapter 11 For the practical use of mod- eling techniques described in this book, software packages (tools) are need-
ed Chapter 12 is devoted to the introduction of a queueing network tool, a stochastic Petri net tool, a tool based on Markov chains and a toolkit with many model types, and the facility for hierarchical modeling is also intro- duced Each tool is described in some detail together with a simple example Throughout the book we have provided many example applications of dif- ferent algorithms introduced in the book Finally, Chapter 13 is devoted to several large real-life applications of the modeling techniques presented in the book
Trang 4Fig 1.1 Flowchart describing how to find the appropriate chapter for a given perfor- mance problem
Trang 51.2 BASICS OF PROBABILITY AND STATISTICS
We begin by giving a brief overview of the more important definitions and results of probability theory The reader can find additional details in books such as [AllegO, Fe1168, Koba78, Triv82] We assume that the reader is familiar with the basic properties and notations of probability theory
1.2.1 Random Variables
A random variable is a function that reflects the result of a random experi- ment For example, the result of the experiment “toss a single die” can be described by a random variable that can assume the values one through six The number of requests that arrive at an airline reservation system in one hour
or the number of jobs that arrive at a computer system are also examples of
a random variable So is the time interval between the arrivals of two consec- utive jobs at a computer system, or the throughput in such a system The latter two examples can assume continuous values, whereas the first two only assume discrete values Therefore, we have to distinguish between continuous and discrete random variables
1.2.1.1 Discrete Random Variables A random variable that can only assume discrete values is called a discrete random variable, where the discrete values are often non-negative integers The random variable is described by the pos- sible values that it can assume and by the probabilities for each of these values The set of these probabilities is called the probability muss function (pmf) of this random variable Thus, if the possible values of a random variable X are the non-negative integers, then the pmf is given by the probabilities:
pk = P(x = k), for k = 0, 1,2 , (1-l)
the probability that the random variable X assumes the value Ic
The following is required:
The following are other examples of discrete random variables:
l Bernoulli random variable: Consider a random experiment that has two possible outcomes, such as tossing a coin (Ic = 0,l) The pmf of the
random
Trang 6random variable X is given by:
P(X = 0) = 1 -p and P(X = 1) = p, with 0 < p < 1 (1.2)
l Binomial random variable: The experiment with two possible outcomes
is carried out n times where successive trials are independent The random variable X is now the number of times the outcome 1 occurs The pmf of X is given by:
P(X = k) =
0 i ~“(1 -p)+lc, k = O,l, , n w
l Geometric random variable: The experiment with two possible out- comes is carried out several times, where the random variable X now represents the number of trials it takes for the outcome 1 to occur (the current trial included) The pmf of X is given by:
The Poisson and geometric random variables are very important to our topic;
we will encounter them very often Several important parameters can be derived from a pmf of a discrete random variable:
l Mean value or expected value:
Trang 7where crx is called the standard deviation
l The coefficient of variation is the normalized standard deviation:
*X cx=Y-
Information on the average deviation of a random variable from its expected value is provided by cx , ox, and var( X) If cx = ox = var( X) = 0, then the random variable assumes a fixed value with probability one
P
w Geometric
all values in the interval [a, b] , where 00 5 a < b 5 +oo, is called a contin- uous random variable It is described by its distribution function (also called CDF or cumulative distribution function):
(1.12)
Trang 8The probability density function (pdf) f x x can be used instead of the dis- ( ) tribution function, provided the latter is differentiable:
l Mean value or expected value:
03
x = E[X] =
s x* fx (4dx
(1.14)
Trang 9A very well known and important continuous distribution function is the
normal distribution The CDF of a normally distributed random variable X
Trang 1010 INTRODUCTION
X
Fig 1.2 pdf of the standard normal random variable
A plot of the preceding pdf is shown in Fig 1.2
For an arbitrary normal distribution we have:
respectively
Other important continuous random variables are described as follows
(a) Exponential Distribution
The exponential distribution is the most important and also the easiest to use distribution in queueing theory Interarrival times and service times can often be represented exactly or approximately using the exponential distribu- tion The CDF of an exponentially distributed random variable X is given
by Eq (1.22):
o<x<oo, otherwise
X’ if X represents interarrival times, with x =
1
2 if X represents service times
Here X or p denote the parameter of the random variable In addition, for
an exponentially distributed random variable with parameter X the following relations hold:
Trang 11The importance of the exponential distribution is based on the fact that it
is the only continuous distribution that possesses the memoryless property:
As an example for an application of Eq (1.23), consider a bus stop with the following schedule: Buses arrive with exponentially distributed interarrival times and identical mean x Now if you have already been waiting in vain for u units of time for the bus to come, the probability of a bus arrival within the next t units of time is the same as if you had just shown up at the bus stop, that is, you can forget about the past or about the time already spent waiting
Another important property of the exponential distribution is its relation
to the discrete Poisson random variable If the interarrival times are expo- nentially distributed and successive interarrival times are independent with identical mean F, then the random variable that represents the number of buses that arrive in a fixed interval of time [0, t) has a Poisson distribution with parameter a = t/s?
Two additional properties of the exponential distribution can be derived from the Poisson property:
1 If we merge n Poisson processes with distributions for the interarrival times 1 - eSxtt, 1 < i < - - n, into one single process, then the result is
a Poisson process for which the interarrival times have the distribution
1 - eext with X = Cy=, Xi (see Fig 1.3)
Fig 1.3 Merging of Poisson processes
2 If a Poisson process with interarrival time distribution 1 - e- xt is split into n processes so that the probability that the arriving job is assigned
Trang 1212 1NTRODUCTlON
to the ith process is qi, 1 5 i 5 n, then the ith subprocess has an interarrival time distribution of 1 - e-qiXt, i.e., n Poisson processes have been created, as shown in Fig 1.4
ax /
-e
Fig 1.4 Splitting of a Poisson process
The exponential distribution has many useful properties with analytic trac- tability, but is not always a good approximation to the observed distribution Experiments have shown deviations For example, the coefficient of variation
of the service time of a processor is often greater than one, and for a peripheral device it is usually less than one This observed behavior leads directly to the need to consider the following other distributions:
(b) Hyperexponential Distribution, H,+
This distribution can be used to approximate empirical distributions with a coefficient of variation larger than one Here k is the number of phases
Fig 1.5 A random variable with Hk distribution
Figure 1.5 shows a model with hyperexponentially distributed time The model is obtained by arranging k phases with exponentially distributed times and rates ~1,,22, ,pk in parallel The probability that the time span is given by the jth phase is qj , where C:=r qj = 1 However, only one phase can be occupied at any time The resulting CDF is given by:
Trang 13to approximate an unknown distribution with given mean x and sample coefficient of variation cx as follows:
is x/i?, where x denotes the mean of the whole time span
Fig 1.6 A random variable with Ek distribution
If the interarrival times of some arrival process like our bus stops are identi- cal exponentially distributed, it follows that the time between the first arrival and the (Ic + 1)th arrival is Erlang-k distributed
Trang 14The CDF is given by:
1
coefficient of variation: cx = 5 5 1
If the sample mean x and the sample coefficient of variation cx are given, then the parameters k and ,IA of the corresponding Erlang distribution are estimated by:
(1.28)
and:
1 p=-
(d) Hypoexponential Distribution
The hypoexponential distribution arises when the individual phases of the Ek distribution are allowed to have assigned different rates Thus, the Erlang distribution is a special case of the hypoexponential distribution
For a hypoexponential distributed random variable X with two phases and the rates ~1 and ,u2 (~1 # ~2), we get the CDF as:
P2 l+(x) = l- -e -P1X + Pl c-P2X
Trang 15The values of the parameters ~1 and ~2 of the hypoexponential CDF can
be estimated given the sample mean x and sample coefficient of variation by:
j=l,jzi pj - I&’ l 5 2 L Icy
coefficient of variation: cx =
(e) Gamma Distribution
Another generalization of the Erlang-k distribution for arbitrary coefficient of variation is the gamma distribution The distribution function is given by:
Jrn
u~-‘~~-~cZU, a! > 0
0
(1.31)
If a = k is a positive integer, then I’(k) = (k - l)!
Thus the Erlang-k distribution can be considered as a special case of the
gamma distribution:
Trang 16CL
(f) Generalized Erlang Distribution
fig 1.7 A random variable with generalized Erlang distribution
Rather complex distributions can be generated by combining hyperexpo- nential and Erlang-Ic distributions; these are known as generalized Erlang distributions An example is shown in Fig 1.7 where 72 parallel levels are depicted and each level j contains a series of rj phases connected, each with exponentially distributed time and rate rjpj Each level j is selected with probability j The pdf is given by:
x 2 0 (1.32)
Another type of generalized Erlang distribution can be obtained by lift- ing the restriction that all the phases within the same level have the same rate, or, alternatively, if there are non-zero exit probabilities assigned such that remaining phases can be skipped These generalizations lead to very
complicated equations that are not further described here
Trang 17(g) Cox Distribution, cr, (Branching Erlang Distribution)
In [Cox55] the principle of the combination of exponential distributions is generalized to such an extent that any distribution that possesses a rational Laplace transform can be represented by a sequence of exponential phases with possibly complex probabilities and complex rates Figure 1.8 shows a model of a Cox distribution with k exponential phases
Fig 1.8 A random variable with Ck distribution
The model consists of k phases in series with exponentially distributed times and rates ~1, ~2, , ,~k After phase j, another phase j + 1 follows with probability aj and with probability bj = 1 - aj the total time span
is completed As described in [SaCh81], there are two cases that must be distinguished when using the sample mean value x and the sample coefficient
of variation cx to estimate the parameters of the Cox distribution: