Modeling Hydrologic Change: Statistical Methods - Chapter 11 pps

11.1.4 I LLUSTRATION OF S IMULATION The sampling distribution of the mean is analytically expressed by the following theorem: If a random sample of size n is obtained from a population t

Trang 1

Hydrologic Simulation

11.1 INTRODUCTION

Evaluating the effects of future watershed change is even more difficult than uating the effects of change that has already taken place In the latter case, somedata, albeit nonstationary, are available Where data are available at the site, somemeasure of the effect of change is possible, with the accuracy of the effect dependent

eval-or the quantity and quality of the data Where on-site data are not available, which

is obviously true for the case where watershed change has not yet occurred, it issometimes necessary to model changes that have taken place on similar watershedsand then a priori project the effects of the proposed change onto the watershed ofinterest Modeling for the purpose of making an a priori evaluation of a proposedwatershed change would be a task simulation task

Simulation is a category of modeling that has been widely used for decades.One of the most notable early uses of simulation that involved hydrology was withthe Harvard water program (Maass et al., 1962; Hufschmidt and Fiering, 1966).Continuous streamflow models, such as the Stanford watershed model (Crawfordand Linsley, 1964) and its numerous offspring, are widely used in the simulationmode Simulation has also been used effectively in the area of flood frequency studies(e.g., Wallis, Matalas, and Slack, 1974)

Because of recent advances in the speed and capacity of computers, simulation hasbecome a more practical tool for use in spatial and temporal hydrologic modeling.Effects of spatial changes to the watershed on the hydrologic processes throughout thewatershed can be simulated Similarly, temporal effects of watershed changes can beassessed via simulation For example, gradual urbanization will change the frequencycharacteristics of peak discharge series, and simulation can be used to project the possibleeffects on, for example, the 100-year flood It is also possible to develop confidencelimits on the simulated flood estimates, which can be useful in making decisions.While simulation is a powerful tool, it is important to keep in mind that thesimulated data are not real They are projected values obtained from a model, andtheir accuracy largely depends on the quality of the model, including both itsformulation and calibration An expertly developed simulation program cannot over-come the inaccuracy introduced by a poorly conceived model; however, with arational model simulation can significantly improve decision making

11.1.1 D EFINITIONS

Before defining what is meant by hydrologic simulation, it is necessary to provide

a few definitions First, a system is defined herein as a set of processes or componentsthat are interdependent with each other This could be a natural system such as a11

Trang 2

watershed, a geographic system such as a road network, or a structural system such

as a high-rise building Because an accident on one road can lead to traffic congestion

on a nearby road, the roads are interdependent

Second, a model is defined herein as a representation of a real system The modelcould be either a physical model, such as those used in laboratories, or a mathemat-ical model Models can be developed from either theoretical laws or empiricalanalyses The model includes components that reflect the processes that govern thefunctioning of the system and provides for interaction between components.Third, an experiment is defined for our purposes here as the process of observingthe system or the model Where possible, it is usually preferable to observe the realsystem However, the lack of control may make this option impossible or unrealistic.Thus, instead of observing the real system, simulation enables experiments with amodel to replace experiments on the real system when it is not possible to controlthe real system

Given these three definitions, a preliminary definition can now be provided forsimulation Specifically, simulation is the process of conducting experiments on amodel when we cannot experiment directly on the system The uncertainty orrandomness inherent in model elements is incorporated into the model and theexperiments are designed to account for this uncertainty The term simulation run

or cycle is defined as an execution of the model through all operations for a length

of simulated time Some additional terms that need to be defined are as follows:

1 A model parameter is a value that is held constant over a simulation runbut can be changed from run to run

2 A variable is a model element whose value can vary during a simulationrun

3 Input variables require values to be input prior to the simulation run

4 Output variables reflect the end state of the system and can consist ofsingle values or a vector of values

5 Initial conditions are values of model variables and parameters that lish the initial state of the model at the beginning of a simulation run

estab-11.1.2 B ENEFITS OF S IMULATION

Simulation is widely used in our everyday lives, such as flight simulators in thespace and aircraft industries Activities at the leading amusement parks simulateexciting space travel Even video games use simulation to mimic life-threateningactivities

Simulation is widely used in engineering decision making It is a popular eling tool because it enables a representation of the system to be manipulated whenmanipulating the real system is impossible or too costly Simulation allows the time

mod-or space framewmod-ork of the problem to be changed to a mmod-ore convenient framewmod-ork.That is, the length of time or the spatial extent of the system can be expanded orcompressed Simulation enables the representation of the system to be changed inorder to better understand the real system; of course, this requires the model to be

a realistic representation of the system Simulation enables the analyst to control

Trang 3

any or all model parameters, variables, or initial conditions, which that is not possiblefor conditions that have not occurred in the past.

While simulation is extremely useful, it is not without problems First, it is quitepossible to develop several different, but realistic, models of the same system Thedifferent models could lead to different decisions Second, the data that are used tocalibrate the model may be limited, so extrapolations beyond the range of themeasured data may be especially inaccurate Sensitivity analyses are often used toassess how a decision based on simulation may change if other data had been used

to calibrate the model

11.1.3 M ONTE C ARLO S IMULATION

The interest in simulation methods started in the early 1940s for the purpose ofdeveloping inexpensive techniques for testing engineering systems by imitating theirreal-world behavior These methods are commonly called Monte Carlo simulationtechniques The principle behind the methods is to develop an analytical model,which is usually computer based, that predicts the behavior of a system Thenparameters of the model are calibrated using data measured from a system Themodel can then be used to predict the response of the system for a variety ofconditions Next, the analytical model is modified by incorporating stochastic com-ponents into the structure Each input parameter is assumed to follow a probabilityfunction and the computed output depends on the value of the respective probabilitydistribution As a result, an array of predictions of the behavior are obtained Thenstatistical methods are used to evaluate the moments and distribution types for thesystem’s behavior

The analytical and computational steps of a Monte Carlo simulation follow:

1 Define the system using a model

2 Calibrate the model

3 Modify the model to allow for random variation and the generation ofrandom numbers to quantify the values of random variables

4 Run a statistical analysis of the resulting model output

5 Perform a study of the simulation efficiency and convergence

6 Use the model in decision making

The definition of the system should include its boundaries, input parameters, output(or behavior) measures, architecture, and models that specify the relationships ofinput and output parameters The accuracy of the results of simulation are highlydependent on an accurate definition for the system All critical parameters and vari-ables should be included in the model If an important variable is omitted from themodel, then the calibration accuracy will be less than potentially possible, which willcompromise the accuracy of the results The definition of the input parameters shouldinclude their statistical or probabilistic characteristics, that is, knowledge of theirmoments and distribution types It is common to assume in Monte Carlo simulationthat the architecture of the system is deterministic, that is, nonrandom However,model uncertainty is easily incorporated into the analysis by including bias factors

Trang 4

and measures of sampling variation of the random variables The results of thesegenerations are values for the input parameters These values should then be sub-stituted into the model to obtain an output measure By repeating the procedure N

times (for N simulation cycles), N response measures are obtained Statistical ods can now be used to obtain, for example, the mean value, variance, or distributiontype for each of the output variables The accuracy of the resulting measures for thebehavior are expected to increase by increasing the number of simulation cycles.The convergence of the simulation methods can be investigated by studying theirlimiting behavior as N is increased

meth-Example 11.1

To illustrate a few of the steps of the simulation process, assume that theory suggeststhe relationship between two variables, Y and X, is linear, Y = a + bX The calibrationdata are collected, with the followings four pairs of values:

The mean and standard deviation of the two variables follow: = 5.0, S x = 2.582, = 6.75, and S y = 2.630 Using least squares, fitting yields a = 2.5, b = 0.85, and

a standard error of estimate S e = 1.7748 The goal is to be able to simulate randompairs of X and Y Values of X can be generated by assuming that X is normallydistributed with µx = , σx = S y, and the following linear model:

are used to generated the values of X with Equation 11.1a The generated values of

X are then inserted into Equation 11.1b to generate values of Y Consider the followingexample:

− 0.37 4.04 0.42 6.68 0.82 7.12 − 0.60 7.48 0.12 5.31 1.03 8.84

− 0.58 6.50 − 0.54 7.06

X Y

Trang 5

The sample statistics for the generated values of X and Y are = 5.74, S x = 1.36,

= 7.515, and S y = 0.94 These deviate considerably from the calibration data, but

are within the bounds of sampling variation for a sample size of four

The above analyses demonstrate the first four of the six steps outlined The linear

model was obtained from theory, the model was calibrated using a set of data, a

data set was calibrated, and the moments of the generated data were computed and

compared to those of the calibration data To demonstrate the last two steps would

require the generation of numerous data sets such that the average characteristics of

all generated samples approached the expected values The number of generated

samples would be an indication of the size of the simulation experiment

11.1.4 I LLUSTRATION OF S IMULATION

The sampling distribution of the mean is analytically expressed by the following

theorem:

If a random sample of size n is obtained from a population that has the mean µ and

variance σ2 , then the sample mean is a value of a random variable whose distribution

has the mean µ and the variance σ2 /n.

If this theorem were not known from theory, it could be uncovered with simulation

The following procedure illustrates the process of simulating the sampling

distribu-tion of the mean:

1 From a known population with mean µ and variance σ2, generate a random

sample of size n

2 Compute the mean and variance S2 of the sample

3 Repeat steps 1 and 2 a total of N s times, which yields N s values of and S2

4 Repeat steps 1 to 3 for different values of µ, σ2, and n

5 For each simulation run (i.e., steps 1 to 3) plot the N s values of , examine

the shape of the distribution of the values, compute the central tendency

and spread, and relate these to the values of µ, σ2, and n

The analysis of the data would show that the theorem stated above is valid

For this example, the model is quite simple, computing the means and variances

of samples The input parameters are µ, σ2, and n The number of samples generated,

N s, is the length of a simulation run The number of executions of step 4 would be

the number of simulation runs The output variables are and S2

Example 11.1 and the five steps described for identifying the sampling

distri-bution of the mean illustrate the first four steps of the simulation process In the

fifth step, the efficiency and convergence of the process was studied In Example

11.1, only one sample was generated The third step of the above description

indicates that N s samples should be generated How large does N s need to be in order

to identify the sampling distribution? The process of answering this question would

be a measure of the convergence of the process to a reliable answer Assuming that

the above five steps constitute a valid algorithm for identifying the sampling

distri-bution of the mean, the number of simulations needed to develop the data would

indicate that the algorithm has converged to a solution

X Y

X

X X X

X

Trang 6

If the effect of the assumed population from which the n values in step 1 were

sampled was of interest, the experiment could be repeated using different underlying

populations for generating the sample values of X The outputs of step 5 could be

compared to assess whether the distribution is important This would qualitatively

evaluate the sensitivity of the result to the underlying distribution A sensitivity

analysis performed to assess the correctness of the theorem when the population is finite

of size N rather than infinite would show that the variance is σ2(N− n)/[n(N − 1)]

rather than σ2/n.

The above simulation of the distribution of the mean would require a random-number

generator for step 1 Random numbers are real values that are usually developed by

a deterministic algorithm, with the resulting numbers having a uniform distribution

in the range (0, 1) A sequence of random numbers should also satisfy the condition

of being uncorrelated, that is, the correlation between adjacent values equals zero

The importance of uniform random numbers is that they can be transformed into

real values that follow any other probability distribution of interest Therefore, they

are the initial form of random variables for most engineering simulations

In the early years of simulation, mechanical random-number generators were used,

such as, drawing numbered balls, throwing dice, or dealing out cards Many lotteries

are still operated this way After several stages of development, computer-based,

arithmetic random-number generators were developed that use some analytical

gen-erating algorithm In these generators, a random number is obtained based on a

previous value (or values) and fixed mathematical equations Therefore, a seed is

needed to start the process of generating a sequence of random numbers The main

advantages of arithmetic random-number generators over mechanical generators are

speed, that they do not require memory for storage of numbers, and repeatability The

conditions of a uniform distribution and the absence of serial correlation should also

be satisfied Due to the nature of the arithmetic generation of random numbers, a given

seed should result in the same stream of random values every time the algorithm is

executed This property of repeatability is important for debugging purposes of the

simulation algorithm and comparative studies of design alternatives for a system

11.2 COMPUTER GENERATION OF RANDOM

NUMBERS

A central element in simulation is a random-number generator In practice, computer

packages are commonly used to generate the random numbers used in simulation;

however, it is important to understand that these random numbers are generated from

a deterministic process and thus are more correctly called pseudo-random numbers.

Because the random numbers are derived from a deterministic process, it is important

to understand the limitations of these generators

Random-number generators produce numbers with specific statistical

character-istics Obviously, if the generated numbers are truly random, an underlying

popu-lation exists that can be represented by a known probability function A single die

Trang 7

is the most obvious example of a random-number generator If a single die wasrolled many times, a frequency histogram could be tabulated If the die were a fair

die, the sample histogram for the generated population would consist of six bars of

equal height Rolling the die produces values of a random variable that has a discretemass function Other random-number generators would produce random numbershaving different distributions, including continuously distributed random variables.When a computerized random-number generator is used, it is important to know theunderlying population

11.2.1 M IDSQUARE M ETHOD

The midsquare method is one of the simplest but least reliable methods of generatingrandom numbers However, it illustrates problems associated with deterministicprocedures The general procedure follows:

1 Select at random a four-digit number; this is referred to as the seed.

2 Square the number and write the square as an eight-digit number usingpreceding (lead) zeros if necessary

3 Use the four digits in the middle as the new random number

4 Repeat steps 2 and 3 to generate as many numbers as necessary

As an example, consider the seed number of 2189 Squaring this yields the digit number 04791721, which gives the first random number of 7917 The followingsequence of 7 four-digit numbers results from using 2189 as the seed:

Trang 8

in that a number will recur after more than a few values are generated While theuse of five-digit numbers could be used to produce ten-digit squares, the midsquaremethod has serious flaws that limit its usefulness However, it is useful for intro-ducing the concept of random-number generation.

11.2.2 A RITHMETIC G ENERATORS

Many arithmetic random-number generators are available, including the midsquaremethod, linear congruential generators, mixed generators, and multiplicative gener-ators All of these generators are based on the same principle of starting with a seedand having fixed mathematical equations for obtaining the random value The result-ing values are used in the same equations to obtain additional values By repeating

this recursive process N times, N random number in the range (0, 1) are obtained.

However, these methods differ according to the algorithms used as the recursivemodel In all recursive models, the period for the generator is of concern The period

is defined as the number of generated random values before the stream of valuesstarts to repeat itself It is always desirable to have random-number generators withlarge periods, such as much larger than the number of simulation cycles needed in

a simulation study of a system

11.2.3 T ESTING OF G ENERATORS

Before using a random-number generator, the following two tests should be formed on the generator: a test for uniformity and a test of serial correlation Thesetests can be performed either theoretically or empirically A theoretical test is defined

per-as an evaluation of the recursive model itself of a random-number generator Thetheoretical tests include an assessment of the suitability of the parameters of themodel without performing any generation of random numbers An empirical test is

a statistical evaluation of streams of random numbers resulting from a number generator The empirical tests start by generating a stream of random num-

random-bers, that is, N random values in the range (0, 1) Then, statistical tests for distribution

types, that is, goodness-of-fit tests such as the chi-square test, are used to assess theuniformity of the random values Therefore, the objective in the uniformity test is

to make sure that the resulting random numbers follow a uniform continuous ability distribution

prob-To test for serial correlation, the Spearman–Conley test (Conley and McCuen,1997) could be used The runs test for randomness is an alternative Either test can

be applied to a sequence of generated values to assess the serial correlation of theresulting random vector, where each value in the stream is considered to come from

a different but identical uniform distribution

11.2.4 D ISTRIBUTION T RANSFORMATION

In simulation exercises, it is necessary to generate random numbers from the ulation that underlies the physical processes being simulated For example, if annualfloods at a site follow a log-Pearson type III distribution, then random numbers having

pop-a uniform distribution would be inpop-appropripop-ate for generpop-ating rpop-andom sequences of

Trang 9

flood flows The problem can be circumvented by transforming the generated form variates to log-Pearson type III variates.

uni-Distribution transformation refers to the act of transforming variates x from distribution f(x) to variates y that have distribution f(y) Both x and y can be either

discrete or continuous Most commonly, an algorithm is used to generate uniformvariates, which are continuously distributed, and then the uniform variates are trans-formed to a second distribution using the cumulative probability distribution for thedesired distribution

The task of distribution transformation is best demonstrated graphically Assume

that values of the random variate x with the cumulative distribution F(x) are generated and values of a second random variate y with the cumulative distribution F(y) are needed Figure 11.1(a) shows the process for the case where both x and y are discrete random variables After graphing the cumulative distributions F(x) and F(y), the value of x

FIGURE 11.1 Transformation curves: (a) X and Y are discrete random variables; (b) X

continuous, Y discrete; (c, d) X is U(0, 1), Y is discrete; and (e, f) Y is continuous, U is U(0, 1).

1

Trang 10

is entered on the x-axis and the value of its cumulative probability found The cumulative value for y is assumed to equal the cumulative value of x Therefore, the value of y is found by moving horizontally from F(x) to F(y) and then down to the y-axis, where the value of yi is obtained Given a sample of n values of xi, a sample

of n values of yi is generated by repeating this process.

The same transformation process can be used when x is a continuously

distrib-uted random variable Figure 11.1(b) shows this case Because many random number

generators generate uniformly distributed random numbers, F(x) is most often the

cumulative uniform distribution This is illustrated in Figure 11.1(c) Since thecumulative distribution for a uniform variate is a constant-sloped line, Figure 11.1(c)

can be simplified to Figure 11.1(d) Both Figure 11.1(c) and 11.1(d) show y as a

discretely distributed random variable Figures 11.1(e) and 11.1(f) show the

corre-sponding graphs when y is a continuously distributed random variable.

Example 11.2

Assume that the number of runoff producing storms per year (x in column 1 of Table 11.1) has the mass function given in column 2 of Table 11.1 Using f(x) in column 2, the cumulative mass function F(x) is formed (see column 3 of Table 11.1) The rule for transforming the uniform variate u to a value of the discrete random variable x follows:

i if F(i − 1) < ui ≤ F(i) (11.2b)Assume that ten simulated values of an annual maximum flood record are needed

Then ten uniform variates ui would be generated (see column 5) Using the formation algorithm of Equations 11.2, the values of ui are used to obtain generated values of xi (see column 6) For example, u1 is 0.62 Entering column 3, u1 lies

trans-between F(2) of 0.5 and F(3) of 0.70; therefore, x1 equals 3 The value of x7 of 0.06

TABLE 11.1 Continuous to Discrete Transformation (1)

(5) Uniform Variate

(6) Simulated

Trang 11

is the case where Equation 11.2a would be applied Since u7 is less than F(0) of 0.1 (column 3), then x7 is equal to 0 The ten simulated values of xi are given in column 6.

The number of runoff-producing storms for the ten simulated years of recordrepresent a sample and should reflect the distribution from which they were gener-ated Histograms of a sample and a population can, in general, be compared to assessthe representativeness of the sample For this example, a sample size of ten is toosmall to construct a meaningful histogram For the purpose of comparison of thissimulation, the sample and population (µ) means will be computed:

(11.3a)and

(11.3b)

The difference between the two values only reflects the sampling variation of themean for the small sample size As the number of simulated values is increased,the value of the sample mean would approach the population mean of 2.65 The sampleand population variances could also be computed and compared

Example 11.3

Example 11.2 illustrates the transformation of a continuous variate to a discretevariate The process is similar when the transformed variate is continuously distrib-uted Consider the case where it is necessary to obtain a simulated sequence of

continuously distributed variates that have the following density function f(x):

(11.4a)(11.4b)

The cumulative distribution of x is

(11.5)(11.6)

Equation 11.5 can be used to transform uniform variates u i to variates x i that havethe density function of Equation 11.4

If the uniform variates have location and scale parameters of 0 and 1,

respec-tively, then the values of u i can be set equal to F(x) of Equation 11.5a and the value

Trang 12

11.3 SIMULATION OF DISCRETE RANDOM VARIABLES

11.3.1 T YPES OF E XPERIMENTS

The following four types of experiments are introduced in this section as examples

of discrete variable experiments: binomial, multinomial, Markov chain, and Poisson

A binomial experiment is one with n independent trials, with each trial limited to

two possible outcomes and the probabilities of each outcome remaining constantfrom trial to trial Examples of binomial experiments include the flip of a coin, evenversus odd on the roll of a die, or failure versus nonfailure In a multinomial

experiment, the n trials are independent and more than two outcomes are possible

for each trial The roll of a die that has six possible outcomes is one example In aMarkov chain experiment, the value of the discrete random variable in one timeperiod depends on the value experienced in the previous time period Thus, it is asequential, nonindependent experimental approach In a Poisson experiment, thediscrete random variable is the number of occurrences in an interval (spatial ortemporal) For example, the number of hurricanes hitting the U.S coast in a yearcould be represented as a Poisson distribution

11.3.2 B INOMIAL D ISTRIBUTION

A discrete random variable that has two possible outcomes can follow a binomialmass function Four assumptions underlie a binomial process:

• Only two possible outcomes are possible for each trial

• Each experiment consists of n trials.

• The n trials are independent.

• The probability of both outcomes remains constant from trial to trial

The probability of exactly x occurrences in n trials is given by

(11.7)

TABLE 11.2 Continuous to Continuous Transformation

Trang 13

where is the binomial coefficient and equals In this section, the

notation b(x; n, p) is used for the binomial distribution and indicates the probability

computed with Equation 11.8

Binomial experiments can be simulated using uniform variates The procedure

is as follows for N experiments, with each having n trials and an outcome A with

an occurrence probability of p and an outcome B with an occurrence probability of

1 − p:

1 For each trial in the experiment, generate a uniform variate, ui.

2 If ui < p, then assume outcome A has occurred; otherwise, assume outcome

B has occurred.

3 Determine the number of outcomes A, x, that occurred in the n trials.

4 After completing the N experiments, with each having n trials, compute the relative frequency of x outcomes.

Example 11.4

Assume three trials per experiment is of interest where the probability (p) of outcome

A is 0.3 For example, if we have an unbalanced coin with a probability of 0.3 of

getting a head, then we would be interested in the probabilities of getting 0, 1, 2,

or 3 heads in three flips of the coin From the binomial mass function of Equation11.8, we can compute the following population probabilities:

(11.9a)

(11.9b)

(11.9c)

(11.9d)

The sum of these probabilities is 1

To illustrate the simulation of binomial probabilities, let N = 15 For each of the

15 experiments, three uniform variates are generated, and the number of ui values less than 0.3 counted, which is used as the sample estimate of x One possible

simulation of 15 experiments is shown in Table 11.3

From the 15 experimental estimates of x, seven 0s, six 1s, one 2, and one 3 were generated This yields sample estimates of the probabilities of p(x = 0) = 7/15 = 0.467, p(x = 1) = 6/15 = 0.400, p(x = 2) = 1/15 = 0.067, and p(x = 3) = 1/15 =

0.066, respectively These sample values can be compared with the population values

of Equations 11.9 The differences are obvious Specifically, the sample probability

Trang 14

of x = 0 is greater than that for x = 1, while the population probability of x = 0 is less than the sample value for x = 1 As N becomes larger and larger, the sample

estimates of the probabilities would more closely approximate the population abilities of Equations 11.9

prob-Example 11.5

Consider the case of a project that has a design life of 5 years, with the probability

of failure in any one year equal to 0.1 A binomial process such as this case can be

simulated using n = 5 and p = 0.1, with values of X = 0, 1, 2, 3, 4, and 5 To estimate the probability of no failures, the case of X = 0 would be of interest Ten experiments

were made and the sample estimate of the probability of a project not failing werecomputed as shown in Table 11.4(a) The sample ( ) and population (p) probabilities

are shown in Table 11.4(b) For a sample of 10, the sample estimates are in ingly good agreement with the population values The sample estimate of the prob-ability of a project not failing in 5 years is 0.60

surpris-11.3.3 M ULTINOMIAL E XPERIMENTATION

Multinomial experimentation is a generalization of binomial experimentation inwhich each trial can have more than two outcomes A multinomial experiment has

n independent trials with k mutually exclusive outcomes Outcome i has a probability

p i of occurring on any one trial, and in an experiment x i is used to denote the number

of outcome i The multinomial mass function is

(11.10a)

TABLE 11.3 Generation of a Binomial Distribution Experiment u i Variates x

Trang 15

subject to the constraints that

no damage and minor damage will occur in the fifth flood is:

(11.11)

TABLE 11.4

Generation of Design Life

(a) Simulation Results for a Binomial Distribution

Trang 16

Similarly, the probability that major damage would not result during five floods is

(11.12)

11.3.4 G ENERATION OF M ULTINOMIAL V ARIATES

Given a sequence of uniformly distributed random numbers u i, values of the

mult-inomial outcome x i can be simulated by the following algorithm:

if u i ≤ F(1) then xi = outcome 1 (11.13a)

if F(j − 1) < ui ≤ F(j) then xi = outcome j (11.13b)

in which F( ) is the cumulative mass function based on the outcome probabilities pi.

Example 11.7

Assume that during any month a wetland can be in any one of the following four

states: flooded (F), the permanent pool can be at normal level (N), no pool but saturated ground (S), or dry (D) Assume that the probabilities of these conditions

are 0.1, 0.7, 0.15, and 0.05, respectively The cumulative mass function is, therefore,

F(j) = {0.1, 0.8, 0.95, 1.0} In generating 1 year of record, the pool is at normal level if the uniform variate is between 0.1 and 0.8, while it is dry if ui is greater

month the probability of the dry condition is only 5% The true probability of 1F, 8N, and 3S in a 12-month period is computed with Equation 11.10a:

While this might seem like a small probability, it is quite reasonable given that

number of combinations of F, N, S, and D in a 12-month period is large Note that

in this case, it does not specify the order of the (1F, 8N, 3S, 0D) occurrences The generation of the probabilities of F, N, S, and D may seem useless since

they can be computed from Equation 11.10 But generated sequences may be necessary

p i j j i

i j

( , , )

all that all such

Trang 17

to examine probabilities such as that of four dry (D) months in succession or the

average length of time between dry periods These other probabilities cannot becomputed with Equation 11.10

a Poisson distribution for any λ The procedure for any given λ is as follows:

1 Calculate the cumulative function F Xi (x i ) for x i = 0, 1, 2, …

2 Generate a unit variate u i in the range (0, 1)

3 If F Xi (x t −1) ≤ ui < F Xi (x i ), then the Poisson variate is x i

Example 11.8

The number of floods per year (n) on a river in northern Idaho was obtained from

streamflow records over a 38-year period The annual number of floods is given in

Table 11.5 for each year The number of years N x of x floods per year is given in

column 2 of Table 11.6 The sample probability of x is given in column 3 A total

of 120 floods occurred during the 38 years, which yields a mean annual flood count

of 3.158; thus, the sample estimate of the Poisson parameter λ is 3.158 The assumedpopulation mass function is

Trang 18

Assume that it is necessary to simulate the number of floods in a 10-year period.The assumed cumulative function is given in column 5 of Table 11.6 For a given

vector of uniform variates ui (column 2 of Table 11.7), the generated number offloods per year is shown in column 3 of Table 11.7 A total of 26 floods occurred

in the 10-year period Based on this sample of ten simulated values, the mean number

of floods is 26/10 = 2.6 As the number of years for which the number of floodsgenerated increases, the sample mean would be expected to move closer to theassumed population mean of 3.158

11.3.6 M ARKOV P ROCESS S IMULATION

Markov process modeling differs from the previous analyses in that independence

of sequential events is not assumed In a Markov process, the value of the random variable xt depends only on its value xt−1 for the previous time period t − 1, but not

the values of x that occurred for time periods before t − 1 (i.e., t − 2, t − 3, etc.) Assume that the random variable can take on any one of a set of states A, where A

Trang 19

consists of n states ai (i = 1, 2, , n) Then by definition of a Markov process, the probability that x exists in state j if it was in state i in the previous time period is denoted by p(xt = aj|xt−1 = ai) For example, assume that (1) a watershed can be classified as being wet (W) or dry (D), which would depend on the value of one or

more criteria; and (2) the probabilities of being wet or dry are stationary Then theconditional probabilities of interest are as follows:

P(xt = W|xt−1 = W) (11.17a)

P(x t = W|x t−1 = D) (11.17b)

P(xt = D|xt−1 = W) (11.17c)

P(xt = W|xt−1 = D) (11.17d)Alternatively, if we assumed that three states (wet, moderate, and dry) were possible,then nine conditional probabilities would be of interest The probabilities of Equa-

tions 11.17 are called one-step transition probabilities because they define the

prob-abilities of making the transition from one state to another state in sequential time

periods The cumbersome notation of Equations 11.17 can be simplified to p ij where

the first subscript refers to the previous time period t − 1 and the second subscript

refers to the present time period t For example, P26 would indicate the probability

of transitioning from state 2 in time period t − 1 to state 6 in time period t In a homogeneous Markov process, the transition probabilities are independent of time.

As the number of states increases, the number of probabilities increases ally For this reason, the transition probabilities are often presented in a square matrix:

proportion-(11.18)

TABLE 11.7 Synthetic Series of Flood Frequency Year u i x i F Xi (x i−−−−1 ) F Xi (x i)

11 12 1

12 22 2

1 2

LL

L

Định dạng
Số trang	39
Dung lượng	491,17 KB