1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Modeling of Combustion Systems A Practical Approach 11 doc

101 422 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Modeling of Combustion Systems: A Practical Approach
Chuyên ngành Combustion Systems
Thể loại book
Năm xuất bản 2006
Định dạng
Số trang 101
Dung lượng 1,43 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

In fact, it distributes as a chi-squared distribution.Knowing something about this distribution will allow us to develop a list of additional statistics related to the model, such as: •

Trang 1

tempera-some understanding of statistics.

This chapter begins with some elementary statistics and butions, and progresses to its seminal tool for separating informa- tion from noise — the analysis of variance Next, the chapter covers factorial designs — the foundation of statistically cognizant exper- iments We find that simple rules produce fractional factorials and

distri-reduce the number of required experiments We show that by

modifying the alias structure, we can clear factors of certain biases.

We then discuss the importance of replication in obtaining an independent estimate of statistical error, and we show how block- ing can reduce it further Further discussion shows how orthogo- nality eliminates mutual factor biases The chapter moves on to consider how to mute certain adulterating effects, including hys- teresis and lurking factors, and how to validate analytical integrity with residuals plots For looking at many factors with few exper- iments, we introduce screening designs such as simplex and highly fractionated designs The reader then learns how random and fixed effects differ, and how they affect the analysis To show how one may assess curvature in factor space, a discussion of second-order designs follows The chapter concludes by considering the sequen-

tial assembly of the various experimental designs

Trang 2

3.1 Some Statistics

A statistic is a descriptive measure that summarizes an important property

of a collection of data For example, consider the group of numbers in braces:{1, 10, 100} Though there are only three data values, we could define anunlimited number of statistics related to them Here are a few:

• The maximum, 100, is a statistic because it summarizes a property of

the data, namely, that all data are equal to or below a certain value, 100

• The minimum, 1, is also a statistic, defining the magnitude that all

data meet or exceed

• The half range, 49.5, that is, (100 – 1)/2, is a statistic It is a measure

of the dispersion of the data.

• The count, 3, tells us the number of data points If the data were

repeated measures of the same quantity differing only by

measure-ment error, the count would relate to a measure of certainty

Intu-itively, we would expect that the more replicates we measure, themore certain we become of the true value

• The median, 10, is the middle value of an ordered data set It measures central tendency Presuming the data comprise replicate observations,

one intuitively expects the true value to be closer to the middle thanthe extremes of the observations

There are ever more statistics, but let us pause here to answer some esting questions:

inter-• Can we really describe three observations with five or more tics? Yes

statis-• How can we have five statistics for only three observations? Not all

of the statistics are independent In fact, no more than three can beindependent if we are deriving our statistics from these particulardata — the number of independent statistics cannot exceed the num-ber of data points The reason for so many statistics is that we have

so many questions that we want to ask of our data; for example:– What limit represents a safe upper bound for a NOx prediction?– How likely are we to exceed this upper limit?

– What confidence do we have that our predicted value representsthe true value?

– How precisely does our model fit the data?

– Is a particular point within or beyond the pale of the data?– How large a margin should we state if we want 99.9% of allfuture data to be below it?

Trang 3

For every important question, it seems someone or several have inventedone or more statistics In this chapter, we shall describe important statisticsthat relate to modeling in general and combustion modeling in particular.

3.1.1 Statistics and Distributions

Suppose we wish to measure a response (y) contaminated by a randomly

distributed error term (e) We would like to separate the information (μ)from the noise (e) One option would be to repeatedly measure the response

at the same condition and average the results In summation notation wehave , or , where n is the number of replicate

measurements We may divide by the total number of measurements to give

Here we will designate it with an overbar That is,

(3.1)

Now if e (the error vector) were truly random, we would expect the

long-run average to be zero This occurs when n →∞ We refer to long-run results

as expected values and designate them with the expectation operator, E( ) We will refer to the true value for y as μ Therefore, E(y) = μ and is an unbiasedestimator for μ Intuitively, we would expect all the y values to distribute

about the true value, differing only by e Since our best estimate of μ from

the data is , then is a measure of central tendency — the inclination of the

average of repeated measures to converge to the true value

The mean is an important statistic — one might dare say the most tant statistic — but it is insufficient to characterize certain aspects of somepopulations For example, suppose the average height of 100 adult malehumans is 1.80 m (5.9 ft) How many will be 0.01 m (<1 in.) tall? How manywill be 3.59 m (11.8 ft) tall? We know from experience that there are no adultmale humans at either of these extremes Yet the average of these two num-bers is 1.80 m Therefore, as important as central tendency statistics are, weare also interested in other measures That is, we would also like some

impor-measure of dispersion Dispersion indicates how values differ from the mean.

One statistic that quantifies dispersion is the variance

Let us define the variance (V) of a sample (y) as follows:

(3.2)

y=∑ + ∑

∑ μ e ∑ =y nμ+ne

y n

n n

n n

Trang 4

Then the following equation gives the mean variance:

(3.3)

The long run average of the mean variance is

(3.4)

However, if we are using the sample mean derived from a finite data set

to estimate the variance, tends to overestimate σ2 unless n is large The

reason for the overestimation is that we have already used the data to

determine Therefore, plus n – 1 data points exactly determine the nthdata value; i.e., is not a completely independent measure of dispersion

So the proper denominator to estimate σ2 in Equation 3.3 is n – 1, not n In other words, we use up (lose) one degree of freedom when we use a finite data set to estimate Thus, n – 1 are the degrees of freedom for the estimated variance We shall use the symbol s2 to denote this quantity:

(3.5a)

This is also called the sample-adjusted variance Obviously, Equation 3.5a and Equation 3.3 become identical as n →∞

One problem with Equation 3.5a is that the units differ from y because the

variance uses the squared value of the response For this reason, we define

the sample standard deviation as

3.1.2 The Normal, Chi-Squared (χχχχ2), F, and t Distributions

To develop the idea of distributions further, let us consider Figure 3.1.1

y y n

Trang 5

Galton’s board looks a bit like a pinball machine It comprises a verticalslate with pegs arranged in a triangle pattern that widens from top to bottom.

A ball dropped onto the topmost peg may fall either to the left or to theright, whereupon it strikes the next peg and again may fall to either the left

or the right The ball continues in this fashion until it ultimately drops intoone of the bins below What is the probability that a dropped ball will fillany particular bin? To answer the question, we begin by calculating thedistribution of the possibilities

3.1.2.1 The Normal Distribution

Most of us are familiar with the normal distribution — the so-called shaped curve — perhaps it is more nearly cymbal shaped At any rate,Equation 3.7 gives the mathematical representation:

The Galton board The Galton board comprises a vertical arrangement of pegs such that a ball

may take one of two possible paths at each peg, finally arriving at a bin below The numbers between the pegs show the number of paths leading through each space The numbers follow Pascal’s triangle (superimposed numbers) The total number of paths for this Galton board sums to 256 Thus, for the ball shown arriving at the bin, the probability is 56/256 = 21.9% One of 56 possible paths leading to that bin is shown (dotted line) The distribution approaches the normal probability distribution as the number of rows in the Galton board increases.

Bin Peg

1 1 1 1 1 1

2

πσ

μ σ

1 2

2

π

μσ

μ

σ 00<P y( )<1

Trang 6

The statistics μ and σ2 completely characterize the normal distribution.One may standardize the normal distribution using the coding

(3.9)

Figure 3.2 depicts the above equation

The normal distribution has the following properties:

• It is symmetrical about its center at z = 0.

• It has an inflection point where the curve changes from concave

down to concave up (at z = ±1).

• The area under the curve sums to unity

3.1.2.2 Probability Distribution for Galton’s Board

Galton’s board is a very good simulator of random error even though tonian physics dictate the ball’s motion Yet, we have no way of predictingwhat bin the ball will fall into on any given trial because very small variationsaffect the path of the ball Such variations include:

New-• The elasticity and roundness of the ball’s surface

• The elasticity, angle, and roundness of each peg

• The mutual interactions among balls

At each peg, the distribution is a binary one: the ball will fall either to theleft or to the right In no way can we consider this a normal distribution It

is an equiprobable binary distribution Notwithstanding, statistical erations allow us to do the following:

consid-• Calculate the ultimate distribution of the balls into the slots

• Calculate the probability of any given ball falling into a particular slot

• Show that the ultimate distribution is a normal probability distribution

2

1 / 2 πe

Trang 7

To derive the probability distribution for Galton’s board, we proceed asfollows First, we count the total number of paths through each space Atthe first peg, we have one path to the left and one path to the right So thepossible paths from left to right are distributed as {1, 1} At the second row

of pegs, we may take one path to the left and fall outside the far left peg.But if the ball jumps left and then right, it will fall between the two pegs onthe second row Likewise, if the ball falls to the right of the first peg and tothe left of the second peg, it will also fall between the two pegs of the secondrow; therefore, there are two paths leading between the two pegs of thesecond row Finally, if the ball takes a right jump at the first peg and then aright jump at the second peg, it will fall to the right of the right peg There-fore, the number of paths from left to right at this level is {1, 2, 1} Now thetotal number of paths between any two pegs will be the sum of the pathscommunicating with it For Galton’s board there are two such paths over-head and to the left and right Thus, the distribution of paths for the nextrow of pegs is {1, 3, 3, 1} We may continue in this fashion all the way downthe board

3.1.2.3 Pascal’s Triangle

We know the pattern {1}, {1 1}, {1 2 1}, {1 3 3 1} … as Pascal’s triangle (Figure

3.3) Pascal’s triangle is a numerical triangle having outside edges of 1 The

FIGURE 3.3

Pascal’s triangle Each term is calculated by adding the two terms above it In some versions,

the second row of ones (1 1) is omitted, but we include it here for consistency Horizontal rows

(f) are numbered starting with zero at top and incrementing by 1 Entries (k) in each row are numbered starting from 0 at left and incrementing by 1 Thus, the coordinates (f, k) = (4,2) correspond to the value 6 f also indicates the number of factors in a factorial design discussed

presently The sum of any horizontal row equals the number of terms in the saturated factorial (2f ) k indicates the overall order of the term One may calculate the value of entry k in row f directly using the formula f !/[k! (f – k)!], e.g., 4!/[2!(4 – 2)!] = 6.

⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄ ⠄

1 2 4 8 16 32 64 128 256

1 1

1 1 1 1 1

1

1 2 1 3 3

1 4

1 5

1 6

7

Trang 8

sum of the two numbers immediately above forms each lower entry Thesum of the numbers in a horizontal row is always

where f is the number of the row starting from top down; all counting for

rows and entries begins with zero, i.e., 0, 1, 2, … Equation 3.11 gives the

kth entry in row f directly:

(3.11)

where m is the number contained in the kth entry of the f th row

For reference, we have superimposed Pascal’s triangle onto Galton’s board

in Figure 3.1 Each number represents the number of possible paths ing through the interstice As shown, the board has eight rows of pegs Atthe bottom, we have nine slots and the distribution of paths is {1, 8, 28, 56,

travers-70, 56, 28, 8, 1} The total number of paths is 1 + 8 + 28 + 56 + 70 + 56 + 28+ 8 + 1 = 256 = 28 So, the probabilities for a ball falling into any given slotfrom left to right are 1/256, 8/256, 28/256, 56/256, 70/256, 28/256, 8/256,

and 1/256, whose fractions sum to 1 This is a binomial frequency distribution.

We may find this directly by the ratio of Equations 3.10 and 3.11:

(3.12)

where B(f, k) is the probability of the ball finding its way to the kth interstice

(counting from zero) under the f th peg For reference, we have superimposed

a bar graph in Figure 3.1 for each bin The bar is proportional to the bility of a ball finding its way to that particular bin The distribution

proba-approaches the normal probability distribution as f → ∞ But even afterseveral rows, the resemblance to Equation 3.7 is unmistakable In fact,

12

1 2

Trang 9

Letting and σ = 1, Equation 3.13 reduces to Equation 3.9:

(3.9)

We call Equation 3.9 the probability density function Then the cumulative

probability function for –a < z < a is

(3.15)

We call the variable, z, the standard unit variate, and when the limits of the

integration are from – ∞ to + ∞, the integral attains unity

Equation 3.15 implies a two-tail test because we are asking for the

proba-bility for z being between –a and a If we were only interested in P[N(z)] being greater than –a or less than a, we would only be interested in one tail

of the distribution Since the distribution is symmetrical, the probability ofthe one-tailed test is exactly half that of the two-tailed test

Most computer spreadsheets have functions to calculate this Excel™ hasseveral related functions The function normdist(x,m,s,TRUE) evaluatesEquation 3.8, where x is a particular value, m the mean, and s the standarddeviation The function n o r m d i s t ( z , m , s , F A L S E ) evaluates

1 – normdist(z,m,s,TRUE) The function normsdist(z)— note the s

in the middle of this function name — evaluates Equation 3.15 For example,

Many statistical tests are strictly valid only for normally distributed errors.However, according to the central limit theorem, even if the parent distribu-tion is not normally distributed, the accumulation of several levels of ran-domly distributed deviations will tend toward a normal distribution Thiswas the case with Galton’s board The parent distribution was binary (highlynonnormal), yet the final distribution approached the normal distributionquite closely So we expect estimates of the mean to distribute around thearithmetic mean, approaching the normal distribution In fact, according toEquation 3.13, μ and σ completely determine the shape of the normal prob-ability curve Thus, from these two statistics alone, we can derive any otherproperty or statistic for normally distributed data If data do not distributenormally, often one can transform them to such For example, emissions datasuch as CO and NOx are always nonnegative, and therefore they do not

μ = f 2

z

( )= 1 −2

2

2

Trang 10

distribute normally However, the logarithms of these quantities are normallydistributed (e.g., ln(NOx)) As we shall see, this will permit us to estimate:

• How likely we are to exceed some upper limit

• What limit represents a safe upper bound for a prediction

• What confidence we have that our predicted value represents thetrue value

3.1.2.4 The Chi-Squared Distribution

Another important distribution is the distribution of variance The variationwill never be negative because variance is a squared quantity Thus, variancecannot distribute normally In fact, it distributes as a chi-squared distribution.Knowing something about this distribution will allow us to develop a list

of additional statistics related to the model, such as:

• The goodness of fit

• The confidence that a particular factor belongs in the model

• The probability that we can accurately predict future values

The chi-squared distribution has the following form:

(3.16)

where n is the degrees of freedom (an integer); z is the standard variate,

defined in Equation 3.14; and is the gamma function, defined as

Excel has several related functions The function gammaln(z) will returnthe natural log of the gamma function for positive arguments of z To obtainthe gamma function itself, one uses exp(gammaln(z)) Equation 3.17 givesthe cumulative probability function for the chi-squared distribution:

(3.17)

χ2

2

2 2 222

22

Trang 11

One may use the Excel function CHIDIST(z,n) to calculate this.

The gamma function has some interesting properties One way to think of

it is as a generalization of the discrete factorial function That is,

tions and the result must always be nonnegative Equation 3.23 gives the F

distribution:

Γ( )n =( )n−1 !

Γ( )n+1 =nΓ( )n

Γ 12

n k

1 3 52

1583

Trang 12

The distribution is a special distribution called the t distribution.*

It accounts for deviation from the normal distribution owing to less than aninfinite number of independent trials

(3.25)

The t distribution approaches the normal distribution as n→∞ In general,

the t distribution has a flatter peak and broader tails than the normal bution The t distribution adjusts the normal probability function for the uncertainty of having less than an infinite number of samples For n > 20, the t and normal distributions are practically identical The associated cumu-

distri-lative probability function is

* W.S Gosset writing under the pen name “Student” while working for the Guiness Brewing

Company derived the t distribution that bears his pen name — Student’s t distribution Details

are available in any statistics text See, for example, Mendenhall, W., Scheaffer, R.L., and

Wack-erly, D.D., Mathematical Statistics with Applications, 3rd ed., PWS Publishers, Boston, 1986, p 273.

12

2 1

Trang 13

The Excel function TDIST(z,n,1) gives the single-tailed function

TDIST(z,n,2) gives the two-tailed test of Equation 3.26

The F distribution allows us to estimate probabilities for ratios of variances.

We use it in an important technique known as the analysis of variance(ANOVA) ANOVA is one of the most important concepts in statistical exper-imental design (SED) It is based on an amazing identity:

(3.27)

where SST stands for sum of squares, total; SSM is the sum of squares, model;and, SSR is the sum of squares, residual term (The reader should note thatalthough the statistical literature uses these abbreviations often, some textsuse slightly different acronyms.)* The residual error includes whatever vari-ation the model does not explain SSM and SSR are the estimators for themodel and the residual variance, respectively

The above identity seems to break the rules of algebra, by ignoring thecross product But in fact, the cross product vanishes for least squares solu-tions This is easy to show:

* For example, Montgomery uses SST (mean sum of squares, treatments) in lieu of our SSM and

he uses SSTO (sum of squares, total) in lieu of our SSM: Montgomery, D.C., Design and Analysis

of Experiments, 5thed., John Wiley & Sons, New York, 2001, pp 531–535 However, other texts are consistent with our nomenclature here See Box and Draper 2 for example.

12

Trang 14

Now, noting that

we have

But both terms are identically 0 for least squares (see Equations 1.86 and

1.87) Therefore, the cross product is also identically zero

We may construct a table making use of the identity of Equation 3.27 Itwill have the slots shown in Table 3.1

The column headers are M = model, R = residual, and T = total The row

headers are SS (sum of squares), DF (degrees of freedom), F (F ratio) The

entries, defined by appending the row header to the column header, are SSM,SSR, SST, DFM (degrees of freedom, model), DFR (degrees of freedom,residual), and DFT (degrees of freedom, total) Those are the slots, now forthe filler

Consider the following model:

(3.28)

We desire to know if a1 is significant In other words, does x really influence

the response or not? If not, then the experiments amount to repeated

mea-sures of y, which differ only by experimental error ; this would beequivalent to and a1 = 0 for Equation 3.28 We shall call the null hypothesis For the null hypothesis, a0 is the only model parameter Then the

total degrees of freedom becomes n – 1, where n is the total number of

observations Therefore, Equation 3.28 reduces to Equation 3.5b and is alent to the total sum of squares divided by the total degrees of freedom:

SSTDFT

Trang 15

The alternative hypothesis is that |a1|> 0 If |a1|> 0, then Equation 3.5b

is not the correct measure of random variance because the total varianceincludes a nonrandom contribution from the model If the alternativehypothesis is true, the appropriate estimation of experimental error will bethe variance left over after subtracting the model from the data In otherwords,

(3.5c)

Here, p is the total number of model parameters For the current case, p = 2, i.e., a0 and a1 Note that SSR = SST – SSM and DFR = DFT – DFM SSR is the

sum of squared residual error The mean squared residual (MSR) gives an

estimate of the error variance The residual is what remains after subtractingthe portion of the total variance that belongs to the model, SSM Equation3.29 gives the mean squared model variance:

(3.29)

This accounting is easy to remember The model variance, SSM, is thevariance over and above the mean; the residual variance (SSR) is the totalvariance (SST) minus the model variance (SSR = SST – SSM) If we add thesetwo contributions (SSM + SSR), we obtain the total variance (SST = SSM +SSR) — the variance of the actual data over and above the mean Perhapsvector notation is more straightforward and easier to remember:

(actual – model) (3.31) (actual – mean) (3.32)

These relations are easy to prove Note that

also substitute giving if desired Expanding SST gives:

SSM=y y y yˆ ˆT − TSSR=y y y yT −ˆ ˆTSST=y y y yT − T

(y y−ˆ ) (T y y−ˆ )=∑(y kyˆ)

k

n

2(y y−ˆ ) (T y y−ˆ )=y y y y y y y yT − Tˆ−ˆT +ˆ ˆT

Trang 16

Finally,

which we may expand as

But note that

3.2.1 Use of the F Distribution

The proper measure of random error depends on knowing whether or notthe null hypothesis is true But how can we know? Recall that a ratio of

variances distributes as an F distribution Now if MSM/MSR ~ 1, then the

null hypothesis is true, and if MSM/MSR >> 1, the null hypothesis is false,leaving the alternative hypothesis But what if the ratio is 1.5? Should weaccept the null hypothesis then? A ratio greater than 1 could occur merely

by chance Now we are 100% confident that if our ratio were infinitely large,then MSM would be significant If we were willing to be less confident, say

Trang 17

95% confident, then our theoretical ratio would not need to be as large asinfinity.

A normal distribution needed μ and σ to determine its shape An F

distri-bution needs three things: the degrees of freedom in the numerator of thevariance ratio, the degrees of freedom in the denominator of the varianceratio, and the confidence with which we desire to say that the ratio differsfrom chance Here is the general procedure

Once we have the sum-of-squares relations, we calculate the mean sum ofsquares by the degrees of freedom Thus, MSM = SSM/DFM and MSR =SSR/DFR The mean squares are variances, and they distribute as chi-

squared variables Therefore, the ratio of mean squares distributes as an F distribution Again, to determine an F distribution we need three things.

1 The degrees of freedom used to determine MSM (i.e., DFM)

2 The degrees of freedom used to determine MSR (i.e., DFR)

3 Decide the probability (P) we will use in judging that MSM/MSR ≠1; that is, the ratio differs from 1 by more than chance alone We

shall call C = 1 – P the confidence Thus, if P 0.05, then we have C

95% that MSM/MSR > F (Most texts use a lowercase p to indicate probability However, this text uses a lowercase p to indicate the number of model parameters Therefore, we use an uppercase P so

as not to confuse the two.)

Step 1: Specify P We must do this before beginning any analysis We will certainly want P to be less than 0.10, indicating that we have

90% confidence or more (1 – 0.10 = 0.90) that our correlation is not

due to chance The typical test is P ≤ 0.05 (95% confidence), denoted

F95 (m, n), where the subscript indicates the percent confidence, and

m and n are the DFM and DFR, respectively.

Step 2: Determine MSM/MSR, m, and n.

Step 3: Compare MSM/MSR with the minimum F ratio If MSM/ MSR > F C (m, n), then reject the null hypothesis and accept the alter- native hypothesis — that the model is significant at the C × 100%confidence level

on C (or alternatively P), m, and n For example, suppose we want to be sure that a term belongs in the model with 95% confidence Let us say that m = DFM = 3 and n = DFR = 2 and C = 95% (P = 0.05) Then according to the table, F C (m, n) = 19.2 (We may also find this from the Excel function

FINV(0.05,3,2)=19.2.) If MSM/MSR F95(3, 2) = 19.2, then we shall rejectthe null hypothesis and conclude that MSM is a significant effect Statistical

programs will give the P value directly and obviate the need for the table.

Spreadsheets will also do the same In Excel, the command FDIST(F,m,n)

gives the P value for F 1–P Let us consider an example

Appendix E, Table E.4 gives the minimum ratio that this is so, depending

Trang 18

Example 3.1 ANOVA for a Single-Factor Investigation

Problem statement: Derive the ANOVA for the following

hypo-thetical data of Table 3.2.

Compare the MSM/MSR ratio with that of an F95 distributionhaving the same degrees of freedom and determine whether thenull hypothesis is valid at the given confidence level Use a

spreadsheet to assess the value of P.

Solution: We solve for the model using least squaresand generate

MSM/MSR = 729.30/2.96 = 246.77 This gives Table 3.3

Now the ratio for F95

E.4 Therefore, we conclude that the effect is significant Using theExcel function we have FDIST(246.77,1,5)= 1.9E-05, or P <

0.0001 At any rate, the model is statistically significantand differs from the alternative (null) hypothesis that

0 1

Trang 19

3.3 Two-Level Factorial Designs

A two-level factorial is an experimental design that investigates each factor at

two levels: high and low Factorial designs generate orthogonal matrices andallow us to assess separately each effect in a partitioned ANOVA To begin,let us consider the effect of three factors on NOx: excess oxygen, O2 (x1); air

preheat temperature, APH (x2); and furnace bridgewall temperature, BWT

(x3) To investigate every possible combination of high and low factorsrequires a minimum of 2f points For example, consider Table 3.4

Table 3.4 gives all possible combinations of high and low factor levelsalong with the response ln(NOx) We use + and – to signify high and lowlevels in the table Actually, we can let + and – refer to +1 and –1, respectively,

by coding all values as follows:

(3.33)

where k is an index from 1 to p, is the kth-coded factor, is the kth factor

in the original metric, is the average value defined by Equation 3.34, and

is the half range, defined by Equation 3.35:

Trang 20

These are the typical coding relations for experimental design However,two other single-factor linear transforms may be useful in some situations:0/1 coding and deviation-normalized coding:

(inverse for 0/1 coding) (3.39)

, where (deviation-normalized coding) (3.40)

(inverse for deviation-normalized coding) (3.41)

It follows that the linear transforms have the following relations:

(3.42)

ξkk++ξk−2

ˆ

ξkk+−ξk−2

= ξ −ξ−ξ

Trang 21

One may use Equation 3.42 to convert among transforms For factorialdesigns, we will generally use ±1 coding For the regression of historical datasets from unplanned experiments, deviation-normalized coding may be pref-erable 0/1 coding will not generate orthogonal matrices, but sometimes theinvestigator desires a model where the low level corresponds to zero Forexample, in classical experimentation, 0/1 coding better highlights that weare exploring along a single-factor axis at a time Also, 0/1 coding has someadvantage for representing a categorical factor with multiple levels (see

Example 3.2 Factor Coding

Problem statement: If the factors have the following ranges, givethe equations to code their values to ±1: oxygen, 1% to 5%; airpreheat temperature, 25 to 325°C; and furnace temperature, 800

to 1100°C Give the inverse relations also

Solution: From Equations 3.33 through 3.35 we have

, ,

and the inverse relations are

, ,

3.3.1 ANOVA for Several Model Effects

If X is orthogonal, we can assess each effect separately in the ANOVA table.

In Table 3.4, the model we are attempting to fit is

and it has the following normal matrix equation:

a a a a

0 1 2 3

0 1 2 3

Trang 22

From the ANOVA, we may also derive a statistic to measure overall

good-ness of fit We shall call it the coefficient of determination and represent it with the symbol r2 It has the following definition:

(3.43)

If we desire, we may augment the ANOVA table with r2, though strictlyspeaking it is not part of the accounting However, the table has some emptyTable 3.5 gives the consolidated form of ANOVA that we will use

However, because each effect in a factorial design is orthogonal to everyother, we may partition SSM and DFM further, as shown in Table 3.6

We see now that the SSM in Table 3.5 partitions as 12.63 + 0.89 + 3.19 =

16.70, where the sums are for x1, x2, and x3, respectively We can also see thatthough Table 3.5 showed that the model as a whole was significant at greater

than 99.9% confidence (P = 0.0006), not all effects are significant at exactly

the same confidence level In this particular case, it happens that all effects

are significant with greater than 95% confidence (P < 0.05) However, this

need not be the case Even if the model is very significant, some individualeffects may not be

3.3.2 General Features of Factorial Designs

A factorial design uses all possible high- and low-factor combinations of f

factors and comprises 2f experimental points Therefore, the factorial designcan fit at most 2f terms according to

SSRSST

space, so why not? For that matter, we can also show s = √MSR

Basic ANOVA for Table 3.4

Partitioned ANOVA for Table 3.4

Trang 23

The last term in Equation 3.44 will be the f-factor interaction For example,

if f = 3, the last interaction (the eighth term, 23 =8) will be a123x1x2x3 Also, inEquation 3.43, the number of summands indicates the overall order of theterm For example,

comprises all the third-order interaction terms for a given factorial design,

presuming f = 3 or greater.

To construct the X matrix for all terms, we may make use of Equation 3.44.

We may also use Figure 3.3 to determine the number of terms that are first,

second, and third order, etc Using Equation 3.11, the value of the mth

param-eter indicates the number of terms (m) of overall order f For example, for

f = 2 we have entries (1, 2, 1) indicating that there is 1 zero-order term (k = 0),

2 first-order terms (k = 1), and 1 second-order (overall) term (k = 2) Since the

factorial has only high and low values of each factor, no term may containfactors having an individual order above 1 Therefore, factorial terms that

overall are second order are of the form x j x k, terms that overall are third

order have the form x h x j x k, and so forth If we want to know the number ofthird-order terms for the 25 factorial design, we can use Equation 3.11 directly

to find

or we can view Figure 3.3 and note that for f = 5 and p = 3, m = 10.

3.3.3 Construction Details of the Two-Level Factorial

To construct the X matrix for a two-level factorial design, we use the

follow-ing rules

1 There will be n = 2 f design points, where f is the number of factors.

2 Construct factor columns by alternating between low (–) and high(+) factor values in blocks of 2f–k , where k is the factor subscript.

3 Continue for f factors The resulting matrix comprises the design

matrix of factor coordinates

y a a x k k a x x jk j k a x x x hjk h j k

k

j k

h j k

Trang 24

Example 3.3 Construction of a 2 3 Factorial Design

Problem statement: Construct a 23 factorial design using the going procedure

fore-Solution: As this is a 23 factorial design, f = 3 and n = 23 = 8 For

the first factor, k = 1; therefore, we alternate the first row in blocks

Pt12345678

Trang 25

This is the 23 factorial design having eight data points comprisingall possible high/low combinations of three factors.

One may also derive the factorial matrix from binary counting For readersdesigns, conversion of the whole numbers from 0 to 2f –1 to binary gives the

sign pattern directly Binary order is the sign pattern of the whole numbers

expressed in binary The method also allows one to jump immediately to the

nth-factor pattern by expressing n – 1 as a binary whole number Some authors refer to this as standard order; others use the phrase to refer to the reverse sign pattern On occasion, we will renumber our points starting from 1 rather

than zero

Binary Counting Method

Problem statement: Construct a 23 factorial design using binarycounting For a 26 design in binary order, what is the sign pattern

If we prefer to number our points starting from 1 rather than 0,

we add one to the above entries

In matrix form (annotated), we would have

unfamiliar with binary and related bases, see Appendix F For full factorial

Trang 26

To calculate the sign pattern of the 40th factor in a 26 factorialdesign we have 40 – 1 = 39 = 1001112 = + – – + + +; that is,

Note that the numerical subscript following the main number

3.3.4 Contrast of Factorial and Classical Experimentation

Traditionally, experimentalists held constant all factors but one in order toassess the individual affect of that factor on the desired response Proponentshave even misstated this strategy as a requirement of the scientific method;

it is not In fact, factorial designs vary all factors at once So then, how do

we know which factors correlate with the response if several vary neously? Actually, if one is clever about the experimental design, not onlycan one vary all factors at once, but also the strategy is more efficient thanthe classical one-factor-at-a-time approach, with better statistical properties(such as better estimates for the factor’s effect on the response) For example,let us contrast classical and factorial designs for two factors with somehypothetical data Table 3.7 gives the factor patterns

simulta-TABLE 3.7

Contrast of Classical and Factorial Designs

Trang 27

In the classical design, we suppose that an investigator, seeking only tovary one factor at a time, starts at some point in ξ1·ξ2 factor space (0, 0),which is the current optimum and maximal response, 2.5 units (see bottomleft corner of Figure 3.4) We shall further suppose that the graph representsthe entire operating space; that is, the operating constraints do not allow forexcursions beyond the boundaries shown The investigator begins byincreasing ξ2 one unit, keeping ξ1 constant, and recording a value of 0.Obviously, the response is going in the wrong direction, but just to be sure,

he increases ξ2 by another unit The response drops to –2.5 At this point, hehas reached an operating constraint and has not been successful in increasingthe response value

Returning to the historic maximum, he holds ξ2 constant and increases ξ1

by one unit Regrettably, the response drops from 2.5 to 0.5 units He decides

to increase ξ1 by another unit, but the response stays the same at 0.5 unitsand he has reached the maximum value for ξ1 Based on his data, he con-cludes the following:

FIGURE 3.4

Classical and factorial experiments In the classical design (circles) the investigator starts at some origin (bottom left corner) and changes one factor while holding the other constant This results in the vertical path along the left axis Finding no increase in response (note contours),

he returns to the origin and increases x1 with x2 constant, proceeding along the bottom right of the figure He concludes (falsely) that the origin is the place of maximum response In fact, a factorial design (squares) would have generated the proper model and found that the higher response within constraints is in the upper right-hand corner.

-0.5 0.5

-1

-2

1 0

-2

x2

x1

Trang 28

• A least squares fit of his data in the given metric is

(3.45)

• Increases in ξ2 are a disaster; they reduce the response precipitously.

• Increases in ξ1 are also a bad idea as they decrease the response.However, the decline flattens out as ξ1 increases

• The response is a maximum One cannot increase things any furtherwithin the operating constraints

As we show presently, the last conclusion is incorrect However, it is notdue to an error in analysis, but to the limitations of the experimental designitself A factorial design would have allowed for different conclusions Afactorial design uses all possible high and low combinations of the factors.For two factors, there are 22 = 4 possible combinations The investigator runsFor now, we merely mention that randomizing the run order is essential tonegate the effects of bias from unknown factors correlated in time (e.g.,humidity, air temperature, batch cycles, etc.) If we know that a factor isinfluential, we can account for it in the experiment But to mute the potentialeffect of those factors we do not know about, we randomize

In accordance with standard practice, we transform the factors This

bounds the operating region by –1 < x1 < 1 and –1 < x2 < 1 With this design,the investigator comes to the following conclusions:

• The regression equation in transformed coordinates is

(3.46)

• Because the matrix is orthogonal, one may directly compare the

coefficients There is a strong interaction term, meaning that if x1 and

x2 move together, the response increases

• Ultimately, the surface slopes downward along the diagonal runningfrom bottom right to top left (see Figure 3.4), and upward along thediagonal running from bottom left to top right

• There is a region of improved response within the design constraints(top right corner of the figure, response = 3.5)

• Figure 3.4 maps both investigations in factor space In the figure, thecircles represent the classical design points and the squares representthe factorial approach

The investigator could not come to the proper conclusions because hisexperimental strategy was flawed His analysis of the data is correct as far

as it goes, but the distorted factor space compromised the results

y=2 5 +ξ ξ1( 1−3)−2 5 ξ2

y= + −1 x x + x x

2 2

1 2 1 2them in random order We describe randomization in detail in Section 3.7

Trang 29

3.3.4.1 Statistical Properties of Classical Experimentation

Generally, classical experimental strategies have poor statistical propertiesfor the following reasons:

• The information is concentrated along a few axes rather than spreadover the entire factor space We will always have more certainty ofinformation near the design points or for interpolations among them,compared to distant and extrapolated regions Box and Draper

present an information function for gauging the certainty of estimating

a response function:2

(3.47)

where is the variance, n is the total number of points in the design,

x is the position vector that we describe shortly, and X is the matrix

of the presumably true model For our purposes, we shall be ested only in the fraction of maximal information as we traverse thefactor space,

inter-(3.48)

which we shall call i(x), the information fraction; is the vector infactor space having the greatest certainty of response estimation Wecan find for

when is a minimum, i.e., for

For the factorial design, the presumed model is ,and the design gives

xmaxT

xmaxT

11

Trang 30

=

= Also,

leading to

Therefore, , which is obvious even by inspection

factorial design, the information fraction becomes

(3.49)

where the subscript F denotes the factorial design Using the same

method, we arrive at the information fraction for the classical design:

(3.50)

where the subscript C denotes the classical design Figure 3.5

pre-sents the results graphically

The factorial design has better and more even-handed certainty inestimating the response The classical design has highly asymmetricinformation contours and poor certainty for about two thirds of theoperating region — basically, anywhere but near the coordinate axes

In other words, even if Equation 3.45 were appropriate, the classicaldesign would not be the best way to estimate it from real data

x X XT( )T − 1x

1

4444

1

1 2

x x

x x

1 2

000

4 2 4

12

1 2 1

3 2 3 1 2 2 2

Trang 31

• The classical one-factor-at-a-time strategy fails in the case of strong

interaction Strong interaction is quite typical for combustion

prob-lems due to inherent nonlinearities and complex response surfaces(e.g., NOx and CO emissions, excess air relations, etc.) Equation 3.45has no interaction term, nor can the design support one Unless onevaries both factors at once, it is impossible to estimate interaction

3.3.4.2 How Factorial Designs Estimate Coefficients

At this point, it may seem like a minor miracle that one can vary severalfactors at once and come to any conclusions, let alone sound ones If all thefactors vary at the same time, how can one know which factor or factorshave changed the response? To see, let us compare the classical and factorialdesigns and note their similarities rather than their differences In a classicaldesign, one varies a single factor

For example, suppose we measure a response in two places, y0 and y1,

corresponding to two positions along the factor x1 axis, namely, x1,0 and x1,1,

respectively Now, we would like to fit the model y = a0 + a1x1 Then y1 = a0+ a1x1,1 and y0 = a0 + a1x1,0 Subtracting y0 from y1 gives Thus,

FIGURE 3.5

Information fraction for a factorial and classical design The factorial design has its maximum

certainty at the center of the operating region (100%, x marks the spot) It also has symmetrical information contours The classical design has a very asymmetrical information distribution The design provides adequate certainty for estimating the response function only in close proximity to the coordinate axes It provides poor certainty for most (about two thirds) of the operating region.

Factorial Design Classical Design

Trang 32

Beginning our experiments at the design origin (x1,0 = 0) and substituting

this into our original equation gives a0 = y0 This is the classical designstrategy

Now suppose we want better certainty and we perform our experiments

a number of times at these two points Then clearly

super-directions Since all the factor axes coincide at the origin, we may write x1,0

= x2,0 = x3,0 = 0 and call this point x0,0 Then we have a total of four design

points: x0,0, x1,1, x1,2, and x1,3 If we replicate each point four times, then weperform 4·8 = 32 experiments Thirty-two experiments for four coefficients

is not an efficient strategy However, if we are clever, we can perform onlyeight experiments yet replicate each point four times To know how, considerthe sneaky farmer

3.3.4.3 The Sneaky Farmer

Suppose a farmer’s land must show at least five rows of mature palm trees

in order to qualify for a tax break, with the law requiring at least four palmtrees per row The farmer could purchase 20 mature trees and plant them in

a five-by-four grid, but mature palms are expensive, especially since the newlaw has taken effect So the farmer decides to plant the trees in a star patternThis only takes 10 trees because the farmer uses every tree in two rows.Factorial designs are similar in the sense that every point serves multipleduty For the case of a three-factor factorial, the design uses each point in

eight averages To see this, we shall examine the XTX matrix for a 23 factorialdesign in some detail Here it is:

a

y n

k k n

0

0 1

=∑= ,

a

y n

y n

k k

n

k k n

1

1 1

0 1

Trang 33

Consider that we shall fit the following model from the above eight iments.

exper-(3.52)

In matrix form, this becomes

FIGURE 3.6

The sneaky farmer Option (a) gives the conventional solution to planting 5 rows of 5 trees each

and requires 20 trees The sneaky farmer chooses option (b), planting only 10 trees Yet, niably, there are five rows of four trees each Each tree has been made to count twice, being in two rows at once Factorial designs are like that In a 2 3 factorial design, each point is used in eight different averages to give eight different coefficients Classical one-factor-at-a-time exper- iments are more like option (a) and require more points for the same statistical certainty.

unde-(a) five rows of four trees each

(b) five rows of four trees each

Pt

12345678

2 3 4 5 6 7 8

y= +a0 a x1 1+a x2 2+a x3 3+a x x12 1 2+a x x13 1 3+a x x23 2 3++a x x x123 1 2 3

Trang 34

1 2 3 4 5 6 7 8

0 1 2 3 12 13 23 123

x x

1 2 3

2 3 2

1 2 3 2

0 1 2 3 12 13 23 123

= ∑

( )

Trang 35

(3.55g)

(3.55h)

In fact, all the denominators sum to 8 We may also write the coefficients

in terms of the response values by considering the actual signs of the

Trang 36

(3.56h)

Comparing Equations 3.55a to h with Equations 3.51a and b shows that

the form is identical It is as if we replicated each point eight times For a0,

we use an average comprising all eight points, as is the case for the classicalexperimentation Yet, we have only run eight experiments This is the powerBecause the design is orthogonal, truncating the model to fewer terms willnot change the value of the remaining ones This is not true for least squares,

in general, but it is true with orthogonal designs This latter fact will allow

us to independently test effects for statistical significance For nonorthogonalmatrices, one may need to consider all possible models before passing judg-ment to retain or reject a model term In the present case there are eight

possible terms in the model (a0 – a123) generating 28 = 256 possible models.Although hierarchical strategies may allow us to reduce the search,*this willrequire dedicated software

Problem statement: Calculate the value of the coefficients for thefactorial data set given in Table 3.8 The data show the dependence

of NOx on excess oxygen (O2), air preheat temperature (APH),and bridgewall temperature (BWT) for a particular burner andfurnace

1 Calculate the coefficients for the full factorial using the tions of Equation 3.54

rela-2 Repeat the calculation using matrix algebra

3 Then calculate the coefficients for the reduced model:

4 Comparing the coefficients, what do you notice?

* A mixed strategy, where one alternately adds terms that most improve or subtracts terms that least harm model significance, is likely to result in optimum models For a discussion, see

Draper, N.R and Smith, H., Selecting the ‘best’ regression equation, in Applied Regression sis, 3rd ed., John Wiley & Sons, New York, 1998, chap 15, pp 327f.

Trang 37

FIGURE 3.7

each respective coefficient All eight points are used to calculate each coefficient.

y1+y2+y3+y4+y5+y6+y7+y8 (y5+y6+y7+y8)–(y1+y2+y3+y4 ) (y3+y4+y7+y8)–(y1+y2+y5+y6 ) (y2+y4+y6+y8)–(y1+y3+y5+y7 )

34

78

© 2006 by Taylor & Francis Group, LLC

Trang 38

Solution: First we apply the following coding transforms to centerand scale all factors to a uniform dimensionless metric of ±1:

1 Using the relations of Equation 3.54 we obtain the followingestimates rounded to the decimal accuracy given:

Trang 39

3 Truncating the model we obtain

, thus

4 In the second case, the coefficients are identical to the first

In the third case, the coefficients for the remaining effectsare identical to the respective coefficients in the first twocases For a factorial design, one may omit any number ofterms from the model and the remaining coefficients retaintheir original values This property is not true for leastsquares solutions in general, but it is true with factorialdesigns

This is how one may vary many factors at once, yet obtainincreased certainty for individual coefficients, as if we had takenrepeated measures along a single-factor axis

3.3.5 Interpretation of the Coefficients

Factorial designs also generate coefficients with sensible interpretations For

example, a0 is the average response This is also the expected response value

at the design center (0, 0, 0 for the case at hand) Note that the least squaresmethod derives each coefficient from all the response values, resulting in thehighest certainty for each estimate

When a k > 0, the coefficients represent the increase in the response value

as x k moves one unit in the positive direction Conversely, if a k < 0, then the

response decreases Since we have coded x k at ±1 for the boundaries of the

a a a a a a a a

0 1 2 3 12 13 23 123

a a a a

0 1 2 12

Trang 40

factorial design, a k represents are the first-order effects at the design extremes.Thus, examining the signs of these coefficients will tell us if an increase inthe factor will increase or decrease the response Since we have standardizedall the factors to unit range and zero mean, they are unbiased; we maycompare the coefficients directly.

The a jk coefficients represent the interaction terms Interaction termsaccount for synergy or moderation of the factors The statistical literature

uses the term effect as a synonym for a model term If effects interact, such interactions will show up as an interaction coefficient An a jk coefficient

represents a binary interaction between x j and x k An a hjk coefficient represents

a ternary interaction among x h , x j , and x k The overall order of an effect is the

sum of the factor exponents expressed as an ordinal number (e.g., first,second, third, etc.) The individual order of a factor within an effect is the

ordinal number of its exponent Therefore, the product x h x j x k is third order

overall and first order in x h , x j , and x k Third- or higher-order terms are rarelysignificant in factorial designs If they are insignificant, then they represent

an estimate of experimental error, e.g., a hjk x h x j x k ≈ e If so, Equation 3.52becomes

(3.57)

Example 3.6 Interpretation of the Example 3.5 Coefficients

Problem statement: Examine and interpret the coefficients ofExample 3.5

Solution: Example 3.5 gave the following coefficients

a0 = 14.8, a1 = 3.38, a2 = 3.27, a3 = 1.28,

a12 = 0.51, a13 = 0.46, a23 = 0.08, a123 = 0.10Recall that the subscripts have the following references: 1 = oxygen,

2 = air preheat temperature, and 3 = furnace temperature

For discussion purposes, we start with a graphical depiction of

Since NOx is a function of three factors, the actual response face is four-dimensional We may represent this as a series ofcontour slices or as a series of superimposed three-dimensionalsurfaces For an overall qualitative understanding of the NOxsurface, the stacked surfaces are better For more quantitativeresults, the contours are better For exact numbers and computerprograms, Equation 3.52 is best From the coefficients, we under-stand the following:

sur-y= +a0 a x1 1+a x2 2+a x3 3+a x x12 1 2+a x x13 1 3+a x x23 2 3++e

the NOx relation Figure 3.8 gives the pictures

Ngày đăng: 13/08/2014, 05:22

TỪ KHÓA LIÊN QUAN