Illustration of the Two Methodologies Using a Case Study Data Set

Một phần của tài liệu six sigma (Trang 147 - 156)

The data set used in this section is taken from a process with mean 12.61 and standard deviation 1.38. The specification limit of this process is 91. From the distribution of the data shown in Figure 10.1, we should expect aCpk much higher than 3 as the

88 77 66 55 44 33 22 11

Process Data

Sample Mean 12.61467 LSL

Target USL Sample N StDev (Within) StDev (Overall)

91.00000 30 1.38157 1.38157 10 620 USL

10 865 11 360 11 380 11 445 11 540 11 560 11 610

13.980 14.380 14.410 15.225 15.335 16.300

Figure 10.1 Histogram of case study data.

specification limit is so far away from the mean and we do not expect USL to be exceeded.

As the distribution is highly skewed, a normal approximation will not yield a good estimate. Two methods were used to estimate the capability of this process; these are considered in turn in Sections 10.2.1 and 10.2.2 and compared in Section 10.2.3.

10.2.1 Estimation of process capability using Box--Cox transformation

Traditionally, when the distribution of the data is not normal, the advice is to first transform the data using the Box--Cox transformation. If the transformation is suc- cessful, and the transformed data follows the normal distribution as proven by the normality test, then we can proceed to do a PCA using the transformed data and the transformed specification using formula (10.1).

10.2.1.1 Box--Cox transformation

The Box--Cox transformation is also known as the power transformation. It is done by searching for a power value,λ, which minimizes the standard deviation of a stan- dardized transformed variable. The resulting transformation isY=Yλforλ=0 and Y=lnYwhenλ=0. Normally, after the optimumλhas been determined, it is then rounded either up or down to a number that make some sense; common values are

−1,−1/2, 0,1/2, 2, and 3.

The transformation can be done by most statistical software. Figure 10.2 was ob- tained from MINITAB 14. It shows a graph of the standard deviation over the different values ofλfrom−5 to 5. The optimumλ(corresponding to lowest standard devia- tion) suggested was−3.537 25. It is recommended to chooseλ= −3; it is evident from Figure 10.2 that this gives a standard deviation close to the minimum.

Lambda

StDev

5.0 2.5

−2.5 0.0

−5.0 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2

Upper CL

Limit

Lambda (using 95.0% confidence)

−3.53725 Estimate −3.53725 Lower CL

Upper CL −0.50142 Best Value

Box-Cox Plot of Data

Figure 10.2 Box--Cox plot for case study data.

10.2.1.2 Verifying the transformed data

Having transformed the data, we need to check that the transformation is a good one using both a normality test and visual assessment (fitting a normal curve over the histogram).

There are many normality tests available. The Anderson--Darling test is the test recommended by Stephens.1 It tests the null hypothesis that the data come from a specified normal distribution by measuring the area between the fitted line based on the distribution and the nonparametric step function based on the plotted points.

The statistic is a squared distance and is weighted more heavily in the tails of the distribution. It is given by

A2= −N− 1 N

(2i−1)(lnF(Yi)+ln(1−F(YN+1−i)),

where theYiare the ordered data and F(Yi)=

Yix¯ s

is the cumulative distribution function of the standard normal distribution. Smaller Anderson--Darling values indicate that there is are fewer differences between the data and the normal distribution, and hence that the data fits the specific distribution better.

Another quantitative measure for reporting the result of the Anderson--Darling normality test is the p-value, representing the probability of concluding that the null hypothesis is false when it is true. If you knowA2you can calculate the p-value. Let

A2= A2*

1+0.75 N +2.25

N2

,

0.0008 0.0007 0.0006 0.0005 0.0004 0.0003 0.0002

Median Mean

0.00060 0.00058 0.00056 0.00054 0.00052 0.00050 0.00048

Anderson-Darling Normality Test

95% Confidence Interval for Mean 0.000474 0.000583 95% Confidence Interval for Median

0.000477 0.000591 95% Confidence Interval for StDev

0.000117 0.000197

A-Squared 0.29

P-Value 0.597

Variance 0.000000 Skewness −0.198567 Kurtosis −0.205168

N 30

Mean 0.000528

StDev 0.000146

95% Confidence Intervals

Summary for Transformed Data

Minimum

1st Quartile 0.000451 Median 0.000538 3rd Quartile 0.000641 Maximum 0.000835 0.000231

Figure 10.3 Histogram of case study data after Box--Cox transformation.

whereNis the sample size. Then

p=

⎧⎪

⎪⎨

⎪⎪

exp(1.2937−5.709A2+0.0186(A2)2), if 13> A2>0.600 exp(0.9177−4.279A2−1.38(A2)2), if 0.600> A2>0.340 1−exp(−8.318+42.796A2−59.938(A2)2), if 0.340> A2>0.200 1−exp(−13.436+101.14A2−223.73(A2)2), if A2<0.200

Generally if thep-value of the test is greater than 0.05, we do not have enough evidence to reject the null hypothesis (that the data is normally distributed).

Any statistical software can again easily do the above. Figure 10.3 was obtained by MINITAB 14, which provides both the histogram plot with a normal curve on top of it and the Anderson--Darling normality test on the same page. From the Anderson-- Darling test we can conclude that the transformed data is normally distributed as the p-value was 0.597, way above the critical value of 0.05.

10.2.1.3 Estimate the process capability using the transformed data

Having successfully transformed the data, we can estimate the process capability us- ing the formula below with the transformed data and the transformed specification, 91−3=1.327×10−6(note that as the power is negative, the original upper specifica- tion limit becomes the lower specification limit):

Cpk =

x¯−L SL 3σ

.

For our example data setCpk=1.19 (see Figure 10.4).

0.0008 0.0006 0.0004 0.0002

0.0000 USL

transformed data Process Data

Sample N 30

StDev (Within) 1.38157 StDev (Overall) 1.38157 After Transformation LSL

Target LSL

USL 0.00000

Sample Mean 0.00053 StDev (Within) 0.00015 StDev (Overall) 0.00015 Target

USL 91.00000

Sample Mean 12.61467

Potential (Within) Capability

Overall Capability Pp PPL PPU 1.19 Ppk 1.19 Cpm

Observed Performance Exp. Within Performance PPM > LSL PPM < USL 179.41

179.41 PPM Total

Exp. Overall Performance PPM > LSL PPM < USL 179.41 PPM Total 179.41

Within Overall

Process Capability of Data

Using Box-Cox Transformation With Lambda = -3

PPM < LSL PPM > USL 0.00

0.00 PPM Total

CCpk 1.19 1.19 1.19 Cp CPL CPU Cpk

Figure 10.4 Process capability study of case study data using Box--Cox transformation.

10.2.2 Best-fit distribution

An alternative to the Box--Cox transformation is to search for the distribution that best fits the data and use that to estimate the process capability. The search for the best-fit distribution is done by first assuming a few theoretical distributions and then comparing them.

10.2.2.1 Distribution fitting

There are various ways of selecting a distribution for comparison.

1. By the shape of the distribution (skewness and kurtosis). Kurtosis is a measure of how ‘sharp’ the distribution is as compared to a normal distribution. A high (positive) kurtosis distribution has a sharper peak and fatter tails, while a low (negative) kurtosis distribution has a more rounded peak with wider shoulders.

Kurtosis can be calculated using the formula N(N+1)

(N−1)(N−2)(N−3)

xix¯ s

4

− 3(N−1)2 (N−2)(N−3)

wherexi is theith observation, ¯xis the mean of the observations,Nis the number of nonmissing observations, andsis the standard deviation.

Skewness is a measure of how symmetrical the distribution is. A negative value indicates skewness to the left (the long tail pointing towards the negative in the pdf), and a positive value indicates skewness to the right (the long tail pointing towards the positive in the pdf). However, a zero value does not necessarily indicate symmetry.

Table 10.1 Common distributions.

Distribution Parameters

Smallest extreme value

Normal μ=location −∞< μ <

Logistic σ=scale σ >0

Lognormal μ=location μ >0

Loglogistic σ=scale σ >0

Three-parameter lognormal μ=location μ >0

Three-parameter loglogistic σ=scale σ >0

λ=threshold −∞< λ <

Weibull α=scale α=exp(μ)

β=shape β=1

Three-parameter Weibull α=scale α=exp(μ)

β=shape β=1

λ=threshold −∞< λ <

Exponential θ=mean θ >0

Two-parameter exponential θ=scale θ >0

λ=threshold −∞< λ <

Skewness can be calculated using the following formula:

N (N−1)(N−2)

xix¯ s

3

where the symbols have the same meanings as in the kurtosis formula above.

Table 10.1 shows the commonly used distributions and their parameters to be estimated. If the distribution is skewed, try the lognormal, loglogistic, exponen- tial, weibull, and extreme value distributions. If the distribution is not skewed, depending on the kurtosis, try the uniform (if the kurtosis is negative), normal (if the kurtosis is close to zero), Laplace or logistic (if the kurtosis is positive).

2. By nature of the data.

(a) Cycle time and reliability data typically follow either an exponential or Weibull distribution.

(b) If the data is screened from a highly incapable process, it is likely to be uniformly distributed.

(c) If the data is generated from selecting the highest or lowest value of multiple measurements, it is likely to following the extreme value distribution.

10.2.2.2 Parameter estimation

After short-listing the possible distributions that are likely to fit the data well, the parameters for the distribution need to be estimated. The two common estimation methods are least squares and maximum likelihood.

Least squares is a mathematical optimization technique which attempts to find a function which closely approximates the given data. It is done by minimizing the sum of the squares of the error (also known as residuals) between points generated by the function and corresponding points in the data.

Maximum likelihood estimation is a statistical method used to make inferences about parameters of the underlying probability distribution of a given data set. Max- imum likelihood estimates of the parameters are calculated by maximizing the likeli- hood function with respect to the parameters. The likelihood function describes, for each set of distribution parameters, the chance that the true distribution has those parameters based on the sample data.

The Newton--Raphson algorithm can be used to calculate maximum likelihood estimates of the parameters that define the distribution. It is a recursive method for computing the maximum of a function.

10.2.2.3 Selecting the best-fit distribution

Selection of the best-fit distribution can be done either qualitatively (by seeing how well the data points fit the straight line in the probability plot), quantitatively (using goodness-of-fit statistics), or by a combination of the two. Most statistical programs provide the plots and the statistics together.

Probability plot

The probability plot is a graphical technique for assessing whether or not a data set follows a given distribution such as the normal or Weibull.2 The data are plotted against a theoretical distribution in such a way that the points should approximately form a straight line. Departures from this straight line indicate departures from the specified distribution.

The probability plot provided by MINITAB includes the following:

r plotted points, which are the estimated percentiles for corresponding probabilities of an ordered data set;

r fitted line, which is the expected percentile from the distribution based on maximum likelihood parameter estimates;

r confidence intervals, which are the confidence intervals for the percentiles.

Because the plotted points do not depend on any distribution, they are the same (before being transformed) for any probability plot made. The fitted line, however, differs depending on the parametric distribution chosen. So you can use a probability plot to assess whether a particular distribution fits your data. In general, the closer the points fall to the fitted line, the better the fit.

Anderson--Darling statistic

The Anderson--Darling statistic was mentioned in Section 10.2.1.2. Note that for a given distribution, the Anderson--Darling statistic may be multiplied by a constant (which usually depends on the sample size, n). This is the ‘adjusted Anderson-- Darling’ statistic that MINITAB uses. The p-values are based on the table given by

D’Agostino and Stephens.1If no exact p-value is found in the table, MINITAB calcu- lates thep-value by interpolation.

Pearson correlation

The Pearson correlation measures the strength of the linear relationship between two variables on a probability plot. If the distribution fits the data well, then the plot points on a probability plot will fall on a straight line. The strength of the correlation is measured by

r = zxzy

N

wherezxis the standard normal score for variableXandzyis the standard normal score for variableY. Ifr = +1(−1) there is a perfect positive (negative) correlation between the sample data and the specific distribution;r =0 means there is no correlation.

Capability Estimation for our case study data using the best-fit distribution method

The graph in Figure 10.5 was plotted using MINITAB 14 with two goodness-of-fit statistics, the Anderson--Darling statistic and Pearson correlation coefficient, to help the user compare the fit of the distributions. From the probability plots, it looks like the three-parameter loglogistic and three-parameter lognormal are equally good fits for the data. From the statistics (Anderson--Darling and Pearson correlation), the three- parameter loglogistic is marginally better.

Data - Threshold

Percent

1.0 10.0 0.1

90 50

10

1

Data - Threshold

Percent

5 2

1 99 90 50 10 1

Data - Threshold

Percent

10 1

99 90 50 10 1

Data - Threshold

Percent

10.000 1.000 0.100 0.010 0.001 90 50

10

1

Correlation Coefficient 3-Parameter Weibull

0.982 3-Parameter Lognormal

0.989 3-Parameter Loglogistic

0.991 2-Parameter Exponential

Probability Plot for Data LSXY Estimates-Complete Data

3-Parameter Weibull 3-Parameter Lognormal

3-Parameter Loglogistic 2-Parameter Exponential

Distribution 3-parameter Weibull 3-parameter Lognormal 3-parameter Loglogistic 2-parameter Exponential

Anderson-Darling (adj) 1.010 0.698 0.637 3.071

Correlation Coefficient 0.982 0.989 0.991 Goodness-of-Fit

Figure 10.5 Probability plots for identification of best-fit distribution.

Data

Frequency

18 16

14 10 12

14 12 10 8 6 4 2 0

Histogram of Data 3-Parameter Loglogistic

Loc 0.8271 Scale 0.2900 Thresh 10.01 30 N

Figure 10.6 Histogram and three-parameter loglogistic fit for case study data.

It is generally good practice to plot the distribution over the histogram make sure the fit is good enough. The graph in Figure 10.6 provides convincing evidence that the three-parameter loglogistic distribution provides an excellent fit for the data.

After the best-fit distribution has been identified, the process capability can be estimated easily by computer. Figure 10.7 shows that the estimatedCpkis 5.94, which is close to what we expected. (Note that instead ofCpk,Ppkis given in the graph. In Six Sigma,Ppk is the long-term capability whileCpkis the short-term process capability.

MINITAB assumes that the data is long term for non-normal data. In this chapter, we will skip the discussion on whether the data is long term or short term, and treat all estimated capability theCpk.)

88 77 66 55 44 33 22 11

USL Process Data

Sample N 30

Location 0.82715

Scale 0.29003

Threshold 10.01439 LSL

Target

USL 91.00000

Sample Mean 12.61467

Overall Capability Pp PPL PPU 5.94 Ppk 5.94

Observed Performance

Exp. Overall Performance

Process Capability of Data

Calculations Based on Loglogistic Distribution Model

PPM < LSL PPM > USL PPM Total

0 0

PPM < LSL PPM > USL 4.55621

4.55621 PPM Total

Figure 10.7 Process capability analysis using three-parameter loglogistic fit.

10.2.3 Comparison of Results

The estimation of the process capability,Cpk, is 1.19 using the Box--Cox transformation and 5.94 using the best-fit distribution. The difference is 4.75, which is huge. Of course, all estimates are wrong, but some are more wrong than others. What is needed is an estimate that is good enough for decision-making. On which process needs a 100 % screening and which one just needs sampling for Xbar-R chart.

Looking at the data and specification limit for the case study, there is no room for doubt that this is a highly capable process, much more so than the common industrial standard for theCpk(1.33). Whether theCpkis 3 or 5 does not really matter as it does not affect the decision. But using the Box--Cox transformation method alone would lead us to suspect that the process is not capable enough. Relying on this method alone would here led us to commit resources that ultimately would have been wasted.

The reason for the gross inaccuracy when theλis negative lies in the fact that the minimum value for the transformed data is 0, violating the assumption that normal data can take any value from−∞to+∞. This violation is not a big problem if the mean of the transformed distribution is far away from zero. But if the mean is close to zero (which in most cases it will be when the distribution is transformed with a negativeλ, as any value greater than one will be between zero and one after the transformation), the estimation will think that the distribution has a ‘tail’ all the way to−∞, and this will bring down theCpk estimate.

Một phần của tài liệu six sigma (Trang 147 - 156)

Tải bản đầy đủ (PDF)

(428 trang)