1. Trang chủ
  2. » Y Tế - Sức Khỏe

Experiments Planning, Analysis, and Optimization Second Edition pdf

743 1,6K 7
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Experiments Planning, Analysis, and Optimization Second Edition
Tác giả C. F. Jeff Wu, Michael S. Hamada
Trường học Georgia Institute of Technology
Chuyên ngành Experimental Design
Thể loại Sách giáo trình
Năm xuất bản 2009
Thành phố Atlanta
Định dạng
Số trang 743
Dung lượng 11,43 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Quantitative Factors and Orthogonal Polynomials, 57 Expected Mean Squares and Sample Size Determination, 63 One-Way Random Effects Model, 70 Residual Analysis: Assessment of Model Assump

Trang 1

Experiments

Planning, Analysis, and Optimization Second Edition

C F JEFFWU

School of IndustriaJ and Systems Engineering

Georgia Institute of Technology

Atlanta, Georgia

MICHAEL S HAMADA

Los Alamos NationaJ Laboratory

Los Alamos, New Mexico

~WILEY

A JOHN WILEY & SONS, INC., PUBLICATION

Trang 2

Copyright © 2009 by John Wiley & Sons, Inc All rights reserved

Published by John Wiley & Sons, Inc Hoboken, New Jersey

Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic mechanical photocopying recording scanning or otherwise except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, with- out either the prior written permission of the Publisher or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center Inc • 222 Rosewood Drive Danvers

MA 01923 (978) 750-8400 fax (978) 750-4470 or on the web at www.copyright.com Requests

to the Publisher for permission should be addressed to the Permissions Department John Wiley & Sons Inc., III River Street Hoboken NJ 07030 (201) 748-6011 fax (201) 748-6008 or online at www.wiley.com/go/permissions

Limit of LiabilitylDisclaimer of Warranty: While the publisher and author have used their best efforts

in preparing this book, they make no representations or warranties with respect to the accuracy

or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose No warranty may be created or extended by sales representatives or written sales materials The advice and strategies contained herein may not be suitable for your situation You should consult with a professional where appropriate Neither the publisher nor the author shall be liable for any loss of profit or any other commercial damages including but not limited to special incidental consequential or other damages

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States

at (317) 572-3993 or fax (317) 572-4002

Wiley also publishes its books in a variety of electronic formats Some content that appears in print may not be available in electronic books For more information about Wiley products, visit our web site at www.wiley.com

UbrQry of COllgress CIIIlI/ogillg-Ul-Publicalioll DIIIlI:

Wu Chien-Fu Jeff

Experiments: planning analysis and optimization I C F Jeff Wu Michael S Hamada 2nd ed

Trang 3

To my parents and Jung Hee, Christina, and Alexandra

M S H

To my mother and family

C F J W

Trang 5

Contents

1 Basic Concepts for Experimental Design and Introductory

1.1 Introduction and Historical Perspective 1

1.2 A Systematic Approach to the Planning and Implementation

of Experiments 4

1.3 Fundamental Principles: Replication, Randomization,

and Blocking, 8

1.4 Simple Linear Regression, 11

1.5 Testing of Hypothesis and Interval Estimation, 14

1.6 Multiple Linear Regression, 20

1.7 Variable Selection in Regression Analysis, 26

1.8 Analysis of Air Pollution Data 29

Trang 6

Quantitative Factors and Orthogonal Polynomials, 57

Expected Mean Squares and Sample Size Determination, 63 One-Way Random Effects Model, 70

Residual Analysis: Assessment of Model Assumptions, 74

Practical Summary, 79

Exercises, 80

References, 86

3.1 Paired Comparison Designs, 87

3.2 Randomized Block Designs, 90

3.3 Two-Way Layout: Factors with Fixed Levels, 94

3.3.1 Two Qualitative Factors: A Regression Modeling

Approach, 97

*3.4 Two-Way Layout: Factors with Random Levels, 99

3.5 Multi-Way Layouts, 108

3.6 Latin Square Designs: Two Blocking Variables, 110

3.7 Graeco-Latin Square Designs, 114

*3.8 Balanced Incomplete Block Designs, 115

4.1 An Epitaxial Layer Growth Experiment, 155

4.2 Full Factorial Designs at Two Levels: A General Discussion, 157 4.3 Factorial Effects and Plots, 161

4.3.1 Main Effects, 162

4.3.2 Interaction Effects, 164

4.4 Using Regression to Compute Factorial Effects, 169

*4.5 ANOVA Treatment of Factorial Effects, 171

4.6 Fundamental Principles for Factorial Effects: Effect Hierarchy, Effect Sparsity, and Effect Heredity, 172

Trang 7

CONTENTS ix 4.7 Comparisons with the "One-Factor-at-a-Time" Approach, 173 4.8" Normal and Half-Normal Plots for Judging Effect

4.11 Use of Log Sample Variance for Dispersion Analysis, 184

4.12 Analysis of Location and Dispersion: Revisiting the Epitaxial Layer Growth Experiment, 185

*4.13 Test of Variance Homogeneity and Pooled Estimate of

Variance, 188

*4.14 Studentized Maximum Modulus Test: Testing Effect Significance

for Experiments with Variance Estimates, 190

4.15 Blocking and Optimal Arrangement of 2k Factorial Designs in 2 q

5 Fractional Factorial Experiments at Two Levels

5.1 A Leaf Spring Experiment, 2 I 1

211

5.2 Fractional Factorial Designs: Effect Aliasing and the Criteria of Resoluti<:>n and Minimum Aberration, 213

5.3 Analysis of Fractional Factorial Experiments, 219

5.4 Techniques for Resolving the Ambiguities in Aliased Effects, 225 5.4.1 Fold-Over Technique for Follow-Up Experiments, 225 5.4.2 Optimal Design Approach for Follow-Up

Experiments, 229 5.5 Selection of 2',-p Designs Using Minimum Aberration and Related Criteria, 234

5.6 Blocking in Fractional Factorial Designs, 238

5.7 Practical Summary, 240

Exercises, 242

Appendix 5A: Tables of 2 k- p Fractional Factorial Designs, 252

Appendix 5B: Tables of 2 k- p Fractional Factorial Designs in 2 q

Blocks, 260

References, 264

Trang 8

x CONTENTS

6 Full Factorial and Fractional Factorial Experiments at Three

6.1 A Seat-Belt Experiment, 267

6.2 Larger-the-Better and Smaller-the-Better Problems, 268

6.3 3 k Full Factorial Designs, 270

6.4 3 k - p Fractional Factorial Designs, 275

6.5 Simple Analysis Methods: Plots and Analysis of Variance, 279 6.6 An Alternative Analysis Method, 287

6.7 Analysis Strategies for Multiple Responses I: Out-of-Spec

Probabilities, 293

6.8 Blocking in 3 k and 3 k - p Designs, 302

6.9 Practical Summary, 303

Exercises, 305

Appendix 6A: Tables of 3 k- p Fractional Factorial Designs, 312

Appendix 6B: Tables of 3 k- p Fractional Factorial Designs in 3 q

Blocks, 313

References, 317

7 Other Design and Analysis Techniques for Experiments at

7.1 A Router Bit Experiment Based on a Mixed Two-Level and Four-Level Design, 319

7.2 Method of Replacement and Construction of 2 ln 4" Designs, 322 7.3 Minimum Aberration 2 ln 4" Designs with n = 1,2, 325

7.4 An Analysis Strategy for 2 111 4" Experiments, 328

7.5 Analysis of the Router Bit Experiment, 330

7.6 A Paint Experiment Based on a Mixed 1\vo-Level and Three-Level Design, 334

7.7 Design and Analysis of 36-Run Experiments at 1\vo and Three Levels, 334

7.8 r k- p Fractional Factorial Designs for any Prime Number T, 341

7.8.1 25-Run Fractional Factorial Designs at Five Levels, 342 7.8.2 49-Run Fractional Factorial Designs at Seven Levels, 345 7.8.3 General Construction, 345

*7.9 Related Factors: Method of Sliding Levels, Nested Effects

Analysis, and Response Surface Modeling, 346

7.9.1 Nested Effects Modeling, 348

7.9.2 Analysis of Light Bulb Experiment, 350

7.9.3 Response Surface Modeling, 353

Trang 9

CONTENTS xi

7.9.4 Symmetric and Asymmetric Relationships Between

Related Factors, 355 7.10 Practical Summary, 356

Exercises, 357

Appendix 7A: Tables of 21H41 Minimum Aberration Designs, 364

Appendix 7B: Tables of 21114 2 Minimum Aberration Designs, 366

Appendix 7C: OA(25, 56), 368

Appendix 70: OA(49, 78), 368

References, 370

8 Nonregular Designs: Construction and Properties

8.1 Two Experiments: Weld-Repaired Castings and Blood Glucose Testing, 371

371

8.2 Some Advantages of Nonregular Designs Over the 2k- p and 3"-P

Series of Designs, 373

8.3 A Lemma on Orthogonal Arrays, 374

8.4 Plackett-Burman Designs and Hall's Designs, 375

8.5 A Collection 6f Useful Mixed-Level Orthogonal Arrays, 379

*8.6 Construction of Mixed-Level Orthogonal Arrays Based on

Appendix 80: Some Useful Difference Matrices, 416

Appendix 8E: Some Useful Orthogonal Main-Effect Plans, 418

References 419

9.1 Partial Aliasing of Effects and the Alias Matrix 421

9.2 Traditional Analysis Strategy: Screening Design and Main Effect Analysis, 424

9.3 Simplification of Complex Aliasing via Effect Sparsity, 424

Trang 10

CONTENTS

9.4 An Analysis Strategy for Designs with Complex Aliasing, 426

*9.5

9.4.1 Some Limitations, 432

A Bayesian Variable Selection Strategy for Designs

with Complex Aliasing, 433

9.5.1 Bayesian Model Priors, 435

9.5.2 Gibbs Sampling, 437

9.5.3 Choice of Prior Tuning Constants, 438

9.5.4 Blood Glucose Experiment Revisited, 439

10 Response Surface Methodology

10.1 A Ranitidine Separation Experiment, 459

10.2 Sequential Nature of Response Surface

Methodology, 461

10.3 From First-Order Experiments to Second-Order Experiments: Steepest Ascent Search and Rectangular Grid Search, 464

10.3.1 Curvature Check, 465

10.3.2 Steepest Ascent Search, 466

10.3.3 Rectangular Grid Search, 470

10.4 Analysis of Second-Order Response Surfaces, 473

10.4.1 Ridge Systems, 475

10.5 Analysis of the Ranitidine Experiment, 477

10.6 Analysis Strategies for Multiple Responses IT: Contour Plots and the Use of Desirability Functions, 481

10.7 Central Composite Designs, 484

10.8 Box-Behnken Designs and Uniform Shell Designs, 489

10.9 Practical Summary, 492

Exercises, 494

Appendix lOA: Thble of Central Composite Designs, 505

Appendix lOB: Table of Box-Behnken Designs, 507

Appendix IOC: Thble of Uniform Shell Designs, 508

References, 509

459

Trang 11

coNTENTS

11 Introduction to Robust Parameter Design

11.1 A Robust Parameter Design Perspective of the Layer Growth and Leaf Spring Experiments, 511

11.1.1 Layer Growth Experiment Revisited, 511

11.1.2 Leaf Spring Experiment Revisited, 512

11.2 Strategies for Reducing Variation, 514

11.3 Noise (Hard-to-Control) Factors, 516

11.4 Variation Reduction Through Robust Parameter Design, 518 11.5 Expetimentation and Modeling Strategies I:

"'11.8.1 Compound Noise Factor, 542

11.9 Signal-to-Noise Ratio and Its Limitations for Parameter Design Optimization, 543

11.9.1 SN Ratio Analysis of Layer Growth Experiment, 546

12.1 An Injection Molding Experiment, 563

12.2 Signal-Response Systems and Their Classification, 565

12.2.1 Calibration of Measurement Systems, 570

12.3 Performance Measures for Parameter Design

Optimization, 571

12.4 Modeling and Analysis Strategies, 575

12.5 Analysis of the Injection Molding Experiment, 577

Trang 12

xiv CONTENTS

13.1 Experiments with Failure TIme Data, 599

13.1.1 Light Experiment, 599

13.1.2 Thermostat Experiment, 600

13.1.3 Drill Bit Experiment, 600

13.2 Regression Model for Failure Time Data, 604

13.3 A Likelihood Approach for Handling Failure TIme Data with Censoring, 605

13.3.1 Estimability Problem with MLEs, 608

13.4 Design-Dependent Model Selection Strategies, 609

13.5 A Bayesian Approach to Estimation and Model Selection for Failure TIme Data, 610

13.6 Analysis of Reliability Experiments with Failure Time Data, 613 13.6.1 Analysis of Light Experiment, 613

13.6.2 Analysis of Thermostat Experiment, 614

13.6.3 Analysis of Dlill Bit Experiment, 615

13.7 Other Types of Reliability Data, 617

13.8 Practical Summary, 618

Exercises, 619

References, 623

14 Analysis of Experiments with Nonnormal Data

14.1 A Wave Soldering Experiment with Count Data, 625

14.2 Generalized Linear Models, 627

14.2.1 The Distribution of the Response, 627

14.2.2 The Form of the Systematic Effects, 629

14.2.3 GLM versus Transforming the Response, 630

14.3 Likelihood-Based Analysis of Generalized Linear Models, 631 14.4 Likelihood-Based Analysis of the Wave Soldering

Experiment, 634

14.5 Bayesian Analysis of Generalized Linear Models, 635

14.6 Bayesian Analysis of the Wave Soldering Experiment, 637

14.7 Other Uses and Extensions of Generalized Linear Models and Regression Models for Nonnormal Data, 639

*14.8 Modeling and Analysis for Ordinal Data, 639

14.8.1 The Gibbs Sampler for Ordinal Data, 642

*14.9 Analysis of Foam Molding Experiment, 644

14.10 Scoring: A Simple Method for Analyzing Ordinal Data, 647

625

Trang 13

CONTENTS xv

14.11 Practical Summary, 649

Exercises, 649

References, 661

Appendix A Upper Tail Probabilities of the Standard Normal

Distribution ' z 1,00 -it;e-21r 1I2/ 2 du 663

Appendix B Upper Percentiles of the t Distribution 665

Appendix C Upper Percentiles of the X 2 Distribution 667

'"

AppendixD Upper Percentiles of the F Distribution 669

AppendixE Upper Percentiles of the Studentized Range

Appendix F Upper Percentiles of the Studentized Maximum

AppendixG Coefficients of Orthogonal Contrast Vectors 699

Trang 15

Preface to the Second Edition

Nearly a decade has passed since the publication of the first edition Many instructors have used the first edition to teach master's and Ph.D students Based

on their feedback and our own teaching experience, it became clear that we needed to revise the book to make it more accessible to a larger audience, includ- ing upper-level undergraduates To this end, we have expanded and reorganized the early chapters in the second edition For example, our book now provides

a self-contained presentation of regression analysis (Sections 1.4-1.8), which prepares those students who have not previously taken a regression course We have found that such a foundation is needed because most of the data analyses in this book are based on regression or regression-like models Consequently, this additional material will make it easier to adopt the book for courses that do not have a regression analysis prerequisite

In the early chapters, we have provided more explanation and details to ify the calculations required in the various data analyses considered The ideas, derivations, and data analysis illustrations are presented at a more leisurely pace

clar-than in the first edition For example, first edition Chapter 1 has been expanded into second edition Chapters 1 and 2 Consequently, second edition Chapters 3-14 correspond to first edition Chapters 2-13 We have also reorganized second edition Chapters 3-5 in a more logical order to teach with For example, anal-ysis methods for location effects that focus on the mean response are presented first and separated from analysis methods for dispersion effects that consider the response variance This allows instructors to skip the material on dispersion anal-ysis if it suits the needs of their classes In this edition, we have also removed material that did not fit in the chapters and corrected errors in a few calculations and plots

To aid the reader, we have mat'ked more difficult sections and exercises by a

"*"; they can be skipped unless the reader is particularly interested in the topic Note that the starred sections and exercises are more difficult than those in the same chapter and are not necessarily more difficult than those in other chapters

xvii

Trang 16

xviii PREFACE TO THE SECOND EDmON

The second edition presents a number of new topics, which include:

• expected mean squares and sample size determination (Section 2.4),

• one-way ANOVA with random effects (Section 2.5),

• two-way ANOVA with random effects (Section 3.4) with application to measurement system assessment,

• split-plot designs (Section 3.9),

• ANOVA treatment of factorial effects (Section 4.5) to bridge the main ysis method of Chapters 1-3 with the factorial effect based analysis method

anal-in Chapter 4,

• a response surface modeling method for related factors (Section 7.9.3), which allows expanded prediction capability for two related factors that are both quantitative,

• more details on the Method IT frequentist analysis method for analyzing experiments with complex aliasing (Section 9.4),

• more discussion of the use of compound noise factors in robust parameter design (Section 11.8.1),

• more discussion and illustration of Bayesian approach to analyzing GLMs and other models for nonnormal data (Sections 14.5-14.6)

In addition, ANOVA derivations are given in Section 3.8 on balanced incomplete block designs, and more details are given on optimal designs in Section 5.4.2 In this edition, we have also rewritten extensively, updated the references throughout the book, and have sparingly pointed the reader to some recent and important papers in the literature on various topics in the later chapters All data sets, sample lecture notes, and a sample syllabus can be accessed on the book's FTP site:

ftp:/lftp.wiley.comlpubliclscLtech medlexperiments-planning! Solutions to selected exercises are available to instructors from the authors The preparation of this edition has benefited from the comments and assis-tance of many colleagues and former students, including Nagesh Adiga, Derek Bingham, Ying Hung V Roshan Joseph, Lulu Kang, Rahul Mukerjee, Peter Z Qian Matthias Tan, Huizhi Xie, Kenny Qian Ye, and Yu Zhu Tirthankar Das-gupta played a major role in the preparation and writing of new sections in the early chapters; Xinwei Deng provided meticulous support throughout the prepa-ration of the manuscript We are grateful to all of them Without their support and interest this revision could not have been completed

Atlanta, Georgia

Los Alamos, New Mexico

June 2009

C F JEFF Wu MICHAEL S HAMADA

Trang 17

Preface to the First Edition

'"

(Note that the chapter numbering used below refers to first edition chapters.)

Statistical experimental design and analysis is an indispensable tool for menters and one of the core topics in a statistics curriculum Because of its impor-tance in the development of modem statistics, many textbooks and several classics have been written on the subject, including the influential 1978 book Statistics for Experimenters by Box, Hunter, and Hunter There have been many new method-ological developments since 1978 and thus are not covered in standard texts The writing of this book was motivated in part by the desire to make these modem ideas and methods accessible to a larger readership in a reader friendly fashion Among the new methodologies, robust parameter design stands out as an innovative statisticaVengineering approach to off-line quality and productivity improvement It attempts to improve a process or product by making it less sensitive to noise variation through statistically designed experiments Another important development in theoretical experimental design is the widespread use

experi-of the minimum aberration criterion for optimal assignment experi-of factors to columns

of a design table This criterion is more powelful than the maximum resolution criterion for choosing fractional factorial designs The third development is the increasing use of designs with complex aliasing in conducting economical exper-iments It turns out that many of these designs can be used for the estimation

of interactions, which is contrary to the prevailing practice that they be used for estimating the main effects only The fourth development is the widespread use of Generalized Linear Models (GLMs) and Bayesian methods for analyz-ing nonnOlmal data Many experimental responses are nonnormally distributed, such as binomial and Poisson counts as well as ordinal frequencies, or have lifetime distributions and are observed with censoring that arises in reliability and survival snldies With the advent of modem computing, these tools have been incorporated in texts on medical statistics and social statistics They should also be made available to experimenters in science and engineering There are also other experimental methodologies that originated more than 20 years ago but have received scant attention in standard application-oriented texts These include mixed two- and four-level designs, the method of collapsing for generating

xix

Trang 18

xx PREFACE TO THE FIRST EDITION

orthogonal main-effect plans, Plackett-Burman designs, and mixed-level onal wTays The main goal of writing this book is to fill in these gaps and present

orthog-a new orthog-and integrorthog-ated system of expelimentorthog-al design orthog-and orthog-anorthog-alysis, which morthog-ay help

in defining a new fashion of teaching wld for conducting research on this subject The intended readership of this book includes general practitioners as well

as specialists As a textbook, it covers standard material like analysis of ance (ANOVA), two- and three-level factorial and fractional factorial designs and response surface methodologies For reading most of the book, the only prerequi-site is an undergraduate level course on statistical methods and a basic knowledge

vari-of regression analysis Because vari-of the multitude vari-of topics covered in the book, it can be used for a variety of courses The material contained here has been taught

at the Department of Statistics and the Department of Industrial and Operations Engineering at the University of Michigan to undergraduate seniors, master's, and doctoral students To help instructors choose which material to use from the book, a separate "Suggestions of Topics for Instructors" follows this preface Some highlights and new material in the book are outlined as follows Chapters

1 and 2 contain standwu material on analysis of variance, one-way and multi-way layout, randomized block designs, Latin squares, balanced incomplete block designs, and analysis of covariance Chapter 3 addresses two-level factorial designs and provides new material in Sections 3.13-3.17 on the use of for-mal tests of effect significance in addition to the informal tests based on normal and half-normal plots Chapter 4, on two-level fractional factorial designs, uses the minimum aben·ation cliterion for selecting optimal fractions and emphasizes the use of follow-up experiments to resolve the ambiguities in aliased effects In Chapter 5, which deals with three-level designs, the linear-quadratic system and the variable selection sb·ategy for handling and analyzing interaction effects are new A new strategy for handling multiple responses is also presented Most of the material in Chapter 6 on mixed two- and four-level designs and the method

of sliding levels is new Chapter 7, on nonregular designs, is the only theoretical chapter in the book It emphasizes statistical properties and applications of the designs rather than their construction and mathematical structure For practition-ers, only the collections of tables in its appendices and some discussions in the sections on their statistical properties may be of interest Chapter 7 paves the way for the new material in Chapter 8 Both frequentist and Bayesian analysis strategies are presented The latter employs Gibbs sampling for efficient model search Supersaturated designs are also briefly discussed Chapter 9 contains a standard treatment of response surface methodologies Chapters 10 and 11 present robust paranleter design The former deals with problems with a simple response while the latter deals with those with a signal-response relationship The three important aspects of parameter design are considered: choice of performance measures, planning techniques, and modeling and analysis strategies Chapter 12

is concerned with experiments for reliability improvement Both failure time data and degradation data are considered Chapter 13 is concerned with experi-ments with nonnonnal responses Several approaches to analysis are considered, including generalized linear models and Bayesian methods

Trang 19

PREFACE TO THE FIRST EDITION xxi

The book has some interesting features not commonly found in experimental design texts Each of Chapters 3 to 13 starts with one or more case studies, which include the goal of the investigation, the data, the experimental plan, and the factors and their levels It is then followed by sections devoted to the description of experimental plans (i.e., experimental designs) Required theory

or methodology for the experimental designs are developed in these sections They are followed by sections on modeling and analysis strategies The chapter then returns to the original data, analyzes it using the strategies just outlined, and discusses the implications of the analysis results to the original case studies The book contains more than 80 experiments, mostly based on actual case studies; of these, 30 sets are analyzed in the text and more than 50 are given in the exercises Each chapter ends with a practical summary which provides an easy guide to the methods covered in that chapter and is particularly useful for readers who want

to find a specific tool but do not have the patience to go through the whole chapter The book takes a novel approach to design tables Many tables are new and based on recent research in expelimental design theory and algorithms For regular designs, only the design generators are given Full designs can be easily generated by the readers from these generators The collections of clear effects are given in these tables, however, because it would require some effort, especially for the less mathematically oriented readers, to derive them The complete layouts

of the orthogonal arrays are given in Chapter 8 for the convenience of the readers With our emphasis on methodologies and applications, mathematical derivations are given sparingly Unless the derivation itself is crucial to the understanding

of the methodology, we omit it and refer to the original source

The majority of the writing of this book was done at the University of Michigan Most of the authors' research that is cited in the book was done

at the University of Michigan with support from the National Science tion (1994-1999) and at the University of Waterloo (1988-1995) with support from the Natural Sciences and Engineering Research Council of Canada and the GMINSERC Chair in Quality and Productivity We have benefited from the comments and assistance of many col1eagues and fonner students, including Julie Berube, Derek Bingham, Ching-Shui Cheng, Hugh Chipman, David Fen-scik, Xiaoli Hou, Longcheen Huwang, Bill Meeker, Randy Sitter, Huaiqing Wu, Hongquan Xu, Qian Ye, Runchu Zhang, and Yu Zhu Shao-Wei Cheng played a pivotal supporting role as the book was completed; Jock MacKay read the first draft of the entire book and made numerous penetrating and critical comments; Jung-Chao Wang provided invaluable assistance in the preparation of tables for Chapter 8 We are grateful to all of them Without their efforts and interest, this book could not have been completed

Founda-Ann Arbor Michigan

Los Alamos New Mexico

C F JEFF Wu

MICHAEL S HAMADA

Trang 21

Suggestions of Topics for Instructors

One term for senior and master's students in Statistics, Engineering, Physical, Life and Social Sciences (with no background in regression analysis):

Chapters 1,2,3 (some of 3.4, 3.8, 3.9, 3.11 can be skipped), 4 (except 4.5, 4.13, 4.14),5; optional material from Chapters 11 (11.1-11.5, part of 11.6-11.9),

6 (6.1-6.6), 8 (8.1-8.5), 9 (9.1-9.4), 10 (10.1-10.3, 10.5, 10.7) For dents with a background in regression analysis, Sections 1.4-1.8 can be skipped or briefly reviewed

stu-One term for a joint master's and Ph.D course in StatisticslBiostatistics:

Chapters 1 (1.1-1.3), 2 (except 2.4), 3 (3.4 and 3.9 may be skipped), 4 (except 4.13-4.14), 5, 6 (6.7-6.8 may be skipped), 10 (10.4 and 10.8 may be skipped), 11 (11.1-11.6, part of 11.7-11.9); optional material from Chapters 7 (7.1-7.5, 7.9), 8 (8.1-8.5), 9 (except 9.5) Coverage of Chapters 1 to 3 can be accelerated for those with a background in ANOVA Two-term sequence for master's and Ph.D students in StatisticslBiostatistics:

First term: Chapters 1 to 3 (can be accelerated if ANOVA is a prerequisite), Chapters 4 (4.13-4.14 may be skipped), 5, 6, 7

Second term: Chapters 8 (the more theoretical material may be skipped), 9 (9.5 may be skipped), 10 (10.8 may be skipped), 11, 12 (12.6 may be skipped),

13 (13.5 may be skipped), 14 (14.5-14.6, 14.8-14.9 may be skipped)

One-term advanced topics course for Ph.D students with background in ductory graduate experimental design course:

intro-Selected topics from Chapters 5 to 14 depending on the interest and back ground of the students

xxiii

Trang 22

xxiv SUGGESTIONS OF TOPICS R>R INSTRUcroRS One-term course on theoretical experimental design for Ph.D students in Statis-tics and Mathematics:

Sections 1.3, 3.6-3.9,4.2-4.3,4.6-4.7,4.15, 5.2-5.6, 6.3-6.4, 6.8, 7.2-7.3, 7.7-7.8, Chapter 8,9.6, 10.4, 10.7-10.8, 11.6-11.8, 12.6

Trang 23

• List of Experiments and Data Sets

Brain and Body Weight Data 38

Long Jump Data 39

Ericksen Data 41

Gasoline Consumption Data 43

Reflectance Data, Pulp Experiment 46

Strength Data, Composite Experiment 57

Adapted Muzzle Velocity Data 83

Summary Data, Airsprayer Experiment 84

Packing Machine Data 84

Blood Pressure Data 85

Residual Chlorine Readings, Sewage Experiment 88

Strength Data, Girder Experiment 91

Torque Data, Bolt Experiment 94

Sensory Data, Measurement System Assessment Study 101 Weight Loss Data, Wear Experiment III

Wear Data, Tire Experiment 116

Water Resistance Data, Wood Experiment 122

Data Starch Experiment 129

Design Matrix and Response Data, Drill Experiment 135 Strength Data, Original Girder Experiment 139

xxv

Trang 24

Strength Data, Revised Composite Experiment 141

Yield Data, Tomato Experiment 141

Worsted Yarn Data 142

Data, Resistor Experiment 144

Data, Blood Pressure Experiment 145

Throughput Data 146

Muzzle Velocity Data 147

Corrosion Resistances of Steel Bars, Steel Experiment 148 Data, Thickness Gauge Study 149

Task Efficiency ExpeIiment 204

Design Matrix and Roughness Data, Drive Shaft

Experiment 205

Metal Alloy Crack Experiment 206

Design Matrix and Free Height Data, Leaf Spring

Design Matrix and Response Data, Ultrasonic Bonding

Experiment 308

Trang 25

LIST OF EXPERIMENTS AND DATA SETS XXVB

Thickness Data, Paint Experiment 336

Design Matrix and Covariates, Light Bulb Experiment 351 Appearance Data, Light Bulb Experiment 352

Design Matrix and Response Data, Reel Motor

A 10-Factor 12-Run Experiment with Six Added Runs 393

Design Mabix and Response Data, Plackett-Burman Design Example Experiment 433

Supersaturated Design Matrix and Adhesion Data, Epoxy Experiment 443

Original Epoxy Experiment Based on 28-Run Plackett-Burman Design 444

Design Matrix and Lifetime Data, Heat Exchanger

Experiment 448

Design Matrix, Window Forming Experiment 449

Pre-Etch Line-Width Data, Window Forming

Experiment 450

Post-Etch Line-Width Data, Window Forming Experiment 451 Design Matrix and Strength Data, Ceramics Experiment 452 Design Matrix and Response Data, Wood Pulp Experiment 453

Trang 26

xxviii UST OF EXPERIMENTS AND DATA SETS CHAPTER 10

Design Matrix and Response Data, Final Second-Order

Ranitidine Experiment 479

Runs Along the First Steepest Ascent Direction 496

Central Composite Design 496

Design Matrix and Response Data, Amphetamine

Experiment 497

Design Matrix and Response Data, Whiteware Expetiment 498 Design Matrix and Response Data, Drill Experiment 499 Design Matrix and Response Data, Ammonia Experiment 500 Design Matrix and Response Data, TAB Laser Experiment 501 Design Matrix and Response Data, Cement Experiment 502 Design Matrix, Bulking Process Experiment 503

Cross Array and Thickness Data, Layer Growth

Control Array, Injection Molding Experiment 565

Response Data, Injection Molding Experiment 566

Design Matrix and Weight Data, Coating Experiment 590 Control Array, Drive Shaft Experiment 591

Response Data, Drive Shaft Experiment 592

Control Array, Surface Machining Experiment 593

Single Array for Signal and Noise Factors, Surface Machining Experiment 593

Response Data, Surface Machining Experiment 594

Trang 27

UST OF EXPERIMENTS AND DATA SETS

Thble 12.20 Control AlTaY and Fitted Parameters, Engine Idling

Failure TIme Data, Thermostat Experiment 602

Cross AIray and Failure TIme Data (with Censoring TIme of 3000), Drill Bit Experiment 603

Design Matrix and Failure TIme Data, Ball Bearing

Table 14.14 Poppy Counts, Weed Infestation Experiment 657

Table 14.15 Larvae Counts, Larvae Control Experiment 657

Table 14.16 Unsuccessful Germination Counts (Out of 50 Seeds), Wheat

Experiment 658

Table 14.17 Window Size Data, Window Forming Experiment 659

Table 14.18 Design and Max Peel Strength Data, Sealing Process

Experiment 660

Trang 29

CHAPTER!

Basic Concepts for Experimental

Design and Introductory Regression Analysis

Some basic concepts and principles in experimental design are introduced in thil chapter, including the fundamental principles of replication, randomization, anc blocking A brief and self-contained introduction to regression analysis is alsc included Commonly used techniques like simple and multiple linear regression least squares estimation, and variable selection are covered

1.1 INTRODUCTION AND HISTORICAL PERSPECTIVE

Experimentation is one of the most common activities that people engage in I covers a wide range of applications from household activities like food prepa· ration to technological innovation in material science, semiconductors, robotics life science, and so on It allows an investigator to find out what happens to th~

output or response when the settings of the input variables in a system are pur· posely changed Statistical or often simple graphical analysis can then be used t( study the relationship between the input and output values A better understand· ing of how the input variables affect the pelformance of a system can thereb)

be achieved This gain in knowledge provides a basis for selecting optimulT input settings Experimental design is a body of knowledge and techniques thai enables an investigator to conduct better experiments, analyze data efficiently and make the connections between the conclusions from the analysis and th~

original objectives of the investigation

Experimentation is used to understand and/or improve a system A systerr can be a product or process A product can be one developed in engineering biology, or the physical sciences A process can be a manufacturing process, f

Experiments, Second Edition By C F Jeff Wu and Michael S Hamada

Convright <a 100Q Inhn Wilp.v Rt ~nn~_ Tn,,_

Trang 30

2 BASIC DESIGN CONCEPTS AND REGRESSION ANALYSIS

process that describes a physical phenomenon, or a nonphysical process such as those found in service or administration Although most examples in the book are from engineering or the physical and biological sciences, the methods can also be applied to other disciplines, such as business, medicine, and psychology For example, in studying the efficiency and cost of a payroll operation, the entire payroll operation can be viewed as a process with key input vaIiables such as the number of supervisors, the number of clerks, method of bank deposit, level

of automation, administrative structure, and so on A computer simulation model can then be used to study the effects of changing these input variables on cost and efficiency

Modem experimental design dates back to the pioneeIing work of the great statistician R A Fisher in the 1930s at the Rothamsted Agricultural Experimental Station in the United Kingdom Fisher's work and the notable contributions by F Yates and D 1 Finney were motivated by problems in agriculture and biology Because of the nature of agricultural experiments, they tend to be large in scale, take a long time to complete, and must cope with variations in the field Such considerations led to the development of blocking, randomization, replication, orthogonality, and the use of analysis of variance and fractional factorial designs The theory of combinatorial designs, to which R C Bose has made fundamental contributions, was also stimulated by problems in block designs and fractional factorial designs The work in this era also found applications in social science research and in the textile and woolen industries

The next era of rapid development came soon after World War II In ing to apply previous techniques to solve problems in the chemical industries,

attempt-G E P Box and co-workers at Imperial Chemical Industries discovered that new techniques and concepts had to be developed to cope with the unique fea-tures of process industries The new techniques focused on process modeling and optimization rather than on treatment comparisons, which was the primary objective in agricultural experiments The experiments in process industries tend

to take less time but put a premium on run size economy because of the cost of experimentation These time and cost factors naturally favor sequential experi-mentation The same considerations led to the development of new techniques for experimental planning, notably central composite designs and optimal designs The analysis for these designs relies more heavily on regression modeling and graphical analysis Process optimization based on the fitted model is also empha-sized Because the choice of design is often linked to a particular model (e.g., a second-order central composite design for a second-order regression model) and the experimental region may be in-egularly shaped, a flexible strategy for finding designs to suit a particular model and/or experimental region is called for With the availability of fast computational algorithms, optimal designs (which was pioneered by 1 Kiefer) have become an important part of this strategy

The relatively recent emphasis on variation reduction has provided a new source of inspiration and techniques in experimental design In manufacturing, the ability to make many parts with few defects is a competitive advantage Therefore variation reduction in the quality characteristics of these parts has become a

Trang 31

INTRODUCTION AND HISTORICAL PERSPECTIVE 3

major focus of quality and productivity improvement G Taguchi advocated the use of robust parameter design to improve a system (i.e., a product or process)

by making it less sensitive to variation, which is hard to control during normal operating or use conditions of the product or process The input variables of a system can be divided into two broad types: control factors, whose values remain fixed once they are chosen, and noise factors, which are hard to control during normal conditions By exploiting the interactions between the control and noise factors, one can achieve robustness by choosing control factor settings that make the system less sensitive to noise variation This is the motivation behind the new paradigm in experimental design, namely, modeling and reduction of variation Traditionally, when the mean and Valiance are both considered, variance is used

to assess the variability of the sample mean as with the t test or of the treatment comparisons as with the analysis of variance The focus on variation and the division of factors into two types led to the development of new concepts and techniques in the planning and analysis of robust parameter design experiments The original problem formulation and some basic concepts were developed by G Taguchi Other basic concepts and many sound statistical techniques have been developed by statisticians since the mid-1980s

Given this historical background, we now classify expelimental problems into five broad categories according to their objectives

1 Treahnent Comparisons The main purpose is to compare several treatments

and select the best ones For example, in the comparison of six barley varieties, are they different in terms of yield and resistance to drought? If they are indeed different, how are they different and which are the best? Examples of treat-ments include varieties (rice, barley, com, etc.) in agricultural trials, sitting posi-tions in ergonomic studies, instructional methods, machine types, suppliers, and

so on

2 Variable Screening If there is a large number of variables in a system but only a relatively small number of them is important, a screening experiment can

be conducted to identify the important variables Such an experiment tends to

be economical in that it has few degrees of freedom left for estimating error variance and higher-order terms like quadratic effects or interactions Once the important variables are identified, a follow-up experiment can be conducted to study their effects more thoroughly This latter phase of the study falls into the category discussed next

3 Response Surface Exploration Once a smaller number of variables is tified as important, their effects on the response need to be explored TIle rela-tionship between the response and these variables is sometimes referred to as a response surface Usually the experiment is based on a design that allows the lin-ear and quadratic effects of the variables and some of the interactions between the variables to be estimated This experiment tends to be larger (relative to the num-ber of variables under study) than the screening experiment Both parametric and semiparametric models may be considered The latter is more computer-intensive but also more flexible in model fitting

Trang 32

iden-4 BASIC DESIGN CONCEPTS AND REGRESSION ANALYSIS

4 System Optimization In many investigations, interest lies in the tion of the system For example, the throughput of an assembly plant or the yield

optimiza-of a chemical process is to be maximized; the amount optimiza-of scrap or number optimiza-of reworked pieces in a stamping operation is to be minimized; or the time required

to process a travel claim reimbursement is to be reduced If a response surface has been identified, it can be used for optimization For the purpose of finding

an optimum, it is, however, not necessary to map out the whole surface as in a response surface exploration An intelligent sequential strategy can quickly move the experiment to a region containing the optimum settings of the variables Only within this region is a thorough exploration of the response surface warranted

5 System Robustness Besides optimizing the response, it is important in ity improvement to make the system robust against noise (i.e., hard-to-control) variation This is often achieved by choosing control factor settings at which the system is less sensitive to noise variation Even though the noise variation is hard to control in normal conditions, it needs to be systematically varied during experimentation The response in the statistical analysis is often the variance (or its transformation) among the noise replicates for a given control factor setting

qual-1.2 A SYSTEMATIC APPROACH TO THE PLANNING

AND IMPLEMENTATION OF EXPERIMENTS

In this section, we provide some guidelines on the planning and implementation

of experiments The following seven-step procedure summarizes the important steps that the experimenter must address

I State Objective The objective of the experiment needs to be clearly stated All stakeholders should provide input For example, for a manufactured product, the stakeholders may include design engineers who design the product, process engineers who design the manufacturing process, line engineers who run the man-ufacturing process, suppliers, lineworkers, customers, marketers, and managers

2 Choose Response The response is the experimental outcome or tion There may be multiple responses in an expeliment Several issues arise in

observa-choosing a response Responses may be discrete or continuous Discrete responses

can be counts or categories-for example, binary (good, bad) or ordinal (easy, normal, hard) Continuous responses are generally preferable For example, a continuous force measurement for opening a door is better than an ordinal (easy, normal, hard to open) judgment; the recording of a continuous characteristic is preferred to the recording of the percent that the characteristic is within its speci-fications Trade-offs may need to be made For example, an ordinal measurement

of force to open a door may be preferable to delaying the experiment until a device to take continuous measurements can be developed Most importantly, there should be a good measurement system for measuting the response In fact,

an experiment called a gauge repeatability and reproducibility (R&R) study can

be performed to assess a continuous measurement system (AIAG, 1990) When

Trang 33

PLANNING AND IMPLEMENTATION OF EXPERIMENTS 5 there is a single measuring device, the variation due to the measurement system can be divided into two types: variation between the operators and variation within the operators Ideally, there should be no between-operator valiation and small within-operator variation The gauge R&R study provides estimates for these two components of measurement system variation Finally, the response should be chosen to increase understanding of mechanisms and physical laws involved in the problem For example, in a process that is producing under-weight soap bars, soap bar weight is the obvious choice for the response in

an experiment to improve the underweight problem By examining the process more closely, there are two subprocesses that have a direct bearing on soap bar weight: the mixing process that affects the soap bar density and the forming pro-cesS that impacts the dimensions of the soap bars In order to better understand the mechanism that causes the underweight problem, soap bar density and soap bar dimensions are chosen as the responses Even though soap bar weight is not used as a response, it can be easily determined from its density and dimensions Therefore, no information is lost in studying the density and dimensions Such a study may reveal new information about the mixing and forming subprocesses, which can in turn lead to a better understanding of the underweight problem Further discussions on and other examples of the choice of responses can be found in Phadke (1989) and Leon, Shoemaker, and Tsui (1993)

The chosen responses can be classified according to the stated Objective Three broad categories will be considered in this book: nominal-the-best, larger-the-better, and smaller-the-better The first one will be addressed in Section 4.10, and the last two will be discussed in Section 6.2

3 Choose Factors and Levels A factor is a variable that is studied in the experiment In order to study the effect of a factor on the response, two or more values of the factor are used These values are referred to as levels or settings

A treatment is a combination of factor levels When there is a single factor, its levels are the treatments For the success of the experiment, it is crucial that potentially impOltant factors be identified at the planning stage There are two graphical methods for identifying potential factors First, a flow chart of the pro-cess or system is helpful to see where the factors arise in a multistage process In Figure 1.1, a rough sketch of a paper pulp manufacturing process is given which involves raw materials from suppliers, a chemical process to make a slurry which

is passed through a mechanical process to produce the pulp Involving all the stakeholders is invaluable in capturing an accurate description of the process or system Second, a cause-and-effect diagram can be used to list and organize the potential factors that may impact the response In Figure 1.2, a cause-and-effect diagram is given which lists the factors thought to affect the product quality of an injection molding process Traditionally, the factors are organized under the head-ings: Man, Machine, Measurement, Material, Method, and Environment (Mother Nature for those who like M's) Because of their appearance, cause-and-effect

diagrams are also called fishbone diagrams Different characteristics of the

fac-tors need to be recognized because they can affect the choice of the experimental

design For example, a factor such as furnace temperature is hard to change That

Trang 34

6 BASIC DESIGN CONCEPTS AND REGRESSION ANALYSIS

Slurry Concentration

t -~I Mechanical

Phase Refiner Plate Gap

Fipre 1.L Flow chart, pulp manufacturing process

MACHINE injection pressure

Trang 35

pLANNING AND IMPLEMENTATION OF EXPERIMENTS 7 force Other factors that may be hard or impossible to control are referred to as

noise factors Examples of noise factors include environmental and customer use

conditions (An in-depth discussion of noise factors will be given in Section 11.3.)

Factors may be quantitative and qualitative Quantitative factors like

temper-ature, time, and pressure take values over a continuous range Qualitative factors take on a discrete number of values Examples of qualitative factors include oper-ation mode, supplier, position, line, and so on Of the two types of factors, there

is more freedom in choosing the levels of quantitative factors For example, if temperature (in degrees Celsius) is in the range 100-200°C, one could choose 130°C and 160°C for two levels or 125°C, IS0cC, and 175°C for three levels If only a linear effect is expected, two levels should suffice If curvature is expected, then three or more levels are required In general, the levels of quantitative fac-tors must be chosen far enough apart so that an effect can be detected but not too far so that different physical mechanisms are involved (which would make

it difficult to do statistical modeling and prediction) There is less flexibility in choosing the levels of qualitative factors Suppose there are three testing meth-ods under comparison All three must be included as three levels of the factor

"testing method," unless the investigator is willing to postpone the study of one method so that only two methods are compared in a two-level experiment When there is flexibility in choosing the number of levels, the choice may depend on the availability of experimental plans for the given combination of

factor levels In choosing factors and levels, cost and practical constraints must

be considered If two levels of the factor "material" represent expensive and cheap materials, a negligible effect of material on the response will be welcomed because the cost can be drastically reduced by replacing the expensive material by the cheap alternative Factor levels must be chosen to meet practical constraints

If a factor level combination (e.g., high temperature and long time in an oven) can potentially lead to disastrous results (e.g., burned or overbaked), it should be avoided and a different plan should be chosen

4 Choose Experimental Plan Use the fundamental principles discussed in

Section 1.3 as well as other principles presented throughout the book The choice of the experimental plan is crucial A poor design may capture little information which no analysis can rescue On the other hand, if the experiment

is well planned, the results may be obvious so that no sophisticated analysis is needed

5 Perform the Experiment The use of a planning matrix is recommended

This matrix describes the experimental plan in terms of the actual values or settings of the factors For example, it lists the actual levels such as 50 or 70 psi

if the factor is pressure To avoid confusion and eliminate potential problems

of running the wrong combination of factor levels in a multifactor experiment, each of the treatments, such as temperature at 30cC and pressure at 70 psi, should be put on a separate piece of paper and given to the personnel pelforming the experiment It is also worthwhile to perform a trial run to see if there will

be difficulties in running the experiment, namely, if there are problems with setting the factors and measUling the responses Any deviations from the planned

Trang 36

8 BASIC DESIGN CONCEPTS AND REGRESSION ANALYSIS

experiment need to be recorded For example, for hard-to-set factors, the actual values should be recorded

6 Analyze the Data An analysis appropriate for the design used to collect the data needs to be carried out This includes model fitting and assessment of the model assumptions through an analysis of residuals Many analysis methods will be presented throughout the book

7 Draw Conclusions and Make Recommendations Based on the data analysis, conclusions are presented which include the important factors and a model for the response in tenns of the important factors Recommended settings or levels for the important factors may also be given The conclusions should refer back to the stated objectives of the experiment A confirmation experiment is worthwhile, for example, to confirm the recommended settings Recommendations for further experimentation in a follow-up experiment may also be given For example, a follow-up experiment is needed if two models explain the experimental data equally well and one must be chosen for optimization

For further discussion on the planning of experiments, see Coleman and gomery (1993), Knowlton and Keppinger (1993), and Barton (1997)

Mont-1.3 FUNDAMENTAL PRINCIPLES: REPLICATION,

RANDOMIZATION, AND BLOCKING

There are three fundamental principles that need to be considered in the design

of an experiment: replication, randomization, and blocking Other principles will be introduced later in the book as they arise

An experimental unit is a generic telm that refers to a basic unit such as material, animal, person, machine, or time period, to which a treatment is applied

By replication, we mean that each treatment is applied to experimental units that are representative of the population of units to which the conclusions of the experiment will apply It enables the estimation of the magnitude of experimental error (i.e., the error variance) against which the differences among treatments are judged Increasing the number of replications, or replicates, decreases the variance of the treatment effect estimates and provides more power for detecting differences in treatments A distinction needs to be made between replicates and repetitions For example, three readings from the same experimental unit are repetitions, while the readings from three separate experimental units are replicates The error variance from the former is less than that from the latter because repeated readings only measure the variation due to errors in reading while the latter also measures the unit-to-unit variation Underestimation of the true error variance can result in the false declaration of an effect as significant The second principle is that of randomization It should be applied to the allocation of units to treatments, the order in which the treatments are applied in perfOlming the experiment, and the order in which the responses are measured

It provides protection against variables that are unknown to the experimenter

Trang 37

FUNDAMENTAL PRINCIPLES: REPLICATION, RANDOMIZATION, AND BLOCKJNG 9 but may impact the response It reduces the unwanted influence of subjective judgment in treatment allocation Moreover, randomization ensures validity of the estimate of experimental error and provides a basis for inference in analyzing the experiments For an in-depth discussion on randomization, see Hinkelmann and Kempthorne (1994)

A prominent example of randomization is its use in clinical trials If a cian were free to assign a treatment or control (or a new treatment versus an old treatment) to hislher patients, there might be a tendency to assign the treat-ment to those patients who are sicker and would not benefit from receiving a control This would bias the outcome of the trial as it would create an unbalance between the control and treatment groups A potentially effective treatment like

physi-a new drug mphysi-ay not even show up physi-as promising if it is physi-assigned to physi-a lphysi-arger portion of "sick" patients A random assignment of treatment/control to patients would prevent this from happening Particularly commonplace is the use of the

pro-double-blind trial, in which neither the patient nor the doctor or investigator has access to the information about the actual treatment assignment More on clinical trials can be found in Rosenberger and Lachin (2002)

A group of homogeneous units is refen'ed to as a block Examples of blocks include days, weeks, morning vs afternoon, batches, lots, sets of twins, and pairs

of kidneys For blocking to be effective, the units should be arranged so that the within-block variation is much smaller than the between-block variation By comparing the treatments within the same block, the block effects are eliminated

in the comparison of the treatment effects, thereby making the experiment more efficient For example, there may be a known day effect on the response so that

if all the treatments can be applied within the same day, the day-to-day variation

in randomized block designs, to be discussed in Section 3.2

These three principles are generally applicable to physical experiments but not

to computer experiments because the same input in a computer experiment gives rise to the same output Computer experiments (see Santner et aI., 2003) are not considered in the book, however

A simple example will be used to explain these principles Suppose two boards denoted by A and B are being compared in terms of typing efficiency Six different manuscripts denoted by 1-6 are given to the same typist First the test is arranged in the following sequence:

Because the manuscripts can vary in length and difficulty, each manuscript is treated as a "block" with the two keyboards as two treatments Therefore, the experiment is replicated six times (with six manuscripts) and blocking is used

Trang 38

10 BASIC DESIGN CONCEPTS AND REGRESSION ANALYSIS

to compare the two keyboards with the same manuscript The design has a ous flaw, however After typing the manuscript on keyboard A, the typist will

seri-be familiar with the content of the manuscript when he or she is typing the same manuscript on k~yboard B This "learning effect" will unfairly help the performance of keyboard B The observed difference between A and B is the combination of the treatment effects (which measures the intrinsic difference between A and B) and the learning effect For the given test sequence, it is impossible to disentangle the learning effect from the treatment effect Random-ization would help reduce the unwanted influence of the learning effect, which might not have been known to the investigator who planned the study By ran-domizing the typing order for each manuscript, the test sequence may appear as follows:

With four AB's and two BA's in the sequence, it is a better design than the first one A further improvement can be made The design is not balanced because

B benefits from the learning effect in four trials while A only benefits from two trials There is still a residual learning effect not completely eliminated by the second design The learning effect can be completely eliminated by requiring that half of the trials have the order AB and the other half the order BA The actual assignment of AB and BA to the six manuscripts should be done by randomization ll1is method is referred to as balanced randomization Balance is a desirable design property, which will be discussed later

For simplicity of discussion, we have assumed that only one typist was involved in the experiment In a practical situation, such an experiment should involve several typists that are representative of the population of typists so that the conclusions made from the study would apply more generally This and other aspects of the typing experiment will be addressed in the exercises

With these principles in mind, a useful addition to the cause-and-effect diagram

is to indicate how the proposed experimental design addresses each listed factor The following designations are suggested: E for an experimental factor, B for

a factor handled by blocking, 0 for a factor held constant at one value, and R for a factor handled by randomization This designation clearly indicates how the proposed design deals with each of the potentially important factors The designation 0, for "one value," serves to remind the experimenter that the factor

is held constant during the current experiment but may be varied in a future experiment An illustration is given in Figure 1.3 from the injection molding experiment discussed in Section 1.2

Other designations of factors can be considered For example, experimental factors can be further divided into two types (control factors and noise fac-tors), as in the discussion on the choice of factors in Section 1.2 For the implementation of experiments, we may also designate an experimental factor

as "hard-to-change" or "easy-to-change." These designations will be considered later as they arise

Trang 39

sIMPLE LINEAR REGRESSION

MACHINE injection pressure (E)

injection speed (E)

nozzle temperature (0)

MATERIAL pre-blend pigmentation (B)

Figure 1.3 Revised cause-and-effect diagram, injection molding experiment

1.4 SIMPLE LINEAR REGRESSION

11

Throughout the book, we will often model experimental data by the general linear model (also called the multiple regression model) Before considering the general linear model in Section 1.6, we present here the simplest case known as the simple linear regression model, which consists of a single covariate We use the following data to illustrate the analysis technique known as simple linear regression

Lea (1965) discussed the relationship between mean annual temperature and

a mortality index for a type of breast cancer in women The data (shown in

Table 1.1), taken from certain regions of Great Britain, Norway, and Sweden, consist of the mean annual temperature (in degrees Fahrenheit) and a mortality index for neoplasms of the female breast

Table 1.1 Breast Cancer Mortality Data

Mortality Index (M): 102.5 104.5 100.4 95.9 87.0 95.0 88.6 89.2

Temperature (T): 51.3 49.9 50.0 49.2 48.5 47.8 47.3 45.1

Mortality Index (M): 78.9 84.6 81.7 72.2 65.1 68.1 67.3 52.5

Temperature (T): 46.3 42.1 44.2 43.5 42.3 40.2 31.8 34.0

Trang 40

12 BASIC DESIGN CONCEPTS AND REGRESSION ANALYSIS

Figure 1.4 Scatter plot of temperature versus mortality index, btellSt cancer example

The first step in any regression analysis is to make a scatter plot A scatter plot of mortality index against temperature (Figure 1.4) reveals an increasing linear relationship between the two variables Such a linear relationship between

a response y and a covariate x can be expressed in terms of the following model:

y = Po + PIX + E,

where E is the random part of the model which is assumed to be normally distributed with mean 0 and variance 0'2, that is, E '" N(O, 0'2); because E is normally distributed, so is y with mean E(y) = Po + PIX and Var(y) = 0'2

If N observations are collected in an experiment, the model for them takes the form

y; = Po + PIX; + EI i = 1, , N, (1.1)

where Yi is the ith value of the response and Xi is the corresponding valUe of the covariate

The unknown parameters in the model are the regression coefficients Po and

PI and the error variance 0'2 Thus, the purpose for collecting the data is to estimate and make inferences about these parameters For estimating Po and PI,

the least squares criterion is used; that is, the least squares estimators (LSEs), denoted by Po and PI, respectively, minimize the following quantity:

N

;=1

Ngày đăng: 05/03/2014, 11:20

TỪ KHÓA LIÊN QUAN