Quality Management and Six Sigma Part 8 pptx

By using few data points, it is able to dynamically determine an upper and lower control limit of acceptable process performance variability.. By using few data points, it is able to dyn

Trang 1

4.2 Software FMEA

Failure mode and effects analysis (FMEA) is one of the well-known analysis methods having

an established position in the traditional reliability analysis The purpose of FMEA is to

identify “UPFRONT” possible failure modes of the system components, evaluate their

influences on system behaviour and propose proper countermeasures to suppress these

effects A failure mode and effects analysis (FMEA) can be described as a systematic way of

identifying failure modes of a system, item or function, and evaluating the effects of the

failure modes on the higher level A bottom-up technique such as FMEA is an effective way

to identify component failures or system mal-functions, and to “design rightly” the system

under consideration (Pentti & Atte, 2002)

The standard guidelines provided by the FMEA cannot be directly used and would have to

be tailored for applying it to software Typically the best definition for “Severity” would be

the one that the software teams use for their problem report classifications Similarly for

“Occurrence” and “Detection” it is better that the teams use their own tailored guideline

based on a simplistic criteria of “Very high” to “Very Low”

By nature, software failure modes generally are unknown—“software modules do not fail in

the literal sense as hardware failure, they only display incorrect behaviour”—and depend

on dynamic behaviour of the application The aim of the FMEA is to then uncover those

situations

The following are certain alerts/pitfalls/learning’s to be aware of when doing software

FMEA:-

1) Use case explosion – Software due to its very nature has many permutations

/combinations of inputs and outputs which could be prone to failures Hence FMEA

would soon run into thousands of use-case combinations of failure-modes Hence it is

advisable to focus on failure modes associated with CTQs, Critical

components/modules/functionalities etc

2) Capturing “Requirements not meeting” as failure modes e.g set not recording as a

failure mode for a DVD recorder etc Recording is a basic requirement itself of a

recorder so listing it as failure mode at a global level would not help Instead the failure

mode should delve deeper into the features

3) Not having the appropriate subject matter experts in the analyses Failure modes

largely dependent on competence, hence knowledge of domain (not software

engineering but rather the usage of product in actual environment) is crucial

4) Attempting to perform FMEA on 100% of the design or code instead of sampling the

design/code most likely to cause a serious failure

5) Excluding hardware from the analysis or isolating the software from the rest of the

system as many of the failures result from the combination and not software alone

6) Typically for software, the severity “SEV” would remain unchanged and it is mainly

the occurrence and detection that can be improved For e.g a hang/crash in a normal

user operation is a severity “A” failure mode translating to a value of 8 for SEV By

taking various actions, its occurrence can be reduced/ eliminated or detectability can be

improved However even after taking actions, the severity would remain unchanged

7) The occurrence “OCC” value can be tricky sometimes for software In a product

development environment, normally a test will be done on few devices say 5 to 10 and

issues do not surface out When long duration tests are conducted in the factory on a

larger sample say 100 devices then the product starts failing So OCC value could be different based on the sample taken and has to be accordingly adapted when validating the results

8) From software development life-cycle perspective, the DET value can take on different values for the same detection levels For e.g a control mechanism may have a high chance of detecting a failure mode making the DET value 4 as per the guideline However based on whether that detection can happen in design itself or testing may vary the value The team might give a higher vale for DET for something that can be detected only in testing as against that which can be detected in design

4.3 Use of Statistics in software

Often this is one of most important challenge when it comes to using concepts like DfSS for software Many software requirements fall into the Yes/No, Pass/Fail category so limit setting is fuzzy Most of them would become critical factors (CFs) and not CTQs in the

“continuous data” sense

 Predicting DPMO (defects per million opportunities) may be misleading (out of limits) This is because the specifications limits in cases like responsiveness are soft targets Just because it takes 0.5 seconds more than Upper Specification Limit to start-up does not necessarily classify it as a defective product In Six sigma terms anything beyond Upper spec limit and less than Lower spec limit becomes a defect

 Random failures due to only software are rare due to which concept like Mean-Time-Between-Failures (MTBF) for software alone is questionable, however it makes sense at overall product level

 No concept of samples – the same piece of code is corrected and used, so advanced statistical concepts have to be applied with discretion

However this does not mean that statistical concepts cannot be applied at all

 The starting point is to challenge each specification to ensure if some numbers can be associated with it Even abstract elements such as “Usability” can be measured as seen

in section 3.5.2

 For many of the software CTQs, the Upper limits and lower limits may not be hard targets, nevertheless it is a good to use them as such and relax it during the course of the development

 The change in Z-scores over the releases would be more meaningful rather than absolute Z-scores

 All Statistical concepts can be applied for the “Continuous CTQs”

 Many of the Design of experiments in software would happen with discrete Xs due to nature of software So often the purpose of doing these is not with the intent of generating a transfer function but more with a need to understand which “Xs” impact the Y the most – the cause and effect So the Main effects plot and Interaction plots have high utility in such scenarios

 The hypothesis tests such as t-Tests, F-Tests, ANOVA are useful in the Verify and Monitor phase to determine if indeed there have been statistical significant changes over the life cycle or from one product generation to next etc

Trang 2

 Statistical Capability analysis to understand the variation on many of the CTQs in simulated environments as well as actual hardware can be a good starting point to design in robustness in the software system

5 References

Ajit Ashok Shenvi (2008) Design for Six Sigma : Software Product Quality, Proceedings of the

1st India Software Engineering Conference, pp 97-106, ISBN:978-1-59593-917-3,

Hyderabad, India, February 19 - 22, 2008 ISEC '08 ACM, New York, NY, DOI= http://doi.acm.org/10.1145/1342211.1342231

Haapanen Pentti & Helminen Atte, Stuk-yto-tr 190/August 2002 Failure modes and effects

analysis of software based-automation systems

Jeannine M Siviy and Eileen C Forrester (2004) Accelerating CMMi adoption using Six

Sigma,Carnegie Mellon Software Engineering Institute

Jeannine M Siviy (SEI), Dave Halowell (Six Sigma advantage) 2005 Bridging the gap

between CMMi & Six Sigma Training Carnegie Mellon Sw Engineering Institute Jiantao Pan 1999 Software Reliability Carnegie Mellon

http://www.ece.cmu.edu/~koopman/des_s99/sw_reliability/

Minitab tool – Statistical tool http://www.minitab.com

Philips DFSS training material for Philips 2005 SigMax Solutions LLC, USA

Trang 3

Statistical Process Control for Software: Fill the Gap

Maria Teresa Baldassarre, Nicola Boffoli and Danilo Caivano

X

Statistical Process Control for Software: Fill the Gap

Maria Teresa Baldassarre, Nicola Boffoli and Danilo Caivano

University of Bari

Italy

1 Introduction

The characteristic of software processes, unlike manufacturing ones, is that they have a very

high human-centered component and are primarily based on cognitive activities As so, each

time a software process is executed, inputs and outputs may vary, as well as the process

performances This phenomena is better identified in literature with the terminology of

“Process Diversity” (IEEE, 2000) Given the characteristics of a software process, its intrinsic

diversity implies the difficulty to predict, monitor and improve it, unlike what happens in

other contexts In spite of the previous observations, Software Process Improvement (SPI) is a

very important activity that cannot be neglected To face these problems, the software

engineering community stresses the use of measurement based approaches such as QIP/GQM

(Basili et al., 1994) and time series analysis: the first approach is usually used to determine

what improvement is needed; the time series analysis is adopted to monitor process

performances As so, it supports decision making in terms of when the process should be

improved, and provides a manner to verify the effectiveness of the improvement itself

A technique for time series analysis, well-established in literature, which has given

insightful results in the manufacturing contexts, although not yet in software process ones is

known as Statistical Process Control (SPC) (Shewhart, 1980; Shewhart, 1986) The technique

was originally developed by Shewhart in the 1920s and then used in many other contexts

The basic idea it relies on consists in the use of so called “control charts” together with their

indicators, called run tests, to: establish operational limits for acceptable process variation;

monitor and evaluate process performances evolution in time In general, process

performance variations are mainly due to two types of causes classified as follows:

 Common cause variations: the result of normal interactions of people, machines,

environment, techniques used and so on

 Assignable cause variations: arise from events that are not part of the process and

make it unstable

In this sense, the statistically based approach, SPC, helps determine if a process is stable or

not by discriminating between common cause variation and assignable cause variation We

can classify a process as “stable” or “under control” if only common causes occur More

precisely, in SPC data points representing measures of process performances are collected

8

Trang 4

These values are then compared to the values of central tendency, upper and lower limit of

admissible performance variations

While SPC is a well established technique in manufacturing contexts, there are only few

works in literature (Card, 1994; Florac et al., 2000; Weller, 2000(a); Weller, 2000(b); Florence,

2001; Sargut & Demirors, 2006; Weller, & Card 2008; Raczynski & Curtis, 2008) that present

successful outcomes of SPC adoption to software In each case, not only are there few cases

of successful applications but they don’t clearly illustrate the meaning of control charts and

related indicators in the context of software process application

Given the above considerations, the aim of this work is to generalize and put together the

experiences collected by the authors in previous studies on the use of Statistical Process

Control in the software context (Baldassarre et al, 2004; Baldassarre et al, 2005; Caivano 2005;

Boffoli, 2006; Baldassarre et al, 2008; Baldassarre et al, 2009) and present the resulting

stepwise approach that: starting from stability tests, known in literature, selects the most

suitable ones for software processes (tests set), reinterprets them from a software process

perspective (tests interpretation) and suggest a recalculation strategy for tuning the SPC

control limits

The paper is organized as follows: section 2 briefly presents SPC concepts and its

peculiarities; section 3 discusses the main differences and lacks of SPC for software and

presents the approach proposed by the authors; finally, in section 4 conclusions are drawn

2 Statistical Process Control: Pills

Statistical Process Control (SPC) (Shewhart, 1980; Shewhart, 1986) is a technique for time

series analysis It was developed by Shewhart in the 1920s and then used in many contexts

It uses several “control charts” together with their indicators to establish operational limits

for acceptable process variation By using few data points, it is able to dynamically

determine an upper and lower control limit of acceptable process performance variability

Such peculiarity makes SPC a suitable instrument to detect process performance variations

Process performance variations are mainly due to: common cause variations (the result of

normal interactions of people, machines, environment, techniques used and so on);

assignable cause variations (arise from events that are not part of the process and make it

unstable) A process can be described by measurable characteristics that vary in time due to

common or assignable cause variations If the variation in process performances is only due

to common causes, the process is said to be stable and its behavior is predictable within a

certain error range; otherwise an assignable cause (external to the process) is assumed to be

present and the process is considered unstable A control chart usually adopts an indicator

of the process performances central tendency (CL), an upper control limit (UCL =

CL+3sigma) and a lower control limit (LCL = CL-3sigma) Process performances are tracked

overtime on a control chart, and if one or more of the values fall outside these limits, or

exhibit a “non random” behavior, an assignable cause is assumed to be present

Fig 1 Example of SPC charts (X charts)

“Sigma” is calculated by using a set of factors tabulated by statisticians (for more details refer to (Wheeler & Chambers, 1992)) and it is based on statistical reasoning, simulations carried out and upon the heuristic experience that: “it works” A good theoretical model for

a control chart is the normal distribution shown in figure 2 where: the percentage values reported express the percentage of observations that fall in the corresponding area;  is the theoretical mean;  is the theoretical standard deviation In the [-3, +3] interval, fall 99.73% (i.e 2.14 + 13.59 + 34.13 + 34.13 + 13.59 + 2.14) of the total observations Thus only the 0,27 % of the observations is admissible to fall outside the [-3, +3] interval

Fig 2 Normal distribution, the bell curve

If we consider sigma in place of , the meaning and rational behind a control chart results clear For completeness it is necessary to say that the normal distribution is only a good theoretical model but, simulations carried out have shown that independently from the data distribution, the following rules of thumb work:

 Rule1: from 60% to 75% of the observations fall in the [CL-sigma, CL+1sigma]

 Rule2: from 90% to 98% of the observations fall in the [CL-2sigma, CL+2sigma]

Trang 5