This quantification of the CCF beta value, indicating that a significant portion of the total failure rate of a component arises from common cause failures, has placed upper limit constr
Trang 1in an information integrated technology (IIT) program considered in Sect 3.3.3.4
and illustrated in Fig 3.46
The beta factor model is extensively used in predictions, in which the appropri-ate values for beta are selected on the basis of expert engineering judgement The problem with the model, though, is the lack of any detailed data for generic sys-tems, assemblies and components, to provide an adequate assessment of safety in engineering design—especially in the preliminary design phase As a result, quite large beta factors have been applied without any justification, and caution needs to
be exercised when selecting these beta values, otherwise the estimates will give un-justifiably pessimistic results with possible over-design of safety-related systems
A somewhat different approach to the beta factor model has thus been taken in which the beta values are not used but predictions are made directly from event data using expert judgment This approach necessitates identifying the root causes
of failure and the likelihood of generating simultaneous failures in similar equip-ment (Hughes 1987) Fundaequip-mentally, the basis of this approach, typical to IIT, is to represent the variability of a component failure probability by distributions that can
be estimated directly from a relatively small database However, some researchers have pointed out the deficiencies of expert engineering judgement as applied to common cause failures, and contend that analysis of such failures is a knowledge-based decision process and, therefore, is itself subject to error or uncertainty (Do-erre 1987)
b) Problems with Applying CCF in Safety and Risk Analysis
for Engineering Design
Problems with applying CCF in safety and risk analysis for engineering design as-sessment in the preliminary design phase can thus be reviewed
These problems are summarised as (Hanks 1998):
• The lack of a suitable comprehensive database for CCF.
• Use of simple CCF models giving pessimistic results for redundancy systems.
• The assumption that similar components will be similarly affected.
• Errors in understanding the nature of CCF and applying the appropriate
method-ology
Various alternative models that refine the beta factor method have thus been pro-posed, such as a binomial failure rate model that assumes that a component has
a constant independent failure rate and a susceptibility to common cause shocks at
a constant rate (NUREG/CF-1401 1980) This has been extended to include com-mon cause shocks that do not necessarily result in catastrophic failure A practical method of common cause failure modelling modifies the beta factor model to take account of the levels of redundancy and diversity in systems (Martin et al 1987)
It was previously noted that the simple beta model is pessimistic when applied to redundancy systems This can also be the case when it is applied to a range of sim-ilar components even if they are installed in one system To illustrate this problem,
Trang 2an example is given based on a simplified high-pressure protection redundancy con-figuration relating to the high-integrity protection sub-system (HIPS) illustrated in Fig 5.22 As indicated in Sect 5.2.3.2, the function of the HIPS sub-system is to prevent a high-pressure surge passing through the process, thereby protecting the process equipment from exceeding its individual pressure ratings
A schematic of the simplified configuration is given in Fig 5.23 In this example,
a possible source of CCF is the contamination of the upstream high-pressure line In theory, all the regulators should be equally affected—however, much depends upon the design features of the main valves and their control systems Contamination of
the high-pressure line will affect the active control valve, A1, in the operating stream
Whether it will affect the monitor valve M1, and to what extent, depends on the way the control system functions Both the regulators in the standby stream should be unaffected In this example, there is a potential for CCF to occur but normal practice would be to assume that CCF applies equally to all the four identical valves—so, it will be seen that the result of any prediction would be pessimistic Another problem
in this case would be the total misunderstanding of how to apply CCF prediction methodology
In Fig 5.23, there are two control streams, one functioning as an operating stream
with two identical regulators (pressure valves), M1and A1, and the other functioning
as a standby stream with two identical regulators, M2and A2 Each stream’s
regu-lator configuration consists of a monitor valve, M i , and an active control valve, A i (i = 1,2) The first regulator in the operating stream, the monitor valve M1, is fully
open in readiness to take over control should the pressure rise above a predeter-mined level due to failure of the active control valve A1 The active control valve A1 controls the outlet pressure Similarly, the first regulator in the standby stream, M2,
is fully open and will function in a similar manner as valve M1, should the standby
stream be activated The second regulator in the standby stream, A2, is closed and will take over pressure regulation if either of the regulators in the operating stream
fails to reduce the outlet pressure to below a predetermined level.
Common cause failures can arise from a wide range of potential problems, typ-ically identified through factor tree charts and associated questions concerning the potential root causes of design integrity problems (as indicated in Sect 5.2.1.2, and Figs 5.6 through to 5.8)
Fig 5.23 Schematic of a
sim-plified high-pressure
protec-tion system
Trang 3Table 5.13 Analysis of valve data to determine CCF beta factor
Ball, plug and gate valves 0.01–0.02
Relief valves, all types 0.05
Check and non-return valves 0.05
Large regulators/control valves 0.10–0.19
Small regulators/control valves 0.05–0.12
Actuators, all types 0.05–0.16
Minimising the effects of CCF is thus an on-going process, already from the early phases of engineering design to the in-service life Any attempt to cut corners
or costs will almost inevitably expose engineered installations to a higher level of CCF-induced failures, with resulting increased costs of failure maintenance, lost production and possible loss of life
Design criteria databases do not usually include common cause failure data One problem is that CCF data for a particular component can be specific to the application and, hence, require a whole series of design, operation and mainte-nance considerations for a particular process Detailed analysis of valve data from
a large database collected from maintenance and operational records has yielded useful information on the incidence of CCF (Hanks 1998) The data, summarised
in Table 5.13, cover a wide range of valve types and applications, including actua-tors
This quantification of the CCF beta value, indicating that a significant portion of the total failure rate of a component arises from common cause failures, has placed upper limit constraints on obtaining sufficiently high levels of reliability and safety
in critical risk control circuits As indicated previously, redundancy is one method
in avoiding this problem, although the approach does lead to large increases in the
false alarm rate (FAR).
This is overcome, however, by utilising voting redundancy where several items are used in parallel and a selected amount are required to be in working order The voting redundancy problem involves the simultaneous evaluation and selec-tion of available components and a system-level design configuraselec-tion that collec-tively meets all design constraints and, at the same time, optimises some objec-tive function, usually system safety, reliability, or cost In practice, though, each of these parameters may not be exactly known and there is some element of risk that the constraint will not actually be met or the objective function value may not be achieved
Trang 45.2.4 Theoretical Overview of Safety and Risk Evaluation
in Detail Design
Safety and risk evaluation determines safety risk and criticality values for each in-dividual item of equipment at the lower systems levels of the systems breakdown
structure (SBS) Safety and risk evaluation determines the causes and consequences
of hazardous events that occur randomly, together with a determination of the fre-quencies with which these events occur over a specified period of time based on
critical component failure rates Safety and risk evaluation is considered in the de-tail design phase of the engineering design process, and includes basic concepts of
modelling such as:
i Point process event tree analysis in designing for safety.
ii Cause-consequence analysis for safety systems design.
iii Failure modes and safety effects evaluation.
5.2.4.1 Point Process Event Tree Analysis in Designing for Safety
The most extensive safety study to date is the US Nuclear Regulatory Commis-sion’s report “Reactor Safety Study” (NUREG-75/014 1975) In October 1975, the NRC issued the final results of a 3-year study of the risks from postulated accidents during the operation of nuclear power reactors of the type used in the USA This re-port, known as the “Reactor Safety Study (RSS)”, or by its NRC document number, WASH 1400, was the first comprehensive study that attempted to quantify a variety
of risks associated with power reactor accidents Since that time, about 40 reactors have been analysed using the same general methodology as WASH 1400 but with considerably improved computer codes and data
The most recent and the most detailed of these studies has been the effort under-taken by the NRC to analyse five different reactors using the very latest methodology and experience data available In June 1989, the second draft of this work, “Severe Accident Risks: An Assessment for Five U.S Nuclear Power Plants” (NUREG 1150 1989), was issued for public comment There is, however, widely held belief that the risks of severe nuclear accidents are small This conclusion rests in part upon the probabilistic analysis used in these studies (NUREG/CR-0400 1978)
The analysis used in these studies provides a suitable example in order to better understand the application of point process event tree analysis in the evaluation of safety and risk in the detail design phase The approach to safety evaluation, as researched in these two studies, considered the sources of the risk, its magnitude, design requirements, and risk determination through probabilistic safety evaluation (PSE) These points of approach, although very specific to the example, need to be briefly explained (Rasmussen 1989)
Trang 5a) Determining the Source of Risk
During full power operation, a nuclear power reactor generates a large amount of ra-dioactivity Most of this radioactivity consists of fission products, resulting from the fission process, which are produced inside the reactor fuel The fuel is uranium diox-ide, a ceramic material that melts at about 5,000◦F The fuel effectively contains the
radioactive fission products unless it is heated to the melting point At temperatures
in this range, essentially all the gaseous forms of radioactivity will be released from the fuel In addition, some of the more volatile forms of the solid fission products may be released as fine aerosols If any of these forms were to be released into the atmosphere, they could be spread by prevailing winds
b) Designing for Safety Requirements
Design requirements for safety in US nuclear plants mandate that the plants have systems to contain any radioactivity accidentally released from the fuel The main system for accomplishing this is the containment building, an airtight structure that surrounds the reactor In addition, all reactors have a system for removing aerosols from the containment atmosphere In many reactors, this system consists of a wa-ter spray that can create the equivalent of a heavy rainstorm inside the contain-ment building Boiling water reactors (BWR) accomplish this function by passing released gases through a pool of water The principal goal of the reactor safety phi-losophy is to prevent the accidental release of radioactivity As a backup, systems are added that prevent the release of radioactivity to the atmosphere even if it were released from the fuel Despite these efforts, one can always postulate ways in which these systems might fail to prevent the accidental release of radioactivity
It is the task of probabilistic safety evaluation (PSE) to identify how this might happen, to determine how likely it is to happen and, finally, to determine the health effects and economic impacts of the radioactive releases upon the public
c) Probabilistic Safety Evaluation (PSE)
• The first step in a PSE analysis begins by developing the causes and likelihood
of heating the fuel to its melting point due to either external causes (earthquakes, floods, tornadoes, etc.) or internal causes This analysis involves developing
a logical relationship between the failures of plant components and operators, and the failure of system safety functions The result of this analysis is an esti-mate of the probability of accidentally melting the fuel, a condition often called
‘core melt’ Of the plants analysed thus far, most have an estimated likelihood of core melt of between 1 in 10,000 and 1 in 100,000 per plant year
• The second step in a PSE analysis is to determine the type and amount of
radioac-tivity that might be released in the different accidents identified These fractions
of the various types of radioactivity released are called the ‘source terms’ for
Trang 6the accident The values from WASH 1400 are in most cases significantly larger than those from NUREG 1150 The lower values of NUREG 1150 are the result
of new information gained from major research in the USA, Japan and Western Europe These experiments, and the measurements at Three Mile Island confirm that the values used in WASH 1400 are too high
• The final step in a PSE analysis is to calculate the effects of any radioactivity
re-leased in the accident Sophisticated computer models have been developed to do this calculation These models require input of the source terms, the population density around the site, and weather data for recent years from the plant site The code then calculates thousands of cases to generate curves that give the magni-tude of given risks versus their probabilities The results of the calculations are in the form of fatality curves The curves generally give the frequency in units per reactor year for events of a given size, and have a wide range of consequences, from quite small at high frequencies to quite large at very low frequencies
• Curves of this shape are typical of all accidents where a number of factors affect
the magnitude of the event In the case of catastrophic accidents, clearly this refers to accidents of low probability near the high-consequence end of the scale These extreme accidents come about only if the various factors affecting the magnitude of the consequences are all in their worst states Thus, for example, the core must melt, then the containment must fail above ground level, the wind must be blowing towards an area of relatively high population density, inversion conditions must prevail, and civil protection efforts must fail to be effective Criticism of the Reactor Safety Study pointed to inadequacies in the statistical methodology, particularly the uncritical use of the log-normal distribution to derive probability estimates for the failure of individual nuclear safety systems (NUREG/ CR-0400 1978) There is an inherent weakness to the approach, in that there is no way of being sure that a critical initiating event has not been overlooked The logic event tree consists of the initiating event and the success or failure response of each
of the applicable engineered safety features After identifying the accident sequence, the probability of occurrence of each engineered safety system in the sequence must
be evaluated As no empirical data are available on which to base estimates of sys-tem failure rates, it is necessary to use techniques that generate syssys-tem failure rates from comparative estimates of failures of similar equipment The extended use of event trees to derive probability estimates for both the failure of individual nuclear safety systems, as well as the accident sequences was developed by the US Depart-ment of Defense and the US National Aeronautics and Space Authority (NASA; NUREG 1150 1989)
With the identification of potential accidents and the quantification of their probability and magnitude, accident sequences are identified by means of logic diagrams—in this case, logic event trees The starting point for the development
of these logic event trees is the identification of the event that initiates a potential accident (due to a catastrophic failure event) or potential incident (due to a criti-cal failure event) A typicriti-cal initiating event for the nuclear reactor safety example would be a pipe break that results in a loss of coolant Initiating events are usu-ally identified using technical information and engineering judgment, similar to an
Trang 7Fig 5.24 Typical logic event tree for nuclear reactor safety (NUREG-751014 1975)
integrated information technology (IIT) program considered in Section 3.3.3.4 and
in Fig 3.41
Figure 5.24 shows a typical logic event tree with an initiating event of a pipe break in a nuclear reactor coolant line, with probability of occurrence ofλ The logic event tree is simplified in that only seven out of 24possibilities need to be considered—for example, if electric power fails with an event rate ofλP2, then none
of the engineering safety features will function The output of the logic event tree
is the release category consequences with their event rates Since the probability of
occurrence is small (i.e equivalent to the concept of a rare event), the probabilities
are approximated by omitting all 1− P iterms
d) Point Process Consequence Analysis
The basic methodology of the Reactor Safety Study used an approach of determin-ing a demand failure rate This can briefly be explained as the control of the rate
of reaction in an atomic power plant by the insertion of control rods The times at which control is needed is termed the transient demand, and was assumed to occur
in an operating time equivalent to a Poisson process When a transient demand oc-curs, the conditional probability that the safety system does not function, resulting
in the consequence of an accident, was determined Based on the Reactor Safety Study, a method for evaluating consequences as a result of safety system failure in
a catastrophic-event process such as a nuclear reactor has been researched (Thomp-son 1988)
Suppose the events initiating accident sequences (as in Fig 5.24) occur in time according to a stochastic point process with an event rate of μ(t) Furthermore,
let N (t) denote the number of events up to time t, and T (i = 1,2,3, ,k) denote
Trang 8the time at which the ith initiating event occurs Suppose further that the ith ini-tiating event yields the consequence C i Assume that C i is a non-negative random
variable with failure distribution function P (C i ≤ c) = F(c), and survival function
P (C i ≥ c) = F (c) The consequences can be assumed to be identically distributed
and independent of one another, with the understanding that there are several kinds
of risks, each with its own initiating event rate and consequence distribution Fi-nally, the evolution of consequences has been assumed to follow a point process Actually, the consequences of many accidents and incidents are difficult to express numerically and most are vector valued in time The basic methodology of the Re-actor Safety Study in dealing with this problem was to conduct a separate study for each type of consequence, and to present the risk in terms of an event rate against
a consequence in the form of a risk curve, as illustrated in Fig 5.25 In mathematical
terms, ifμk (t) is the event rate at time t of consequences exceeding k, the critical
number of consequences, then the risk curve is a graph for fixed t ofμk (t) versus k.
The event rate of consequences that exceed k is related to the process of initiating
events and the distribution of consequences
μk (t) = [1 − F(k)]μ(t) (5.66) where:
F (k) is the failure distribution function of C i
Fig 5.25 Risk curves from nuclear safety study (NUREG 1150 1989) Appendix VI WASH 1400:
c.d.f for early fatalities
Trang 9Different process systems designs have different consequence sequences The
consequence sequences, S, of a particular process, P, over a time period, t, can be
expressed as
SP(t) =
'
C1+C2+C3+ C N (t) N (t) > 0 (5.67)
Characteristics of SP(t) are determined by N(t) and the distribution of the
con-sequences C i, where the sequence of consequences constitutes a point process Of specific interest is to determine the catastrophic event having the greatest conse-quence when one accident is too much in the seconse-quence of conseconse-quences This is
done by defining an expression for S
Pin the catastrophic case
SP(t) =
'
max[C1+C2+C3+ C N (t) ] N(t) > 0 (5.68)
Probability S
P(t) being less than k, the critical number of consequences is given
by
P (S
where:
P =∑∞
n=0
P [C i ≤ k; i = 1,2,3, ,n|N(t) = n]
If C i is a non-negative random variable with the failure distribution function
P (C i ≤ c) = F(c), then
P (S
P(t) ≤ k) = ∑∞
n=0[F(k)] n · P[N(t) = n] (5.70)
where:
ψt = probability generating function of N(t) (i.e Bernoulli transform)
and if C i is a non-negative random variable with the survival function P (C i ≥ c) =
F (c), then
P (S
P(t) > k) = 1 −ψt [F(k)] ≤ [1 − F(k) · E N(t)] (5.72)
Thus, for consequences exceeding k, the critical number of consequences (or the
‘cut-off value’ between acceptable and unacceptable consequences), the probability
of the occurrence of an unacceptable consequence within time t will be less than
F (k)E N(t), where F (k) is the survival function P(C i > k), and E N(t) is the ex-pected value or mean of N (t), the number of events on (0,t].
If system failure is now identified with obtaining an unacceptable consequence,
then F (k) is the demand failure rate (such as the demand to control the rate of
reac-tion in an atomic power plant by the inserreac-tion of control rods) This demand failure
rate yields an upper bound for the probability of failure Since k is the
unaccept-able critical number of consequences, the probability of a consequence exceeding
Trang 10that value must be as small as possible—that is, F (k) will be near 0 with an upper
bound when F (k) is near 1.
The expected maximum consequence can be expressed as
EC (t) =
∞
0
EC (t) = ∑∞
n=0
∞
0
{1 − [F(k)] n }dk · P[N(t) = n] (5.74) From Eqs (5.72) and (5.74) we get:
EC (t) · P[N(t) ≥ 1] ≤ EC (t) ≤ EC(t) · N(t)
where:
EC (t) = the expected value of consequence C in period t
EC (t) = the expected value of consequence C in period t
P [N(t) ≥ 1] = the probability that the number of events ≥ 1.
The expected time to the first critical event with unacceptable consequence is given as
EV k=
∞
0
EV k= ∑∞
n=1
ETn [F(k)] n [1 − F(k)] (5.76) where:
T n = time of occurrence of the nth initiating event.
The probability generating function (p.g.f.), or Bernoulli transformψt, needs to
be defined in greater detail: Thus, given a random variable N (t), its generating
func-tionψt (z) is expressed as
ψt (z) = ∑∞
n=1
z n P [N(t) = n] (5.77)
ψt (z) is a function in terms of z with the following properties:
• The p.g.f is determined by and also determines C
• ψ
t (1) is the expectation of N(t)
• ψ
t (1) is the expectation of N(t) · [N(t) − 1].
Probability generating functions also provide for addition of independent
ran-dom variables For example, if N (t) and C(t) are independent, then the p.g.f of