The risk analysis process in engineering design is both iterative and progressive, in that it is composed of five basic steps that are repeated for each progressive design phase as the d
Trang 1534 5 Safety and Risk in Engineering Design whom died within weeks from radiation exposure It also caused radiation sickness
in a further 200–300 staff and fire fighters, and contaminated large areas of Belarus, Ukraine, Russia and beyond It is estimated that at least 5% of the total radioactive material in the Chernobyl-4 reactor core was released from the plant, due to the lack
of any containment structure Most of this was deposited as dust close by Some was carried by wind over a wide area About 130,000 people received significant radi-ation doses (i.e above internradi-ationally accepted ICRP limits) and have been closely monitored About 800 cases of thyroid cancer in children have been linked to the accident Most of these were curable, though about ten have been fatal No increase
in leukaemia or other cancers has been observed but some ongoing occurrences are expected
The World Health Organisation is closely monitoring most of those affected An OECD expert report concluded that “the Chernobyl accident has not brought to light any new, previously unknown phenomena or safety issues that are not resolved or otherwise covered by current reactor safety programs for commercial power reactors
in OECD member countries” (OECD NEA 1995)
The IAEA has given a high priority to addressing the safety of nuclear power plants in Eastern Europe, where deficiencies remain However, energy demand in these countries is such that there is little flexibility for closing even those plants that are of most concern, though the European Union is bringing pressure to bear, particularly in countries that aspire to EU membership A major international pro-gram of assistance has been carried out by the OECD, IAEA and Commission of the European Communities to bring early Soviet-designed reactors up to near-Western safety standards, or at least to effect significant improvements to the plants and their operation Modifications have been made to overcome deficiencies in the 13 reac-tors still operating in Russia and Lithuania Automated inspection equipment has also been installed in these reactors as added safety precaution Another class of reactors that has been the focus of international attention for safety upgrades is the first-generation of pressurised water reactors These were designed before formal safety standards were issued in the Soviet Union, and they lack many basic safety features Eleven are operating in Bulgaria, Russia, Slovakia and Armenia (ANSTO 1994) From 1996 on, the Nuclear Safety Convention (NSC) came into force as the first international legal instrument on the safety of nuclear power plants worldwide
It commits participating countries to maintain a high level of safety by setting in-ternational benchmarks to which they subscribe and against which they report The NSC has 65 signatories and has been ratified by 41 states
For the past two decades, the University of Washington, Seattle, WA, has been de-veloping a theoretical foundation and methodology for analysing safety in complex systems, under grants from the US National Aeronautics and Space Administration (NASA Langley, NASA Ames) The methodology includes safety analysis, system hazard analysis, control software design, and special techniques for the design of human–machine interaction (Leveson 1995) What is especially appealing in this methodology is that it not only formulates system safety using control software in system automation for enhanced control of complex integrations of systems but also
considers human error analysis.
Trang 2The problem of technology-induced human error on process systems control has
been approached in two ways in designing for safety.
The first approach is the detection of error-prone automation features early in
the conceptual design phase of the engineering design process while significant changes can still be made The information produced from this approach can be used to redesign process automation to eliminate any error-inducing features, or to design better human-machine interfaces, process operator procedures, and training programs
The second approach to safety analysis of human error is the more traditional
form of human factors analysis This method looks at the types of human errors that could arise in the system, and then performs a comparative analysis of the
con-troller’s job before and after system automation control Potential safety issues are identified that involve decreased awareness, increased vigilance requirements, and skills degradation Identification, classification and evaluation of potential hazards are done through modelling and analysis in which the hardware, software, as well
as human components in the system are considered
Risk in engineering design may simply be described as the process of risk
anal-ysis of hazardous systems at the conceptual, preliminary and detail design phases, with respect to risk prediction, risk assessment and risk evaluation respectively The risk analysis process in engineering design is both iterative and progressive, in that
it is composed of five basic steps that are repeated for each progressive design phase
as the design becomes increasingly complex and detailed These five steps include the following:
• Design definition
• Hazards definition
• Risk estimation
• Risk verification
• Results application.
Design definition entails defining the system under consideration according to the level of detail achieved at each particular design phase Thus, at the conceptual de-sign phase, the process and major systems are defined together with environmental
conditions and general system physical and functional boundaries
At the schematic or preliminary design phase, the systems are reviewed inclusive
of their major items of equipment (predominantly sub-systems and assembly sets), together with integrated systems conditions and specific equipment physical and functional boundaries
At the detail design phase, the systems are reviewed in greater depth to include
all items of equipment (e.g assemblies and components) as well as major parts of components, together with intrinsic system conditions and component physical and functional boundaries
Hazards definition is concerned with the identification of hazards that are evident
at each progressive level of design detail in the systems hierarchical structure This
step includes estimates of the significance of the identified hazards, whereby each
phase of the risk analysis process results not only in an accumulation of potential
Trang 3536 5 Safety and Risk in Engineering Design hazards but also in the elimination of hazards that are found to be non-significant through progressive clarity of the level of detail achieved at each design phase
Analysis of hazards is done either through causal analysis or through conse-quence analysis, or both, depending on the need to identify causes or conseconse-quences
of the hazards respectively Identifying the causes of hazards usually makes use of techniques such as root cause analysis, whereas consequence analysis makes use of systems engineering analysis
Risk estimation may be perceived as the application of a variety of methods and
techniques for risk prediction, risk assessment and risk evaluation The prediction
of risk is usually at a higher process and systems level with minimal clarity on detail, and is fundamentally useful in determining the configuration (inclusion of parallel redundancy) and initial sizing (maximum strength-stress safety margins) of the engineering design
Risk assessment is usually conducted at equipment level, and includes
investiga-tion of potential sources of hazards to determine the probability/likelihood of oc-currence of the originating hazard and its associated consequences for the system’s operation as a whole Risk assessment may also be targeted at the component level whereby functional failures are identified based on the severity of their intrinsic effects and the likelihood of their occurrence
Risk verification is basically concerned with verifying the suitability of the risk
estimation techniques and their end results It is fundamentally a design review pro-cess used to determine the integrity of engineering design through verification of the risk estimates This is accomplished by considering the relevance and suitability of the various risk estimation and analysis methods with respect to their appropriate-ness in analysing the type of system and hazard being studied, as well as the format
of the results with respect to a correct understanding of the priority, occurrence and severity of the risk
Results application effectively incorporates the contents and results of the
previ-ous four risk analysis steps with the application of automated continual design re-views in concurrent engineering design throughout the engineering design process
In this research, these design reviews are modelled in an artificial intelligence-based (AIB) blackboard system that is targeted for use by multi-disciplinary groups of
de-sign engineers, whereby each dede-signed system is evaluated for integrity by locally
or remotely located design groups communicating via an intranet or via the inter-net, within an integrated collaborative engineering design environment The reviews should contain information such as systems definition, analysis methodology and as-sociated assumptions and limitations, modelling descriptions, quantitative data and methods of accumulation, the specific techniques of risk estimation and the results obtained, together with a discussion of the results, associated assumptions and sen-sitivity analysis—all within intelligent computer automated methodology for deter-mining the integrity of engineering design
Trang 45.2 Theoretical Overview of Safety and Risk
in Engineering Design
Safety, in contrast to risk, is a system property, not a component property Therefore, safety analysis must consider the entire system, and not its component parts How-ever, there exists no single safety analysis technique that can cope with all aspects
of complex systems or complex integrations of systems Safety analysis of complex systems is an inter-disciplinary effort, and must include systems design engineers, software engineers, and human factors and cognitive experts Safety analysis in en-gineering design is, in effect, a program consisting of systems design activities and special safety tasks and techniques that significantly interact with one another at each progressive phase of the engineering design process Such a program is highly iterative and includes continual updating of what has been done previously in the earlier phases, as new information and clarity of the design are gained A safe sys-tem is one that is free from accidents or incidents resulting in unacceptable losses Accidents or incidents result from hazards, where a hazard is defined as a system state or condition that can lead to an accident or incident, given certain uncontrol-lable or unpredictable environmental conditions
Thus, safety in engineering design starts with a hazards analysis that identifies
and analyses the system for hazards Once these hazards have been identified, steps can be taken to eliminate these, reduce their likelihood, or mitigate their effect In addition, some hazard causes can be identified and eliminated or controlled Al-though it is usually impossible to anticipate all potential causes of hazards, obtain-ing more information about these usually allows greater protection to be provided with fewer trade-offs, especially if the hazards are identified in the early design phases Identifying hazards, and hazard causes, enables safety requirements to be established during the engineering design process
A hazard may be defined as “a source of potential harm or a situation with
a potential for harm”, where harm is defined as “a physical injury or damage to health, property or the environment” Furthermore, an accidental event is defined as
“an event which can cause harm” (IEC60300-3-9 1995) A hazard may thus lead to
an accidental event To create a sound basis for further analysis, all the hazards have
to be identified in a systematic way A commonly used technique for such a survey
is hazard identification (HAZID).
Hazard identification (HAZID) analysis is usually carried out in the early design
phases of a system The objective of the analysis is to reveal potential hazards at
an early stage, such that the hazards may be eliminated, minimised or controlled as early as possible in the development process For each hazard that is identified, all possible causes, effects and severity of potential accidents are described Possible improvements and precautions are also described It is important that this analysis
is based on previous experience with similar equipment Checklists of various types are useful during the analysis The analysis should be conducted with one or two ex-perienced engineers in attendance, with a background in safety engineering Since the HAZID analysis is carried out in the early phases of the engineering design
Trang 5538 5 Safety and Risk in Engineering Design process, a limited amount of information about the specific system will normally
be available For a process plant, the process concept has to be settled before the analysis is initiated At that point in time, the most important chemicals and reac-tions are known, together with the main elements of the process equipment (vessels, pumps, etc.) The HAZID analysis must be based on all safety-related information about the system, with respect to design criteria, equipment specifications, specifi-cations of materials and chemicals, operational procedures, previous hazard studies
of similar systems, and previous accident details if available
The following input information should be available:
• Design sketches, drawings and data describing the system and sub-system
ele-ments for the various conceptual approaches under consideration
• Functional flow diagrams and related data describing the proposed sequence of
activities, functions and operations of the system elements during the contem-plated life span
• Background information relating to safety requirements associated with the
con-templated testing, manufacturing, storage, repair and use locations, and safety-related experiences of similar previous programs or activities
The HAZID analysis is conducted by identifying hazards and thereby potential ac-cidental events that may lead to unwanted consequences The analysis must also identify design criteria or alternatives that may eliminate or reduce the hazard Dur-ing the analysis, certain factors must be considered (AIChE 1992)
These factors are:
• Hazardous equipment and materials
(e.g fuels, highly reactive chemicals, toxic substances, explosives, high-pressure systems, and other energy storage systems)
• Safety-related interfaces between equipment and materials
(e.g material interactions, fire/explosion initiation and propagation, and con-trol/shutdown systems)
• Environmental factors that influence the equipment and materials
(e.g storms, earthquakes, vibration, flooding, extreme temperatures, electrostatic discharge, and humidity)
• Operating, testing, maintenance and emergency procedures
(e.g human error, operator functions, equipment layout and/or accessibility, and personnel safety protection)
• Facility support
(e.g storage, testing equipment, training and utilities)
• Safety-related equipment
(e.g safety device, fire suppression, personal protective equipment)
Some hazards can be identified by the following (Rausand 2000):
• examining similar existing systems,
• reviewing existing checklists and standards,
• considering energy flows through the system,
Trang 6• considering inherently hazardous materials,
• considering interactions between system components,
• reviewing previous hazard analyses for similar systems,
• reviewing operation specifications,
• considering all environmental factors,
• considering human/machine interfaces,
• considering usage mode changes,
• trying small-scale testing, and theoretical analyses,
• thinking through a worst-case scenario, what-if analysis.
The results from a HAZID analysis are usually presented in a specific HAZID work-sheet, identifying the hazards, the causes, the potential consequences, and possi-ble improvements and precautions In most applications, it is relevant to start with the accidental events A generic list of hazards may often be useful in supporting
a brain-storming process to identify potential accidental events A number of sim-ilar methods with other names are also used Among these are preliminary hazard analysis (PHA), and rapid risk ranking (RRR)
The preliminary hazard analysis technique was developed by the US Army (MIL-STD-882C 1993), and has been successfully used within defence-related industries, and for safety analysis of engineering processes
Hazard severity categories are defined to provide a qualitative measure of the
worst credible mishap resulting from personnel error, environmental conditions, design inadequacies, procedural deficiencies, or system, sub-system or component failure or malfunction The starting point of a HAZID worksheet is defining the potential accidental events The worksheet has a column called ‘hazard category’
In this column, the severity of the potential consequences is ranked MIL-STD-882
requires that such a column for severity ranking be part of the HAZID (or PHA) worksheet These hazard severity categories are defined in the Military Standard under the sub-title of ‘hazard severity’, and are presented in Table 5.1 An example
of a HAZID worksheet is shown in Table 5.2
In some cases, it may be relevant to include a more detailed severity ranking, e.g distinguishing between human, environmental, and process or product conse-quences Such a ranking depends on the actual context of the consequences, i.e the risk assessment scale is given later in Table 5.7
Table 5.1 Hazard severity ranking (MIL-STD-882C 1993)
Hazard severity Description category Mishap definition
II Critical Severe injury, severe occupational
illness, or major system damage III Marginal Minor injury, minor occupational
illness, or minor system damage
IV Negligible Less than minor injury, occupational
illness, or system damage
Trang 7540 5 Safety and Risk in Engineering Design
Table 5.2 Sample HAZID worksheet
System: Acid separation plant
Subsystem: Precipitation tanks—piping
Drawing: Al-ASP/PT 02-004
Date: Nov 2006 Page: 11 of 32
Reference Accidental
event
Probable causes
Major effects Consequence/
severity
Corrective/
preventive action HC
piping
Gas
leak/losses
CO 2
corrosion
Thin wall cracking
Safety catastrophic I
Ultrasonic corrosion probe Slurry
piping
Containment
losses
Pipe wall penetration
Lack of inhibition
Environment critical II
Inhibitor piping injection check Water
piping
Containment
losses
External erosion
Damage/water traps
Environment marginal III
Visual inspections
Table 5.3 Categories of hazards relative to various classifications of failure
Hazard category
Stress-related failure
Failure due to misuse
Failure due to damage
Failure due to weakness
Failure due to wear-out
Maintenance failures
Immediate functional Gradual degradation
Critical safety Critical operational Major functional Minor functional Hidden failure Non-operational
Catastrophic Critical Marginal Negligible
Hazard consequences depend on the cause-effect nature of functional failure
within a system as well as system states that define system hazards The various combinations of the different defining categories of hazards (i.e by cause, effect,
consequence and/or severity), relating to the various classifications of failure, are
presented in Table 5.3
Hazards analysis can take the form of backward analysis, alternatively termed causal analysis, or of forward analysis, alternatively termed consequence analysis,
depending on the need to identify causes or consequences respectively Hazard anal-ysis techniques that use backward search or analanal-ysis begin with a hazardous state and then determine the events that could lead to this state The analysis starts from hazards identified at the process and/or systems level, and identifies their precursors further down in the systems hierarchical structure Thus, in the case of backward analysis, the analysis is causal and the search is top-down Conversely, in hazard analysis techniques that use forward search or analysis, in which the next sequences
of events that could lead from a hazard to an accident or incident are identified, the analysis is consequential and the search is bottom-up
Information that is derived from both backward and forward analysis, i.e cause-consequence analysis, relates to recognisable failed system states that can be used to
redesign the system to prevent or minimise the probability of occurrence and/or the severity of the hazard In order to be able to apply causal analysis techniques, such as
Trang 8fault-tree analysis (FTA), a more detailed specification or model of the behaviour of the system is required at the lower equipment levels of the systems hierarchy (e.g assembly, sub-assembly and component levels) A high-level system design may appear to be safe, while the detailed design contains hazardous equipment
interac-tions inherent within the system However, these interacinterac-tions may not be inherent
within a single system and may arise only on the functional interface of equipment belonging to different systems, due to a complex integration of the various sys-tems by design The hazards and design constraints must be traced right down to the system components where feasible, so that assurance may be provided that the hazards have been eliminated or mitigated Although theoretically this type of anal-ysis can be comprehensively performed only on the detail design of the system, the
practical approach is to progressively structure the design according to systems hier-archical modelling that is continually enhanced as the emerging systems breakdown
structure (SBS) becomes more definite with each phase of the engineering design
process The use of object oriented programming (OOP) simulation modelling that
provides graphic displays (preferably animated) of the various systems and equip-ment of the design enables realistic three-dimensional visualisation of the model appropriate to the system’s domain The OOP simulation must have the capability
of backward and forward processing to accommodate both causal and consequence analysis respectively
5.2.1 Forward Search Techniques for Safety
in Engineering Design
Forward search techniques begin with an initiating event and trace it forwards in
time or in effect At the higher systems levels, the appropriate forward analysis technique is consequence analysis As indicated previously in Sect 3.3.2.1 dealing
with failure modes and effects analysis, the consequences of failure are associated
with the overall results that occur in the system or process as a whole, whereas the
effects of failure are associated with the immediate results that initially occur within
the assembly’s or component’s environment
Thus, at the lower systems levels, the appropriate forward analysis techniques are failure modes and effects analysis (FMEA) and failure modes and effects criticality analysis (FMECA) or, in the case of safety analysis, event tree analysis (ETA), haz-ards and operability (HAZOP) studies, and failure modes and safety effects (FMSE) analysis The difference between FMSE and FMECA is in the construct of the in-ductive FMSE spreadsheet that, in addition to the standard columns of an FMECA, includes safety-related aspects such as failure root causes, integrity measures, and inspection methods
Hazard analysis thus conventionally includes the following deductive and
induc-tive analysis techniques:
• Fault-tree analysis (FTA)
• Root cause analysis (RCA)
Trang 9542 5 Safety and Risk in Engineering Design
• Event tree analysis (ETA)
• Cause-consequence analysis (CCA)
• Hazardous operability studies (HAZOP).
These techniques have been developed for visualising the sequence of events in the operation of a complex engineered system and for estimating the probability of occurrence of the end result They start either with an expected, unwanted effect (i.e fault) and work backwards to the logical cause, or with a proposed cause (i.e event) and proceed forwards through the relevant analysis steps, ending up with an end result effect and/or several effects As indicated in Sect 3.2.2, backward analysis
techniques are deductive, whereas forward analysis techniques are inductive These
are often termed ‘top-down’ or ‘bottom-up’ procedures that emulate the diagram-matical arrangement representing the systems hierarchical structure that is used to define the path from effect to cause, or from cause to effect respectively
Fault-tree analysis is a typical application of deductive analysis, in which the
analysis begins with the system in a hazardous state and then works backwards one step at a time, during which irrelevant branches of possible causes can be omit-ted, or specific branches of greater significance can further be followed However,
applying a reachability graph to visualise the structural extent of backward
anal-ysis of process engineering systems, it becomes evident that the graph explodes quickly for complex systems, and in itself becomes complex (Leveson 1995) Many
of the branches of the structural graph are either incomplete or impossible to pur-sue, requiring alternative analytic approaches, which will be considered in Sect 5.3 Figure 5.1 shows the format for a fault-tree analysis that is a ‘top-down’ deductive
analysis method, and here the branch points lead to the typical question:‘What are the conditions that lead to this point?’.
Root cause analysis utilises the deductive logic tree approach, similar to
fault-tree analysis, in establishing the root causes of a problem, whether it is a functional failure or a system state Such a logic tree approach to problem solving is partic-ularly useful for determining safety in detail engineering designs By organising problem analysis results in an orderly manner as the design progresses, the time spent to find the root causes of possible problems is minimised The method uses
factor trees to guide the course of the analysis These factor trees diagrammatically
present the major functions to be considered in the design’s project stages, and pro-vide an excellent method for sorting out facts and zeroing-in on the root causes of problems
Fig 5.1 Fault-tree analysis
Fault
Analysis
Possible initiating causes
Trang 10Event tree analysis, unlike fault-tree analysis, uses inductive logic This
tech-nique is a method for illustrating the sequence of outcomes that may arise after the occurrence of a selected initial event It is mainly used in consequence analysis for pre-incident and post-incident application The left side connects with the initiator, the right side with plant damage state; the top defines the systems; nodes (dots) call for branching probabilities obtained from the system analysis If the path goes up at the node, the system succeeded, if down, it failed Event trees have been applied in the nuclear industries for operability analysis of nuclear power plant as well as for accident sequence in the Three Mile Island nuclear power generator accident (INPO 84-027 1984)
Figure 5.2 shows an event tree format classified as a ‘bottom-up’ inductive anal-ysis method Here, the branch points follow a YES or NO criterion for a specific
question of the type ‘is valve V1 closed?’.
Cause-consequence analysis is a combination of deductive analysis and of
induc-tive analysis This technique combines cause analysis (described by fault trees) and consequence analysis (described by event trees) The purpose of cause-consequence analysis is to identify chains of events that can result in undesirable consequences With the probabilities of the various events in the CCA diagram, the probabilities
of the various consequences can be calculated, thus establishing the risk level of the system This technique was developed by RISO Laboratories in Denmark to be used in risk analysis of nuclear power stations (Aven 1992) It can also be adapted for process engineering in the estimation of the safety of protective systems Figure 5.3 shows a layout of a cause-consequence analysis that is both a
‘top-down’ deductive analysis and a ‘bottom-up’ inductive analysis.
These tree-based methods are used mainly to find cut sets leading to the undesired events Fault trees and event trees have been widely used to quantify the probabil-ities of occurrence of accidents and other undesired events leading to the loss of life or economic losses in probabilistic risk assessment However, use of fault tree and event tree analysis techniques is usually confined to static, logic modelling of accident scenarios, and does not cover risk assessment for dynamic systems (Siu 1994)
Methodologies for the analysis of dynamic systems include techniques such as
digraphs or fault graphs, dynamic event logic, as well as Markov modelling, which
will be considered later In giving the same treatment to process failures and human errors in fault tree and event tree analysis, the conditions affecting human behaviour cannot be modelled explicitly This affects the assessed level of dependency be-tween events Relatively new techniques such as human cognitive reliability
Effects Initiating
event