Modelling and Optimizing Maintenance Intervals

Một phần của tài liệu Sổ tay bảo trì hệ thống phức tạp (Trang 109 - 115)

A wide range of general models and methods for maintenance optimization have been proposed, e.g., see Rausand and Hứyland (2004), Pierskalla and Voelker (1979), Valdez-Florez and Feldman (1989), Cho and Parlar (1991), Gertsbakh (2000) and Wang (2002). A high number of models and methods for specific appli- cations have also been developed, e.g., see Vatn and Svee (2002), Chang (2005), Castanier and Rausand (2006), and Welte et al. (2006). In this section we present basic elements required to optimize maintenance interval ( )τ , and a standard pro- cedure for setting up the cost function, C(τ), is proposed.

A computerized tool called OptiRCM has been developed by the authors to support the RCM procedure presented in this chapter. OptiRCM is currently being used by the Norwegian National Railway (NSB). The Norwegian National Rail Administration (JBV) has also adopted the same procedure. OptiRCM imports the FMECA results generated by Steps 6 and 7 of the RCM analysis process. Cost information is usually not available in the FMECA; hence information about preventive and corrective maintenance costs must be provided separately. A screen presenting the information on the MSI level is shown in Figure 4.6.

OptiRCM uses a procedure with three steps to optimize maintenance intervals:

(i) the component performance is established (left-hand part of Figure 4.6), (ii) the system model is established (centre part of Figure 4.6), and (iii) the total cost if calculated (right-hand part of Figure 4.6).

4.4.1 Component Model

The aim of the component model is to establish the effective failure rate with respect to a specific failure mode, λE(τ), as a function of the maintenance interval τ. The effective failure rate is the unconditional expected number of failures per time unit for a given maintenance level. Typically, the effective failure rate is an increasing function of τ. A large number of models for determining the effective failure rate as a function of the maintenance strategies, the degradation models, and so on, have been proposed in the literature.

Figure 4.6. OptiRCM input and analysis screen

The interpretation of the effective failure rate is not straightforward for hidden functions. For such functions we also need to specify the rate at which the hidden function is demanded. In this situation we may approximate the effective failure rate by the product of the demand rate and the probability of failure on demand (PFD) for the hidden function.

In the following we indicate models that may be used for modelling the effective failure rate, and we refer to the literature for details. The aim of OptiRCM has been:

• To cover the standard situations, both with respect to evident/hidden failures, but also with respect to the type of failure progression.

• Provide formulae that do not require too many reliability parameters to be specified.

• Limit the number of probabilistic models as a basis for the optimization.

Only the Weibull distribution is used to model aging failures in OptiRCM. There may, of course, be situations where another distribution would be more realistic, but our experience is that the user of such a tool rarely has data or insight that helps him to do better than applying the Weibull model.

4.4.1.1 Effective Failure Rate in the Situation of Aging

A standard block replacement policy is considered where an aging component is periodically replaced after intervals of length τ. Upon a failure in one interval, the component is replaced without affecting the next planned replacement. The effective failure rate, i.e., the average number of failures per time unit is then given by λE(τ)=W(τ) /τ, where W(τ) is the renewal function (e.g., see Rausand and Hứyland 2004). Approximation formulas for the effective failure rate exist if we assume Weibull distributed failure times (e.g., see Chang et al. 2006). OptiRCM

uses the renewal equation to establish an iterative scheme for the effective failure rate based on an initial approximation.

4.4.1.2 Effective Failure Rate in the Situation of Gradual Observable Failure Progression

The assumptions behind this situation is that the failure progression, say Y(t), can be observed as a function of time. In the simplest situation Y(t) is one- dimensional, whereas in more complex situations Y(t) may be multidimensional.

We may also have situations where Y(t) denotes some kind of a signal where, for example, the fast Fourier transform of the signal is available. In OptiRCM a very simple situation is considered, where Y(t) is monotonically increasing. As Y(t) increases, the probability of failure also increases, and at a predefined level (maintenance limit), say l, the component is replaced, or overhauled. The effective failure rate, λE(τ,l), is now a function of both the inspection interval, and the maintenance limit. In OptiRCM a Markov chain model is used to model the failure progression (e.g., see Welte et al. 2006 for details of the Markov chain modelling, and also an extension where it is possible to reduce the inspection intervals as we approach the maintenance limit). In the Markov chain model it is easy to treat the situation where Y(t) is a nonlinear function of time. If we restrict ourselves to linear failure progression, continuous models as the Wiener and gamma processes may also be used.

4.4.1.3 Effective Failure Rate in the PF Model

The assumption behind the PF model is that failure progression is not observable for a rather long time, and then at some point of time we have a rather fast failure progression. This is the typical situation for cracks (potential failures) that can be initiated after a large number of load cycles. The cracks may develop rather fast, and it is important to detect the cracks before they develop into breakages. The time from a crack is observable until a failure (breakage) occurs is denoted the PF-interval. The important reliability parameters are the rate of potential failures, the mean and standard deviation of the PF-interval, and the coverage of the inspection method. The model implemented in OptiRCM for the PF situation is described in Vatn and Svee (2002). See also Castanier and Rausand (2006) for a similar approach, and the more general application of delay time models (Christer and Waller 1984).

4.4.2 System Model

Figure 4.4 shows a simplified model of the risk picture related to the component failure being analyzed. In order to quantify the risk related to safety, we need the following input data:

• The effective failure rate, λE(τ)

• The probability that the other barriers against the TOP event with respect to safety all fail, PTE−S

• The probability that the TOP event results in consequence Cj is PCj for j running through the number of consequence classes.

Table 4.2. PLL and cost contribution and for each consequence class Consequence PLLj = PLL-contribution SCj = Cost (Euro)

C1: Minor injury 0.01 2,000

C2: Medical treatment 0.05 30,000

C3: Permanent injury 0.1 300,000

C4: 1 fatality 0.7 1,600,000

C5: 2–10 fatalities 4.5 13,000,000

C6: >10 fatalities 30 160,000,000

The frequency of the consequence class Cj is given by

Fj =λE(τ)⋅PTE−S⋅PCj (4.1)

where PCj is the probability that the TOP event results in consequence class Cj. We will later indicate how we can model Equation 4.1 as a function of the maintenance interval, τ.

In some situations we also assign a cost, and/or a PLL (potential loss of life) contribution to the various cost elements. PLL denotes the annual, statistically expected number of fatalities in a specified population. Proposed values adopted by the Norwegian National Rail Administration are given in Table 4.2. Please see discussion by Vatn (1998) regarding what it means to assign monetary values to safety.

The total PLL contribution related to the component failure being analyzed is then

PLL=PTE−S⋅∑6j=1(PCj⋅PLLj)⋅λE(τ) (4.2)

and the total cost contribution related to the component is

CS =PTE−S⋅∑6j=1(PCj⋅SCj)⋅λE(τ) (4.3)

where SCj is safety cost of consequence class Cj.

Note that in the FMECA analysis we can have an automatic procedure that calculates both the PLL contribution and the safety cost contribution based on the reliability parameters, and the type of TOP event.

In the same way as we have done for safety consequences, we proceed with punctuality or unavailability costs. Here we simplify, and assume that there exists a fixed (expected) cost for each TOP event for punctuality, say PC(TOP). The punctuality cost per time unit is then

CP = PTE-P ⋅PC (TOP)⋅λE (τ) (4.4) This procedure may, if required, be repeated for other dimensions like environ-

ment, material damage, and so on.

4.4.3 Total Cost and Interval Optimization

The approach to interval optimization is based on minimizing the total cost related to safety, punctuality, availability, material damage, etc. Within an ALARP regime (e.g., see Vatn 1998) this requires that the risk is not unacceptable. Assuming that risk is acceptable, we proceed by calculating the total cost per time unit:

C(τ)=CS(τ)+CP(τ)+CPM(τ)+CCM(τ) (4.5) where CS(τ) and CP(τ) are given by Equation 4.3 and 4.4, respectively. Further,

CPM(τ)=PMCost/τ (4.6)

where PMCostis the cost per preventive maintenance activity. Note that for condition-based tasks we distinguish between the cost of monitoring the item, and the cost of physically improving the item by some restoration or renewal activity.

This complicates Equation 4.6 slightly because we have to calculate the average number of renewals.

Further, if CMCost is the cost of a corrective maintenance activity, we have

CCM(τ)=CMCost⋅λE(τ) (4.7)

Table 4.3. Generic probabilities, PCj, of consequence class Ci for the different TOP events

TOP event PC1 PC2 PC3 PC4 PC5 PC6

Derailment 0.1 0.1 0.1 0.1 0.05 0.01

Collision train-train 0.02 0.03 0.05 0.5 0.3 0.1

Collision train-object 0.1 0.2 0.3 0.15 0.01 0.001

Fire 0.1 0.2 0.2 0.1 0.02 0.005

Passengers injured or killed at platforms 0.3 0.3 0.2 0.05 0.01 0.001 Persons injured or killed at level crossings 0.1 0.2 0.3 0.3 0.09 0.01 Persons injured or killed in or at the track 0.2 0.2 0.2 0.3 0.1 0.0001

To find the optimum maintenance interval we can now calculate C(τ) in Equation 4.5 for various values of the maintenance interval, τ, and then choose the τ value that minimizes C(τ).

As a numerical example we consider a pump used for oil cooling of the main high voltage transformer in a locomotive. The relevant figures in the example are assessed by experts in the Norwegian National Railway. Upon failure of the oil pump, the TOP event for punctuality will most likely be a FULL STOP with a probability, PTE−P =0.75for this punctuality consequence. It is considered that a full stop gives an average delay of 15 min, and the cost of 1 min delay is set to 150 Euros. The potential TOP event for safety is a FIRE, but the likelihood is very small, i.e. PTE−S =0.0005. The reliability parameters of the pump are for the aging parameter α=3.5, and for the mean time to failure without any preventive maintenance we set to MTTF = 10 million km. To calculate the safety cost we find

(PCj⋅SCj)=1.286

∑j million Euros by combining Table 4.2 and 4.3. Equation 4.3 thus reads CS(τ)=643⋅λE(τ).

Punctuality cost in Equation 4.4 is similarly given by CS(τ)=2250⋅λE(τ). For PM and CM cost we have PMCost =3100Euros, and CMCost=4400Euros, respectively. Chang et al. (2006) argue that a good approximation for the effective failure rate is

1 2

2

(1 1/ ) 0.1 (0.09 0.2)

( ) 1

MTTF MTTF MTTF

E

α

α α ατ α τ

λ τ =⎛⎜⎝Γ + ⎞⎟⎠ τ − ⎡⎢⎣ − + − ⎤⎥⎦ (4.8)

The total cost C(τ) in Equation 4.5 can now be found as a function of τ; see Figure 4.7 for a graphical illustration. The optimum interval is found to be 7.5 mil- lion km. The maintenance action is scheduled replacement of the pump; see Figure 4.5.

Figure 4.7. Cost elements as a function of the maintenance interval

Một phần của tài liệu Sổ tay bảo trì hệ thống phức tạp (Trang 109 - 115)

Tải bản đầy đủ (PDF)

(647 trang)