430 likes | 595 Views
The 4 standard failure models -to be used in maintenance optimization, with focus on state modelling. Professor Jørn Vatn. Situations and maintenance tasks. Observable gradual failure progression Inspect at regular intervals (or with shorter and shorter intervals)
E N D
The 4 standard failure models-to be used in maintenance optimization, with focus on state modelling Professor Jørn Vatn
Situations and maintenance tasks • Observable gradual failure progression • Inspect at regular intervals (or with shorter and shorter intervals) • Replace when degradation is high • Observable “sudden” failure progression • Inspect at regular intervals • Replace if failure progression is detected • Non-observable failure progression • Replace based on age • Shock • Perform functional test to identify hidden failures
Examples, observable gradual failure progression • The break disks on a train • The wear on a railway rail • The corrosion on a pipe • Cracks in an airplane structure • The level of degradation determines the next inspection, and whether a repair action is required
Examples: observable “sudden” failure progression • Cracks in a train wheel • Isolation resistance in a signalling cable
Multistate systems • Multistate systems are described by performance measures • We use a state variable, Y(t), to describe the state of the system at time t, e.g., • Performance (pump capacity, compressor efficiency etc) • For binary systems Y(t) reduces to take only the values 0 and 1; Y(t) = 1 represents a functioning state, and Y(t) = 0 represents a fault state • Y(t) is a random quantity, i.e. expressed in probabilistic terms, involving model parameters
Content of the state variable Y(t) • Y(t) was introduced as a performance variable • However, we will let Y(t) be more general, and Y(t) will be used to express the state of the system at time t, i.e.; • the direct performance of the system, capacities etc., or • a direct measure of wear, or • an indication of wear or increased failure probability • We use W(t) as a general quantity that simply is related to degradation of the system:
Degradation quantities of interest • WP(t): Quantities that are direct performance measures ($!!!) • E.g., the pumping capacity of a pump • WI(t): Quantities that are only indicators of the degradation of the component • E.g., the bearing temperature • WD(t): Quantities that represent measurable degradation • Examples are crack shape and size, corrosion level, geometrical defects (inclusive wear) • WS(t): Stressors that influence the degradation process • Examples could be the cyclic loads and corrosive medium • The stressors them selves do not measure the likelihood of failure, but is important for the forecasting of the failure progression • WP(t), WI(t) and WD(t) will be (probabilistic) modelled by the state variable Y(t)
Challenges in failure modelling • How to measure Y(t)? • For quantities that could be measured: • Use the quantity directly, i.e., crack length • Transformations, for example FFT (Fast Fourier Transform) • Non measurable quantities • Define patterns for similarity comparison • What is the relation between the readings from the measurements and the real physical state? • Reliability of the measurement techniques • To model failure (fixed failure limits rarely exist) • To model failure, we generally specify the failure probability as a function of the value of the state variable, i.e., p = p(y) • A simplification would be to assume that a failure occurs the first time the state variable reaches a fixed limit (failure limit)
Purpose of modelling – binary systems • We want to establish a mathematical model describing the relation between • the effective failure rate, E, and • the maintenance, i.e., • the inspection interval, , and • the intervention level, l • E = E(,l) • Establish a cost model: • PM cost (inspection interval)-1 • Renewal cost increases with a restrictive intervention level • CM cost/unavailability cost increases with increasing inspection interval • CM cost/unavailability cost decreases with a restrictive intervention level • Example
Classes of probabilistic models used • PF model • Failure progression is defined between a potential failure (P) and a failure (F) • The Wiener process • During an arbritary time interval t, the “failure progression” is increased by a normally distributed quantity with mean t and variance2t • A failure occurs the first time the failure progression passes the critical value • The Gamma process • Similar to the Wiener process, but the increments are gamma distributed • The shock model • The system is exposed to shocks, and each shock causes a damage Xi • When the accumulated damage increases, so does also the failure probability • The Markov state model • The failure progression is approximated by a discrete set of states • The transitions between the sates are assumed to follow a Markov process • The model is very flexible, and allows for modeling a large range of situations Markov model
The PF model • The objective of the inspection is to detect e.g., a crack (potential failure) before it develops to a breakage (critical failure) • The time from a crack is detectable (P) until the e.g., the rail breakage is a fact (F), is denoted the PF interval
Variation in the PF interval • The length of the PF interval is assumed to vary from time to time • cracks can be initialised in different places of the component • crack propagation depends on several different factors such as load, structure quality, temperature etc • The cracks that propagate very fast represent the largest risk of not being detected by the ultrasonic inspection • The objective of the modelling isto obtain the probability, Q, of notdetecting the crack in due timeas a function of the inspectioninterval • Q = Q()
The shock model The shocks represent WS(t) The magnitude of the shock also representsWS(t) The impactXirepresents WD(t)
Model assumptions • The state variable, Y(t), describes the state of the system at time t, Y(t) is a random quantity • The state variable could take one of the values y0, y1,…,yr • The values could either be numerical, or a qualitative description of a state or phenomenon • The system starts in state y0, and jumps to a higher state (yi toyi+1) with a time independent intensity i • There is generally a cost assossiated with being in state yi • The system fault state is yr • The system is inspected at intervals of length (offline) • The system is renewed if Y(t) ylat an inspection
Maintenance Par. Spec. Calculation
Markov differential equations • Introduce Pi(t) = Pr(the system is in state i at time t) • Consider the change in a small time interval t: • Standard Markov considerations gives: Pi(t+t) = Pi(t)(1-it) + Pi-1(t)i-1t (*) • Equation (*) could now be used to obtain the state probabilities, Pi(t), as a function of time by numerical integration
The easy situation: no maintenance • If no maintenance is carried out then • integrate equation (*) • starting from the initial state • Mean time to failure is given by: • MTTF = t=0: R(t) dt = t=0:[1-Pr(t)]dt • in fact a sum … • To verify our calculations we should verify the analytical result: • MTTF = i=0:r-1MTTFi = i=0:r-11/i
Calculation procedure: with maintenance • The system is inspected at intervals of length • The system is renewed if Y(t) ylat an inspection (Fig.) • The model is integrated as before, but when tequals , 2, 3,… special considerations are necessary • Procedure • Define the initial conditions: P0(0) = 1, Pi(0) = 1, i > 0 • Set f = 0, t = 0, t = sufficient small • Integrate Equation (*) one step, and let t = t + t • Let f = f + Pr(t) • If t =, 2, 3,…, thenletP0(t) = P0(t)+ ilPi(t), and Pi(t) = 0, il • Loop to Step 3 until t is sufficient large • System failure frequency now equals E(,l) = f/t
Essential source code in VBA Do While t < MaxT ‘ Main loop nFail = nFail + IntegrateDt(dt) P(0) = P(0) + P(r) P(r) = 0 t = t + dt If t > inspection Then inspection = inspection + tau nRenewal = nRenewal + Inspect(L, q) End If Loop Function IntegrateDt(dt As Single) For i = r To 1 Step -1 P(i) = P(i) * (1 - lam (i) * dt) _ + P(i - 1) * lam (i - 1) * dt Next P(0) = P(0) * (1# - lambda(0) * dt) IntegrateDt = P(r) End Function Function Inspect(L As Integer, q As Single) rr = 0 For i = L To r - 1 rr = rr + P(i) * (1 - q) P(0) = P(0) + P(i) * (1 - q) P(i) = P(i) * q Next i DoInsp = rr End Function
Specification of model parameters • In principle we need to specify all transition rates, i.e. • 0, 1,…, r-1 • We also need the probability of erroneous classification • Qij=Pr(Classify into state i when the real state is j) • In order to get numerical values (estimates) of the model parameters, we utilise: • Experience data • Expert and engineering judgements • Degradation modelling, i.e. fracture mechanics, FEM etc • For r > 4-5 this will be a huge number of parameters • We want to simplify the parameter specification procedure
Simplified parameter specification • We specify the parameters in the situation without maintenance, i.e. • What will the mean time to failure (MTTF) be if no maintenance is carried out? (Fig. ) • Is the transition rate between states constant, or increasing? • If it is increasing then we specify the ratio: • V = r-1/0= how much faster failure progression is just before failure compared to initially (Fig. ) • We also need to specify • The number of states in the model (r) • The probability q that an inspection does not reveal that the system is in a critical state Calculation example
Calculation example • Excel spreadsheet is provided: MarkovStateModel.xls • Input parameters: • Result MarkovStateModel.xls
The effect of maintenance • We have established (by means of the Excel model) the relation between maintenance (,l) and i) the effective failure rate, E(,l), andii) the renewal rate rr(,l) • Example results
Cost elements - Optimization • The most important cost elements are: • The cost per inspection, CI • The (unavailability) cost per system failure, CF • The cost of repairing a system failure, CCM • The cost of renewing the system at state l, CRC • The total cost per unit time is then C(,l) = CI/ + (CF+CCM)E(,l) + CRCrr(,l) • The objective is now to minimize C(,l) wrt maintenance interval and intervention level
Extension of the Markov model • More advanced maintenance strategies could be applied • Reducing inspection interval as we approach the maintenance limit, l • Conduct non perfect repair before the maintenance limit • Models have been developed for hydro power plant
The gamma process • Stationary gamma process • Background: Xis said to be gamma distributed with shape parameter v, and scale parameter u if the PDF is given by: • Ga(x|v,u)=uvxv-1e-ux/(v) • Let Y(t) be the degradation level at time t • Y(t) follows a stationary gamma process if • Y(0) = 0 • Y(s) - Y(t) ~ Ga([s-t]v,u), s>t • Y(t) has independent increments
Mean time to failure in the gamma process • Assume that the component fails as soon as the failure progression exceeds the value • Let Tdenote the time to failure • It follows that FT(t) = Pr(T<t) = Pr(Y(t) > ) = (vt, u)/(vt) • Where (a, x) is the incomplete gamma function • Welte (2008) reports the following: • E(T) u/v + 1/(2v) • Var(T) u/v2 - 1/(12v2)
Non-stationary gamma process • The gamma process could be extended to a non-stationary process by letting the shape parameter be a function of time, i.e., v(t) is the shape function, and we have: • Y(0) = 0 • Y(s) - Y(t) ~ Ga(v(s)-v(t),u), s>t • Y(t) has independent increments • The CDF now readsFT(t) = Pr(T<t) = Pr(Y(t) > ) = (v(t), u)/(v(t)) • The expected time to failure, and variance in time to failure could be found by numerical methods
Comparison – Discrete model, vs gamma process • For the discrete model we need to fix the number of states • If the degradation is continuous, this seems not very natural, hence a gamma process is more appealing • Degradation rate • In the discrete model, the degradation rate (in terms of transition rates) depends on the state of the system, and not on the age (time) • In a gamma process the degradation rate could also be modelled by a non-constant value, but degradation rate depends on the age, and not on the state
Exercise • Verify E(T) u/v + 1/(2v) by numerical integration, i.e., E(T) = 0R(t)dt
Non-stationary gamma process • The gamma process could be extended to a non-stationary process by letting the shape parameter be a function of time, i.e., v(t) is the shape function, and we have: • Y(0) = 0 • Y(s) - Y(t) ~ Ga(v(s)-v(t),u), s>t • Y(t) has independent increments • The CDF now readsFT(t) = Pr(T<t) = Pr(Y(t) > ) = (v(t), u)/(v(t)) • The expected time to failure, and variance in time to failure could be found by numerical methods
Integration of the gamma process • Let S|t,dt = Y(t+dt) - Y(t) be the degradation during a small time interval dt after time t • S|t,dt~ Ga(v(t+dt)-v(t),u) • Further, let g(s | t,dt) denote the pdf of S|t,dt • If the pdf of Y(t) is known, we may obtain the pdf of Y(t+dt) by a convolution argument: • (*) • Assume the system is inspected every time unit, and renewed whenever Y > yM • To find the effective failure rate, we integrate (*) from t = 0 to , and whenever t = k, probability mass is moved to 0