400 likes | 607 Views
Systems Engineering Program. Systems Prognostic Health Management April 1, 2008. Christopher Thompson IBM Global Business Services FCS LDMS Program.
E N D
Systems Engineering Program Systems Prognostic Health ManagementApril 1, 2008 Christopher Thompson IBM Global Business Services FCS LDMS Program Disclaimer: This briefing is unclassified and contains no proprietary information. Any views expressed by the author are his, and in no way represent those of Lockheed Martin Corporation.
My Engineering Experience IBM Global Business Services, Dallas TX Requirements Lead/Prognostics SME FCS Logistics Data Management Service (LDMS) Lockheed Martin Missiles and Fire Control, Dallas TX Senior Systems Engineer - Multifunction Utility/Logistics Equipment (MULE) Lockheed Martin Aeronautics, Fort Worth TX Vehicle Systems - Prognostic Health Management - F-35 Joint Strike Fighter (Lightning II) Lockheed Martin Missiles and Fire Control, Dallas TX Reliability Engineer - Army Tactical Missile System (TACMS) SMU School of Engineering, Dallas TX - TA for Dr. Stracener
My Education B.S. in Electrical Engineering, SMU (1997) M.S. in Mechanical Engineering, SMU (2001) - Major: Fatigue/Fracture Mechanics M.S. in Systems Engineering (2002) - Major: Reliability, Statistical Analysis Ph.D. in Applied Science (anticipated ~ 2009) - Major: Systems Engineering (PHM)
My Dissertation Fleet Based Analysis of Mission Equipment Sensor Configuration and Coverage Optimization for Systems Prognostic Health Management
A0 increases Life cost decreases weight increases power increases MTTR decreases R(t) decreases volume increases AUPC increases cabling increases P(FDI) increases P(Prog) increases MTBUMA increases Sensor Tradeoffs As more sensors are added to your system: GOODBAD TRADEOFFS
PHM Optimization Optimum AO Cost* $ Operational Availability AO LCC AUPC 0 # of Sensors N * Other metrics will include Weight/Volume, Power (K/W), Specants (Computing Power)
structural element x x x x = mean distance between sensors optimum solution 0 x = mean distance between sensors X ∞ N = # of sensors 0 PHM Optimization Probability of Detection of Crack
200% spec limit 150% spec limit Engine Power Engine Internal Temp. 125% spec limit 100% spec limit Time For a common LRU (such as an engine), plotting engine power against against an environmental measure (such as temperature) over time: Severe Damage Moderate Damage Mild Damage No Damage
200% specification limit x4 (or more) 150% specification limit x2 x1 Estimating the damage accumulated (or life consumed) Severe Damage Moderate Damage Mild Damage
Hypothetical engine air/oil/fuel filter performance over its life MTBF optimal performance acceptable performance filter performance (flow rate) distribution of failure times degraded performance engine system damage likely hazardous performance engine system failure likely filter life (in miles)
Common LRU used on multiple vehicle types with platform specific (hidden) failure modes Why is the Failure Rate for the LRU in Platform 4 higher? What is different about Platform 4? statistically significant difference Failure Rate Platform 1 Platform 2 Platform 3 Platform 4
Standard oil filter used in engines across FCS vehicles, replaced at a scheduled time/miles Increased engine life consumption Scheduled Replacement Time Correct action Wearout – Life Consumption wasted filter life Vehicle 1 Vehicle 2 Vehicle 3 Actual Condition of the oil filter
Standard structural element across several vehicles (under cyclic loading) fleet based estimate Repair needed before estimate Damage Accumulation LRU life histories Time (or miles, or load cycles, or on/off cycles, …)
The MULE Program Future Combat Systems Multifunction Utility/Logistics Equipment
Keys to the Success of FCS • Reducing Logistics footprint • Increasing Availability • Reducing Total Cost of Ownership • Implementing Performance Based Logistics • Improvements in the ‘ilities’ (RAM-T) • Reliability • Availability • Maintainability • Testability • Supportability
Prognostics Of or relating to prediction; a sign of a future happening; a portent. The process of calculating an estimate of remaining useful life for a component, within sufficient time to repair or replace it before failure occurs.
Prognostic Health Management (PHM) PHM is the integrated system of sensors which: • Monitors system health, status and performance • Tracks system consumables oil, batteries, filters, ammunition, fuel… • Tracks system configuration software versions, component life history… • Isolates faults/failures to their root causes • Calculates remaining life of components
Diagnostics The identification of a fault or failure condition of an element, component, sub-system or system, combined with the deduction of the lowest measurable cause of that condition through confirmation, localization, and isolation. • Confirmation is the process of validation that a failure/fault has occurred, the filtering of false alarms, and assessment of intermittent behavior. • Localization is the process of restricting a failure to a subset of possible causes. • Isolation is the process of identifying a specific cause of failure, down to the smallest possible ambiguity group.
Faults and Failures Fault: A condition that reduces an element’s ability to perform its required function at desired levels, or degrades performance. Failure: The inability of a component, sub-system or system to perform its intended function. Failure may be the result of one or more faults. Failure Cascade: The result when a failure occurs in a system where the successful operation of a component depends on a preceding component, which can a failure can trigger the failure of successive parts, and amplify the result or impact.
Classes of Failures Design Failures: These take place due to inherent errors or flaws in the system design. Infant Mortality Failures: These cause newly manufactured systems to fail, and can generally be attributed to errors in the manufacturing process, or poor material quality control. Random Failures: These can occur at any time during the entire life of a system. Electrical systems are more likely to fail in this manner. Wear-Out Failures: As a system ages, degradation will cause systems to fail. Mechanical systems are more likely to fail in this manner.
The Ultimate Goal of Prognostics The aim of Prognostics is to maximize system availability and life consumption while minimizing Logistical Downtime and Mean Time To Repair, by predicting failures before they occur. This is a notional diagram indicative of a wear out failure.
What is PHM? • Prognostic Health Management (PHM) is the integrated • hardware and software system which: • Monitors system health, status and performance • Tracks system consumables • oil, batteries, filters, ammunition, fuel… • Tracks system configuration • software versions, component life history… • Diagnoses/Isolates faults/failures to their root causes • Calculates remaining life of components • Predicts failures before they occur • Continually updates predictive models with failure data
What is PHM? Prognostic Health Management is a methodology for establishing system status and health, and projecting remaining life and future operational condition, by comparing sensor-based operational parameters to threshold values within knowledge base models. These PHM models utilize predictive diagnostics, fault isolation and corroboration algorithms, and knowledge of the operational history of the system, allowing users to make appropriate decisions about maintenance actions based on system health, logistics and supportability concerns and operational demands, to optimize such characteristics as availability or operational cost.
Availability Analysis • Availability, Achieved where MTBF = Mean Time Between Failure MTTR = Mean Time To Repair
Availability Analysis • Availability, Operational where MTBUMA = Mean Time Between Unscheduled Maintenance Actions ALDT = Administrative Logistical Down Time MTTR = Mean Time To Repair
Availability Analysis • MTBUMA = Mean Time Between Unscheduled Maintenance Actions where MTBM = Mean Time Between Failures MTBM = Mean Time Between Maintenance
Availability Analysis • How can we improve AO? - By decreasing Administrative & Logistical Down Time (ALDT) - By increasing Mean Time Between Failures (MTBF) - By decreasing Mean Time To Repair (MTTR) - By increasing Mean Time Between Unscheduled Maintenance Actions (MTBUMA) – [by decreasing MTBR induced and MTBR no defect]
Availability Analysis • How can we decrease ALDT? - By improving Logistics Improve scheduling of inspections Improve commonality of parts Decrease time to get replacements - By improving Prognostics Replace parts before they fail, not after Maximize use of component life Improve off-board prognostics trending More sensors!!
Availability Analysis • How can we increase MTBF? - By improving Reliability Select more rugged components Improve life screening and testing Improve thermal management - By improving Quality Better parts screening Better manufacturing processes - By adding Redundancy At the cost of Size, Weight and Power!
Availability Analysis • How can we decrease MTTR? - By improving Maintainability Improve quality and efficacy training Simplify fault isolation Decrease number of tools and special equipment Decrease access time (panels, connectors…) Improve Preventative Maintenance - By improving Diagnostics Improve BIT and BITE Decrease ambiguity group size Improve maintenance manuals and training
Availability Analysis • How can we increase MTBM (induced/no defect)? - By improving Safety Limit the potential for accidental damage - By improving Prognostics Improve PHM models to monitor induced damage - By improving Diagnostics Lower the false alarm rate Don’t repair/replace things which aren’t broken!