1 / 25

Statistical modelling challenges – approaches used in Network Rail

Statistical modelling challenges – approaches used in Network Rail. Julian Williams, Network Rail. 29 th November, 2011. 17 th July 2008. 20,000 miles of track. 40,000 bridges and tunnels. 800 signal boxes. A few key facts. 2,500 stations. Largest private landowner in the UK

hisa
Download Presentation

Statistical modelling challenges – approaches used in Network Rail

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical modelling challenges – approaches used in Network Rail Julian Williams, Network Rail 29th November, 2011 17th July 2008

  2. 20,000 miles of track 40,000 bridges and tunnels 800 signal boxes A few key facts 2,500 stations Largest private landowner in the UK Largest purchaser of electricity in the UK Delivered by a team of 33,000 people

  3. Asset condition and performance

  4. Safety: EU comparison

  5. Our Asset Management Model

  6. Route utilisation, output & funding specification Asset Monitoring Policies and review & Standards Enablers: Asset information Analysis tools Competencies Processes Route Work asset execution management plans Route delivery plans Asset policies / strategies

  7. Asset Strategy 10-Stage Process • Asset description – how many, where, construction types • Asset history – construction types, investments, condition, performance • Asset criticality – by cost, performance, safety • Route criticality – segmentation of routes • Asset degradation – condition, performance relationships • Interventions – effectiveness, unit costs • Investment scenarios – different strategies, targets • Models – strategic, WLCC, tactical • Assessments – volumes, costs, condition, performance, sustainability • Policy selection – chosen option, final policy

  8. Model overview

  9. Statistical challenges • Incomplete / inaccurate databases • Inter-database incongruence • Understanding trends • Understanding degradation / intervention effectiveness • Relating condition to performance • Validation • Uncertainty

  10. Challenge 1: Database accuracy • Large number of databases are required covering the whole network • Assessment of impact of incomplete and inaccurate databases used to support investment planning • Two-part confidence grading: • Source data management (A – D) • Accuracy (1 – 6) • Overall contributions weighted according to impact on investment plans for CP5 – formal system developed to derive the weights • Checks on sample data from databases • Number of samples based on required statistical confidence in the level of accuracy

  11. Challenge 2: Database incongruence • Need to match asset registers, renewals & maintenance works, and condition & performance measures • Inconsistent: • Formats (e.g. single text filed vs individual numeric fields) • Asset hierarchies (e.g. point operating equipment vs signalling interlocking vs mileage vs GPS) • Database engines (ORACLE, Access, Excel, text) • A lot of effort goes into matching data – not always successful • Have to account for bias in matched vs unmatched data • Track data segmentation: • Asset registers (various), traffic, track structures, track geometry, rail defects, faults, train delays, planned renewals, maintenance

  12. Challenge 2 Example: S&C rail defects • Rail defects on switches and crossings (S&C) • Recorded in rail defect management system (RDMS) as S&C defect • Clearly must have occurred on an S&C! • Need to match to the S&C in the asset register

  13. Challenge 3: Understanding trends • Inspection • Have the guidelines changed on condition rating? (e.g. what is a “serious” defect) • Has training improved? • What is the variance between inspectors? • Is there a bias in the inspection frequency (e.g. more critical locations, assets with poor previous rating)? • Maintenance • Maintenance frequency (proactive vs reactive, RCM, campaigns) • Maintenance tools (stoneblowers vs tampers) • Workforce competence (experience, training) • Asset condition • Asset specification (e.g. better sleepers) • Asset condition • Utilisation

  14. Challenge 3 example 1: rail defects • Bolt hole defects • Reduction in the number of joints • Better rail end management • Tache ovales • Removal of older rail • Better guidance and training

  15. Challenge 4: Average degradation curves • What happens “on average” does not represent what happens on the ground • Spurious correlations • Data averaged too much • Not all information known • Possibly just an estimate of asset installation date • Unseen biases in the data • Missing “minor” interventions • Different maintenance policies • Need to retain variance in degradation in the models • Use Markov probability models • Link to location specific condition history (if available)

  16. Challenge 4 example: track geometry • “Average” record looks smooth

  17. Challenge 4 example: track geometry • Shape is different for a specific track section

  18. Challenge 4 example: track geometry • Some tracks behave worse than others

  19. Challenge 5: performance • Predicting failure rates • Helps understand impact of changing asset type / condition • Required for whole lifecyle cost analysis • Failure database designed to manage failure repair not for failure analysis • Filled in by operators, rather than engineers • Root cause is a text field, rather than drop down • Asset hierarchy is variable • Analysis only viable for total failures, rather than root cause • Less confidence that performance improvement will follow expected changes in asset condition

  20. Challenge 5 example: track failure rates • Regression analysis to identify several indicators: • Tonnage • Track geometry • Defect rate • Jointed track • S&C density • Can get good regression statistics with “wrong” relationships, due to correlations with other variables • Failure rate decreases with more jointed track: jointed track is on low tonnage lines, so this becomes a correction factor for the tonnage relationship • Need to adjust so that use jointed tonnage-km and CWR tonnage-km • Need to understand the relationships

  21. Challenge 6: Validation • Many relationships based on • Expert judgement (formal elicitation) • Limited data (initially) • Validation • Data gathering and analysis • Benchmarking with other industries • Data sometimes sufficient to show whether in the “right ballpark” • Difficult to validate with good statistical confidence • Often shows have not identified all the important parameters • Some relationships look completely random, partly due to lack of good data • Need new data programmes • Specific survey campaigns to collect data (e.g. rail pad condition) • New mandated fields for some databases (e.g. fault management system) • Guidance to maintenance staff on records

  22. Challenge 6 example: annual variability • Number of rail defects vary from year to year on the same track: • As they are repaired and the assets renewed (accounted for) • Number of inspections (not accounted) • Random variation (not accounted) • Validation has to account for this

  23. Challenge 6 example: no validation with poor data • Sometimes the data provides no support at all

  24. Challenge 7: Uncertainty • For individual relationships: • Asset information • Degradation rates • Intervention impact • Costs • “Optimal” policy • Large uncertainty in overall result, but could have small uncertainty in differential between policies • Confidence limits for expected expenditure required to achieve targets • Probability of meeting targets given expenditure • Break down uncertainty into: • Poor data • Inaccurate models • Natural variability and random events • Monte Carlo analysis used for WLCC models and Tier 0 model • Requires estimate of individual parameter uncertainty • Estimate of most important contributors to uncertainty to guide further data analysis / model developmebnt • Rarely based on good statistical analysis • Parameter correlations hard to estimate and often ignored • Can only address part of the problem: does not account for inaccurate models • Bayesian analysis • Model estimate is the “prior” • Other evidence used to create a posterior

  25. Challenge 7 example: identifying contributors to uncertainty • Identify biggest contributors to uncertainty • Dependent on correct parameter ranges • Correlations important

More Related