280 likes | 395 Views
A Scalable Approach to Architectural-Level Reliability Prediction. Leslie Cheung Joint work with Leana Golubchik and Nenad Medvidovic. Motivation. Many design decisions are made early in the software development process These decisions affect software quality
E N D
A Scalable Approach to Architectural-Level Reliability Prediction Leslie Cheung Joint work with Leana Golubchik and Nenad Medvidovic
Motivation • Many design decisions are made early in the software development process • These decisions affect software quality • Need to assess software quality early • If problems are discovered later (e.g., after implementation), they may be costly to address
Motivation • We focus on assessing software reliability using architectural models in this talk • Reliability: the fraction of time that the system operates correctly • Architectural models: describes system structure, behavior, and interactions
Case Study: MIDAS Measure room temperature and adjust the temperature according to a user-specified threshold by turning on/off the AC Sensor: measures temperature and sends the measured data to a Gateway Gateway: aggregates and translates the data and sends it to a Hub Hub: determines whether it should turn the AC on or off AC: Control the AC GUI: View current temperature, and change thresholds
Motivations • Existing approaches for concurrent systems: keeps track of the states of all components • MIDAS Example • State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC)
Motivations (Taking Measurements, idle, idle, idle, Processing User Request, idle) • Existing approaches for concurrent systems: keeps track of the states of all components • MIDAS Example • State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC)
Motivations (Failed!, idle, idle, idle, Processing User Request, idle) • Existing approaches for concurrent systems: keeps track of the states of all components • MIDAS Example • State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC)
Motivations (Taking Measurements, idle, idle, idle, Processing User Request, idle) • Existing approaches for concurrent systems: keeps track of the states of all components • MIDAS Example • State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC) • Problem: Scalability • e.g., 2 Gateways,10 Sensors each • >5000 states • How about real-world applications, which may have 100s of Sensors and Gateways? • The models are too big to solve
The SHARP Framework • SHARP: Scalable, Hierarchical, Architectural-Level Reliability Prediction Framework • Idea: generate part of the system model at a time by leveraging use-case scenarios • Solving many smaller models is more efficient than solving one huge model
MIDAS Use-Case Scenarios • MIDAS example • Sensor Measurement • GUI Request • Control AC
Modeling concurrency: instances of scenarios may run simultaneously MIDAS Example Processing a GUI request while processing sensor measurements Sensor Measurement and GUI request scenarios run simultaneously Multiple sensors Multiple instances of the Sensor Measurement scenario The SHARP Framework
The SHARP Framework • Generate and solve submodels according to the system’s use-case scenarios • Generate and solve a coarser-level model for system reliability • Describe what happens when multiple instances of scenarios are running • Make use of results from the submodels
The SHARP Framework R1 m1
The SHARP Framework R2 m2 R3 m3
The SHARP Framework • Generate and solve submodels according to the system’s use-case scenarios • Generate and solve a coarser-level model for system reliability • Describe the number of active instances of each scenarios • Make use of results from the submodels
The SHARP Framework m1 R1 m2 R2 m3 R3
The SHARP Framework m1 R1 m2 R2 R m3 R3
Evaluation • To show… • SHARP has better scalability than a flat model that can be derived from existing approaches, and • SHARP is accurate, using results from the flat model as “ground truth” • Experiments • Computational cost in practice • Sensitivity analysis
Computational cost in practice • Example: MIDAS system, varying the number of Sensor component (x-axis) • Y-axis: number of operations needed to solve the model
Sensitivity Analysis We are primarily interested in what-if analysis Is Architecture A “better” than Architecture B? but not Will my system’s reliability greater than 90%? What is the probability that I can run my system for 100 hours without any failure? Focusing on trendsis meaningful at the architectural level
Sensitivity Analysis “Ground truth”: results from the flat model Vary Sensor failure rate
Conclusions • Assessing software quality early is desirable • Scalability is a major challenge in reliability prediction of concurrent systems using architectural models • We tackle address this challenge by leveraging a system’s use-case scenarios in SHARP • Future Work: Contention modeling • Work thus far: assume no contention • However, concurrency contention
Defects • Architectural: Mismatches between architectural models • e.g., An interaction protocol mismatch between 2 comps • System: Limitations of components • e.g., Sensor has limited power • Allow system designers to evaluate how much reliability will improve if defects are addressed • Cost