Multi-dimensional Robustness Optimization of Embedded Systems & Online Performance Verification

Multi-dimensional Robustness Optimization of Embedded Systems&Online Performance Verification Arne Hamann Steffen Stein Rolf Ernst Institute of Computer and Communication Network Engineering

Part I:Multi-dimensional Robustness Optimization of Embedded Systems Arne Hamann Rolf Ernst Institute of Computer and Communication Network Engineering

Outline • System property variations • Sensitivity Analysis • Stochastic Multi-dimensional Sensitivity Analysis • Robustness Metrics • Hypervolume calculation • Minimum Guaranteed Robustness (MGR) • Maximum Possible Robustness (MPR) • Experiments

System Property Variations • Why do system property variations occur? • Specification changes, late feature requests, product variants, software updates, bug-fixes • Robustness to property variations • decreases design risk, and increases system maintainability and extensibility • Property variations can have severe unintuitive effects on system performance • Sensitivity analysis: achieve robustness without on-line parameter adaptation

Problem Formulation • Find fixed parameter configuration that … • … maximizes system robustness w.r.t. changes of several properties • Robustness = the system can sustain property variations without severe performance degradation • Not included: dynamic parameter adaptations (ongoing work submitted to EMSOFT 2007)

Stochastic Sensitivity Analysis (1) • Problem of exact sensitivity analysis approaches: computational effort grows exponentially with number of considered dimensions • Solution: scalable stochastic analysis able to quickly bound system sensitivity

Stochastic Sensitivity Analysis (2) • Sensitivity analysis formulated as multi-objective optimization problem Pareto-front of optimization task corresponds to sought-after sensitivity front • Use multi-criteria evolutionary algorithms to approximate sensitivity front • E.g. SPEA2 (ETH Zurich): diversified sensitivity front approximation through Pareto-dominance based selection and density approximation

Creation of the Initial Population • Creates a certain number of points representing a first approximation of sensitivity front • Uses 1-dim sensitivity analysis • to bound the search space in each dimension (bounding hypercube) • to generate points representing the extrema of the sought-after sensitivity front • Randomly place the rest of the initial points in bounding hypercube

Bounding Box Initial Population - Example 10,85 Property 2 6,5 26,2 10 Property 1

Bounding the Search Space (1) • Idea: bound search space containing the sought-after sensitivity front • Bounding working Pareto-front Fn • evaluated Pareto-optimal working points • Bounding non-working Pareto-front F nw • evaluated Pareto-optimal non-working points • Bounding Pareto-fronts can be used to derive multi-dim. robustness metrics (later)

Bounding the Search Space (2) • Space between bounding Pareto-fronts is called relevant region • Variation operators use algorithm ensuring that generated offsprings (points) are situated in the relevant region • Below bounding non-working Pareto-front • Above bounding working Pareto-front  Efficiently focuses exploration effort

Bounding Box Bounding the Search Space (3) 10,85 Property 2 6,5 26,2 10 Property 1

Front Convergence Mutate (1) • Heuristic operator adapted to optimization problem • Strategy: • Determine X closest points on opposite Pareto-front • Choose randomly one of these points • Place offspring point randomly on straight line connecting the parent point and the chosen random point • Increases convergence speed of the bounding Pareto-fronts

Bounding Box Front Convergence Mutate (2) 10,85 Property 2 6,5 26,2 10 Property 1

Bounding Box Front Convergence Mutate (3) 10,85 Property 2 6,5 26,2 10 Property 1

Hypervolume Calculation • Hypervolume as basis of the proposed robustness metrics • Hypervolume is defined in a given hypercube and associated to a point set • Two different notions of hypervolume • inner hypervolume : Volume of space Pareto-dominated by the given points inside the given hypercube • outer hypervolume : Volume of space Pareto-dominated by all points not Pareto-dominating any of the given points

- + ( )= 66 ( )= 100 Hypervolume Calculation (2) • 2D-case • inner hypervolume: lower step function • outer hypervolume: upper step function (15,18) Bounding Box [15,28]x[6,18] (18,16) (20,12) (26,10) (28,6)

Robustness Metrics • Given a set of properties … • … use stochastic sensitivity analysis to derive upper and lower robustness bounds • Minimum Guaranteed Robustness (MGR) • Defined as inner hypervolume of the bounding working Pareto-front Fw • Maximum Possible Robustness (MPR) • Defined as outer hypervolume of the bounding non-working Pareto-front Fnw

Bounding Box Robustness Metrics (2) 10,85 Property 2 MPR MGR 6,5 26,2 10 Property 1 Obviously: MGR <= Real Robustness <= MPR

Robustness Exploration • Idea: Pareto-optimize MGR and MPR • Advantages • Stochastic sensitivity analysis is scalable Little computational effort necessary to reasonably bound robustness potential of given configuration • In-depth analysis can be performed once interesting configurations are identified (i.e. high MGR or high MPR) Perfectly suited for robustness optimization

Example System • Distributed embedded system • 4 computational resources … • …connected via CAN bus • 3 constrained applications • SensAct • SinSout • CamVout

Approximation Quality (1) • Approximation after 100 evaluations (20 sec) • MGR = 2447 • MPR = 2937 • Approximation after 200 evaluations (40 sec) • MGR = 2580 • MPR = 2813

Approximation Quality (2) • Approximation after 300 evaluations (60 sec) • MGR = 2632 • MPR = 2777 • Result using exact sensitivity analysis (85 sec) • MGR = 2585 • MPR = 2826

3D - Robustness Maximization Original configuration Optimized configuration

Integration of New Functionality • Integration of a fourth application with lowest priorities • What combinations WCET T9 and WCCT C6 are feasible? • Is there optimization potential? • Idea: initially assume WCET T9 and WCCT C6 equal zero Sink C6 T9 Sens 2

Integration of New Functionality (2) WCCT C6 WCET T9 • Areas below the curves represent feasible systems

Part II:Online Performance Verification Steffen Stein Rolf Ernst Institute of Computer and Communication Network Engineering

Outline • Motivation • Framework Architecture • In Detail: Global Analysis Layer • System Setup • Approach to Analysis Control • Experimental Results

Engine Control SW Driver Assistance Multimedia Service Future Challenges • In-Field updates • Run-time Reconfigurations • 90% of Innovation in Software • Networked Systems Not Manageable at Design-Time!

Approach • Generally Speaking: • Make Systems clever enough to handle Integration Problem themselves • Here: Timing Properties • ToDo • Gather performance Data during runtime • Evaluate/ Optimise online • Feed Results back into running Systems • Result: Evolving Systems

Architecture: Organic Computing • Single Instance • Multiple Instances • Multiple collaborating Instances • Layered approach Goals/ Design Rules selects observation model observer controller reports observes controls System under Observationand Control (SuOC) Source: Towards a generic observer/controller architecture for Organic Computing, U. Richter, M. Mnif, J. Branke, C. Müller-Schloer, H. Schmeck, INFORMATIK 2006 -- Informatik für Menschen

Analysis Engine Analysis Engine Control Framework Global Analysis Layer Control Plane Global Controller Layer Global Observer Layer Self-Organisation Self-Organisation Observer Controller Observer Controller Observer Controller Analysis Engine Local Layer Data Exchange Gather data Adjust settings Use resources Heterogeneous Networked Embedded System (SuOC)

Distributed Setup PPC uC T1 S2 T2 T3 T6 S3 T5 CAN T4 S1 T8 T7 S4 T0 T9 DSP ARM Real System Global Model

Analysis Control Analysis Control Distributed Analysis Control T1 T2 T1 T2 T1 T1 Network Tunnel T3 T4 T3 T4 T6 T6 Do for all not up-to-date Resources Analyse end Until all Resources are up to date While (true) if Resource invalidated analyse Resource end end

Performance of trivial Approach # analysis runs (resource level) System size (# tasks)

Problem Exponential increase in number of necesary Analysis runs T3 T1 T2 S1 T6 T4 T5 S2 T9 T7 T8 S3 Solution: Caching

Performance with caching # analysis runs (resource level) System size (# tasks)

Conclusion • Distributed Performance Analysis implemented • Suitable as evaluator for online performance control / optimization • Future Work: From System observations to analysable Model.

Multi-dimensional Robustness Optimization of Embedded Systems & Online Performance Verification