380 likes | 603 Views
Dependability Benchmarking of Soc-embedded control systems. Juan-Carlos Ruiz Technical University of Valencia, Spain. 3rd SIGDeB Workshop on Dependability Benchmarking ISSRE 2005, Chicago, Illinois USA, November 2005. Outline. Benchmarking Context Benchmark Specification Benchmark Prototype
E N D
Dependability Benchmarking of Soc-embedded control systems Juan-Carlos Ruiz Technical University of Valencia, Spain 3rd SIGDeB Workshop on Dependability BenchmarkingISSRE 2005, Chicago, Illinois USA, November 2005
Outline • Benchmarking Context • Benchmark Specification • Benchmark Prototype • Ongoing Work
Powertrain Entertainment & Communications Body & Chassis SoC-embeddedAutomotive Control Applications Electronic Control Units Per-Cylinder Knock ElectronicFuel Injection Electronic Valve timing Electronic Ignition Electronically controlled Automatic Transmission Electronic Combustion
Any Engine ECU handles (at least) … • Fuel injection • Angle When is the fuel injected? • Timing How long is the injection? • Air management • Flow shape How does the air enter the cylinder? • Air Mass How much air enters the cylinder?
Air management Fuel injection Flow shape timing air mass angle Engine ECU model ElectronicControlUnit sensors control software Throttle Outputs fromthe ECU control loops RTOS (optional) Engineinternalvariables µController-/DSP-based Hardware actuators
Outline • Benchmarking Context • Benchmark Specification • Benchmark Prototype • Ongoing Work
ECU Failure modes • No new data (stuck-at failure) • Delayed data (missed deadline) • Output control deviation • Close to nominal • Far from nominal • Unpredictablee.g. the engine stops, the engine is damaged • Predictable- Noise & Vibrations- Non-optimal behavior Powertrain System Failure Modes Measures for powertrain system integrators (I) ECU control outputs • Angle • Timing • Volume • Flow Shape Fuel injection Air management
Measures for powertrain system integrators (II) {Number of failures of this type in the considered output / Total number of experiments} Unsafety levels + - Failure modes Value Time Note: (1) Table referred to a diesel engine PSA DW12
System Under Benchmarking (SUB) & Dependability Benchmark Target (DBT) • For economical and safety reasons, the throttle and the engine are replaced by two models • These models must be customised according to each specific engine Air management Fuel injection ECU ThrottleModel Flow shape timing sensors DBT volume angle Outputs fromthe ECU control loops RTOS µController-/DSP-based Hardware Engineinternalvariables actuators EngineModel SUB
Tools supporting the definition of Engine models realengine • If available, we can use a synthetic model of the engine • Otherwise, the model can be obtained from the real engine: • Run the workload • Trace the engine behaviour • The resulting traces are those defining a model of theengine behaviour in absence of faults
Workload • Workload = Engine internal variables + Throttle • Engine internal variables are generated by the engine • Throttle inputs are computing according to one of the following driving cycles: • Acceleration-Deceleration cycle • Urban Driving Cycle • Extra-Urban Driving Cycle Emission certification of light duty vehicles in Europe (EEC Directive 90/C81/01)
Worload detail Urban Driving Cycle Extra-urban Driving Cycle Speed average 18.35 km/h Time 13.00 min Distance 3976.11 m Maximum speed 50.00 km/h Speed average 61.40 km/h Time 6.67 min Distance 6822.22 m Maximum speed 120.00 km/h
Faultload “About 80% of hardware faults are transient, and as VLSI implementation use smaller geometries and lower power levels, the importance of transients increases” [Cunha DSN2002] [Somani Computer1997][DBench ETIE2 report] • Transient physical (hardware) faults affecting the ECU memory that are software-emulated using the single bit-flip fault model
Benchmark Conduct Golden run Sequence of experiments … Experiment 1 Experiment 2 Experiment N fault injection failure error detection Event-orientedmeasures fault activation(error) Start-up Fault injection ECU’s activity Observation Time-oriented measures error detection latency observation time
Technical considerations • Engine ECUs are typically manufactured as SoCs • Control software is stored and executes inside a chip(observability and controllability issue) • Running faultloads without introducing spatial and temporal intrusion is a challenge • Our advises: • Exploit on-chip debugging features currently supplied by most automotive embedded microcontrollers • On-the-fly memory access • Program and data tracing facilities • To increase portability select standard and processor independent OCD mechanisms, like the ones defined by Nexus
Outline • Benchmarking Context • Benchmark Specification • Benchmark Prototype • Ongoing Work
Benchmark Prototype ECU software runs here DBench-ECU USB link Nexus Adapter In-Circuit Debugger for Nexus MPC565Evaluation Board
Probes Internals Externals Memoryposition ErrorHandlers Non-PeriodicOutputs PeriodicOutputs 2 Events DependabilityMeasure Internals Externals ErrorActivation ErrorDetection Failures Internals Externals Far fromNominal NoFailure ErrorLatencies ErrorCoverage&Distribution Failures Close toNominal No newData MissedDeadline Inducing an unpredictableengine behavior Inducing a predictableengine behavior Non-optimal Vibrations 3 Benchmark Analysis RAWmeasures 1
Case Study: Diesel ECUs (DECUs) Inputs fromSensors Outputs toActuators Engine ECU Control Loops Intake air pressure Common rail pressure Crankshaft angle Camshaft angle Throttle position (Reference speed) Current engine speed (in rpm) Common rail compressordischarge valve Swirl valve Waste gate valve Injector1 … InjectorN Fuel Timing(20 ms) Fuel Angle(50 ms) Air Shape(20 ms) … Air Volume(500 ms) • Implemented on a RTOS called μC/OS II • Each control output computed in a different OS task • Tasks uses semaphores & waiting facilities of the OS • OS scheduling policy is Rate-monotonic ECU version 1 • Implemented without OS • Each control output computed in a different programprocedure • The main program schedules the execution of eachprogram procedure • The scheduling policy is computed off-line ECU version 2
Some Results Acc.-Dec. Cycle Extra-Urban Cycle Urban Driving Cycle • DECU with RTOSFailure Ratio: 1.775 % • - Upredictable: 0.598 % • - Noise & Vibrations: 0.539 % • - Non-Optimal: 0.638 % • DECU without RTOS • Failure Ratio: 10.659 % • - Upredictable: 2.52 % • - Noise & Vibrations: 3.517 % • - Non-Optimal: 4.622 % • DECU with RTOSFailure Ratio: 5.76 % • - Upredictable: 1.35 % • - Noise & Vibrations: 2.48 % • - Non-Optimal: 1.93 % • DECU without RTOS • Failure Ratio: 2.38 % • - Upredictable: 0.34 % • - Noise & Vibrations: 1.36 % • - Non-Optimal: 0.68 % • DECU with RTOSFailure Ratio: 5.10 % • - Upredictable: 2.72 % • - Noise & Vibrations: 0.00 % • - Non-Optimal: 2.38 % • DECU without RTOS • Failure Ratio: 5.76 % • - Upredictable: 1.28 % • - Noise & Vibrations: 1.6 % • - Non-Optimal: 2.88 % (Results obtained from a 5 days benchmark execution 300 exp. per driving cycle)
Practical considerations • Observation time after fault injection is limited by the trace memory of the debugger connected to the debugging ports • The number of probes that can be connected to a debugging port is limited. Thus, obtaining the benchmarking measures should require to run several times the same golden run or experiment.
Outline • Benchmarking Context • Benchmark Specification • Benchmark Prototype • Ongoing Work
Current Working Context • ARTEMIS workshop, June-July 2005, Paris • Increasing interest of the industrial community in the use of SW components in (SoC-)embedded systems • Need of benchmarking other types of components in control systems (RTOS, Middlewares, etc.) • To what extend what we know can be applied to such type of reseach?
Ongoing Research • SoC systems = compound of components • Component = Interface + Implementation • Parameter corruption techniques of major interest to evaluate component robustness • New technique for parameter corruption in SoCs using OCD mechanisms [PRDC11 (to appear)] • The key issue here is not to reinvent the wheel but rather to explore to what extend what exists can be applied to SoCs
Thanks for your attention!! Any question, comment, or suggestions ?
Benchmark measures • Failure modes in control outputs • Time failures (out of time control delivery) • Value failures (no new value, value in tolerable bounds, value out of tolerable bounds) • Impact of failures over the system and users (unsafety levels) • Without consequences • With consequences, but non-catastrophic • With catastrophic consequences • Benchmark performers must correlate, for each control output, failure modes and their impact over the system and users
Some Results Urban Driving Cycle Extra-Urban Driving Cycle • Number of BEs: 300 • DECU with RTOSFailure Ratio: 5.76 % • - Upredictable: 1.35 % • - Noise & Vibrations: 2.48 % • - Non-Optimal: 1.93 % • DECU without RTOS • Failure Ratio: 2.38 % • - Upredictable: 0.34 % • - Noise & Vibrations: 1.36 % • - Non-Optimal: 0.68 % • Number of BEs: 300 • DECU with RTOSFailure Ratio: 5.1 % • - Upredictable: 2.72 % • - Noise & Vibrations: 0 % • - Non-Optimal: 2.38 % • DECU without RTOS • Failure Ratio: 5.76 % • - Upredictable: 1.28 % • - Noise & Vibrations: 1.6 % • - Non-Optimal: 2.88 %
Some Results Acc.-Dec. Cycle Extra-Urban Cycle Urban Driving Cycle • DECU with RTOSFailure Ratio: 1.775 % • - Upredictable: 0.598 % • - Noise & Vibrations: 0.539 % • - Non-Optimal: 0.638 % • DECU without RTOS • Failure Ratio: 10.659 % • - Upredictable: 2.52 % • - Noise & Vibrations: 3.517 % • - Non-Optimal: 4.622 % • DECU with RTOSFailure Ratio: 5.76 % • - Upredictable: 1.35 % • - Noise & Vibrations: 2.48 % • - Non-Optimal: 1.93 % • DECU without RTOS • Failure Ratio: 2.38 % • - Upredictable: 0.34 % • - Noise & Vibrations: 1.36 % • - Non-Optimal: 0.68 % • DECU with RTOSFailure Ratio: 5.10 % • - Upredictable: 2.72 % • - Noise & Vibrations: 0.00 % • - Non-Optimal: 2.38 % • DECU without RTOS • Failure Ratio: 5.76 % • - Upredictable: 1.28 % • - Noise & Vibrations: 1.6 % • - Non-Optimal: 2.88 % (Results obtained from a 5 days benchmark execution 300 exp. per driving cycle)
COTS software Components(Potential Benchmark Targets) Experimental Set-up Experiment Analyzer SUB Monitor Experiment Repository Experimentalmeasurements stored in Benchmark Target activity monitoring DependabilityMeasures Faultload Controller Monitoring interface Faultload interface Workload interface FaultInjectionProcess Exercise SoC-embedded components Benchmark Manager System Under Benchmarking(SUB) Workload Controller
detected errors 26,7 % Results 3000 experiments SW Configuration: RTOS (μC/OS II) Workload: Acceleration-Deceleration Non-detectederrors 73,3 % Failure ration 40 %
Fault Injection Procedure FI experiment setup SoC Power-up Reset SoC software isloaded in memory set watchpoint SoC software starts execution start Temporal trigger Spatial trigger watchpoint message? Timer expires ? Yes Set a external timer (e.g. in a PC) Yes No No Fault injection process Read Memory Bit-flip Write Fault end
Hardware Fault models • Transient faults (Single & Multiple bit-flip) • Permanent faults (stuck-at model) Continuous monitoring of the location where the fault must be introduced: stuck-at “1” bit-flip = Memory OR Mask (bits to flip at “1”) stuck-at “0” bit-flip = Memory AND Mask (bits to flip at “0”) Memory location 1. Read Memory(e.g. bit71000.0011bit0) XOR 0x00014B04 2. Bit-flip = Memory Mask (e.g. mask: bit70011.1000bit0) 3. Write fault(e.g. bit71011.1011bit0)
Golden run Golden run Golden run Experiment N Experiment 2 Experiment 1 Experiment 1´ Experiment 2´ Experiment N´ Technical considerations • The number of probes that can be connected to a debugging port is limited. Thus, studying the system activity in presence of faults should require to run several times a fault injection experiment • The observation time after a fault injection is limited by the trace memory of the components connected to the debugging ports FI campaign … fault injection error detection FI experiments(1 fault per experiment) Experiment fault activation (error) failure
INERTE : Integrated NExus-based Real-Time fault injection tool for Embedded systems For eachFault Injection campaign ConfigurationFile FI Campaignreport Experiment Generator Module Fault Injector Analysis Tool For eachFault Injection experiment TraceRepository Golden Run Trace FI Trace
When 521 200000.ms 1 ROD 0x00015FB4 0x40 88265.ms OSUnMapTbl 2 ROD 0x00015F6D 0x10 70262.ms OSUnMapTbl 3 ROD 0x00015FB5 0x40 103116.ms OSUnMapTbl 4 COD 0x00014A85 0x02 57053.ms ConvertirDatosInyeccion 5 COD 0x00014A21 0x80 129717.ms ConvertirDatosInyeccion 6 COD 0x00014B04 0x01 115127.ms ConvertirDatosInyeccion 7 RWD 0x00070B25 0x10 77078.ms ConsignaPresionRail 8 RWD 0x00070B46 0x10 97479.ms ConsignaPresionRail 9 COD 0x00014419 0x02 139488.ms Interp2d 10 COD 0x00014138 0x20 79351.ms Interp2d 11 COD 0x000143F6 0x08 85503.ms Interp2d 12 COD 0x0001457A 0x40 59389.ms Interp2d 13 COD 0x000141D7 0x01 96898.ms Interp2d 14 COD 0x0001416B 0x01 146757.ms Interp2d 15 COD 0x000143C7 0x08 58150.ms Interp2d 16 COD 0x000141C4 0x20 128517.ms Interp2d 17 COD 0x00013FAA 0x80 76006.ms Interp2d 18 COD 0x000140BA 0x04 61788.ms Interp2d 19 COD 0x000140FF 0x08 136874.ms Interp2d 20 COD 0x0001427C 0x08 97722.ms Interp2d … 1 ROD 0x00015FB4 0x40 88265.ms OSUnMapTbl Where Configuration Files(Where & When faults are injected) Experiment Generator Module
Fault Injection Script(written in PRACTICE) SoC applicationinputs & outputs SoC application Tasks SoCInternal Registers Fault Injector Commercial Nexus debugging tool fromLauterbach® Golden runprocesing Fault injectionprocesing For the time being,Multibit flip is not considered
Fault activation vs Non-activation: Error, 1173 No error, 1786 Error syndrome: Detected Errors, 431 - Failure before error detection, 15 No Detected Errors, 742 Errors not provoking a failure, 454 Errors leading to Failure, 288 Failures: Data close to expected output, 116 Data far from expected output, 172 Error detection mechanism: IBRK , 0 LBRK , 0 DTLBER , 0 ITLBER , 0 SEE , 234 FPASE , 26 SYSE , 11 FPUVE , 11 ALE , 25 MCE , 97 CHSTP , 31 OTHER , 5 Error detection latency: Min, 0.000008620 Max, 0.002321500 Avg, 0.000097840 Analysis completed: - 3000 experiments analyzed. - 41 dropped, 11 due multibitflips. Analysis Tool TraceRepository B::Trace.List_(-50000.)--(0.)_address_data_ti.back_mark.mark _____record|address_____|d.l_____|ti.back___|mark -**********| -0000001128| D:00070BB8 00000000 ---- -0000001127| D:00070BBC 00000000 0.540us ---- -0000001126| D:00070BC0 00000000 0.700us ---- -0000001125| D:00070BC4 00000000 0.700us ---- -0000001124| D:00070BC8 00000000 0.960us ---- -0000001123| D:00070BCC 00000000 1.040us ---- -0000001122| D:00070BD0 00000000 0.700us ---- -0000001121| D:00070BD4 00000000 0.700us ---- -0000001120| D:00070BB8 000003E8 1.026s ---- -0000001119| D:00070BBA 00000014 1.760us ---- -0000001118| D:00070BBC 0000000F 1.740us ---- -0000001117| 239.200us A--- …
Control Component TaskN Task1 … RTOS Component RTOS interface Anatomy of a SoC-based control system • A SoC is a chip-embedded computer SoC internal memory Sensors Actuators Sensor readings Control outputs Inputs Outputs Controlleror DSP