1.75k likes | 1.93k Views
Programming Safety-Critical Embedded Systems. Work mainly by Sidharta Andalam and Eugene Yip Main supervisor: Advisor: Dr. Partha Roop Dr . Alain Girault ( UoA ) (INRIA). Outline. Introduction Synchronous Languages PRET-C ForeC. Outline. Introduction Synchronous Languages
E N D
Programming Safety-Critical Embedded Systems Work mainly by SidhartaAndalam and Eugene Yip Main supervisor: Advisor: Dr.ParthaRoopDr. Alain Girault (UoA) (INRIA)
Outline • Introduction • Synchronous Languages • PRET-C • ForeC
Outline • Introduction • Synchronous Languages • PRET-C • ForeC
Introduction • Safety-critical systems: • Perform specific real-time tasks. • Comply with strict safety standards [IEC 61508, DO 178] • Time-predictability useful in real-time designs. Embedded Systems Safety-critical concerns Timing analysis Timing/Functionality requirements [Paolieri et al 2011] Towards Functional-Safe Timing-Dependable Real-Time Architectures.
Introduction Processor UPC X10 Intel Cilk Plus Manycore SHIM Sigma C ForkLight ForeC OpenMP OpenCL Pthreads ParC SharC Grace Multicore RTOS (VxWorks) Esterel SCADE Simulink Protothreads PRET-C C Single-core Domain of application Embedded Desktop
Outline • Introduction • Synchronous Languages • PRET-C • ForeC
Synchronous Languages • Deterministic concurrency (formal semantics). • Concurrent control behaviours. • Typically compiled away. • Execution model similar to digital circuits. • Threads execute in lock-step to a global clock. • Threads communicate via instantaneous signals. Inputs 1 2 3 4 Global ticks Outputs [Benveniste et al 2003] The Synchronous Languages 12 Years Later.
Synchronous Languages Must validate: max(Reaction time) < min(Time for each tick) Specified by the system’s timing requirements Time for a tick 1s 2s 3s 4s Physical time Reaction time [Benveniste et al 2003] The Synchronous Languages 12 Years Later.
Synchronous Languages • Esterel, Lustre, Signal • Synchronous extensions to C: • PRET-C • Reactive Shared Variables • Synchronous C • Esterel C Language Retain the essence of C and add deterministic concurrency and thread communication. [Roop et al 2009] Tight WCRT Analysis of Synchronous C Programs. [Boussinot 1993] Reactive Shared Variables Based Systems. [Hanxleden et al 2009] SyncCharts in C - A Proposal for Light-Weight, Deterministic Concurrency. [Lavagno et al 1999] ECL: A Specification Environment for System-Level Design.
Outline • Introduction • Synchronous Languages • PRET-C • ForeC
PRET-C Stages PRET-C: Simple synchronous extension to C (using macros). TCCFG: Intermediate format. TCCFG’: Updated after cache analysis. Model Checking: Binary search for the WCRT. PRET-C TCCFG void main() { while(1) { abort PAR(sampler,display); when(reset); EOT; } } Model Checker Cache analysis Final Output WCRT
PRET-C • Simple set of synchronous extensions to C: • Light-weight multi-threading. • Macro-based implementation. • Thread-safe shared memory accesses. • Amenable to timing analysis for ensuring time-predictability.
PRET-C The semantics of PRET-C is presented using structural operational style, along with proofs for reactivity and determinism [IEEE TC 2013 March]
PRET-C Code ... PAR(T1,T2) ... T1: A; EOT; C; EOT T2: B; EOT; D; EOT C T1 A T2 B D Time Local tick Local tick Global Tick Global Tick
Outline • Introduction • Synchronous Languages • PRET-C • ForeC
Introduction • Safety-critical systems: • Shift from single-core to multicore processors. • Cheaper, better power vs. execution performance. Core n Core 0 Shared System bus Shared Shared Resource Resource [Blake et al 2009] A Survey of Multicore Processors. [Cullmann et al 2010] Predictability Considerations in the Design of Multi-Core Embedded Systems.
Introduction • Parallel programming: • From super computers to mainstream computers. • Frameworks designed for systems without resource constraints or safety-concerns. • Optimised for average-case performance (FLOPS), not time-predictability. • Threaded programming model. • Pthreads, OpenMP, Intel Cilk Plus, ParC, ... • Non-deterministic thread interleaving makes understanding and debugging hard. [Lee 2006] The Problem with Threads.
Introduction • Parallel programming: • Programmer responsible for shared resources. • Concurrency errors: • Deadlock, Race condition, Atomic violation, Order violation. [McDowell et al 1989] Debugging Concurrent Programs. [Lu et al 2008] Learning from Mistakes: A Comprehensive Study on Real World Concurrency Bug Characteristics.
Introduction • Synchronous languages • Esterel, Lustre, Signal • Synchronous extensions to C: • PRET-C • Reactive Shared Variables • Synchronous C • Esterel C Language Sequential execution semantics. Unsuitable for parallel execution. [Roop et al 2009] Tight WCRT Analysis of Synchronous C Programs. [Boussinot 1993] Reactive Shared Variables Based Systems. [Hanxleden et al 2009] SyncCharts in C - A Proposal for Light-Weight, Deterministic Concurrency. [Lavagno et al 1999] ECL: A Specification Environment for System-Level Design.
Introduction • Synchronous languages • Esterel, Lustre, Signal • Synchronous extensions to C: • PRET-C • Reactive Shared Variables • Synchronous C • Esterel C Language Compilation produces sequential programs.Unsuitable for parallel execution. [Roop et al 2009] Tight WCRT Analysis of Synchronous C Programs. [Boussinot 1993] Reactive Shared Variables Based Systems. [Hanxleden et al 2009] SyncCharts in C - A Proposal for Light-Weight, Deterministic Concurrency. [Lavagno et al 1999] ECL: A Specification Environment for System-Level Design.
ForeC “Foresee” ForeC • C-based, multi-threaded, synchronous language. Inspired by PRET-C and Esterel. • Deterministic parallel execution on embedded multicores. • Fork/join parallelism and shared memory thread communication. • Program behaviour independent of chosen thread scheduling.
ForeC • Additional constructs to C: • pause: Synchronisation barrier. Pauses the thread’s execution until all threads have paused. • par(st1, ..., stn): Forks each statement to execute as a parallel thread. Each statement is implicitly scoped. • [weak] abortstwhen[immediate] exp: Preempts the statement st when exp evaluates to a non-zero value. exp is evaluated in each global tick before st is executed.
ForeC • Additional variable type-qualifiers to C: • inputand output: Declares a variable whose value is updated or emitted to the environment at each global tick.
ForeC • Additional variable type-qualifiers to C: • shared: Declares a shared variable that can be accessed by multiple threads.
ForeC • Additional variable type-qualifiers to C: • shared: Declares a shared variable that can be accessed by multiple threads. • Threads make local copies of shared variables that they may use at the start of their local ticks. • Threads only modify their local copies during execution. • If a par statement terminates: • Modified copies from the child threads are combined (using a commutative & associative function) and assigned to the parent. • If the global tick ends: • The modified copies are combined and assigned to the actual shared variables. a b
Execution Example Shared variable sharedint sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... } Commutative and associative combine function Fork-join Synchronisation
Execution Example 1 shared int sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... }
Execution Example 1 shared int sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... } Global tick start
Execution Example 1 shared int sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... } Global tick start
Execution Example 1 shared int sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... } Global tick start
Execution Example 1 shared int sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... } Global tick start Global tick end
Execution Example 1 shared int sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... } Global tick start Global tick end
Execution Example 1 shared int sum = 1 combine with plus; intplus(int copy1, int copy2) { return (copy1 + copy2); } void main(void) { par(f(1), f(2)); } void f(inti) { sum = sum + i; pause; ... } Global tick start Global tick end Global tick start
Execution Example 2 shared intv=0 combine with plus; int[4] data={1,2,3,4}; void main(void) { f(data); }void f(int *data) { par(add(0,data), add(2,data)); } void add(int x, int *data) { v=data[x] + data[x+1]; } Sum a set of data.
Execution Example 2 shared intv=0 combine with plus; int[4] data={1,2,3,4}; int[4] data1={5,6,7,8}; void main(void) { f(data); }void f(int *data) { par(add(0,data), add(2,data)); } void add(int x, int *data) { v=data[x] + data[x+1]; } Sum sets of data in parallel.
Execution Example 2 shared intv=0 combine with plus; int[4] data={1,2,3,4}; int[4] data1={5,6,7,8}; void main(void) { par(f(data), f(data1)); }void f(int *data) { par(add(0,data), add(2,data)); } void add(int x, int *data) { v=data[x] + data[x+1]; } Sum sets of data together in parallel.
Execution Example 2 main f f add add add add v
Execution Example 2 main f f add add add add v v
Execution Example 2 int[4] data={1,2,3,4}; int[4] data1={5,6,7,8}; void main(void) { par(f(data), f(data1)); }void f(int *data) { shared int v=0 combine with plus; par(add(0,data,&v), add(2,data,&v)); } void add(int x, int *data, shared int *const v combine with +) { *v=data[x] + data[x+1]; }
Execution Example Shared variables: • Threads modify local copies of shared variables. • Isolation of thread execution allows threads to truly execute in parallel. • Thread interleaving does no affect the program’s behaviour. • Prevents most concurrency errors. • Deadlock, Race condition: No locks. • Atomic and order violation: Local copies. • Copies for a shared variable can be split into groups and combined in parallel.
Execution Example Shared variables: • Programmer has to define a suitable combine function for each shared variable. • Must ensure the combine function is indeed commutative & associative. • Notion of “combine functions” is not entirely new: • Intel Cilk Plus, OpenMP, MPI, UPC, X10 • Esterel, Reactive Shared Variables [Intel Cilk Plus] http://software.intel.com/en-us/intel-cilk-plus [OpenMP] http://openmp.org [MPI] http://www.mcs.anl.gov/research/projects/mpi/ [Unified Parallel C] http://upc.lbl.gov/ [X10] http://x10-lang.org/ [Berry et al 1992] The Esterel Synchronous Programming Language: Design, Semantics and Implementation. [Boussinot 1993] Reactive Shared Variables Based Systems.
Execution Example Shared variables: • Programmer has to define a suitable combine function for each shared variable. • Must ensure the combine function is indeed commutative & associative. • Notion of “combine functions” is not entirely new: • Intel Cilk Plus, OpenMP, MPI, UPC, X10 • Esterel, Reactive Shared Variables Aggregates cilk::reducer_op cilk::holder_op shared var reduction(operator: var) MPI_Reduce MPI_Gather shared var collectives [Intel Cilk Plus] http://software.intel.com/en-us/intel-cilk-plus [OpenMP] http://openmp.org [MPI] http://www.mcs.anl.gov/research/projects/mpi/ [Unified Parallel C] http://upc.lbl.gov/ [X10] http://x10-lang.org/ [Berry et al 1992] The Esterel Synchronous Programming Language: Design, Semantics and Implementation. [Boussinot 1993] Reactive Shared Variables Based Systems.
Execution Example Shared variables: • Programmer has to define a suitable combine function for each shared variable. • Must ensure the combine function is indeed commutative & associative. • Notion of “combine functions” is not entirely new: • Intel Cilk Plus, OpenMP, MPI, UPC, X10 • Esterel, Reactive Shared Variables shared var Combine operator Valued signals Combine operator [Intel Cilk Plus] http://software.intel.com/en-us/intel-cilk-plus [OpenMP] http://openmp.org [MPI] http://www.mcs.anl.gov/research/projects/mpi/ [Unified Parallel C] http://upc.lbl.gov/ [X10] http://x10-lang.org/ [Berry et al 1992] The Esterel Synchronous Programming Language: Design, Semantics and Implementation. [Boussinot 1993] Reactive Shared Variables Based Systems.
Shared Variable Design Patterns • Point-to-point • Broadcast • Software pipelining • Divide and conquer • Scatter/Gather • Map/Reduce
Point-to-point shared int sum = 0combine with plus; void main(void) { par( f(), g() ); } void f(void) { while (1) { sum = comp1(); pause; } } void g(void) { while (1) { comp2(sum); pause; } } New value of sum is received in the next global tick. Combine operation is not required.
Broadcast shared int sum = 0combine with plus; void main(void) { par( f(), g(), g() ); } void f(void) { while (1) { sum = comp1(); pause; } } Multiple receivers. void g(void) { while (1) { comp2(sum); pause; } } New value of sum is received in the next global tick. Combine operation is not required.
Software Pipelining Outputs from each stage are buffered. shared int s1 = 0, s2 = 0 combine with plus; void main(void) { par( stage1(), stage2(), stage3() ); } void stage1(void) { while (1) { s1 = comp1(); pause; } } void stage2(void) { pause; while (1) { s2 = comp2(s1); pause; } } Use the delayed behaviour of shared variables to buffer each stage. void stage3(void) { pause; pause; while (1) { comp3(s2); pause; } }
Divide and Conquer input int[1024] image; sharedintedges = 0 combine with plus; void main(void) { par( analyse(0, 511), analyse(512, 1023) ); } void analyse(int start, int end) { while (1) { edges = 0; for (i = start; i < end; ++i) { ... image[i] ... ; edges++; } pause; } } Count the number of edges in an image.
Scheduling • Light-Weight Static Scheduling: • Take advantage of multicore performance while delivering time-predictability. • Generate code to execute directly on hardware (bare metal/no OS). • Thread allocation and scheduling order on each core decided at compile time by the programmer. • Develop a WCRT-aware scheduling heuristic. • Thread isolation allows for scheduling flexibility. • Cooperative (non-preemptive) scheduling.