1 / 89

Synthesis of Embedded Software for Reactive Systems

This project focuses on the synthesis of embedded software for reactive systems. It involves the development of a formal design environment using methodologies, formal methods, and modeling techniques. Participants from various institutions and companies contribute to the project.

jgodinez
Download Presentation

Synthesis of Embedded Software for Reactive Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synthesis of Embedded Software for Reactive Systems Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Joint work with: Robert Clarisó, Alex Kondratyev, Luciano Lavagno, Claudio Passerone and Yosinori Watanabe (UPC, Cadence Berkeley Labs, Politecnico di Torino)

  2. System designer(Set-top box) Contents provider (e.g. TV broadcast company) MPEG2Engine µP custom Coms Graphics DSP System Design Platform provider (e.g. Semiconductor company) MethodologyforPlatform-basedSystem Design External IP provider(e.g. software modem) Internal IP provider (e.g. MPEG2 engine) : Requirements specification, Testbench : Functional and performance models (with agreed interfaces and abstraction levels)

  3. Metropolis Project etropolis • Goal: develop a formal design environment • Design methodologies: abstraction levels, design problem formulations • EDA: formal methods for automatic synthesis and verification, a modeling mechanism: heterogeneous semantics, concurrency • Participants: • UC Berkeley (USA): methodologies, modeling, formal methods • CMU (USA): formal methods • Politecnico di Torino (Italy): modeling, formal methods • Universitat Politècnica de Catalunya (Spain): modeling, formal methods • Cadence Berkeley Labs (USA): methodologies, modeling, formal methods • Philips (Netherlands): methodologies (multi-media) • Nokia (USA, Finland): methodologies (wireless communication) • BWRC (USA): methodologies (wireless communication) • BMW (USA): methodologies (fault-tolerant automotive controls) • Intel (USA): methodologies (microprocessors)

  4. Function Specification Metropolis Formal Methods: Synthesis/Refinement Metropolis Formal Methods: Analysis/Verification Metropolis Framework Architecture Specification Design Constraints • Metropolis Infrastructure • Design methodology • Meta model of computation • Base tools • - Design imports • - Meta model compiler • - Simulation

  5. Outline • The problem • Synthesis of concurrent specificationsfor sequential processors • Compiler optimizations across processes • Previous work: Dataflow networks • Static scheduling of SDF networks • Code and data size optimization • Quasi-Static Scheduling of process networks • Petri net representation of process networks • Scheduling and code generation • Open problems

  6. TSENSOR HSENSOR TEMP FILTER HUMIDITY FILTER TDATA HDATA CONTROLLER AC-on DRYER-on ALARM-on Environmental controller Temperature Humidity ENVIRONMENTAL CONTROLLER AC Dehumidifier Alarm

  7. Environmental controller TEMP-FILTER float sample, last; last = 0; forever { sample = READ(TSENSOR); if (|sample - last| > DIF) { last = sample; WRITE(TDATA, sample); } } TSENSOR HSENSOR TEMP FILTER HUMIDITY FILTER TDATA HDATA CONTROLLER AC-on DRYER-on ALARM-on

  8. Environmental controller TEMP-FILTER float sample, last; last = 0; forever { sample = READ(TSENSOR); if (|sample - last| > DIF) { last = sample; WRITE(TDATA, sample); } } TSENSOR HSENSOR TEMP FILTER HUMIDITY FILTER TDATA HDATA HUMIDITY-FILTER float h, max; forever { h = READ(HSENSOR); if (h > MAX) WRITE(HDATA, h); } CONTROLLER AC-on DRYER-on ALARM-on

  9. Environmental controller CONTROLLER float tdata, hdata; forever { select(TDATA,HDATA) { case TDATA: tdata = READ(TDATA); if (tdata > TFIRE) WRITE(ALARM-on,10); else if (tdata > TMAX) WRITE(AC-on, tdata-TMAX); case HDATA: hdata = READ(HDATA); if (hdata > HMAX) WRITE(DRYER-on, 5); } } TSENSOR HSENSOR TEMP FILTER HUMIDITY FILTER TDATA HDATA CONTROLLER AC-on DRYER-on ALARM-on

  10. TSENSOR HSENSOR TEMP FILTER HUMIDITY FILTER TDATA HDATA CONTROLLER AC-on DRYER-on ALARM-on Environ. Processes OS Tsensor T-FILTERwakes up Operating system T-FILTERexecutes T-FILTERsleeps Hsensor H-FILTERwakes up H-FILTERexecutes &sends datato HDATA H-FILTERsleeps CONTROLLERwakes up CONTROLLERexecutes &reads datafrom HDATA . . .

  11. Instruction level Basic blocks Intra-procedural(across basic blocks) Inter-procedural Inter-process ? a = b*16  a = b >> 4 common subexpr.,copy propagation loop invariants,induction variables inline expansion,parameter propagation channel optimizations,OS overhead reduction Compiler optimizations Each optimization enables further optimizations at lower levels

  12. Partial evaluation (example) Specification: subsets (n,k) = n! / (k! * (n-k)!) ________________________________________________ int subsets (int n, int k) { return fact(n) / (fact(k) * fact(n-k)); } int pairs (int n) { return subsets (n,2);} ... print (pairs(x+1)) ... ... print (pairs(5)) ... Partial evaluation (compiler optimizations)

  13. Partial evaluation (example) Specification: subsets (n,k) = n! / (k! * (n-k)!) ________________________________________________ int subsets (int n, int k) { return fact(n) / (fact(k) * fact(n-k)); } int pairs (int n) { return subsets (n,2);} ... print ((x+1)*x / 2) ... ... print (pairs(5)) ... Partial evaluation (compiler optimizations)

  14. Partial evaluation (example) Specification: subsets (n,k) = n! / (k! * (n-k)!) ________________________________________________ int subsets (int n, int k) { return fact(n) / (fact(k) * fact(n-k)); } int pairs (int n) { return subsets (n,2);} ... print ((x+1)*x / 2) ... ... print (10) ...

  15. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (D, 2); } A forever { x = read (E); y = read (F); z = read (G); write (H, x/(y*z)); } n x! x! x! H pairs (n)

  16. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (D, 2); } A forever { x = read (E); y = read (F); z = read (G); write (H, x/(y*z)); } x! x! x! H No chances for optimization

  17. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (D, 2); } A forever { x = read (E); y = read (F); z = read (G); write (H, x/(y*z)); } x! x! x! H 2...2 2...2

  18. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (G, 2); } A forever { x = read (E); y = read (F); z = read (G); write (H, x/(y*z)); } x! x! H 2...2 2...2

  19. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (G, 2); } A forever { x = read (E); y = read (F); z = read (G); write (H, x/(y*z)); } x! x! H 2...2 2...2

  20. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (G, *); } A forever { x = read (E); y = read (F); read (G); write (H, x/(y*2)); } x! x! H • Copy propagation across processes • Channel G only synchronizes (token available)

  21. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (G, *); } A forever { x = read (E); y = read (F); read (G); write (H, x/(y*2)); } x! x! H By scheduling operations properly, FIFOs may become variables (one element per FIFO, at most)

  22. Inter-process partial evaluation forever { n = read (A); v1 = n; v3 = n-2; A x = v2; y = v4; write (H, x/(y*2)); } x! v1 v2 x! v3 v4 H

  23. Inter-process partial evaluation A forever { n = read (A); v1 = n; v2 = fact (v1); x = v2; v3 = n-2; v4 = fact (v3); y = v4; write (H, x/(y*2)); } H And now we can apply conventional compiler optimizations

  24. Inter-process partial evaluation A forever { n = read (A); x = fact (n); y = fact (n-2); write (H, x/(y*2)); } H If some “clever” theorem prover could realize that fact(n) = n*(n-1)*fact(n-2) the following code could be derived ...

  25. Inter-process partial evaluation forever { n = read (A); write (H,n*(n-1)/*2); } A H

  26. Inter-process partial evaluation forever { n = read (A); write (B,n); write (C, n-2); write (D, 2); } A forever { x = read (E); y = read (F); z = read (G); write (H, x/(y*z)); } x! x! x! H This was the original specification of the system !

  27. Inter-process partial evaluation forever { n = read (A); write (H,n*(n-1)/*2); } A H • This is the final implementation after inter-process optimization: • Only one process (no context switching overhead) • Channels substituted by variables (no communication overhead)

  28. Goal: improve performance, code size power consumption, ... • Reduce operating system overhead • Reduce communication overhead • How?: Do as much as possible staticallyand automatically • Scheduling • Compiler optimizations Operating system TSENSOR HSENSOR TEMP FILTER HUMIDITY FILTER TDATA HDATA CONTROLLER AC-on DRYER-on ALARM-on

  29. Outline • The problem • Synthesis of concurrent specifications • Compiler optimizations across processes • Previous work: Dataflow networks • Static scheduling of SDF networks • Code and data size optimization • Quasi-Static Scheduling of process networks • Petri net representation of process networks • Scheduling and code generation • Open problems

  30. Dataflow networks • Powerful mechanism for data-dominated systems • (Often stateless) actors perform computation • Unbounded FIFOs perform communication via sequences of tokens carrying values • (matrix of) integer, float, fixed point • image of pixels, ….. • Determinacy: • unique output sequences given unique input sequences • Sufficient condition: blocking read (process cannot test input queues for emptiness)

  31. Intuitive semantics • Example: FIR filter • single input sequence i(n) • single output sequence o(n) • o(n) = c1* i(n) + c2* i(n-1) i(-1) i  c1  c2 + o

  32. FFT 1024 1024 10 1 Examples of Dataflow actors • SDF: Static Dataflow: fixed number of input and output tokens • BDF: Boolean Dataflow control token determines number of consumed and produced tokens 1 + 1 1 T F select merge T F

  33. Static scheduling of DF • Key property of DF networks: output sequences do not depend on firing sequence of actors (marked graphs) • SDF networks can be statically scheduled at compile-time • execute an actor when it is known to be fireable • no overhead due to sequencing of concurrency • static buffer sizing • Different schedules yield different • code size • buffer size • pipeline utilization

  34. A B np nc Balance equations • Number of produced tokens must equal number of consumed tokens on every edge (channel) • Repetitions (or firing) vector vof schedule S: number of firings of each actor in S • v(A) ·np= v(B) ·nc must be satisfied for each edge nc np A B

  35. Balance equations • Balance for each edge: • 3 v(A) - v(B) = 0 • v(B) - v(C) = 0 • 2 v(A) - v(C) = 0 • 2 v(A) - v(C) = 0 A 2 3 2 1 1 1 B C 1 1

  36. 3 -1 0 0 1 -1 2 0 -1 2 0 -1 A 2 3 M = 2 1 1 1 B C 1 1 Balance equations • M ·v = 0 iff S is periodic • Full rank (as in this case) • no non-zero solution • no periodic schedule (too many tokens accumulate on AB or BC)

  37. 2 -1 0 0 1 -1 2 0 -1 2 0 -1 A 2 2 M = 2 1 1 1 B C 1 1 Balance equations • Non-full rank • infinite solutions exist (linear space of dimension 1) • Any multiple of v = |1 2 2|T satisfies the balance equations • ABCBC and ABBCC are minimal valid schedules • ABABBCBCCC is non-minimal valid schedule

  38. Static SDF scheduling • Main SDF scheduling theorem (Lee ‘86): • A connected SDF graph with n actors has a periodic schedule iff its topology matrix M has rank n-1 • If M has rank n-1 then there exists a unique smallest integer solution v to M v = 0

  39. Deadlock ! Deadlock • If no actor is firable in a state before reaching the initial state, no valid schedule exists (Lee’86) A 1 1 2 2 B C 1 1 Schedule: (2A) B C

  40. Deadlock • If no actor is firable in a state before reaching the initial state, no valid schedule exists (Lee’86) A 1 1 2 2 B C 1 1 Schedule: (2A) B C

  41. Deadlock • If no actor is firable in a state before reaching the initial state, no valid schedule exists (Lee’86) A 1 1 2 2 B C 1 1 Schedule: (2A) B C

  42. Deadlock • If no actor is firable in a state before reaching the initial state, no valid schedule exists (Lee’86) A 1 1 2 2 B C 1 1 Schedule: (2A) B C

  43. Deadlock • If no actor is firable in a state before reaching the initial state, no valid schedule exists (Lee’86) A 1 1 2 2 B C 1 1 Schedule: (2A) B C

  44. Deadlock • If no actor is firable in a state before reaching the initial state, no valid schedule exists (Lee’86) A 1 1 2 2 B C 1 1 Schedule: (2A) B C

  45. Code size minimization • Assumptions (based on DSP architecture): • subroutine calls expensive • fixed iteration loops are cheap (“zero-overhead loops”) • Global optimum: single appearance schedule e.g. ABCBC  A (2BC), ABBCC  A (2B) (2C) • may or may not exist for an SDF graph… • buffer minimization relative to single appearance schedules (Bhattacharyya ‘94, Lauwereins ‘96, Murthy ‘97)

  46. 1 A 10 C D 1 10 B 10 1 Buffer size minimization • Assumption: no buffer sharing • Example: v = | 100 100 10 1|T • Valid SAS: (100 A) (100 B) (10 C) D • requires 210 units of buffer area • Better (factored) SAS: (10 (10 A) (10 B) C) D • requires 30 units of buffer area, but… • requires 21 loop initiations per period (instead of 3)

  47. Scheduling more powerful DF • SDF is limited in modeling power • More general DF is too powerful • non-Static DF is Turing-complete (Buck ‘93) • bounded-memory scheduling is not always possible • Boolean Data Flow: Quasi-Static Scheduling of special “patterns” • if-then-else, repeat-until, do-while • Dynamic Data Flow: run-time scheduling • may run out of memory or deadlock at run time • Kahn Process Networks: quasi-static scheduling using Petri nets • conservative: schedulable network may be declared unschedulable

  48. Outline • The problem • Synthesis of concurrent specifications • Compiler optimizations across processes • Previous work: Dataflow networks • Static scheduling of SDF networks • Code and data size optimization • Quasi-Static Scheduling of process networks • Petri net representation of process networks • Scheduling and code generation • Open problems

  49. The problem • Given: a network of Kahn processes • Kahn process: sequential function + ports • communication: port-based, point-to-point, uni-directional, multi-rate • Find: a single sequential task • functionally equivalent to the originalnetwork (modulo concurrency) • threads driven by input stimuli(no OS intervention) TSENSOR HSENSOR TEMP FILTER HUMIDITY FILTER TDATA HDATA CONTROLLER AC-on DRYER-on ALARM-on

  50. Event-driven threads Init() last = 0; Reset Tsensor() sample = READ(TSENSOR); if (|sample - last| > DIF) { last = sample; if (sample > TFIRE) WRITE(ALARM-on,10); else if (sample > TMAX) WRITE(AC-on,sample-TMAX); } Hsensor() h = READ(HSENSOR); if (h > MAX) WRITE(DRYER-on,5);

More Related