1 / 60

HOPES : Embedded Software Development Environment for MPSoC

HOPES : Embedded Software Development Environment for MPSoC. Feb. 14, 2006 Soonhoi Ha Seoul National University. Contents. Introduction Proposed Design Flow Key Techniques Status Conclusion. Deep Submicron Era. Increased chip density

oprah
Download Presentation

HOPES : Embedded Software Development Environment for MPSoC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HOPES: Embedded Software Development Environment for MPSoC Feb. 14, 2006 Soonhoi Ha Seoul National University

  2. Contents • Introduction • Proposed Design Flow • Key Techniques • Status • Conclusion

  3. Deep Submicron Era • Increased chip density • MPSoC (Multiprocessor System on Chip), NoC (Network on Chip) • Increased chip density • Increased NRE cost • Increased manufacturing cost • Platform based design • Platform: common (HW/SW) denominator over multiple applications • Pre-built and verified (HW/SW) architecture • SW design and system verification are major challenges

  4. SoC (System-on-a-Chip) • Assembly of “prefabricated component” • Maximize VC(IP) reuse: over 90% • New economics: fast and correct design > optimum design • Design and Verification at the system level • interface between VCs • SW becomes more important software up up memory graphics video unit DSP custom coms

  5. HOPES Project • Project • Title: Development of Embedded software design and verification techniques for MPSoC, Period: 2005.3.1 - 2008.2.29, • Amount: $4,200,000 • Sponsor: Korean Ministry of Information and Communication, • Mission • Develop an integrated embedded software development environment based on model-based programming techniques and multi-stage validation techniques for MPSoC (Multi-Processor System on Chip)

  6. Motivation • Current SoC-related projects (in Korea) focus on hardware design and verification • Software design on MPSoC • Embedded software with timing and resource constraints • Parallel programming for heterogeneous multiprocessors • Fast virtual prototyping

  7. Current Practice (in Korea) • Virtual prototyping • Coware ConvergenSC, ARM(Axys) MaxSim • Manual software and hardware design • TLM (transaction level modeling) simulation: 100K inst./sec • Model-based programming • UML based tool: (ex) Telelogic TAU • Limited capability: documentation, code structure. • No tool is available for parallel programming • Manual programming

  8. HOPES Proposal KPN UML Model-based Programming (PeaCE model) Common Intermediate Code (CIC) Generic API CIC (w/ API) translator Static C code analysis API lib. Per-processor code 2 SW Platform 1 Perf. & power estimation Virtual prototyping Target HW platform Processor ISS

  9. Key Techniques • SW design techniques • Model-based specification • Partitioning and automatic code generation from specification • Parallel programming (task parallelism, data parallelism) • Generic APIs (independent of OS and target architecture) • SW verification techniques: 3-phase verification • At specification level: static analysis of program specification and functional simulation • After code generation: static analysis of C code • At simulation time: debugging on virtual prototyping environment

  10. Contents • Introduction • Proposed Design Flow • Key Techniques • Status • Conclusion

  11. Proposed design flow • Model-based programming • PeaCE model (dataflow + FSM + task model) • ESUML (embedded system UML) model • CIC (Common Intermediate Code) • OpenMP pragma + generic API • Static Analysis • Buffer overrun, memory leak, null dereference, stack size • Virtual prototyping • Performance and power estimation • with source-level debugging capability

  12. Software Module Interface results GUI SW Platform ESUML_Modeler PeaCE_Modeler Document generation Block library DB PeaCE_Partitioner ESUML_Verifier ESUML_Partitioner Common Intermediate Code (CIC): task_graph.cic Generic parallel API Generic API proc. code with generic API parallel API HW component DB OpenMP translator OS API Translator Comm. API Lib. Code Synthesizer C-code static analyzer {processorX.c} proc. code with OS proc. code without OS Verification OS Power analysis module ARM processor simulation Performance analysys module MPI Library virtual prototype system MPSoC {prototype system}

  13. {Application_graph.xml} user command App_classDiagram.xml App_seqDiagram.xml Action semantics results GUI SW Platform ▼공동 1 Application_graph.xml user command ESUML_Modeler PeaCE_Modeler User_command sched.xml Document gen. sched.xml Task_graph.xml Block perf. DB PeaCE_Partitioner ESUML_Verifier TimeCost.xml ESUML_Partitioner archi.xml Common Intermediate Code (CIC): task_graph.cic

  14. Common Intermediate Code (CIC): task_graph.cic openMP pragma Generic API proc. Code(generic API) parallel API HW component DB OpenMP translator OS API Translator comm API Lib. Code Synthesizer C-code static analyzer {processorX.c} code w/ OS code w/o OS Verification OS processorX.elf Power estimation ARM ISS Perf. estimation MPI Library virtual prototype system Function symbol table Cycle-level execution trace MPSoC {prototype system} processor parameter

  15. Demo 1. HOPES Design Flow • Model-based programming (ex: divx player) • PeaCE Model • Dataflow + (FSM) + task model • Fault simulation • sample rate inconsistency (Y;U;V= 3:1:1) • deadlock error • Functional simulation • Host machine

  16. Demo 1. HOPES Design Flow (2) • C code static analysis • Buffer overrun error detection • Fault simulation: wrong array size in “AviFileReader task” • Runtime verification • Simulation with Realview ARM debugger • Fault simulation • Logical error insertion

  17. Demo 2. HOPES (single task) • Single task example: H.263 decoder • C-code static analysis • Performance and power estimation • Custom-designed ARM Simulator • Per-function estimation • Runtime verification • Source level debugging • Fault simulation : logical error insertion

  18. HOPES GUI Xml files (topology, property) perf. And power estimation PeaCE code gen. - CIC(xml) - CIC(.cic) ARM ISS C code static analysis CIC code with Generic API, OpenMP pragma Executable With target API OpenMP translator CIC code With Generic API CIC xml Generic API tranformer Code Synthesizer - C code, Executable C codes With Generic API HOPES flow for single processor Manual for now

  19. Demo 3. UML 2.0 –based Modeling • ESUML : Embedded Software modeling with UML 2.0 feedback Target MPSoC Embedded S/W Needs Analysis & Design Verification & Validation Code Gen. & Optimization ESUML Simulator ESUML Statis analyzer ESUML Modeler

  20. ESUML • Model – Driven Application Development Requirements Capturing PIM Modeling Structural model Behavioral model Use-Case model Model Verification Static Analysis Model Simulation PSM Modeling H/W architecture model Code Generation Physical model CIC code Java code PIM : Platform Independent Model PSM : Platform Specific Model

  21. ESUML Characterisitcs • Light-Weight Development Methodology Requirement Capturing PIM and PSM modeling Analysis Code Generation Real-Time Embedded Software Interaction-based Behavior Modeling Large-Scale System Concepts Process ESUML Event-Based Modeling Hierarchical Decomposition Modeling Guidelines and Tips Use Case Diagram Class Diagram Interaction Overview Diagram Sequence Diagram Hardware Architecture Diagram Rules Notations

  22. UCD A use case 1 Scenarios use case 2 IOD of UC1 use case 3 X Global IODs B C IOD of UC3 IOD of UC2 Global behavioral view Y SDs of X SDs of Z SDs of Y SDs of X ESUML Characteristics • Hierarchical Modeling • Interaction-based Behavior Modeling

  23. PSM Modeling (To-Be) • PSM Modeling : Physical Model Automatically Recommended Partition + Modeler Enforced Partition

  24. C JAVA C++ ESUML Code Gen. (To-Be) • Platform Independent Code Generation PIM/PSM Models CIC Code Generator

  25. 64 Rules ESUML Verification and Validation • Rule-based consistency analyzer • Intra-model, inter-model, action-language

  26. ESUML Simulator Simulation result MicroOven.XMI File (4) output (1) input (3) output CFG Generation Simulation Engine (5) save (2) input SimMicroOven.txt (5) modify model (if needed) Define Scenario Feedback to Modeler

  27. Contents • Introduction • Proposed Design Flow • Key Techniques • PeaCE Modeling • CIC & Generic API • Static Analysis • Status • Conclusion

  28. PeaCE Model • Top model: Task model • Computation task model: Dataflow model • Extended SDF model: SPDF (Synchronous Piggybacked Data Flow) and FRDF (Fractional Rate Data Flow) • Control task model: FSM model • Hierarchical and concurrent FSM model: fFSM (flexible Finite State Machine)

  29. PeaCE Project • PeaCE • Ptolemy extension as a Codesign Environmentbased on Ptolemy classic • open-source research platform • Officially released in DAC 2005. (version 1.0) • PeaCE home page • http://peace.snu.ac.kr/research/peace

  30. 2 2 3 3 3 3 3 Block Lib. 2 1 8 7 4 5 3 6 3 3 3 Seamless HW/SW Codesign Flow 1 1 Architecture Func . Algorithm spec. templates Simulation 4 4 3 3 3 3 Graph Analysis Profiling TimeCost.xml . clustered.xml DB PE Selection Partition & mapping 5 3 3 5 sched.xml 6 3 6 3 7 3 7 3 3 8 8 3 PE Code generation + OS API VHDL - code C - code C - code Comm. arch DSE 6 3 6 3 Archi.xml IF Code generation 7 3 3 7 3 3 8 8 Coverification & Prototyping

  31. Divx Player: Top Model

  32. H.264 Decoder if-then-else construct

  33. SPDF Model • Basic SDF Model • A node represents a function block (ex: FIR, DCT) • A block is fireable (executable when it receives all required number of input samples) • Statically scheduled at compile-time: sample rate inconsistency, deadlock condition • Our extensions • Generic message type data -> code generation only • FRDF: fractional rate data flow • SPDF: synchronous piggybacked dataflow

  34. 1/99 1/99 The macro block is 1/99 of the video frame in its size. For each execution of the ME node, it consumes a macro-block from the input frame, and produces a macro-block output. Fractional Rate Dataflow ME EN 1 1 1 16x16 Schedule : 99(ME,EN)

  35. FRDF (Fractional Rate DF) 1/99 1 Read From Device Motion Estimation Macro Block Encoding Variable Length Coding 1/99 1 1 1 1 1/99 1/99 Macro Block Decoding Motion Compensation Write To Device 1 1 1 1 1/99 Reference Code SDF FRDF Buffer size 361KB 686KB 225KB No. of frame Type data 5 8 3

  36. State name 1st iteration 2nd iteration “gain” 10 20 …. C Synchronous Piggybacked Dataflow • Global data structure with limited access • Each data sample is piggybacked with an SU request • A block updates its internal parameters depending on applicability of SU request SU “gain”= 10 “gain”= 5 PD A B piggyback C data

  37. x c x c PB A B A B MUX 1 0 Mux y y SPDF (Synchronous Piggybacked DF) • Piggybacking the control variable • Special super-block definition for the body of DC

  38. a B a ac’ a AC BC A B bc’ A BC A c/u b C D ac bc a’c b’c b/v c C D a b/v b’c/u AD BD b/v BD b fFSM (flexible FSM) Model • Hierarchical FSM • Concurrent FSM

  39. start suspend stop 1 text exit fFSM + SPDF suspend==1 H.263 Decoder divxread divx run divx suspend AVI Reader MP3 Decoder script in divxrun {run divx} script in divsuspend {suspend divx} exit start==1 divx scripts in divx {deliver ui filename divxread filename} {run divx} stop==1/ tostart=1 divx text==4 script in divxstop {stop divx} exitDivx ==1 divxstop start script in exit state {get divx exit exitDivx} tostart==1

  40. Task Model • Task execution semantics • periodic, sporadic, functional • Port properties • data rate : static or dynamic • data size : static or variable • port type : queue or buffer Execution Type periodic port semantic H.263 decoder Basic Task Block (SDF or FSM) AviReader MP3 decoder Task Wrapper

  41. CIC Code Generation • Application tasks • Task in task-level specification model  task_name.cic • Generic API • Partitioning considering data parallelism • openMP programming void h263decoder_go (void) { ...         l = MQ_RECEIVE("mq0", (char *)(ld_106->rdbfr), 2048);         ... # pragma omp parallel for         for(i=0; i<99; i++) {              //thread_main() ....         }          // display the decode frame            dither(frame);          }

  42. CIC Format Architecture Task Code Hardware Kernel code _init() _go() _wrapup() Constraints Structure Generic API Parallel API

  43. DivX player CIC xml (Hardware) <?xml version="1.0" ?> <CIC_XML> <hardware> <processor name="arm926ej-s0">   <index>0</index> <localMem name=“lmap0”>   <addr>0x0</addr>   <size>0x10000</size> // 64KB   </localMem> <sharedMem name=“shmap0”> <addr>0x10000</size> <size>0x40000</size> // 256KB <sharedWith>1</sharedWith> </sharedMem> <OS>   <support>TRUE</support>   </OS>   </processor> <processor name="arm926ej-s1">   <index>0</index> <localMem name=“lmap1”>   <addr>0x0</addr>   <size>0x20000</size> // 128KB   </localMem> <sharedMem name=“shmap1”> <addr>0x20000</size> <size>0x40000</size> // 256KB <sharedWith>0</sharedWith> </sharedMem> <OS>   <support>TRUE</support>   </OS>   </processor>  </hardware>

  44. LocalMem TCM P1 P2 P1 P2 P3 Shared Mem SM1 SM2 Hardware Architectures 0x0 0x0 shared=NO shared=NO 0x10000 0x20000 shared=YES shared=YES 0x50000 0x60000

  45. DivX player CIC xml (Constraint) <constraints> <memory>16MB</memory>   <power>50mWatt</power> <mode name="default"> <task name="AviReaderI0">   <period>120000</period> <deadline>120000</deadline>   <priority>0</priority> <subtask name="arm926ej-s0">   <execTime>186</execTime>   </subtask>   </task> <task name="H263FRDivxI3">   <period>120000</period> <deadline>120000</deadline>   <priority>0</priority> <subtask name="arm926ej-s0">   <execTime>5035</execTime>   </subtask>   </task> <task name="MADStreamI5">   <period>120000</period> <deadline>120000</deadline> <priority>0</priority> <subtask name="arm926ej-s0">   <execTime>13</execTime>   </subtask>   </task>   </mode> </constraints>

  46. DivX player CIC xml (Structure) <structure> <mode name="default"> <task name="AviReaderI0"> <subtask name="arm926ej-s0">   <procMap>0</procMap>   <fileName>AviReaderI0_arm926ej_s0.cic</fileName>   </subtask>   </task> <task name="H263FRDivxI3"> <subtask name="arm926ej-s0">   <procMap>0</procMap>   <fileName>H263FRDivxI3_arm926ej_s0.cic</fileName>   </subtask>   </task> <task name="MADStreamI5"> <subtask name="arm926ej-s0">   <procMap>0</procMap>   <fileName>MADStreamI5_arm926ej_s0.cic</fileName>   </subtask>   </task>  </mode> <queue>   <name>mq0</name>   <src>AviReaderI0</src>   <dst>H263FRDivxI3</dst>   <size>30000</size>   </queue> <queue>   <name>mq1</name>   <src>AviReaderI0</src>   <dst>MADStreamI5</dst>   <size>30000</size>   </queue> </structure> </CIC_XML>

  47. Task Code: Structure • Task kernel code • _init(): before main loop • _go(): in the main loop • _wrapup(): after main loop (example) h263dec.cic // header file // global declaration and definition // procedure definition h263dec_init() { … } h263dec_go() { … } h263dec_wrapup() { … }

  48. Generic API • Generic API • OS-indepednent API • API translator to OS-specific APIs • Generic API Definition • Abstraction from IEEE POSIX 1003.1-2004 • OS services: • Process, memory, synchronization, file system, networking, real-time, thread, I/O, timer, etc. • C library functions based on use frequency • stdio.h, stdlib.h, time.h, signal.h

  49. Generic API 정의

  50. OS API Translator • OS API Translator CIC’s Task Code APIs, Patterns, and Rules register modify OS API Translator Translation pattern and rules Code Analysis • Pattern mapping • - 1-to-1, 1-to-m, n-to-m • - initialize/finalize • - acquire/release • parameters • - type, number, value • Variable names • - naming of additional variables Generic APIs Transformation Rules Symbol Table Construction POSIX APIs Non-POSIX APIs Transformation OS-Specific C Code

More Related