270 likes | 556 Views
With Recommendations for a Multi-Core Power.org TSC Agenda Content Environment: Industry Trends Embedded Multi-Core Enablement Layers Multi-Core Ecosystem Embedded Early Adoption Segments Background Information Multi-Core Design: Three Large Questions
E N D
Agenda Content • Environment: • Industry Trends • Embedded Multi-Core Enablement Layers • Multi-Core Ecosystem • Embedded Early Adoption Segments • Background Information • Multi-Core Design: Three Large Questions • Key Multi-Core enablement questions • Embedded Multi-Core Challenges • Areas of Programming/Programming Categories/ • Types of Parallelism • Industry Standards for Embedded Multi-Core • Requirements and Recommendations • Requirements for Embedded Multi-Core • New Programming Models as answer • Known Multi-Core Issues • Summary of Requirements for Multi-Core Enablement • Conclusions/Recommendations/TSC Proposal • Summary
Environment Industry Software Trends for Embedded • Overall Embedded Development Cost: SW Portion Dramatically Growing • 2007: $25.8 billion • 2008: $33.3 billion • Embedded Processors going Multi-Core: see diagram • $372M in 2007 growing to $1.33B in 2009 (VDC Estimates) • Multi-Core issues are generic, specifically visible in Embedded, because: • Software enablement lagging behind in standardized way • Embedded systems heterogeneous, require small footprint • Embedded Software Readiness scores a 2.06 out of 5 • Issues getting bigger as number of course will increase • 55% of embedded developers are/will be using multi-core in next 12 mo • Source VDC • Performance improvements required, Multi-Core Benefits: • Lower power consumption while improving overall throughput • 2009: $41.7 billion • 2010: $51.6 billion • 2011: $61.9 billion
$372.1M $1.33B Environment Embedded Processors going Multi-Core Source : A White Paper on: MULTI-CORE COMPUTING IN EMBEDDED APPLICATIONS, VDC 2007 • The first multi-core CPUs to the embedded market in late 2006 (dual-core) • Multi-core CPU revenues from embedded in 2011 are projected to reach over 6 times 2007 multi-core revenues and over 44 times 2006 levels. • Asia-Pacific is the main market and still growing
Environment Industry Software Trends for Embedded • Overall Embedded Development Cost: SW Portion Dramatically Growing • 2007: $25.8 billion • 2008: $33.3 billion • Embedded Processors going Multi-Core: • $372M in 2007 growing to $1.33B in 2009 (VDC Estimates) • Multi-Core issues are generic, specifically visible in Embedded, because: • Software enablement lagging behind in standardized way • Embedded systems heterogeneous, require small footprint • Embedded Software Readiness scores a 2.06 out of 5 - see diagram • Issues getting bigger as number of course will increase • 55% of embedded developers are/will be using multi-core in next 12 mo • Source VDC • Performance improvements required, Multi-Core Benefits: • Lower power consumption while improving overall throughput • 2009: $41.7 billion • 2010: $51.6 billion • 2011: $61.9 billion
Environment Embedded Software Readiness scores a 2.06 out of 5 Source : Multi-Core Enablement Panel, Power Architecture Developer Conference 2007 BACK
Environment Embedded Multi-Core Enablement Layers New, Portable Applications/Converted Legacy Applications/Workflow management Multi-Core Enablement Tools: Debugger, optimizing compilers, performance tracing,... Enabled for Multicore Possibly new languages, adapting existing languages for multicore DMA, pThreads, MP, streaming data Management “To thread or not to thread” Adapting existing OSs/potentially new OSs Possibly new memory sys architecures Mixture of heterogenous OSs Hypervisor/Firmware: Adapted for Multicore Heterogenous Multicore System: Types of Cores, Connection of Cores,…
New, Portable Applications/Converted Legacy Applications/Workflow management Multi-Core Enablement Tools: Debugger, optimizing compilers, performance tracing,... Enabled for Multicore Possibly new languages, adapting existing languages for multicore DMA, pThreads, MP, streaming data Management “To thread or not to thread” Adapting existing OSs/potentially new OSs Possibly new memory sys architecures Mixture of heterogenous OSs Hypervisor/Firmware: Adapted for Multicore Power Architecture Multi-Core Environment Embedded Multi-Core Enablement Layers Need agreements on Appropriate Interfaces to guarantee Interoperability
Environment Multi-Core Ecosystem Power Architecture Multi-Core, SoC and IP Providers Application Developers Standards Partners Embedded System Designers and Developers Interface Agreements Interoperability Specifications RTOS/Linux Providers Middleware/Develop-ment Tool Providers
Environment Embedded Multi-Core Early Adoption Segments • Communication • Aerospace, Defense • Medical • Automotive • Consumer Space, smart phones, … • Networking, Network Processing • Digital Signal Processing • See Diagram
Simple data offloading Simple data offloading Environment Embedded Early Adoption Segments - Diagram Likelihood of Application Segments as Early Adopters of Multi-Core Processors (Average Rating for each Vertical Market on a Scale of 1 to 5 Where 1 is Least Likely and 5 is Most Likely) Source : A White Paper on: MULTI-CORE COMPUTING IN EMBEDDED APPLICATIONS, VDC 2007
Background Information Three Major Areas of Multi-Core Design • Sizing Resources HW Design • Making decisions such as: • adding processors • keeping/increasing cash sizes,… • Connecting Cores HW Design • Programming Evolution/Enablement Software
Background Information Key Multi-Core Enablement Questions • What is the multi-Core technology of the future • looking ahead 5, 10, 15 years from now? • For SW: How many cores, Heterogeneous or homogenous,… • What are the leading solutions for multi-core debug? • What are the biggest stumbling blocks to making multi-core successful? • And what are their resolutions • What is the state-of-the-art in software virtual prototyping • and where is it headed? • Is a new programming paradigm necessary to support parallelism? • Can it be unleashed with tried and true sequential programming models? • To thread or not to thread How do I migrate my application to multi-core
Background Information The Multi-Core Challenge for the Embedded Space • Software Migration to the multi-core environment • Embedded software is typically not multi-core/multi-thread aware • Embedded customers need to re-evaluate existing assets • “Tune” Assets for multi-core manually • Development of new Software • Need well-defined APIs, libraries and programming models to best utilize the capability of multi-core processor. • Development environment (parallelization tool, debugger, performance profiler, compiler, et al) is not ready for multi-core environment. • See Software Readiness Scores • Debugging • Debug environments are not ready for multi-core environment. • See Software Readiness Scores • Hardware Development • To assess the number of cores to obtain the target performance is getting hard due to the wide variety of applications How do I optimize my solution for multi-core
Background Information Types of Parallelism • Load Balancing/Workflow Optimization • embedded vs PC • Dual Core vs Multi-Core • Multiple Types of Parallelism – technical background • thread-level parallelism (TLP) - vectorization • negatives: overhead of spawning, communicating, synchronization with threads • coars grained parallelism lookin gacross outer-loops, large code segments, • lib and comp support, explicit prog using • OpenMP (shared mem models) • MPI (distr mem models) • auto versus explicit (auto focus on FORTRAN due to stronger aliasing rules than C) • data-level parallelism (DLP)- simdization (SIMD single instruction multiple data) • exploit DLPs fine-grain parallelism across innermost loops or local code segments • provided by compilers How do I make my code fit a parallel environment Where can I find software stack modules that fit the parallel processing paradigm
Background Information Industry Standards: Embedded Multi-Core SW Development • Multi-Core Association • Standards Group specifically for the embedded space • Standard Categories: • Resource Management RAPI (vs pthreads) • Communication CAPI • Transparent Interprocess Communication TIPC • Multi-Core Debug Mechanisms • Should Linux adopt Posix-Threads? • pthreads (explicitly using …) • Java threads (explicitly using) • OpenMP, explicit parallel programming for shared memory models • MPI explicit parallel programming model for distributed memory models • Standards needed for communication between cores and OSes
Requirements & Recommendations Requirements for Embedded Multi-Core (1 of 2) • 85% of embedded developers using C/C++ • Embedded took a Long time to move from Assembler to C/C++ • Other parallel languages are non-starters: • Not a lot of new languages to be seen in embedded space in next 10 years • Embedded Space will take a lot longer than server space to adopt new languages • C/C++ code is hard to optimize for multi-core (hard to parallelize) • BUT C/C++ is all through-out the embedded space • There's a need for a short-term fix to make C/C++ more expressive • as well as a long-term solution with new languages and tools • Applications moving to multi-core will be existing applications • Biggest challenge to SW developer: how to partition an application • Issues getting bigger with number of course increasing • Dual-Core, Quad-Core, “many” core,…
Requirements & Recommendations Requirements for Embedded Multi-Core (2 of 2) • Load-balancing loosely coupled applications • Makes for an Easy exploitation of Multi-Core • How applicable is this concept in the embedded space? • Only about 6 percent of tools were ready for parallel chips in 2007 • Will only rise to 40 percent in 2011 (Source: VDC) • Some call this the “SW Gap” or even “SW Crisis” • Problem with Threads: “to thread or not to thread” • Sequential/serial approach to parallelization • New programming models such as the Actor Model • Virtual system prototypes could greatly ease embedded multi core design
Requirements & Recommendations New Programming Models • If not new programming languages, can new models be introduced? • Changing techniques could be as hard as developing new tools • Legacy methodologies are engrained as much as legacy code • Object Oriented approach combined with concurrency seem appealing • The Actor model: a mathematical model of concurrent computation that treats "actors" as the universal primitives of concurrent digital computation: in response to a message that it receives, an actor can make local decisions, create more actors, send more messages, and determine how to respond to the next message received. (also: multi-agent systems) • “Spectrum of issues” • sequential vs. concurrent; message passing; concurrent computation and petri-nets • There is really only room for one model of parallel programming • No one can afford to write applications in multiple models • Some of this will happen in Research • In Power.org we should focus on what can be implemented today • Deliverables may reflect some ideas from new programming models conceptually
Requirements & Recommendations Multi-Core Enablement: Known Issues, Discussion Points • Multi-Core Environments are Non-Deterministic (“Heisenbugs”) • Inter-processor Communication -> too many proposed standards • MPI, TIPC (multi core association) • Need Standards for communication between Cores and OSs • Lack of integrated tools • Licensing Issues – Maybe more of an issue for server space • per core, per processor • per PUV, per company • Compiler optimization versus binary level post-link optimization • Target: instruction cache, data cache utilization problems • will increase in future as L1 instruction and data cache per proc core not to grow • FDPR-Pro feedback directed post-link optimizer for Power (AIX/Linux) and Cell • improvement due to better cache utilization • embedded challenge: limited cache environment • similar to large server app like DB engine • gap between app code and data foot print and avail cache size increase
Requirements & Recommendations Summary of Requirements for Multi-Core Enablement • From Proprietary Specs to common standards • Standards needed for communication between Cores and OSes • Goal: Heterogeneous embedded distributed systems • Changes in development tools, Runtime SoftWare, Languages • Portability of SW required • ported/recompiled into new generations of same architecture • needs to be more or less automatic or semi-automatic • Creating Portable Applications • SMP Scaling beyond 6: Dual, quad, many • C/C++ coding • Need to convert legacy apps • Encapsulation, Modularity • Technical Standards for specific areas needed, in others too many • Time frames for Complex SW development matters.. A LOT!
Requirements & Recommendations Conclusions and Recommendations to Power.org • Build Multi-Core Enablement TSC • Narrow initial scope of TSC, i.e. limit to SW enablement • Can be expanded later by adding HW focus sub-teams • Collaborate with existing TSCs such as • Debug • Bus • SoC • Hypervisor • Simulation • Decided how to distribute Multi-Core work amongst TSCs • Central Mulit-Core work should happen in specialized Multi-Core TSC • Overlapping issues, i.e multi-core debug could find home in either TSC How to best guide the developer community to re-write their code to fit this environment
Requirements & Recommendations Suggestion for a Power.org Multi-Core TSC (1 of 2) • Use existing standards whenever possible, such as: • Multi Core Association Standards (RAPI, CAPI, TIPC), Open MP, MPI,… • Some possible deliverables for a Power Architecture Multi-Core enablement TSC • Programming guide for Power based embedded multi-core • Share Power Architecture specific practices • Recommendations on how to partition an application for Power Architecture • Highlight Power Architecture Benefits such as Out of order processing, processor affinity,… • Addressing common issues such as: • Cash coherency issues, False sharing • Run benchmarks, give recommendations on compiler flags etc to optimized multi core use • Recommendations on explicit manual programming versus exploiting compiler optimization We can either all fend for ourselves separately or we can join forces, identify common areas and find solutions that will help make multi-core successful for Power Architecture
Requirements & Recommendations Suggestion for a Power.org Multi-Core TSC (2 of 2) • Evaluate existing tools/environment • Performance analysis tools (FDPR-Pro), give tuning hints and tips • Tool chain elements, run-time sub-system elements, Integration of tools • Provide a “Framework” for Multi Core Software development for Power Architecture that • closes gaps identified in previous step • helps minimize explicit manual coding for Multi-Core • Looks into automatic workload balancing specifically for power • Looks into automatic parallelization, focusing on outer loops, other compiler optimizations • Enhances existing standards with specs/standards where needed, specifically • vendor based accelerator communications API such as: • Resource management API (also non-smp) • Standard Message passing API, (also non-smp, based on MCAPI, TIPC,…) • Provide guidance to Debug and Simulations TSC regarding multi-core considerations • Collaborate with both TSCs around Multi-Core • Decide where multi-core debug work needs to happen: CDI TSC or “new TSC for multi-core”? • Make Recommendations for Multi Core HW designers (to: PAC, Bus TSC, …) • Decide where to work on multi-core HW issues such as: (i.e. which TSC) • power management, hardware threading etc .
Requirements & Recommendations In Short • Power Architecture specific Multi-Core Programming Guide • Compiler optimizations • Evaluation of existing tools, Identification of Gaps • Power Architecture Framework for Multi-Core • Compiler work • APIs based on standards • Possibly incorporating ideas from new programming models • Recommendation/Collaboration to/with other power.org “constructs” • Debug TSC, Simulation TSC, Hyper visor TSC, Bus Architecture TSC, SoC TSC, PAC How do you migrate existing assets to a multi-core environment How do you optimize your solution for multi-core Where can you find software stack modules that fit the parallel processing paradigm How can power.org guide the development community in all this
Requirements, Recommendations Additional Thoughts • multi-core: DSP vs GPP. Divergence, convergence, now diverging again cause of multi-core • vmx and graphics, other special purpose processors? • Overarching standards effort particularly between graphics processor and DSP AND Cell??? • Parallelism and source code level • A new programming model to help developers better understand how to optimize their applications for parallel chips. Such a model would need to automate as much as possible while giving users override options and drill-down mechanisms • Mobile Industry Processor Interface, OpenMP, Posix and Power.org • I believe there will be new language constructs in C/C++ to support some of the new frameworks people will develop, but even these constructs, if we are not careful, will not be widely adopted • One of the new IMEC tools aims to hide the complexity of new memory hierarchies and interconnect fabrics increasingly used in multicore chips. Another tool quickly shows scaling benefits of a program without requiring the program to be debugged. • enable new styles of parallel optimization • micro-parallelization (shared L2) • Power Multi-Core OS; Independent Framework?? • SW Bundling??? • Linux to adopt Posix-Threads