210 likes | 317 Views
System Simulation Of 1000-cores Heterogeneous SoCs. Shivani Raghav Embedded System Laboratory (ESL) Ecole Polytechnique Federale de Lausanne (EPFL). Price profile N. Price profile 1. Load profile N. Load profile 1. w. $. $. w. now. now. now. now.
E N D
System Simulation Of 1000-cores Heterogeneous SoCs Shivani Raghav Embedded System Laboratory (ESL) EcolePolytechniqueFederale de Lausanne (EPFL)
Price profile N Price profile 1 Load profile N Load profile 1 w $ $ w now now now now ESL Work on Energy-Aware Datacenter Design Datacenter infrastructure PMSM: Power/Therm. Manager IPS IPS IPS communic. Internet IPS network IPS Grid Load profile 1, 2 and 3 New server cooling tech. IPS System Simulationfor many-core SIMinG-1k
Emerging Data-Intensive Workloads Financial Simulations Medical Imaging Cloud Servers Monte Carlo Simulations Molecular Dynamics Gene Sequencing Online Gaming Services
Demand for Hardware Acceleration Hybrid Cores AMD Fusion (on-chip) GPU Clusters (off –chip Accelerators) Tile based Manycores Intel SCC, Tile 64 (Integrated)
Urgent Need for Simulation of Heterogeneous SoCs Thermal & Power Evaluations Design Space Exploration Simulation Benchmarking Profiling Debugging Early Software Development
How to Design a Fast and Scalable Many-Core Simulator? Parallel Target Parallel Simulator Parallel Host
Simulating Parallel Target on Parallel Host WWT II Graphite Cotson, OVPSim Flexus RAMP Opportunity Large Parallel Systems FPGA GPGPU is an Old Technology…
Target Architecture Switch Memory Caches Core Data-Parallel Coprocessors Simple In-order Cores 1000s of cores in a tile network Fine grain parallelism
Solution – Accelerating Simulation using GPGPUs Target Architecture Host Platform A Perfect Match
Outline • Problem Overview Simulation of Heterogeneous SoCs • Solution SIMinG-1k: A GPU accelerated simulator • Evaluation • Summary
Overall Simulation Framework Data Parallel Code Sequential Code Application General Purpose CPU Target Architecture Many-Core Accelerator SIMinG-1k Simulator Host Platform
SIMinG-1k - Features • Instruction Accurate • Inexpensive and EasilyAvailable • FastDevelopment Cycle • Equation Performance Model • Portability (Target Independent) • Interpretation basedcore-simulation
Challenges of using GPU as a host SIMT(Single inst multiple threads) Divergent Code isa problem Synchronizationoutside thread block Slow CPU-GPU communication Global Memory is slow and limited
Outline • Problem Overview Simulation of Heterogeneous SoCs • Solution SIMinG-1k (GPU accelerated simulator) • Evaluation • Summary
Results – Architecture 1 Data Scratchpad ARM ISA Inst Scratchpad Single tileof target Accelerator MIPS - Number of simulated instruction in host wall clock time
Speed Up – Architecture 1 Speedup compared to simulation on OVPSim (thousands of ARM cores)
Results – Architecture 2 Switch Single tile of Data-parallel Accelerator (cores, caches, on-chip interconnect) Memory Caches Core
Speed Up – Architecture 2 Speedup compared to serial simulation on QEMU
Outline • Problem Overview Simulation of Heterogeneous SoCs • Solution SIMinG-1k (GPU accelerated simulator) • Evaluation • Summary
Conclusion Future Work • Extend the simulator for thermal and power evaluations • Complete simulation of Cloud Data Centers • Challenge Fast and parallel simulator for heterogeneous SoCs • Solution Parallelize 1000 core simulation using GPUs • Design Full System Simulation using QEMU and SIMinG-1k • Results High Scalability and speedup upto 4096 cores
Thanks! Questions?