280 likes | 441 Views
Single-Chip Heterogeneous Computing Does the Future Include Custom Logics, FPGA, and GPGPUs?. Presented by Kittisak Sajjapongse. Introduction to the study. Objective of the study. Observe the trends of integrating unconventional cores (U-cores) into single-chip multicores
E N D
Single-Chip Heterogeneous ComputingDoes the Future Include Custom Logics, FPGA, and GPGPUs? Presented by KittisakSajjapongse
Objective of the study • Observe the trends of integrating unconventional cores (U-cores) into single-chip multicores • Identify the factors that impact decision to have U-cores Introduction to the study
Model in the study Symmetric - Multiple fast complex cores (FastCore) - Highly optimized to minimize latency of single thread • Asymmetric • One fast complex core (FastCore) • Multiple simple cores (BCE) • Intended to handle application which has parallelism • Heterogeneous • One fast complex core (FastCore) • U-cores: ASICs, FPGAs, GPGPUs • We are going to study about U-cores Introduction to the study
ASIC, FPGA, and GPGPU • ASIC(Application-Specific Integrated Circuit) • A device or integrated circuit customized for specific application domains e.g. H264 codec, JPEG codec etc. • FPGA(Field Programmable Gate Array) • A configurable digital integrated circuit capable for supporting hardware architectures • GPGPU(General-Purpose Graphic Processing Unit) • Graphics devices that provides APIs (Application Programming Interface) for using with parallelizable application Introduction to the study
ASIC, FPGA, and GPGPU They all are used to exploit parallelism!!! Introduction to the study
What is the study about ? • Constains • Power • Bandwidth • Questions posed Under bandwidth- and power- constrains • Would single-chip multicores benefit significatly from U-cores ? • Would ASICs be the best choice ? Introduction to the study
What is BCE? • Baseline Core Equivalent • Referred to a basic processor • Used as baseline reference for performance and power consumption Model for U-core
What is BCE? • Two parameters used later • n : number of total BCE available • r : number of resources dedicated to complex cores (in a unit of BCE) Model for U-core
Amdahl’s Law Reference: http://en.wikipedia.org/wiki/Amdahl_law Model for U-core
Hill & Marty’s extended Amdahl’s Law Reference: M. D. Hill et al., “Amdahl’s Law in the Multicore Era,” Computer Model for U-core
How about Heterogeneous arch.? ? SpeedupHeterogeneous (??)= ??? Under Power & Bandwidth constrains Model for U-core
Deriving model for U-core SpeedupAmdahl = f(f,n) SpeedupHill&Marty= f(f,n,r) SpeedupHet.(U-core) = f(f,n,r,B,P,µ,φ) New Parameters: B – Memory Bandwidth of U-core (in unit of BCE compulsory bandwidth) P – Active Power of U-core relative to BCE µ – Performance of U-core relative to BCE Φ – Power efficiency of U-core relative to BCE Model for U-core
Deriving model for U-core 1 Speeduphet(U-core)= Speedupasym(offload)= Speedupasymmetric= 1-f f + perf(r) perf(r) + µ( n - r n - r ) Model for U-core
Devices & Workload Device: Workload: - Dense Matrix Multiplication (MMM) - Fast Fourier Transform (FFT with various input size 24 to 220) - Black-Scholes (BS) Obtaining µ,φ for U-core
Obtained Parameters Obtaining µ,φ for U-core
Answering the questions • Would single-chip multicores benefit significatly from U-cores ? • Yes , If the application has enough (>90%) parallelism to exploit. • Would ASICs be the best choice ? • Depends on applications, if there is not much parallelism, then ASIC might not be worth to implement.
Conclusions • Sufficient parallelism must exists to significantly obtain performance improvement from U-core • Flexible U-cores tend to be competitive to ASIC under limited bandwidth and limited parallelism • U-core such as ASIC is useful when power is the primary goal