10 likes | 135 Views
CPU. FPGA/ GPU. Heterogeneous & Reconfigurable Computing Research Group. G1. G3. G4. G2. G1. G3. g5. G2. http://herc.cse.sc.edu College of Engineering and Computing, University of South Carolina, Columbia. G5. G5. G6. Memory. Memory. G4. G6. Co-Processor based System. G4. G1.
E N D
CPU FPGA/ GPU Heterogeneous & Reconfigurable Computing Research Group G1 G3 G4 G2 G1 G3 g5 G2 http://herc.cse.sc.edu College of Engineering and Computing, University of South Carolina, Columbia G5 G5 G6 Memory Memory G4 G6 Co-Processor based System G4 G1 G5 G2 G3 g5 DOMAINS Co-PROCESSOR HOST (GPP) Fluid Dynamics FPGA Accelerated Applications Data Mining Apps GPU Modeling Network Security • Exploiting GPU Architecture as a co-processor • Characterizing GPU’s viability for application development • Simulator development for performance tuning CUDA code operating at intermediate language Computational Biology • Phylogenetic Reconstruction & Genome Analysis • Gene Order Median Computations • Tree search via direct optimization and Bayesian Analysis • Accelerated Gene Rearrangement Analysis Numerical Computations Multi-FPGA Systems • Distributed processing system • Predictive Load Balancing • Reduce network congestion due to packet blocking in order to maintain/ increase the interconnect capacity • Abstracted, Scalable & High Capacity • Effective for linear, fan-in, linear fan-in, diamond traffic patterns • Double Precision Accumulator • Towards solving fundamental accumulation operation without overly complex architecture or data scheduling • Sparse Matrix-Vector Multiplication • For solving large systems of linear equations derived from physical applications such as molecular and fluid dynamics Heterogeneous Computing A heterogeneous computing system is a blend of general purpose computer processors along with specialized co-processors. These co-processors, which include Field Programmable Gate Arrays (FPGAs) and Graphics Processor Units (GPUs), require specialized programming techniques but perform certain types of computations very fast and efficiently. When specific portions of a computer program are adapted to run on these co-processors, the overall system can achieve much higher performance and efficiency even than a traditional high-performance computer system containing racks upon racks of general-purpose processors. Publications • A High-Performance Double Precision Accumulator, ICFPT ’09 • FPGA vs. GPU for Sparse Matrix Vector Multiply, ICFPT ‘09 • An Integrated Reduction Technique for a Double Precision Accumulator, HPRCTA ’09 (SC09) • Exploiting Matrix Symmetry to Improve FPGA-Accelerated Conjugate Gradient, FCCM ‘09 • A Special-Purpose Architecture for Solving the Breakpoint Median Problem, IEEE Trans. On VLSI Dec 2008 • FPGA Acceleration of Phylogeny Reconstruction for Whole Genome Data, BIBE ‘08 • FPGA Acceleration of Gene Rearrangement Analysis, FCCM ‘07 • Predictive Load Balancing for Interconnected FPGAs, FPL ‘06 • A Reconfigurable Distributed Computing Fabric Exploiting Multilevel Parallelism, FCCM ‘06 ARCHITECTURES • Off-load Computationally Intensive portion of apps to • Field Programmable Gate Arrays (FPGAs) • Graphics Processing Units (GPUs) Research Supported by NSF fund #s CCF-0915608 and CCF-0844951