1 / 28

Evaluating GPU Passthrough in Xen for High Performance Cloud Computing

Evaluating GPU Passthrough in Xen for High Performance Cloud Computing. Andrew J. Younge 1 , John Paul Walters 2 , Stephen P. Crago 2 , and Geoffrey C. Fox 1 1 Indiana University 2 USC / Information Sciences Institute. Where are we in the Cloud?.

kiley
Download Presentation

Evaluating GPU Passthrough in Xen for High Performance Cloud Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating GPU Passthrough in Xen for High Performance Cloud Computing Andrew J. Younge1, John Paul Walters2, Stephen P. Crago2, and Geoffrey C. Fox1 1 Indiana University 2 USC / Information Sciences Institute

  2. Where are we in the Cloud? • Cloud computing spans may areas of expertise • Today, focus only on IaaS and the underlying hardware • Things we do here effect the entire pyramid! http://futuregrid.org

  3. Motivation • Need for GPUs on Clouds • GPUs are becoming commonplace in scientific computing • Great performance-per-watt • Different competing methods for virtualizing GPUs • Remote API for CUDA calls • Direct GPU usage within VM • Advantages and disadvantages to both solutions

  4. Front-end GPU API • Translate all CUDA calls into remote method invocations • Users share GPUs across a node or cluster • Can run within a VM, as no hardware is needed, only a remote API • Many implementations for CUDA • rCUDA, gVirtus, vCUDA, GViM, etc.. • Many desktop virtualization technologies do the same for OpenGL & DirectX http://futuregrid.org

  5. Front-end GPU API http://futuregrid.org

  6. Front-end API Limitations • Can use remote GPUs, but all data goes over the network • Can be very inefficient for applications with non-trivial memory movement • Usually doesn’t support CUDA extensions in C • Have to separate CPU and GPU code • Requires special decouple mechanism • Cannot directly drop in solution with existing solutions. http://futuregrid.org

  7. Direct GPU Passthrough • Allow VMs to directly access GPU hardware • Enables CUDA and OpenCL code • Utilizes PCI-passthrough of device to guest VM • Uses hardware directed I/O virt (VT-d or IOMMU) • Provides direct isolation and security of device • Removes host overhead entirely • Similar to what Amazon EC2 uses http://futuregrid.org

  8. Direct GPU Passthrough http://futuregrid.org

  9. Hardware Setup

  10. SHOC Benchmark Suite • Developed by Future Technologies Group @ Oak Ridge National Laboratory • Provides 70 benchmarks • Synthetic micro benchmarks • 3rd party applications • OpenCL and CUDA implementations • Represents well-rounded view for GPU performance http://futuregrid.org

  11. http://futuregrid.org

  12. http://futuregrid.org

  13. http://futuregrid.org

  14. http://futuregrid.org

  15. Initial Thoughts • Raw GPU computational abilities impacted less than 1% in VMs compared to base system • Excellent sign for supporting GPUs in the Cloud • However, overhead occurs during large transfers between CPU & GPU • Much higher overhead for Westmere/Fermi test architecture • Around 15% overhead in worst-case benchmark • Sandy-bridge/Kepler overhead lower http://futuregrid.org

  16. http://futuregrid.org

  17. http://futuregrid.org

  18. Discussion • GPU Passthrough possible in Xen! • Results show high performance GPU computation a reality with Xen • Overhead is minimal for GPU computation • Sandy-Bridge/Kepler has < 1.2% overall overhead • Westmere/Fermi has < 1% computational overhead, 7-25% PCIE overhead • PCIE overhead not likely due to VT-d mechanisms • NUMA configuration in Westmere CPU architecture • GPU PCI Passthrough performs better than other front-end remote API solutions http://futuregrid.org

  19. Future Work • Support PCI Passthrough in Cloud IaaS Framework – OpenStack Nova • Work for both GPUs and other PCI devices • Show performance better than EC2 • Resolve NUMA issues with Westmere architecture and Fermi GPUs • Evaluate other hypervisor GPU possibilities • Support large scale distributed CPU+GPU computation in the Cloud http://futuregrid.org

  20. Conclusion • GPUs are here to stay in scientific computing • Many Petascale systems use GPUs • Expected GPU Exascale machine (2020-ish) • Providing HPC in the Cloud is key to the viability of scientific cloud computing. • OpenStack provides an ideal architecture to enable HPC in clouds. http://futuregrid.org

  21. Thanks! Acknowledgements: About Me: Andrew J. Younge Ph.D Candidate Indiana University Bloomington, IN USA Email – ajyounge@indiana.edu Website – http://ajyounge.com http://portal.futuregrid.org • NSF FutureGrid project • GPU cluster hardware • FutureGrid team @ IU • USC/ISI APEX research group • Persistent Systems Graduate Fellowship • Xen open source community http://futuregrid.org

  22. Extra Slides http://futuregrid.org

  23. FutureGrid: a Distributed Testbed NID: Network Impairment Device PrivatePublic FG Network

  24. http://futuregrid.org

  25. OpenStack GPU Cloud Prototype http://futuregrid.org

  26. ~ 1.25%

  27. ~.64% ~3.62%

  28. Overhead in Bandwidth

More Related