240 likes | 434 Views
Performance Analysis of Virtualization for High Performance Computing. A Practical Evaluation of Hypervisor Overheads. Matthew Cawood University of Cape Town. Overview. Background Research Objectives HPC Virtualization Performance Tuning The Cloud Cluster Benchmarks Results
E N D
Performance Analysis of Virtualization forHigh Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape Town
Overview • Background • Research Objectives • HPC • Virtualization • Performance Tuning • The Cloud Cluster • Benchmarks • Results • Conclusions
1. Background • BSc (Eng) final year research project • CHPC Advanced Computer Engineering (ACE) Lab • Cloud Cluster is currently being commissioned • Research focused on evaluating the hardware and software configurations
2. Research Objectives • Present an in-depth report on the current technologies being developed in the field of High Performance Computing. • Provide a quantitative performance analysis of the costs associated with Virtualization, specifically in the field of HPC.
3. High Performance Computing • HPC data centres are rapidly growing in size and complexity • Currently emphasis is placed on improving efficiency and utilization • Wide selection of applications/requirements • Bioinformatics • Astrophysics • Simulation • Modelling
5. Performance Tuning • Host memory reservation of Linux huge pages • KVM vCPU pinning to improve NUMA cell awareness
6. The Cloud Cluster Compute Nodes: • 2x Intel Xeon E5-2690, 20MB L3 cache, 2.90 GHz • 256GB, DDR3-1600, CL11 • Mellanox ConnectX-3 VPI FDR 56Gbps HCA • Gigabit Ethernet NIC Switch Infrastructure: • Mellanox SX6036 FDR 36 port Infiniband Switch
6. The Cloud Cluster • CentOS 6.4 • OFED 2.0 (with SR-IOV) • OpenNebula 4.2
7. Performance Benchmarks • HPC Challenge • HPLinpack • MPI Random Access • STREAM • Effective bandwidth & latency • OpenFOAM • 7 million cell, 5 millisecond transient simulation • snappyHexMesh
8.1 Software Comparison HPLinpack throughput comparison of compiler selection
8.2 Single Node Evaluation MPI Random Access Performance HPLinpack throughput efficiency of virtual machines STREAM Memory Bandwidth
8.3 Cluster Evaluation HPLinpack throughput efficiency of virtual machines
8.3 Cluster Evaluation OpenFOAM runtime efficiency of virtual machines
8.4 Interconnect Evaluation Native Verbs Vs. IP over Infiniband Typical IPoIB Latency of virtual machines Typical Verbs Latency of virtual machines
8.5 Supplementary Tests • Intel® Hyper-threading HPLinpack throughput
8.5 Supplementary Tests Virtual machine Scaling OpenFOAM runtime
9. Conclusions • KVM typically provides good performance for HPC • Tuning is necessary to further improve performance • Efficiency is highly application dependant • SR-IOV for Infiniband effectively reduced I/O Virtualization overheads • Synthetic and real-world results often contradict