1 / 21

Exploring ARM CPU Benchmarking for High-Performance Computing

Delve into ARM CPU benchmarking for enhanced performance evaluation of advanced RISC machine processors in various applications and industries, from smartphones to supercomputers.

wilmaw
Download Presentation

Exploring ARM CPU Benchmarking for High-Performance Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ARM CPU Benchmarking Robert Reed | University of the Witwatersrand

  2. Introduction 2 Quad Cores in the Samsung S4 • What is an ARM Processor • Advanced RISC Machine • Where is it being used • 95% of Smartphone Market • Consumer Products • Supercomputers (K-Supercomputer Japan - RISC) • Why use them? • Power efficient • Low capital cost

  3. Introduction Solution Problem China’s Tianhe-2 • How are we going to use ARMs • High-Throughput Supercomputer • Large numbers of ARMs = many cores = parallel

  4. Benchmark Selection • Characterising the ARM architecture • The main factors to look at: • CPU • Cache • RAM • Connectivity

  5. Benchmark Selection • CoreMark by EEMBC • Supported by ARM Holdings • Uses common algorithms • Strict submission rules

  6. Results - Coremark

  7. Benchmark Selection • High Performance LINPACK by Jack Dongarra • Used in TOP500 list

  8. Understanding the benchmarks • Improvement: Almost 30% • Compile time: Approx 22 hours • Math Libraries • Linear Algebra Package (ATLAS)

  9. Results - HPL

  10. Results - HPL

  11. Conclusion • Great for high throughput • Energy Efficient • Need better performance • GPU co processing • Problem Specific

  12. ARM CPU Benchmarking

  13. ARM CPU Benchmarking BACKUP SLIDES

  14. Layout A7 Cubieboard X5 Switch Server INTERNET A9 X5 Wandboard A15 X1 ODroid Physical Layout

  15. CoreMark Flags

  16. CoreMark Comparisons

  17. Results - HPL Scalability – 4x A9

  18. Results - HPL Block Size

  19. Understanding the benchmarks • NB test for the A9 • Dependant on array size • Larger array = large NB • Dependant on whole system • Changing Block Allocation NB

  20. Results - HPL Multi Precision

  21. Results - HPL Power Measurements

More Related