1 / 21

High Performance Compute Cluster

High Performance Compute Cluster. Abdullah Al Owahid Graduate Student, ECE Auburn University. Topic Coverage. Cluster computer Cluster categories Auburn’s vSMP HPCC Software installed Accessing HPCC How to run simulations in HPCC Demo Performance Points of Contact. Cluster Computer.

Download Presentation

High Performance Compute Cluster

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Performance Compute Cluster Abdullah Al Owahid Graduate Student, ECE Auburn University A. Al Owahid: ELEC 5200-001/6200-001

  2. Topic Coverage • Cluster computer • Cluster categories • Auburn’s vSMP HPCC • Software installed • Accessing HPCC • How to run simulations in HPCC • Demo • Performance • Points of Contact A. Al Owahid: ELEC 5200-001/6200-001

  3. Cluster Computer • Multi Processor-Distributed Network • A computer cluster is a group of linked computer • Works together closely thus in many respects forming a single computer • Connected to each other through fast local area networks A. Al Owahid: ELEC 5200-001/6200-001

  4. Cluster Computer-Categories • High-availability (HA) clusters --They operate by having redundant nodes • Load-balancing clusters --Multiple computers are linked together to share computational workload • Compute clusters -- HPCC --Computational purposes --Cluster shares a dedicated network --Compute job uses one or few nodes, and needs little or no inter-node communication (grid computing) --Uses MPI or PVM (parallel virtual machine) A. Al Owahid: ELEC 5200-001/6200-001

  5. HPCC A. Al Owahid: ELEC 5200-001/6200-001

  6. Auburn’s vSMP HPCCSamuel Ginn College of Engineering Computational Cluster • Dell M1000E Blade Chassis Server Platform • 4 M1000E Blade Chassis Fat Nodes • 16 M610 half-height Intel dual socket Blade • 2CPU, Quad-core Nehalem 2.80 GHz processors • 24GB RAM, two 160GB SATA drives and • Single Operating System image (CentOS). A. Al Owahid: ELEC 5200-001/6200-001

  7. Auburn’s vSMP HPCC (contd..) • Each M610 blade server is connected internally to the chassis via a Mellanox Quad Data Rate (QDR) InfiniBand switch 40Gb/s for creation of the ScaleMP vSMP • Each M1000E Fat Node is interconnected via 10 GbE Ethernet using M6220 blade switch stacking modules for parallel clustering using OpenMPI/MPICH2 • Each M1000E Fat Node also has independent 10GbE Ethernet connectivity to the Brocade Turboiron 24X Core LAN Switch • Each node with 128 cores @ 2.80 GHz Nehalem • Total of 512 cores @ 2.80 GHz, 1.536TB shared memory RAM, and 20.48TB RAW internal storage A. Al Owahid: ELEC 5200-001/6200-001

  8. vSMP Scale MP • ScaleMP is the leader in virtualization for high-end computing • The innovative Versatile SMP (vSMP) architecture aggregates multiple x86 systems into a single virtual x86 system, delivering an industry-standard, high-end symmetric Multiprocessing (SMP) computer. • vSMP Foundation aggregates up to 16 x86 systems to create a single system with 4 to 32 processors (128 cores) and up to 4 TB of shared memory. A. Al Owahid: ELEC 5200-001/6200-001

  9. vSMP HPCC Configuration Diagram A. Al Owahid: ELEC 5200-001/6200-001

  10. Network Architecture A. Al Owahid: ELEC 5200-001/6200-001

  11. Software installed • Matlab (/export/apps/MATLAB) -- Parallel distributed computing toolbox with 128 workers • Fluent (/export/apps/Fluent.Inc) – 512 parallel license • LS Dyna (/export/apps/ls-dyna) – 128 parallel license • Starccm+ (/export/apps/starccm)--128 Parallel license • MPICH2 – Argonne National Laboratory /opt/mpich2-1.2.1p1 /opt/mpich2 A. Al Owahid: ELEC 5200-001/6200-001

  12. Accessing HPCC http://www.eng.auburn.edu/ens/hpcc/Access_information.html A. Al Owahid: ELEC 5200-001/6200-001

  13. How to run simulations in HPCC • Save .rhosts file in your home directory • Save .mpd.conf file in home directory • Your H:\ drive is already mapped • Add rsa keys by ssh compute-i and then exit, i=1,2,3,4 • mkdirfolder_name • In your script file add a line • #PBS –d /home/au_user_id/folder name obtained by “pwd” • Make the script executable “chmod 744 s_file.sh” • Submit the script using qsub “./script_file.sh” A. Al Owahid: ELEC 5200-001/6200-001

  14. Basic commands • showq • runjobjob_id • canceljobjob_id • pbsnodes –a • pbsnodes compute-1 • ssh compute-1 • ps –ef | grepany_process_you_want_to_see • pkillprocess_name • kill -9 your_ aberrant process_id • exit A. Al Owahid: ELEC 5200-001/6200-001

  15. Demo Live demo (25 minute) • Accessing cluster • Setting all the path and home space • Making changes in script based on requirement • Submitting multiple jobs • Obtaining the data • Viewing load • Tracing the processes A. Al Owahid: ELEC 5200-001/6200-001

  16. Performance A. Al Owahid: ELEC 5200-001/6200-001

  17. Performance (contd..) A. Al Owahid: ELEC 5200-001/6200-001

  18. Performance (contd..) A. Al Owahid: ELEC 5200-001/6200-001

  19. Points of Contact • James ClarkInformation Technology Master Specialist Email: jclark@auburn.edu • Shannon PriceInformation Technology Master Specialist Email: pricesw@auburn.edu • Abdullah Al Owahid Email: azo0012@auburn.edu A. Al Owahid: ELEC 5200-001/6200-001

  20. Thank You Question & Answer A. Al Owahid: ELEC 5200-001/6200-001

  21. References • http://en.wikipedia.org/wiki/Computer_cluster • http://www.eng.auburn.edu/ens/hpcc/index.html A. Al Owahid: ELEC 5200-001/6200-001

More Related