220 likes | 242 Views
Learn about the Hotfoot High-Performance Computing Cluster expansion in March 2011, with increased nodes, storage, and performance. Explore architecture, networking, and performance evaluations. Contact for inquiries.
E N D
Hotfoot HPC Cluster March 31, 2011
Topics • Overview • Execute Nodes • Manager/Submit Nodes • NFS Server • Storage • Networking • Performance
Overview - Hotfoot Pilot • Launched May 2009 • Original Partnership • Astronomy • Statistics • CUIT • Office of the Executive Vice President for Research
Overview - Hotfoot Expansion • Expanded March 2011 • More Nodes • More Storage • Changed Scheduler • New Participant • Social Science Computing Committee (SSCC)
Overview – Cluster Components • 52 Execute Nodes • 520 Total Cores • 2 Manager Nodes • 1 NFS Server (1 Cold Spare) • 52 TB Storage (72 TB Raw)
Manager/Submit Nodes • HP DL360 G5, 4 GB RAM • Torque Resource Manager (OpenPBS descendent) • Maui Cluster Scheduler • User Access via virtual interface (vif) • Failover via Torque High Availability (HA)
NFS Servers • Primary • HP DL360 G7 • 2 x 4 cores • 16 GB RAM • Backup • HP DL360 G5 • 1 x 2 cores • 8 GB RAM
Storage • HP P2000 Storage Array • 32 x 2 TB Drives • RAID 5 • ~52 TB Usable
Networking • Execute Nodes • Channel-bonding mode 2 (load-balancing and fault tolerance) • 1 Gb connection to chassis switches • Usage records suggested this was sufficient
Networking Sample Traffic for an Execute Node
Networking • Chassis • Each chassis has four Cisco 3020 switches • 1 Gb connection to Edge switches • Usage records suggested this was sufficient
Networking Sample Traffic for a Chassis Switch
Networking Original Chassis, Showing Network Connections for Two Servers
Performance • Concern about the ability of NFS to handle i/o demands. • Reviewed performance of pilot system. • Ran tests on expanded system.
Performance Memory Usage on Old NFS Server
Performance Load Average on Old NFS Server
Performance Test Program #include <stdio.h> #define MILLION 1000000 int main(intargc, char *argv[]) { int max, i; max = 100 * MILLION; for(i = 0; i < max; ++i) { printf("%d\n", i); } }
Questions? • Questions? • Comments? • Contact: roblane@columbia.edu