500 likes | 516 Views
Introduce XC hardware and software. Yu-Sheng Guo Account Support Consultant HP Taiwan. HP Unified Cluster ?. Computation. HP Integrity & ProLiant Servers. HP Cluster Platforms. Visualization. HP StorageWorks Scalable Data Management. Data Management. HP Workstations
E N D
Introduce XC hardware and software Yu-Sheng Guo Account Support Consultant HP Taiwan
HP Unified Cluster ? HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Computation HP Integrity & ProLiant Servers HP Cluster Platforms Visualization HP StorageWorks Scalable Data Management Data Management HP Workstations Scalable Visualization Advancing the power of computing with • Integrated scalable solutions spanning HP, partner and open source software • Choice of industry standard platforms, operating systems, interconnects • HP engineered and supported solutions that are easy to manage and use • Extensive contributions to open source software • Complete portfolio of HP, partner and open source development tools and applications Advancing the Power of Computing. HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
HP Unified Cluster Portfolio HPC Cluster Services HPC Application and Development/Grid Portfolio HP Scalable Visualization HP Scalable File Share Cluster Management ClusterPack Partner software Partner SW XC Cluster SW Common OperatingEnvironments HP-UX HP Cluster Platforms Integrity/ProLiant with choice of interconnects Windows Linux Windows Unified Cluster Portfolio for HPC HP-UX Linux HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Users High speed interconnect Lustre servers OST App nodes Inbound connections Object storage servers MDS XC compute cluster Log-ins Meta data servers Viz nodes Services Admin SEPIA scalable viz Scalable HA Storage Farm Service Nodes Pixel Network Multi-PanelDisplay Device Vision of HP Unified Cluster HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
What’s HP XC ? HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Best in class interconnect choice Linuxleadership AlphaServer SCexperience hp ProLiant & Integrityserver nodes HP XC Cluster HP configured and supported open source tools & utilities HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Factory pre-assembled hardware solution with optional software installation Includes nodes, interconnects, network, racks, etc. integrated & tested Configure to order from 5 node to 512 nodes (more by request) Uniform, worldwide specification and product menus Fully integrated, with HP warranty and support HP Cluster Platforms HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Why XC ? HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Why XC ? • Easy to order and configure • Customer inputs processor type, number of nodes, interconnect, and other key variables • HP Cluster Platform configuration tools drive final design, bill of materials, and manufacturing specifications • Single design and methodology drives quality and provides choice • Reference platform for testing and verification by HP and partners • Field Service teams trained for one “product” • Common platform supports multiple software and networking choices • Breakthrough price-performance • Combining ProLiant and Integrity systems with latest networking technologies and software solutions • Confidently ride the industry-standard price/performance curve HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Simplicity • Feature • HP developed and supported HP XC System Software • Client support for HP StorageWorks Scalable File Share, HP’s implementation of Lustre™ technology • Benefit • Single-system-image simplicity for unprecedented levels of ease of use, productivity, and scalability; and standard supported comprehensive software environment • High-bandwidth, high performance, coherent scalable file system—support for HP StorageWorks Scalable File Share (SFS) HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Agility • Feature • Leadership HPC cluster expertise • Wide range of 32-bit and 64-bit computing platforms • Industry-leading, breakthrough performance with Integrity and ProLiant hardware • Integration of industry’s top selling resource manager, Platform Computing's LSF • Benefit • Expert implementation, design support, and training • Customers choose best hardware to meet HPTC needs • High performance scalability • Easy and popular user interface for job scheduling; robust capabilities for system management and resource allocation HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Value • Feature • Alliances in open source and commercial application communities • High-speed Quadrics ELAN and Myricom Myrinet, Voltaire InfiniBand, and Gigabit Ethernet interconnects • Integration of HP-MPI • Benefit • Superior support, maintenance, and optimum performance of the Linux kernel • Optimal interconnect performance and throughput of parallel applications • Transparent interface to interconnect libraries enables support for multiple applications HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Support and Consulting, Training, On site Staffing HP Services Extensive portfolio of tested applications Applications Validated selection of Compilers, Math Libraries, Debuggers, Profiling Tools Compilers, Development tools Linux OS Cluster Manager Job Scheduler (LSF) HP MPI HP MLIB XC System Software HP Cluster Platform Nodes Networks Storage HP Delivers a Complete Solution HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC Hardware HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
InfiniBandInterconnect Control Nodes Storage Network Compute Nodes Branch Switches Root Switches Admin Network Console Network HP XC system architecture HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC4000 Server • ProLiant DL145G2 • AMD Optron 2.0GHz Dual Core * 2 • 4GB Memory • SATA 160GB • Dual Gigabit Ethernet • InfiniBand 4X HCA • Remote Control (Lights-Out 100i) HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC4000 Server • ProLiant DL385 • AMD Optron 2.0GHz Dual Core * 2 • 8GB Memory • SCSI 146GB * 4 with Raid 1+0 • Dual Gigabit Ethernet • InfiniBand 4X HCA • Remote Control (iLO) HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
InfiniBand Switch • ISR9096 Switch • sLB24 24 port Line Board * 4 • sFB4 Fabric Board * 4 • sMB Management Board * 2 HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
InfiniBand HCA • HCA400 • Dual port 4X (10Gbps) • PCI-Express or PCI-X HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
1 2 3 4 3 4 2 1 sLB24 sFB4 sLB24 ISR9096 Topology HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
ProCurve Switch 2848 • ProCurve 2848 • 48 Port Gigabit • ProCurve 2650 • 48 Port 10/100 • 2 Port Gigabit HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Flat, not subnet based HeadNode AdminNetwork Private Admin Network HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Individual I/O nodesattached to the adminswitch starting at thehighest numbered portand descending Switches (Application cabinets)attached to the admin switch inascending order starting with 1 Root switch I/O Cab App Cab App Cab App Cab App Cab Admin Network Topology HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
The consoles forindividual root nodesattached to the adminswitch starting at thehighest numbered portand descending Switches (Application cabinets)attached to the console root switch in ascending order starting with 1 Console Switch Utility Cab App Cab App Cab App Cab App Cab Console Network Topology HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Connection to top level Admin switch Procurve 2650 Procurve 2848 Port numbering 123456789......n n......987654321 Admin / GBE Console / 10/100 Branch Switch Server nodes HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC Software HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Base Linux OS • XC System Software includes an OS which is compatible with Red Hat Enterprise Linux Advanced Server 3.0 • Track RH progress for enhancement in the base OS such as support for IPv6 HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Nagios Kickstart XCConfig SystemImager Syslog-ng SuperMon Nagiosplug-ins pdsh Installation Configuration Monitoring DistributedCommand XC Database mySQL Red Hat Compatible Distribution HP XC Management Stack Orange components are open tools configured / adapted for use by XC. HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC Software Stack Overview HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Cluster system management Objective • XC Installation • Provide an automated installation with minimal user interaction • Provide a default installation for the user • Provide for easy and repeatable installations • XC Configuration • Provide initial cluster configuration and subsequent run-time updates of all nodes • Provide for maximum automation, and minimal user-intervention • Provide for easily and readily repeatable configurations • XC Monitoring • Monitoring of both system performance metrics like CPU and memory utilization, as well as environmental data such as fan, temperature, and power-supply status • Parallelized service checks • scalable logging facility HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
SystemFiles HeadNode SystemImagerGolden Client XCDistribution XCDB AdminNetwork Kickstart SystemImagerPropagation SystemFiles HeadNode XC Installation • Distribution • Distribution delivered on one DVD • Default Kickstart file included on DVD • Discovery • Initial cluster discovery • All required configuration data gathered up front, via simple UI • Configuration data written to system database, for subsequent retrieval by various utilities • Head node configuration • User-chosen roles, plus typically setup as initial DHCP server, DB server • Node replication environment (systemimager) configured • “golden client” • Image server HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC Configuration • Nodeconfiguration • All other nodes are image installed in a two phase process • Phase-1: node is generically imaged • Phase-2: per-node personality is applied using configuration data from either from a cached DB copy sent with each image. • At the end of this process, all nodes have been rebooted, and configured with their respective personality • Local copy of Linux on each node allows for very scalable fast booting. • Large machine should boot in less than half an hour • Actual node booting • Starting of services – slurm, nagios, syslog-ng • Mounting of SFS file systems HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Mgmt directives DB Head Node Proxy demons Mgmt Hub Mgmt Hub Mgmt Hub Communication API Database API … Fans Fans Power Power Plug-in Modules Metrics Metrics … … Plug-in shared code XC Monitoring Managed Nodes • Other integrations: • Diagnostics • Storage • Console • Service tools • Events • Command invocation HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC Monitoring - Nagios • An open source host, service and network monitoring program • The monitoring daemon runs intermittent checks on hosts and services specified for XC using external "plugins" which return status information to Nagios. • When problems are encountered, the daemon can send notifications out to administrative • Current status information, historical logs, and reports can all be accessed via a web browser HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
SystemFiles Nagios HeadNode XCDB Syslog-ng Supermon Nagios AdminNetwork XC Monitoring HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Global Commands - pdsh • High-performance, parallel remote shell utility • On an XC it is layered on ssh • Data aggregation HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Centralize logging syslog-ng • Scalable hierarchical scheme • Filtering to allow selective logging of items of interest • syslog-ng configured for distributed logging • Define a strategy for rotation and viewing • syslog-ng can provide aggregation as events are received. HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Power Management and Console • Cluster startup and shutdown • Power control for specified nodes • Power status for specified nodes • Node identification capability • Access system console of specified node HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
LVS LSF Slurm NAT HP-MPI MLIB NetworkInbound JobScheduling Resource Management Launching NetworkOutbound MPI Math Library Red Hat Compatible Distribution XC User / Application Stack Orange components are open tools configured / adapted for use by XC. HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC Resource Management • SLURM and LSF • Scalability • Handling of STDIO, signals, etc… • LSF is a scheduler which manages the user workload. • SLURM allocates the cluster resources for use by LSF. HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
What is SLURM ? • Simple Linux Utility for Resource Management • Arbitrates requests by managing queue of pending work • Allocates access to computer nodes within a cluster • Launches parallel jobs and manages them (I/O, signals, limits, etc.) • NOT a comprehensive cluster administration or monitoring package • NOT a sophisticated scheduling system • An external entity can manage the SLURM queues via plugin HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
Why SLURM ? • Simple • Scheduling complexity external to SLURM (LSF, PBS) • Open source: GPL • Fault-tolerant • For SLURM daemons and its jobs • Secure • Restricted user access to compute nodes • System administrator friendly • Simple configuration file • Scalable to thousands of nodes HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
SLURM Architecture One daemon per node Cluster-wide control daemon Slurmd Slrumctld (primary) Slurmd Slrumctld (backup) Slurmd Slurmd Slurmd srun sinfo squeue scontrol scancel User and administrator tools HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
XC LSF Job Management • Optimal Resources Usage under Fairshare • Advance Reservations • Resource Allocation Limits HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
What is the “Top 500”? • International register of worlds fastest supercomputers • Located at http://www.top500.org • Updated twice per year (June & November) • Position based on Linpack results (Rmax not Rpeak) • #1 is currently 136.8 Tflops (65536 IBM P4 processors!) • #500 is currently 1.166 Tflops • HP first entry number 12 (LANL at 13.88Tflops) • 26.2% of entries are HP. HP is no 2 for entries. • HP & IBM total 78% of list • Very important to some customers ! HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
What is a Flop? • Floating Point OPerations per second • industry standard for measuring a servers raw floating point mathematical power • Flops=(speed in hz) x (# of CPU) x (floating point operations per cycle) • Xeon & Opteron (2 ops per cycle) Itanium (4 ops per cycle) • For example a dual 3.2Ghz processor DL360 equates to 3.2Ghz x 2cpu x 2 operations = 12.8Gflops Rpeak • A Gigaflop is 1000,000,000 flops • A Teraflop is 1000 Gflops • Rating of cluster = node Rpeak x number of nodes • The formula gives theoretical (Rpeak) performance only HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
What is linpack? • The Linpack Benchmark is a measure of a computer’s floating-point rate of execution • It is determined by running a computer program that solves a dense system of linear equations • The benchmark result gives us the Rmax of the cluster • Rpeak (theoretical) divided by Rmax (actual) gives us the cluster efficiency • Results are effected by many things including: • Processor architecture • Server architecture • Available memory • Interconnect speed & latency • Compiler efficiency • Maths library efficiency • Etc … HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
HP Web Link • HP HPTC • http://www.hp.com/go/hptc • XC Information • http://h20311.www2.hp.com/HPC/cache/275435-0-0-0-121.html • XC Manual • http://docs.hp.com/en/linuxhpc.html • EVA • http://h50007.www5.hp.com/enterprise/product/storage/arraysystem.asp HP XC Cluster = A simple, scalable, and flexible Linux cluster solution
HP XC Cluster = A simple, scalable, and flexible Linux cluster solution