NPACI Rocks: Easy Cluster Installation & Management Toolkit Overview

Quick Overview of NPACI Rocks Philip M. Papadopoulos Associate Director, Distributed Computing San Diego Supercomputer Center

Seed Questions • Do you buy-in installation services? From the supplier or a third-party vendor? • We integrate. Easier to have vendor integrate larger clusters • Do you buy pre-configured systems or build your own configuation? • Rocks is adaptable to many configurations • Do you upgrade the full cluster at one time or in rolling mode? • Suggest all at once (very quick with Rocks) can be done as a batch job. • Can support rolling, if desired. • Do you perform formal acceptance or burn-in tests? • Unfortunately, no. Need more automated testing.

Installation/Management • Need to have a strategy for managing cluster nodes • Pitfalls • Installing each node “by hand” • Difficult to keep software on nodes up to date • Disk Imaging techniques (e.g.. VA Disk Imager) • Difficult to handle heterogeneous nodes • Treats OS as a single monolithic system • Specialized installation programs (e.g. IBM’s LUI, or RWCPs Multicast installer) – • let Linux packaging vendors do their job • Penultimate • RedHat Kickstart • Define packages needed for OS on nodes, kickstart gives a reasonable measure of control. • Need to fully automate to scale out (Rocks gets you there)

Scaling out • Evolve to management of “two” systems • The front end(s) • Log in host • User’s home areas, passwords, groups • Cluster configuration information • The compute nodes • Disposable OS image • Let software manage node heterogeneity • Parallel (re)installation • Data partitions on cluster drives untouched during re-installs • Cluster-wide configuration files derived through reports from a MySQL database (DHCP, hosts, PBS nodes, …)

NPACI Rocks Toolkit – rocks.npaci.edu • Techniques and software for easy installation, management, monitoring and update of clusters • Installation • Bootable CD + floppy which contains all the packages and site configuration info to bring up an entire cluster • Management and update philosophies • Trivial to completely reinstall any (all) nodes. • Nodes are 100% automatically configured • Use of DHCP, NIS for configuration • Use RedHat’s Kickstart to define the set of software that defines a node. • All software is delivered in a RedHat Package (RPM) • Encapsulate configuration for a package (e.g.. Myrinet) • Manage dependencies • Never try to figure out if node software is consistent • If you ever ask yourself this question, reinstall the node

Rocks Current State – Ver. 2.1 • Now tracking Redhat 7.1 • 2.4 Kernel • “Standard Tools” – PBS, MAUI, MPICH, GM, SSH, SSL, … • Could support other distros … don’t have staff for this. • Designed to take “bare hardware” to cluster in a short period of time • Linux upgrades are often “forklift-style”. Rocks supports this as the default mode of admin • Bootable CD • Kickstart file for Frontend created from Rocks webpage. • Use same CD to boot nodes. Automated integration “Legacy Unix config files” derived from mySQL database • Re-installation (we have a single HTTP server, 100 Mbit) • One node: 10 Minutes • 32 nodes: 13 Minutes • Use multiple HTTP servers + IP-balancing switches for scale

More Rocksisms • Leverage widely-used (standard) software wherever possible • Everything is in RedHat Packages (RPM) • RedHat’s “kickstart” installation tool • SSH, Telnet (only during installation), Existing open source tools • Write only the software that we need to write • Focus on simplicity • Commodity components • For example: x86 compute servers, Ethernet, Myrinet • Minimal • For example: no additional diagnostic or proprietary networks • Rocks is a collection point of software for people building clusters • It evolving to include cluster software and packaging from more than just SDSC and UCB • <[your-software.i386.rpm] [your-software.src.rpm] here>

Rocks-dist • Integrate RedHat Packages from • Redhat (mirror) – base distribution + updates • Contrib directory • Locally produced packages • Local contrib (e.g. commerically bought code) • Packages from rocks.npaci.edu • Produces a single updated distribution that resides on front-end • Is a RedHat Distribution with patches and updates applied • Kickstart (RedHat) file is a text description of what’s on a node. Rocks automatically produces frontend and node files. • Different Kickstart files and different distribution can co-exist on a front-end to add flexibility in configuring nodes.

insert-ethers • Used to populate the “nodes” MySQL table • Parses a file (e.g., /var/log/messages) for DHCPDISCOVER messages • Extracts MAC addr and, if not in table, adds MAC addr and hostname to table • For every new entry: • Rebuilds /etc/hosts and /etc/dhcpd.conf • Reconfigures NIS • Restarts DHCP and PBS • Hostname is • <basename>-<cabinet>-<chassis> • Configurable to change hostname • E.g., when adding new cabinets

Configuration Derived from Database Automated node discovery mySQL DB Node 0 insert-ethers Node 1 makehosts makedhcp pbs-config-sql Node N /etc/hosts /etc/dhcpd.conf pbs node list

Remote re-installationShoot-node and eKV • Rocks provides a simple method to remotely reinstall a node • CD/Floppy used to install the first time • By default, hard power cycling will cause a node to reinstall itself. • Addressable PDUs can do this on generic hardware • With no serial (or KVM) console, we are able to watch a node as installs (eKV), but … • Can’t see BIOS messages at boot up • Syslog for all nodes sent to a log host (and to local disk) • Can look at what a node was complaining about before it went offline

Remote re-installationShoot-node and eKV 192.168.254.254 Remotely starting reinstallation on two nodes 192.168.254.253

Monitoring your cluster • PBS has a GUI called xpsmon. Gives a nice graphical view of up/down state of nodes • SNMP status • Use the extensive SNMP MIB defined by the Linux community to find out many things about a node • Installed software • Uptime • Load • Slow • Ganglia (UCB) – IP Multicast-based monitoring system • 20+ different health measures • I think we’re still weak here – learning about other activities in this area (e.g. ngop, CERN activities, City Toolkit)

Cern • Cern.ch/hep-proj-grid-fabric • Installation tools : wwwinfo.cern.ch/pdp

NPACI Rocks: Easy Cluster Installation & Management Toolkit Overview

NPACI Rocks: Easy Cluster Installation & Management Toolkit Overview

Presentation Transcript

NPACI Rocks Tutorial

NPACI Summer Institute

NPACI ROCKS Technology

Managing Configuration of Computing Clusters with Kickstart and XML using NPACI Rocks

Quick Overview of Bioinformatics

Quick Overview

Quick Overview

Using NPACI Systems Part I. NPACI Environment ( npaci/Resources )

TACC/NPACI IBM Regatta-HPC (Power4) Overview

Quick Overview of SafeHome

Quick overview

SDSC/NPACI Overview NPACI Parallel Computing Institute August 19, 2002

Using NPACI Systems Part I. NPACI Environment ( npaci/Resources )

Quick Overview of DARC

NPACI Rocks Tutorial

Quick Overview of the Course

Quick Overview of STL

NPACI HotPage Updates

Quick Overview of the Causes

Quick Overview