180 likes | 278 Views
Scalable Cluster Management: Frameworks, Tools, and Systems. David A. Evensky Ann C. Gentile Pete Wyckoff Robert C. Armstrong Robert L. Clay Ron Brightwell. Sandia National Laboratories. Lilith: a tool framework for very large clusters.
E N D
Scalable Cluster Management:Frameworks, Tools, and Systems David A. Evensky Ann C. Gentile Pete Wyckoff Robert C. Armstrong Robert L. Clay Ron Brightwell Sandia National Laboratories
Lilith: a tool framework for very large clusters • Most current tools for clusters are designed as monolithic programs, to do one task well. • If you need a new task, you need a new tool. • The Lilith framework allows users to easily construct new tools using a component framework.
10sec 16min Control of large distributed systems • System administration • Auditing & job control by users • Interrogation of processes • Simple Applications 1 sec program on 1000 nodes
Lilith spans a tree of machines executing user-defined code. User code (Lilim/Lilly) provides component functionality on a single node Provides scalable distribution, result collection Lilith: Scalable component framework
Component Methods • MO[] distributeOnTree(MO, int[]) • data distribution down the tree • MO onTree(MO) • component action on the node • MO collateOnTree(MO[]) • result collection and condensation
LilithHost Keys Policy Security Uses purely Java 2 mechanisms at this time…. User sends credential with call LilithHost creates ProtectionDomain from user credential LilithHost calls checkPermission Sandbox setup similarly using the User credential and PolicyFile Method invocation
System monitoring tool to track the state of a cluster of machines PS-tool to get sortable process information from selected nodes of the cluster. Prototypical tools
Lilith Lights tool • Snake toy app • demo that draws a snake over front panel • no global repository for state --- all info distributed • Snake’s movement was limited to left half of machine • program error in declaration of drand48() biased results
Who serves who? • Programmers adapt to: • The OS that runs on the machine, • The system configuration chosen by the admins • Changing system environments • economically driven to heterogeneous distributed computing • Why can’t the user dictate the software environment as a resource request?
DASE • Dynamically Adaptive Software Environment • Provide multi-OS/multi-environment capability • Manage multiple SW environments • “save” user environment for reuse later • Integration with SW component architectures
DASE Service Object Model Logicalpartitioning Physical system “system”model Resource Space Scheduler Mesher Mapping App Object- resource spec - data/map objects Partitioner Resource Request Visualizer Solver App Space
8 Myrinet LAN cables Power controller Terminal server compute 16 port Myrinet switch Power controller Terminal server compute 16 port Myrinet switch compute compute compute compute To system support network compute compute compute compute compute compute compute compute service sss0 service 100BaseT hub 100BaseT hub Myrinet power serial Ethernet Scalable Unit
Admin access Master copy of system software sss1 sss0 sss0 sss0 In-use copy of system software In-use copy of system software In-use copy of system software node node node node node node NFS mount root from SSS0 NFS mount root from SSS0 NFS mount root from SSS0 node node node Scalable Unit Scalable Unit Scalable Unit node node node System Support Hierarchy
Hardware Management • Discovery and Control • Perl scripts that • control individual devices (power controller, terminal server, machine, switch) • build a database of configuration info (MAC and IP addresses, serial numbers, etc.) • Roles • database is augmented with each components role in the system (compute, sss0, terminal server, etc.)
“Virtual Machines” • Allows arbitrary grouping of scalable units that use the same system software • Operations to update system software and boot nodes, scalable units, or machines • Updates system software on an SU in 1 min. • Update system software on 24 SUs in 1.5 min. • Boot an SU in 5 min. (staged for power drain) • Boot 24 SUs in 10 min.
Production SU configuration database Uses rdist to push system software down Alpha sss1 Beta Linux 2.3 sss0 sss0 sss0 In-use copy of system software In-use copy of system software In-use copy of system software node node node node node node NFS mount root from SSS0 NFS mount root from SSS0 NFS mount root from SSS0 node node node Scalable Unit Scalable Unit Scalable Unit node node node “Virtual Machines”
http://dancer.ca.sandia.govhttp://www.cplant.ca.sandia.govhttp://www.cs.sandia.gov/cplanthttp://dancer.ca.sandia.govhttp://www.cplant.ca.sandia.govhttp://www.cs.sandia.gov/cplant