180 likes | 589 Views
OpenSSI A Single System Image Linux Cluster Project Openssi.org Dr. Bruce J. Walker HP Fellow Keerthi Bhushan K N Senior Systems Engineer, HP. Why Clusters, why not SMP ?. Performance Availability Price/performance Scaling. Types of Clusters. Availability. Scalability.
E N D
OpenSSI A Single System Image Linux Cluster Project Openssi.org Dr. Bruce J. Walker HP Fellow Keerthi Bhushan K N Senior Systems Engineer, HP
Why Clusters, why not SMP ? • Performance • Availability • Price/performance • Scaling
Availability Scalability Manageability Usability Properties of clusters Ideal/Perfect Cluster in all dimensions SMP HA Cluster OpenSSI Linux Cluster Project log scale HUGE ReallyBIG
What is SSI ? • Cluster-wide view of files, ipc objects, processes, users, devices and services - single root • Strong Membership • Single management domain • Single external view of cluster • Consistent kernel on each node
OpenSSI Cluster ProjectAvailability • Automatic failover/restart of services in the event of hardware or software failure • Automatic failover/restart of applications in the event of node failure • Automatic filesystem failover • Automatic “root” node failover being worked on
OpenSSI Cluster Project Price / Performance Scalability • More usable memory, processors, networking, devices • More resources to share • Clones can co-ordinate easily • Load balancing of connections and processes • Can mix hardware • Distributed OS algorithms written to scale to hundreds of nodes (and successfully demonstrated to 133 blades and 27 Itanium SMP nodes)
OpenSSI Linux Cluster -Manageability • Single Installation • Online addition of new nodes • Standard standalone system tools for cluster administration • Cluster-wide resource visibility • New cluster-specific commands • Linux commands extended for cluster
OpenSSI Linux Cluster - Ease of Use • Can run anything anywhere with no setup • Install applications only once • Service failover/restart is trivial • Automatic or manual load balancing • libcluster.a - cluster APIs
Uniprocessor or SMP node Uniprocessor or SMP node Users, applications, and systems management Users, applications, and systems management Standard OS kernel calls Standard OS kernel calls Extensions Extensions Standard Linux 2.4 kernelwith SSI hooks Standard Linux 2.4 kernelwith SSI hooks Modular kernel extensions Modular kernel extensions Devices Devices IP-based interconnect How Does SSI Clustering Work? Other nodes
Component Contributions to OpenSSI Cluster Project Lustre Appl. Avail. CLMS OGFS Mosix Beowulf Vproc DLM LVS OCFS IPC DRBD CFS ICS EVMS OpenSSI Cluster Project Load Leveling HP contributed Open source and integrated To be integrated
OpenSSI components - communication CLMS, ICS • Cluster Management subsystem (CLMS) • "Node-down" detections and processing • Membership management • Internodes Communication Subsystem (ICS) • Kernel-to-kernel transport • Provides service daemons and queues for all subsystems • RPC, request/response and messaging interfaces
OpenSSI components - file systemCFS, OpenGFS, Lustre • CFS layered over physical file system (ext2, ext3, xfs, optionally on volume mgmt.) • Single “/” • Cluster-wide mounts • CFS layered over tmpfs • OpenGFS and Lustre support also available
OpenSSI components - process subsystemVproc • Single pid space but allocate locally • Transparent access to all processes on all nodes • Processes can migrate during execution • /proc/<pid>/goto and migrate system call • rfork, rexec, onnode, onall, fastnode commands • process part of /proc is systemwide (so ps & debuggers “just work” systemwide) • Implemented via a virtual process (Vproc) architecture • Work in progress on providing “view” for a process
OpenSSI components - load levellingOpenMOSIX, LVS • Connection load levelling (LVS) • Process load levelling (OpenMOSIX algorithms) • “migrate” command • /proc/cluster/loadlevellist • Decentralized load-balancing decisions
OpenSSI components - ipc • ipc objects (pipes, fifo's and ptys inclusive) are locally created • Cluster-wide ipc namespace (nameserver) • Cluster-wide ipc object access
Component information • LVS - Linux Virtual Server http://www.LinuxVirtualServer.org • DLM - Distributed Lock Manager http://sourceforge.net/projects/opendlm • Beowulf / MPICH / LAMPI / PBS / MUAI/… http://www.beowulf.org • DRBD - Distributed Replicated Block Device http://drbd.cubit.at • EVMS - Enterprise Volume Management System http://sourceforge.net/projects/evms/
Conclusions • OpenSSI provides a common cluster framework for all forms of clustering • OpenSSI simultaneously addresses availability, scalability, manageability and usability • OpenSSI is usable in production now Come see a demo in the HP booth