1 / 35

Lecture 11: Unix Clusters

Lecture 11: Unix Clusters. Asoc. Prof. Guntis Barzdins Asist. Girts Folkmanis University of Latvia Dec 10, 2004. Moore’s Law - Density. Moore's Law and Performance. The performance of computers is determined by architecture and clock speed.

sue
Download Presentation

Lecture 11: Unix Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 11:Unix Clusters Asoc.Prof. Guntis Barzdins Asist. Girts Folkmanis University of Latvia Dec 10, 2004

  2. Moore’s Law - Density

  3. Moore's Law and Performance • The performance of computers is determined by architecture and clock speed. • Clock speed doubles over a 3 year period due to the scaling laws on chip. • Processors using identical or similar architectures gain performance directly as a function of Moore's Law. • Improvements in internal architecture can yield better gains cf Moore's Law.

  4. Future of Moore’s Law • Short-Term (1-5 years) • Will operate (due to prototypes in lab) • Fabrication cost will go up rapidly • Medium-Term (5-15 years) • Exponential growth rate will likely slow • Trillion-dollar industry is motivated • Long-Term (>15 years) • May need new technology (chemical or quantum) • We can do better (e.g., human brain) • I would not close the patent office

  5. Different kinds of PC cluster • High Performance Computing Cluster • Load Balancing • High Availability

  6. High Performance Computing Cluster (Beowulf) • Start from 1994 • Donald Becker of NASA assemble the world’s first cluster with 16 sets of DX4 PCs and 10 Mb/s ethernet • Also called Beowulf cluster • Built from commodity off-the-shelf hardware • Applications like data mining, simulations, parallel processing, weather modelling, computer graphical rendering, etc.

  7. Examples of Beowulf cluster • Scyld Cluster O.S. by Donald Becker • http://www.scyld.com • ROCKS from NPACI • http://www.rocksclusters.org • OSCAR from open cluster group • http://oscar.sourceforge.net • OpenSCE from Thailand • http://www.opensce.org

  8. Cluster Sizing Rule of Thumb • System software (Linux, MPI, Filesystems, etc) scale from 64 nodes to at most 2048 nodes for most HPC applications • Max socket connections • Direct access message tag lists & buffers • NFS / storage system clients • Debugging • Etc • It is probably hard to rewrite MPI and all Linux system software for O(100,000) node clusters

  9. Apple Xserve G5 with Xgrid Environment • Alternative to Beowulf PC cluster • Server node + 10 compute nodes • Dual CPU G5 processors (2 GHz, 1 GB memory) • Gigabit ethernet inter-connectivity • 3 TB XServe RAID array • Xgrid offers ‘easy’ pool-of- processors computing model • MPI available for heritage code

  10. Xgrid agents Xgrid Computing Environment • Suitable for loosely coupled distributed computing • Controller distributes tasks to agent processors (tasks include data and code) • Collects results when agents finish • Distributes more chunks to agents as they become free and join cluster/grid Xgrid controller Server storage Xgrid client

  11. Xgrid Work Flow

  12. Cluster Status Offline  turned off Unavailable  turned on, but busy w/ other non-cluster tasks Working  computing on this cluster job Available  waiting to be assigned cluster work

  13. Rocky’s Tachy Tach Cluster Status Displays Tachometer illustrates total processing power available to cluster at any time. Level will change if running on a cluster of desktop workstations, but will stay steady if monitoring a dedicated cluster

  14. Load Balancing Cluster • PC cluster deliver load balancing performance • Commonly used with busy ftp and web servers with large client base • Large number of nodes to share load

  15. High Availability Cluster • Avoid downtime of services • Avoid single point of failure • Always with redundancy • Almost all load balancing cluster are with HA capability

  16. Examples of Load Balancing and High Availability Cluster • RedHat HA cluster • http://ha.redhat.com • Turbolinux Cluster Server • http://www.turbolinux.com/products/tcs • Linux Virtual Server Project • http://www.linuxvirtualserver.org/

  17. High Availability Approach:Redundancy + Failover • Redundancy eliminates Single Points Of Failure (SPOF) • Auto detect Failures (hardware, network, applications) • Automatic Recovery from failures(no human intervention)

  18. Real-Time Disk Replication (DRDB)Distributed Replicating Block Device

  19. Tivoli System Automation (TSA) for Multi-Platform Proprietary IBM Solution Used across all eServers, ia32 from any vendor Available on Linux, AIX, OS/400 Rules Based Recovery System Over 1000 licenses since 2003 IBM Supported Solutions • Linux-HA (Heartbeat) • Open Source Project • Multiple platform solution for IBM eServers, Solaris, BSD • Packaged with several Linux Distributions • Strong focus on ease-of-use, security, simplicity, low-cost • > 10K clusters in production since 1999

  20. HPCC Cluster and parallel computing applications • Message Passing Interface • MPICH (http://www-unix.mcs.anl.gov/mpi/mpich/) • LAM/MPI (http://lam-mpi.org) • Mathematical • fftw (fast fourier transform) • pblas (parallel basic linear algebra software) • atlas (a collections of mathematical library) • sprng (scalable parallel random number generator) • MPITB -- MPI toolbox for MATLAB • Quantum Chemistry software • gaussian, qchem • Molecular Dynamic solver • NAMD, gromacs, gamess • Weather modelling • MM5 (http://www.mmm.ucar.edu/mm5/mm5-home.html)

  21. MOSIX and openMosix • MOSIX: MOSIX is a software package that enhances the Linux kernel with cluster capabilities. The enhanced kernel supports any size cluster of X86/Pentium based boxes. MOSIX allows for the automatic and transparent migration of processes to other nodes in the cluster, while standard Linux process control utilities, such as 'ps' will show all processes as if they are running on the node the process originated from. • openMosix: openMosix is a spin off of the original Mosix. The first version of openMosix is fully compatible with the last version of Mosix, but is going to go in its own direction.

  22. MOSIX architecture (3/9) Preemptive process migration any user’s process, trasparently and at any time, can migrate to any available node. The migrating process is divided into two contexts: • system context (deputy) that may not be migrated from “home” workstation (UHN); • user context (remote) that can be migrated on a diskless node;

  23. MOSIX architecture (4/9) Preemptive process migration master node diskless node

  24. Multi-CPU Servers

  25. Benchmark - Memory 4x DIMM 1GB DDR266 Avent Techn. 4x DIMM 1GB DDR266 Avent Techn. 1x Stream:2x Stream:4x Stream: 2x Opteron, 1.8 GHz, HyperTransport: 1006 – 1671 MB/s 975 – 1178 MB/s 924 – 1133 MB/s 2x Xeon, 2.4 GHz, 400 MHz FSB: 1202 – 1404 MB/s 561 – 785 MB/s 365 – 753 MB/s

  26. Sybase DBMS Performance

  27. Multi-CPU Hardware and Software

  28. Service Processor (SP) • Dedicated SP on-board • PowerPC based • Own IP name/address • Front panel • Command line interface • Web-server • Remote administration • System status • Boot/Reset/Shutdown • Flash the BIOS

  29. Unix Scheduling

  30. Process Scheduling • When to run scheduler • Process create • Process exit • Process blocks • System interrupt • Non-preemptive – process runs until it blocks or gives up CPU (1,2,3) • Preemptive – process runs for some time unit, then scheduler selects a process to run (1-4)

  31. Solaris Overview • Multithreaded, Symmetric Multi-Processing • Preemptive kernel - protected data structures • Interrupts handled using threads • MP Support - per cpu dispatch queues, one global kernel preempt queue. • System Threads • Priority Inheritance • Turnstiles rather than wait queues

  32. Linux • Today Linux scales very well in SMP systems up to 4 CPU’s. • Linux on 8 CPU’s is still competitive, but between 4way and 8way systems the price per CPU increases significantly. • For SMP systems with more than 8 CPU’s, classic Unix systems are the best choice. • With Oracle Real Application Clusters (RAC),small 4 or 8way systems can be clustered to cross the today’s Linux limitations. • Commodity, inexpensive 4way Intel boxes, clustered with Oracle 9i RAC, help to reduce TCO.

More Related