140 likes | 146 Views
This project aims to make fundamental changes in designing and constructing large-scale systems to keep up with the rapidly growing performance requirements. It explores novel system design concepts in the cluster paradigm, leveraging a single-chip "Killer Switch" for fast and scalable communication. The project also focuses on intelligent network interfaces, parallel applications, and distributed file management.
E N D
Berkeley Cluster Projects David E. Culler culler@cs.berkeley.edu http://now.cs.berkeley.edu/ 11/23, 1998
Goals • Make a fundamental change in how we design and construct large-scale systems • market reality: • 50%/year performance growth => cannot allow 1-2 year engineering lag • technological opportunity: • single-chip “Killer Switch” => fast, scalable communication • Highly integrated building-wide, campus-wide systems • Explore novel system design concepts in this new “cluster” paradigm
Comm. Software Comm. Software Comm.. Software Comm. Software Network Interface Hardware Network Interface Hardware Network Interface Hardware Network Interface Hardware Fast Communication Challenge • Fast processors and fast networks • The time is spent in crossing between them Killer Platform ° ° ° ns ms µs Killer Switch
P P P P P P P Opening: Intelligent Network Interfaces • Dedicated Processing power and storage embedded in the Network Interface • An I/O card today • Tomorrow on chip? Mryicom Net 160 MB/s Myricom NIC M M I/O bus (S-Bus) 50 MB/s M M $ M $ $ $ Sun Ultra 170 $
NOW System Architecture Parallel Apps Large Seq. Apps Sockets, Split-C, MPI, HPF, vSM Global Layer UNIX Process Migration Distributed Files Network RAM Resource Management UNIX Workstation UNIX Workstation UNIX Workstation UNIX Workstation Comm. SW Comm. SW Comm. SW Comm. SW Net Inter. HW Net Inter. HW Net Inter. HW Net Inter. HW Fast Commercial Switch (Myrinet)
Communication Performance Direct Network Access Latency 1/BW • LogP: Latency, Overhead, and Bandwidth • Active Messages: lean layer supporting programming models
World-Record Disk-to-Disk Sort • Sustain 500 MB/s disk bandwidth and 1,000 MB/s network bandwidth
Massive Cheap Storage • Basic unit: 2 PCs double-ending four SCSI chains Currently serving Fine Art at http://www.thinker.org/imagebase/
Cluster of SMPs (CLUMPS) • Four Sun E5000s • 8 processors • 3 Myricom NICs • Multiprocessor, Multi-NIC, Multi-Protocol
Information Servers • Basic Storage Unit: • Ultra 2, 300 GB raid, 800 GB tape stacker, ATM • scalable backup/restore • Dedicated Info Servers • web, • security, • mail, … • VLANs project into dept.
Millennium Computational Community Business SIMS BMRC Chemistry C.S. E.E. Biology Gigabit Ethernet Astro NERSC M.E. Physics N.E. Math IEOR Transport Economy C. E. MSME
Millennium PC Clumps • Inexpensive, easy to manage Cluster • Replicated in many departments • Prototype for very large PC cluster
Proactive Infrastructure Information appliances Stationary desktops Scalable Servers