The CC – GRID? Era Infinite processing, storage, and bandwidth @ zero cost and latency

The CC – GRID? EraInfinite processing, storage, and bandwidth @ zero cost and latency Gordon Bell (gbell@microsoft.com) Bay Area Research Center Microsoft Corporation

deja’ vu • ARPAnet: c1969 • To use remote programs & data • Got FTP & mail. Machines & people overloaded. • NREN: c1988 • BW => Faster FTP for images, data • Latency => Got http://www… • Tomorrow => Gbit communication BW, latency • <’90 Mainframes, minis, PCs/WSs • >’90 very large, dep’t, & personal clusters • VAX: c1979 one computer/scientist • Beowulf: c1995 one computer/scientist • 1960s batch: opti-use allocate, schedule,$ • 2000s GRID: opti-use allocate, schedule, $ (… security, management, etc.)

Some observations • Clusters are purchased, managed, and used as a single, one room facility. • Clusters are the “new” computers. They present unique, interesting, and critical problems… then Grids can exploit them. • Clusters & Grids have little to do with one another… Grids use clusters! • Clusters should be a good simulation of tomorrow’s Grid. • Distributed PCs: Grids or Clusters? • Perhaps some clusterable problems can be solved on a Grid… but it’s unlikely. • Lack of understanding clusters & variants • Socio-, political, eco- wrt to Grid.

Some observations • GRID was/is an exciting concept … • They can/must work within a community, organization, or project. What binds it? • “Necessity is the mother of invention.” • Taxonomy… interesting vs necessity • Cycle scavenging and object evaluation (e.g. seti@home, QCD) • File distribution/sharing aka IP theft (e.g. Napster, Gnutella) • Databases &/or programs and experiments(astronomy, genome, NCAR, CERN) • Workbenches: web workflow chem, bio… • Single, large problem pipeline… e.g. NASA. • Exchanges… many sites operating together • Transparent web access aka load balancing • Facilities managed PCs operating as cluster!

Grids: Why? • The problem or community dictates a Grid • Economics… thief or scavenger • Research funding… that’s where the problems are

In a 5-10 years we can/will have: • more powerful personal computers • processing 10-100x; multiprocessors-on-a-chip • 4x resolution (2K x 2K) displays to impact paper • Large, wall-sized and watch-sized displays • low cost, storage of one terabyte for personal use • adequate networking? PCs now operate at 1 Gbps • ubiquitous access = today’s fast LANs • Competitive wireless networking • One chip, networked platforms e.g. light bulbs, cameras • Some well-defined platforms that compete with the PC for mind (time) and market sharewatch, pocket, body implant, home (media, set-top) • Inevitable, continued cyberization… the challenge… interfacing platforms and people.

SNAP … c1995Scalable Network And PlatformsA View of Computing in 2000+We all missed the impact of WWW! Gordon Bell Jim Gray

How Will Future Computers Be Built? Thesis: SNAP: Scalable Networks and Platforms • Upsize from desktop to world-scale computer • based on a few standard components Because: • Moore’s law: exponential progress • Standardization & Commoditization • Stratification and competition When: Sooner than you think! • Massive standardization gives massive use • Economic forces are enormous

Volume drives simple,cost to standardplatforms p e r f o r m a n c e Stand-alone Desk tops PCs

Legacy mainframes & minicomputers servers & terms Portables Legacy mainframe & minicomputer servers & terminals ComputingSNAPbuilt entirelyfrom PCs Wide-area global network Mobile Nets Wide & Local Area Networks for: terminal, PC, workstation, & servers Person servers (PCs) scalable computers built from PCs A space, time (bandwidth), & generation scalable environment Person servers (PCs) Centralized & departmental uni- & mP servers (UNIX & NT) Centralized & departmental servers buit from PCs ??? TC=TV+PC home ... (CATV or ATM or satellite)

SNAP Architecture----------

Modern scalable switches … also hide a supercomputer • Scale from <1 to 120 Tbps • 1 Gbps ethernet switches scale to 10s of Gbps, scaling upward • SP2 scales from 1.2

Interesting “cluster” in a cabinet • 366 servers per 44U cabinet • Single processor • 2 - 30 GB/computer (24 TBytes) • 2 - 100 Mbps Ethernets • ~10x perf*, power, disk, I/O per cabinet • ~3x price/perf • Network services… Linux based *42, 2 processors, 84 Ethernet, 3 TBytes

ISTORE Hardware Vision • System-on-a-chip enables computer, memory, without significantly increasing size of disk • 5-7 year target: • MicroDrive:1.7” x 1.4” x 0.2” 2006: ? • 1999: 340 MB, 5400 RPM, 5 MB/s, 15 ms seek • 2006: 9 GB, 50 MB/s ? (1.6X/yr capacity, 1.4X/yr BW) • Integrated IRAM processor • 2x height • Connected via crossbar switch • growing like Moore’s law • 16 Mbytes; ; 1.6 Gflops; 6.4 Gops • 10,000+ nodes in one rack! 100/board = 1 TB; 0.16 Tf

14" The Disk Farm? or a System On a Card? The 500GB disc card An array of discs Can be used as 100 discs 1 striped disc 50 FT discs ....etc LOTS of accesses/second of bandwidth A few disks are replaced by 10s of Gbytes of RAM and a processor to run Apps!!

Increased Demand Increase Capacity(circuits & bw) Create new service Lower response time WWW Audio Video Voice! The virtuous cycle of bandwidth supply and demand Standards Telnet & FTP EMAIL

Map of Gray Bell Prize results Redmond/Seattle, WA single-thread single-stream tcp/ip via 7 hops desktop-to-desktop …Win 2K out of the box performance* New York Arlington, VA San Francisco, CA 5626 km 10 hops

The Promise of SAN/VIA:10x in 2 years http://www.ViArch.org/ • Yesterday: • 10 MBps (100 Mbps Ethernet) • ~20 MBps tcp/ip saturates 2 cpus • round-trip latency ~250 µs • Now • Wires are 10x faster Myrinet, Gbps Ethernet, ServerNet,… • Fast user-level communication • tcp/ip ~ 100 MBps 10% cpu • round-trip latency is 15 us • 1.6 Gbps demoed on a WAN

1st, 2nd, 3rd, or New Paradigm for science? Labscape

Labscape

Courtesy of Dr. Thomas Sterling, Caltech

Lessons from Beowulf • An experiment in parallel computing systems • Established vision- low cost high end computing • Demonstrated effectiveness of PC clusters for some (not all) classes of applications • Provided networking software • Provided cluster management tools • Conveyed findings to broad community • Tutorials and the book • Provided design standard to rally community!* • Standards beget: books, trained people, software … virtuous cycle* *observations Courtesy, Thomas Sterling, Caltech.

The EndHow can GRIDs become a non- ad hoc computer structure?Get yourself an application community!

The CC – GRID? Era Infinite processing, storage, and bandwidth @ zero cost and latency

The CC – GRID? Era Infinite processing, storage, and bandwidth @ zero cost and latency

Presentation Transcript