190 likes | 204 Views
Explore the impact of Grids and Clusters in the era of evolving technologies through insights, examples, and future considerations for improved performance and transparency. Discover the potential benefits, challenges, and investment opportunities in utilizing Grid platforms efficiently.
E N D
The CC – GRID? EraCCGSC 2002 Gordon Bell (gbell@microsoft.com) Bay Area Research Center Microsoft Corporation
Observations from a mostly Grid workshop • Clusters. Let’s finish the job! • Grids generally. • Grids as arbitrary cluster platforms…why? • Examples of Grid-types, especially web services • Summary…
Blades aka a “cluster in a cabinet” • 366 servers per 44U cabinet • Single processor • 2 - 30 GB/computer (24 TBytes) • 2 - 100 Mbps Ethernets • ~10x perf*, power, disk, I/O per cabinet • ~3x price/perf • Network services… Linux based *42, 2 processors, 84 Ethernet, 3 TBytes
Clusters aren’t as bad as programs make them out to be, but we need to make them work better and be more transparent. • Everything is becoming a cluster. Certainly all of 500! • 64 bit addressing will cause more change! • Future nodes should bet on CLMP smP’s (p = 4-32) .Utilize existing and emerging smP’s nodes versus assuming lcd PM-pairs & MPI. • Massive gains from compiler and runtime. ES has set a new standard of efficiency and system transparency for “clusters”. • Expand the MPI programming model: • Full transparency of MPI needs to be the goal • Objectify for greater flexibility and greater insulation from latency
Grids: If they are the solution what’s the problem? • Economics… thief, scavenger, power, efficiency or resource sharing? • Research funding… that’s where the money is • Are they where the problems lie? • Does massive collaboration that the Grids enable, create massive overhead and generally less output?Unless the output is for a community! • Is funding and middleware a good investment?
Same observations as 2000 X • GRID was/is an exciting concept … • They can/must work within a community, organization, or project. Apps need to drive. • “Necessity is the mother of invention.” • Taxonomy… interesting vs necessity • Cycle scavenging and object evaluation (e.g. seti@home, QCD) • File distribution/sharing for IP theft e.g. Napster • Databases &/or programs for a community(astronomy, bioinformatics, CERN, NCAR) • Workbenches: web workflow chem, bio… • Exchanges… many sites operating together • Single, large objectified pipeline… e.g. NASA. • Grid as a cluster platform! Transparent & arbitrary access including load balancing Web SVCs
Grid nj. An arbitrary distributed, cluster platform A geographical and multi-organizational collection of diverse computers dynamically configured as cluster platforms responding to arbitrary, ill-defined jobs “thrown” at it. • Costs are not necessarily favorable e.g. disks are less expensive than cost to transfer data. • Latency and bandwidth are non-deterministic, thereby changing cluster characteristics • Once a large body of data exists for a job, it is inherently bound to (set into) fixed resources. • Large datasets & I/O bound programs need to be with their data or be database accesses… • But are there resources there to share? • Bound to cost more?
Bright spots… near term, user focus, a lesson for Grid suppliers • Tony Hey apps-based funding. Web services based Grid & data orientation. • David Abramson - Nimrod. • Parameter scans… other low hanging fruit • Encapsulate apps! “Excel”-- language/control mgmt. • “Legacy apps are programs that users just want, and there’s no time or resources to modify code …independent of age, author, or language e.g. Java.” • Andrew Grimshaw - Avaki • Making Legion vision real. A reality check. • Lip 4 pairs of “web services” based apps • Gray et al Skyservice and Terraservice • Goal: providing a web service must be as easy as publishing a web page…and will occur!!!
SkyServer: delivering a web service to the astronomy community. Prototype for other sciences? Gray, Szalay, et al First paper on the SkyServer http://research.microsoft.com/~gray/Papers/MSR_TR_2001_77_Virtual_Observatory.pdf http://research.microsoft.com/~gray/Papers/MSR_TR_2001_77_Virtual_Observatory.doc Later, more detailed paper for database community http://research.microsoft.com/~gray/Papers/MSR_TR_01_104_SkyServer_V1.pdf http://research.microsoft.com/~gray/Papers/MSR_TR_01_104_SkyServer_V1.doc
What can be learned from Sky Server? • It’s about data, not about harvesting flops • 1-2 hr. query programs versus 1 wk programs based on grep • 10 minute runs versus 3 day compute & searches • Database viewpoint. 100x speed-ups • Avoid costly re-computation and searches • Use indices and PARALLEL I/O. Read / Write >>1. • Parallelism is automatic, transparent, and just depends on the number of computers/disks. • Limited experience and talent to use dbases.
Heuristics for building communities that need to share data & programs • Always go from working to working • Do it by induction in time and space(Why version 3 is pretty good.) • Put ONE database in place that’s useful by itself in terms of UI, content, & queries • Invent and demo 10-20 instances of use • Get two working in a single location • Extend to include a second community, with an appropriate superset capability
You can GREP 1 GB in a minute You can GREP 1 TB in 2 days You can GREP 1 PB in 3 years. 1PB ~10,000 >> 1,000 disks At some point you need indices to limit searchparallel data search and analysis Goal using dbases. Make it easy to Publish: Record structured data Find data anywhere in the network Get the subset you need! Explore datasets interactively Database becomes the file system!!! You can FTP 1 MB in 1 sec. You can FTP 1 GB / min. … 2 days and 1K$ … 3 years and 1M$ Some science is hitting a wallFTP and GREP are not adequate (Jim Gray)
Network concerns • Very high cost • $(1 + 1) / GByte to send on the net; Fedex and 160 GByte shipments are cheaper • DSL at home is $0.15 - $0.30 • Disks cost less than $2/GByte to purchase • Low availability of fast links (last mile problem) • Labs & universities have DS3 links at most, and they are very expensive • Traffic: Instant messaging, music stealing • Performance at desktop is poor • 1- 10 Mbps; very poor communication links • Manage: trade-in fast links for cheap links!!
Gray’s $2.4 K, 1 TByte Sneakernet aka Disk Brick Cost to move a Terabyte Cost, time, and speed to move a Terabyte Cost of a “Sneaker-Net” TB • We now ship NTFS/SQL disks. • Not good format for Linux. • Ship NFS/CIFS/ODBC servers (not disks). • Plug “disk” into LAN. • DHCP then file or DB serve… • Web Service in long term Courtesy of Jim Gray, Microsoft Bay Area Research
Cost, time of Sneaker-net vs Alts Courtesy of Jim Gray, Microsoft Bay Area Research
Grids: Real and “personal”Two carrots, one downside. A bet. • Bell will match any Gordon Bell Prize (parallelism, performance, or performance/cost) winner’s prize that is based on “Grid Platform Technology”. • I will bet any individual or set of individuals of the Grid Research community up to $5,000 that a Grid application will not win the above by SC2005.
The EndHow can GRIDs become a real, useful, computer structure?Get a life. Adopt an application community!Success if CCGSC2004 is the last…by making Grids ubiquitous.