220 likes | 372 Views
Distributed Computing Economics. Jim Gray Microsoft Research gray@microsoft.com Presentation To Microsoft Venture Capital Summit 28 April 2004. Distributed Computing Economics. Why is Seti@Home a great idea? Why is Napster a great deal? Why is the Computational Grid uneconomic?
E N D
Distributed Computing Economics Jim Gray Microsoft Research gray@microsoft.com Presentation To Microsoft Venture Capital Summit 28 April 2004
Distributed Computing Economics • Why is Seti@Home a great idea? • Why is Napster a great deal? • Why is the Computational Grid uneconomic? • When does computing on demand work? • What is the “right” level of abstraction? • Is the Access Grid the real killer app? Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24 http://research.microsoft.com/research/pubs/view.aspx?tr_id=655
Computing Is Free • Computers cost 1k$ (if you shop right) (yes, there are 1μ$ to 1M$ computers, but..) • So 1 cpu day = 1$ (computers last 3 years) • If you pay the phone bill, internet bandwidth costs 50…500$/mbps/m (not including routers and management) • So 1GB costs 1$ to send and 1$ to receive Caveat: All numbers rounded to nearest factor of 3.
Why Is Seti@Home A Good Deal? • Send 300 KB: Costs 3e-4$ • User computes for ½ day: Benefit .5e-1$ • ROI: 1500:1
Seti@HomeThe worlds most powerful computer • 67 TF is sum of top 4 of Top 500 • 67 TF is 9x the number 2 system • 67 TF more than the sum of systems 2...10
Why Was Napster A Good Deal? • Send 5 MB costs 5e-3$ ½ a penny per song • Both sender and receiver can afford it • Same logic powers web sites (Yahoo!...) • 1e-3$/page view advertising revenue • 1e-5$/page view cost of serving web page • 100:1 ROI
Computing Equivalents1$ buys • 1 day of cpu time • 4 GB (fast) ram for a day • 1 GB of network bandwidth • 1 GB of disk storage for 3 years • 10 M database accesses • 10 TB of disk access (sequential) • 10 TB of LAN bandwidth (bulk) • 10 KWhrs == 4 days of computer time Depreciating over 3 years, and there are about 1k days in 3 years.
Some Consequences • Beowulf networking is 10,000x cheaper than WAN networking factors of 105 matter • The cheapest and fastest way to move Terabytes cross country is sneakernet24 hours = 4 MB/s50$ shipping vs 1,000$ wan cost • Sending 10PB CERN data via network is silly: buy disk bricks in Geneva, fill them, ship them TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data Exchange Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBerg Microsoft Technical Report may 2002, MSR-TR-2002-54 http://research.microsoft.com/research/pubs/view.aspx?tr_id=569
Computational Grid Economics • To the extent that computational grid is like Seti@Home or ZetaNet or Folding@home or…it is a great thing • The extent that the computational grid is MPI or data analysis, it fails on economic grounds: move the programs to the data, not the data to the programs • The Internet is not the cpu backplane • An alternate reality: Nearly free networking • Telcos go bankrupt and price=cost=0 • Taxpayers pay your phone bill so price=0 and telcos receive a BIG government subsidy
IF instruction density > 100,000 instructions/byteAND remote computer is free (costs you nothing)THEN ROI > 0ELSE ROI < 0 When To Export A Task
Computing On Demand • Was called outsourcing/service bureaus in my youth. CSC and IBM did it • It is not a new way of doing things: think payroll. Payroll is standard outsourced service • Now Hotmail, Salesforce.com, Oracle.com,… • Works for standard apps • COD works for commoditized services • Airlines outsource reservations. Banks outsource ATMs • But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t outsource their core competence
What’s The Right Abstraction Level For Internet Scale Distributed Computing? • Disk block? No too low • File? No too low • Database? No too low • Application? Yes, of course • Blast search • Google search • Send/Get eMail • Portals that federate astronomy archives(http://skyQuery.Net/) • Web Services (.NET, EJB, OGSA) give this abstraction level
Access Grid • Q: What comes after the telephone? • A: eMail? • A: Instant messaging? • Both seem retro: text & emotons • Access Grid could revolutionize human communication • But, it needs a new idea • Q: What comes after the telephone?
Supercomputers You Use • Hotmail, Yahoo!, Google: ~10k servers • Amazon, Barnes&Noble • Expedia, Orbitz • Dell, HP,… • Service-oriented architectures • Not computing on demand, but information on demand!
Distributed Computing Economics • Why is Seti@Home a great idea? • Why is Napster a great deal? • Why is the Computational Grid uneconomic • When does computing on demand work? • What is the “right” level of abstraction? • Is the Access Grid the real killer app? Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24 http://research.microsoft.com/research/pubs/view.aspx?tr_id=655
Poll • Is there a market for Supercomputers?Yes, Google, Expedia, Hotmail,… • Is Computing On Demand a high-margin business?I think not • Do you know the equivalent high-margin business?Information on demand
Take Aways • Computing on demand is a service business; probably not high margin; questionable economics; think LoudCloud • Distributed computing is coming,but it is probably via Service Oriented Architecture (SOA) • Web Services is the way to do SOA
Outline • Overview of Microsoft Research • Distribute Computing Economics • Q&A
The Cost Of ComputingComputers are NOT free! • IBM, HP, Dell make $billions • Capital Cost of a TpcC system is mostly storage and storage software (database) • IBM 32 cpu, 512 GB ram 2,500 disks, 43 TB(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf • A 7.5M$ super-computer • Total Data Center Cost: 40% capital & facilities 60% staff(includes app development)