1 / 46

Scaleable Computing Jim Gray Microsoft Corporation Gray@Microsoft

™. Scaleable Computing Jim Gray Microsoft Corporation Gray@Microsoft.com. Thesis: Scaleable Servers. Scaleable Servers Commodity hardware allows new applications New applications need huge servers Clients and servers are built of the same “stuff” Commodity software and Commodity hardware

Download Presentation

Scaleable Computing Jim Gray Microsoft Corporation Gray@Microsoft

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scaleable ComputingJim GrayMicrosoft CorporationGray@Microsoft.com

  2. Thesis: Scaleable Servers • Scaleable Servers • Commodity hardware allows new applications • New applications need huge servers • Clients and servers are built of the same “stuff” • Commodity software and • Commodity hardware • Servers should be able to • Scale up (grow node by adding CPUs, disks, networks) • Scale out (grow by adding nodes) • Scale down (can start small) • Key software technologies • Objects, Transactions, Clusters, Parallelism

  3. 1987: 256 tps Benchmark • 14 M$ computer (Tandem) • A dozen people • False floor, 2 rooms of machines Admin expert Hardware experts A 32 node processor array Auditor Network expert Simulate 25,600 clients Manager Performance expert OS expert DB expert A 40 GB disk array (80 drives)

  4. 1988: DB2 + CICS Mainframe65 tps • IBM 4391 • Simulated network of 800 clients • 2m$ computer • Staff of 6 to do benchmark 2 x 3725 network controllers Refrigerator-sized CPU 16 GB disk farm 4 x 8 x .5GB

  5. 1997: 10 years later1 Person and 1 box = 1250 tps • 1 Breadbox ~ 5x 1987 machine room • 23 GB is hand-held • One person does all the work • Cost/tps is 1,000x less25 micro dollars per transaction 4x200 Mhz cpu 1/2 GB DRAM 12 x 4GB disk Hardware expert OS expert Net expert DB expert App expert 3 x7 x 4GB disk arrays

  6. mainframe mini price micro time What Happened? • Moore’s law: Things get 4x better every 3 years(applies to computers, storage, and networks) • New Economics: Commodityclass price/mips software $/mips k$/yearmainframe 10,000 100 minicomputer 100 10microcomputer 10 1 • GUI: Human - computer tradeoffoptimize for people, not computers

  7. ? performance 1985 1995 2005 What Happens Next • Last 10 years: 1000x improvement • Next 10 years: ???? • Today: text and image servers are free 25 m$/hit => advertising pays for them • Future:video, audio, … servers are free“You ain’t seen nothing yet!”

  8. Kinds Of Information Processing Point-to-point Broadcast Lecture Concert Conversation Money Network Immediate Book Newspaper Mail Time-shifted Database It’s ALL going electronic Immediate is being stored for analysis (so ALL database) Analysis and automatic processing are being added

  9. Why Put EverythingIn Cyberspace? Point-to-point OR broadcast Low rent - min $/byte Shrinks time - now or later Shrinks space - here or there Automate processing - knowbots Network Immediate OR time-delayed Locate Process Analyze Summarize Database

  10. Magnetic Storage Cheaper Than Paper • File cabinet: cabinet (four drawer) 250$ paper (24,000 sheets) 250$ space (2x3 @ 10$/ft2) 180$ total 700$ 3¢/sheet • Disk: disk (4 GB =) 800$ ASCII: 2 mil pages 0.04¢/sheet (80x cheaper) • Image: 200,000 pages 0.4¢/sheet (8x cheaper) • Store everything on disk

  11. DatabasesInformation at Your Fingertips™ Information Network™Knowledge Navigator™ • All information will be in anonline database (somewhere) • You might record everything you • Read: 10MB/day, 400 GB/lifetime(eight tapes today) • Hear: 400MB/day, 16 TB/lifetime(three tapes/year today) • See: 1MB/s, 40GB/day, 1.6 PB/lifetime (maybe someday)

  12. People Name Address David NY Mike Berk Won Austin Database StoreALL Data Types • The old world: • Millions of objects • 100-byte objects • The new world: • Billions of objects • Big objects (1 MB) • Objects have behavior (methods) • Paperless office • Library of Congress online • All information online • Entertainment • Publishing • Business • WWW and Internet People Name Voice Address Papers Picture NY David Mike Berk Won Austin

  13. Billions Of Clients • Every device will be “intelligent” • Doors, rooms, cars… • Computing will be ubiquitous

  14. Billions Of ClientsNeed Millions Of Servers • All clients networked to servers • May be nomadicor on-demand • Fast clients wantfaster servers • Servers provide • Shared Data • Control • Coordination • Communication Clients Mobileclients Fixedclients Servers Server Super server

  15. 3 1 MM 10 nano-second ram 10 microsecond ram 10 millisecond disc 10 second tape archive ThesisMany little beat few big $1 million $10 K $100 K Pico Processor Micro Nano 10 pico-second ram 1 MB Mini Mainframe 10 0 MB 1 0 GB 1 TB 1 00 TB 1.8" 2.5" 3.5" 5.25" 1 M SPECmarks, 1TFLOP 106 clocks to bulk ram Event-horizon on chip VM reincarnated Multiprogram cache, On-Chip SMP 9" 14" • Smoking, hairy golf ball • How to connect the many little parts? • How to program the many little parts? • Fault tolerance?

  16. CPU 50 GB Disc 5 GB RAM Future Super Server:4T Machine • Array of 1,000 4B machines • 1 bps processors • 1 BB DRAM • 10 BB disks • 1 Bbps comm lines • 1 TB tape robot • A few megabucks • Challenge: • Manageability • Programmability • Security • Availability • Scaleability • Affordability • As easy as a single system Cyber Brick a 4B machine Future servers are CLUSTERS of processors, discs Distributed database techniques make clusters work

  17. The Hardware Is In Place…And then a miracle occurs ? • SNAP: scaleable networkand platforms • Commodity-distributedOS built on: • Commodity platforms • Commodity networkinterconnect • Enables parallel applications

  18. Thesis: Scaleable Servers • Scaleable Servers • Commodity hardware allows new applications • New applications need huge servers • Clients and servers are built of the same “stuff” • Commodity software and • Commodity hardware • Servers should be able to • Scale up (grow node by adding CPUs, disks, networks) • Scale out (grow by adding nodes) • Scale down (can start small) • Key software technologies • Objects, Transactions, Clusters, Parallelism

  19. Scaleable ServersBOTH SMP And Cluster Grow up with SMP; 4xP6is now standard Grow out with cluster Cluster has inexpensive parts SMP superserver Departmentalserver Personalsystem Clusterof PCs

  20. SMPs Have Advantages • Single system image easier to manage, easier to program threads in shared memory, disk, Net • 4x SMP is commodity • Software capable of 16x • Problems: • >4 not commodity • Scale-down problem (starter systems expensive) • There is a BIGGEST one SMP superserver Departmentalserver Personalsystem

  21. 1-TB home page www.SQL.1TB.com Todo loo da loo-rah, ta da ta-la la la Todo loo da loo-rah, ta da ta-la la la Todo loo da loo-rah, ta da ta-la la la Todo loo da loo-rah, ta da ta-la la la Todo loo da loo-rah, ta da ta-la la la Todo loo da loo-rah, ta da ta-la la la Todo loo da loo-rah, ta da ta-la la la TM 1-TB SQL Server DBSatellite and aerial photos Supportfiles Building the Largest Node • There is a biggest node (size grows over time) • Today, with NT, it is probably 1TB • We are building it(with help from DEC and SPIN2) • 1 TB GeoSpatial SQL Server database • (1.4 TB of disks = 320 drives). • 30K BTU, 8 KVA, 1.5 metric tons. • Will put it on the Web as a demo app. • 10 meter image of the ENTIRE PLANET. • 2 meter image of interesting parts (2% of land)One pixel per meter = 500 TB uncompressed. • Better resolution in US (courtesy of USGS).

  22. What’s TeraByte? • 1 Terabyte: 1,000,000,000 business letters 150 miles of book shelf 100,000,000 book pages 15 miles of book shelf 50,000,000 FAX images 7 miles of book shelf 10,000,000 TV pictures (mpeg) 10 days of video 4,000 LandSat images 16 earth images (100m) 100,000,000 web page 10 copies of the web HTML • Library of Congress (in ASCII) is 25 TB 1980: $200 million of disc 10,000 discs $5 million of tape silo 10,000 tapes 1997: $200 k$ of magnetic disc 48 discs $30 k$ nearline tape 20 tapes Terror Byte !

  23. TB DB User Interface + + + Next

  24. Tpc-C Web-Based Benchmarks • Client is a Web browser (7,500 of them!) • Submits • Order • Invoice • Query to server via Web page interface • Web server translates to DB • SQL does DB work • Net: • easy to implement • performance is GREAT! HTTP IIS = Web ODBC SQL

  25. SMP superserver Departmentalserver Personalsystem Grow UP and OUT 1 Terabyte DB • Cluster: • a collection of nodes • as easy to program and manage as a single node 1 billion transactions per day

  26. Clusters Have Advantages • Clients and servers made from the same stuff • Inexpensive: • Built with commodity components • Fault tolerance: • Spare modules mask failures • Modular growth • Grow by adding small modules • Unlimited growth: no biggest one

  27. Windows NT Clusters • Microsoft & 60 vendors defining NT clusters • Almost all big hardware and software vendors involved • No special hardware needed - but it may help • Fault-tolerant first, scaleable second • Microsoft, Oracle, SAP giving demos today • Enables • Commodity fault-tolerance • Commodity parallelism (data mining, virtual reality…) • Also great for workgroups!

  28. Billion Transactions per DayProject • Building a 20-node Windows NT Cluster (with help from Intel)> 800 disks • All commodity parts • Using SQL Server & DTC distributed transactions • Each node has 1/20 th of the DB • Each node does 1/20 th of the work • 15% of the transactions are “distributed”

  29. How Much Is 1 Billion Transactions Per Day? • 1 Btpd = 11,574 tps (transactions per second)~ 700,000 tpm (transactions/minute) • AT&T • 185 million calls (peak day worldwide) • Visa ~20 M tpd • 400 M customers • 250,000 ATMs worldwide • 7 billion transactions / year (card+cheque) in 1994 Millions of transactions per day 1,000. 100. 10. Mtpd 1. 0.1 AT&T Visa BofA NYSE 1 Btpd

  30. ParallelismThe OTHER aspect of clusters • Clusters of machines allow two kinds of parallelism • Many little jobs: online transaction processing • TPC-A, B, C… • A few big jobs: data search and analysis • TPC-D, DSS, OLAP • Both give automatic parallelism

  31. Kinds of Parallel Execution Any Any Sequential Sequential Pipeline Program Program Partition outputs split N ways inputs merge M ways Any Any Sequential Sequential Program Program Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  32. Partitioned Execution Spreads computation and IO among processors Partitioned data gives NATURAL parallelism Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  33. N x M way Parallelism N inputs, M outputs, no bottlenecks. Partitioned Data Partitioned and Pipelined Data Flows Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey

  34. 1,000 MIPS 32 $ 1 MIPS 1 $ .03$/MIPS The Parallel Law Of Computing • Grosch's Law: • Parallel Law: • Needs: • Linear speedup and linear scale-up • Not always possible 2x $ is 4x performance 2x $ is2x performance 1,000 MIPS 1,000 $ 1 MIPS 1 $

  35. Thesis: Scaleable Servers • Scaleable Servers • Commodity hardware allows new applications • New applications need huge servers • Clients and servers are built of the same “stuff” • Commodity software and • Commodity hardware • Servers should be able to • Scale up (grow node by adding CPUs, disks, networks) • Scale out (grow by adding nodes) • Scale down (can start small) • Key software technologies • Objects, Transactions, Clusters, Parallelism

  36. The BIG PictureComponents and transactions • Software modules are objects • Object Request Broker (a.k.a., Transaction Processing Monitor) connects objects(clients to servers) • Standard interfaces allow software plug-ins • Transaction ties execution of a “job” into an atomic unit: all-or-nothing, durable, isolated Object Request Broker

  37. Linking And EmbeddingObjects are data modules;transactions are execution modules • Link: pointer to object somewhere else • Think URL in Internet • Embed: bytesare here • Objects may be active; can callback to subscribers

  38. Database Spreadsheet Photos Mail Map Document Objects Meet DatabasesThe basis for universaldata servers, access, & integration • object-oriented (COM oriented) programming interface to data • Breaks DBMS into components • Anything can be a data source • Optimization/navigation “on top of” other data sources • A way to componentized a DBMS • Makes an RDBMS and O-RDBMS (assumes optimizer understands objects) DBMS engine

  39. Web Client HTML VB Java plug-ins VBscritpt JavaScrpt Middleware ORB TP Monitor Web Server... Object server Pool VB or Java Script Engine VB or Java Virt Machine HTTP+ DCOM ORB Internet DCOM (oleDB, ODBC,...) LU6.2 Legacy Gateways IBM The Three Tiers Object & Data server.

  40. Server Side ObjectsEasy Server-Side Execution • Give simple execution environment • Object gets • start • invoke • shutdown • Everything else is automatic • Drag & Drop Business Objects A Server Network Receiver Queue Management Connections Security Context Configuration Thread Pool Service logic Synchronization Shared Data

  41. A new programming paradigm • Develop object on the desktop • Better yet: download them from the Net • Script work flows as method invocations • All on desktop • Then, move work flows and objects to server(s) • Gives • desktop development • three-tier deployment • Software Cyberbricks

  42. Transactions Coordinate Components (ACID) • Transaction properties • Atomic: all or nothing • Consistent: old and new values • Isolated: automatic locking or versioning • Durable: once committed, effects survive • Transactions are built into modern OSs • MVS/TM Tandem TMF, VMS DEC-DTM, NT-DTC

  43. Transactions & Objects • Application requests transaction identifier (XID) • XID flows with method invocations • Object Managers join (enlist)in transaction • Distributed Transaction Manager coordinates commit/abort

  44. Distributed Transactions Enable Huge Throughput • Each node capable of 7 KtmpC (7,000 active users!) • Can add nodes to cluster (to support 100,000 users) • Transactions coordinate nodes • ORB / TP monitor spreads work among nodes

  45. Distributed Transactions Enable Huge DBs • Distributed database technology spreads data among nodes • Transaction processing technology manages nodes

  46. Thesis: Scaleable Servers • Scaleable Servers Built from Cyberbricks • Allow new applications • Servers should be able to • Scale up, out, down • Key software technologies • Clusters (ties the hardware together) • Parallelism: (uses the independent cpus, stores, wires • Objects (software CyberBricks) • Transactions: masks errors.

More Related