1 / 60

Introduction to cloud computing

Introduction to cloud computing. Jiaheng Lu Department of Computer Science Renmin University of China www.jiahenglu.net. Yahoo ! Cloud computing . Yahoo! Cloud Stack. EDGE. Horizontal Cloud Services. YCS. YCPI . Brooklyn. …. WEB. Horizontal Cloud Services. VM/OS. yApache. PHP.

ellison
Download Presentation

Introduction to cloud computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China www.jiahenglu.net

  2. Yahoo! Cloud computing

  3. Yahoo! Cloud Stack EDGE Horizontal Cloud Services YCS YCPI Brooklyn … WEB Horizontal Cloud Services VM/OS yApache PHP App Engine APP Provisioning (Self-serve) Monitoring/Metering/Security Horizontal Cloud Services VM/OS Serving Grid … Data Highway STORAGE Horizontal Cloud Services Sherpa MOBStor … BATCH Horizontal Cloud Services Hadoop …

  4. Web Data Management • CRUD • Point lookups and short scans • Index organized table and random I/Os • $ per latency • Scan oriented workloads • Focus on sequential disk I/O • $ per cpu cycle Structured record storage (PNUTS/Sherpa) Large data analysis (Hadoop) • Object retrieval and streaming • Scalable file storage • $ per GB Blob storage (SAN/NAS)

  5. The World Has Changed • Web serving applications need: • Scalability! • Preferably elastic • Flexible schemas • Geographic distribution • High availability • Reliable storage • Web serving applications can do without: • Complicated queries • Strong transactions

  6. PNUTS / SHERPA To Help You Scale Your Mountains of Data

  7. Yahoo! Serving Storage Problem • Small records – 100KB or less • Structured records – lots of fields, evolving • Extreme data scale - Tens of TB • Extreme request scale - Tens of thousands of requests/sec • Low latency globally - 20+ datacenters worldwide • High Availability - outages cost $millions • Variable usage patterns - as applications and users change 9

  8. What is PNUTS/Sherpa? A 42342 E A 42342 E B 42521 W B 42521 W C 66354 W D 12352 E F 15677 E A 42342 E E 75656 C B 42521 W C 66354 W C 66354 W D 12352 E D 12352 E E 75656 C E 75656 C F 15677 E F 15677 E CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Structured, flexible schema Geographic replication Parallel database Hosted, managed infrastructure 11

  9. A 42342 E A 42342 E A 42342 E B 42521 W B 42521 W B 42521 W C 66354 W C 66354 W C 66354 W D 12352 E D 12352 E D 12352 E E 75656 C E 75656 C E 75656 C F 15677 E F 15677 E F 15677 E What Will It Become? Indexes and views

  10. Design Goals Consistency Per-record guarantees Timeline model Option to relax if needed Multiple access paths Hash table, ordered table Primary, secondary access Hosted service Applications plug and play Share operational cost Scalability Thousands of machines Easy to add capacity Restrict query language to avoid costly queries Geographic replication Asynchronous replication around the globe Low-latency local access High availability and fault tolerance Automatically recover from failures Serve reads and writes despite failures 14

  11. Technology Elements Applications Tabular API PNUTS API • PNUTS • Query planning and execution • Index maintenance • Distributed infrastructure for tabular data • Data partitioning • Update consistency • Replication YCA: Authorization • YDOT FS • Ordered tables • YDHT FS • Hash tables • Tribble • Pub/sub messaging • Zookeeper • Consistency service 15

  12. Data Manipulation Per-record operations Get Set Delete Multi-record operations Multiget Scan Getrange 16

  13. Tablets—Hash Table Name Description Price 0x0000 $12 Grape Grapes are good to eat $9 Limes are green Lime $1 Apple Apple is wisdom $900 Strawberry Strawberry shortcake 0x2AF3 $2 Orange Arrgh! Don’t get scurvy! $3 Avocado But at what price? Lemon How much did you pay for this lemon? $1 $14 Is this a vegetable? Tomato 0x911F $2 The perfect fruit Banana $8 Kiwi New Zealand 0xFFFF 17

  14. Tablets—Ordered Table Name Description Price A $1 Apple Apple is wisdom $3 Avocado But at what price? $2 Banana The perfect fruit $12 Grape Grapes are good to eat H $8 Kiwi New Zealand Lemon $1 How much did you pay for this lemon? Limes are green Lime $9 $2 Orange Arrgh! Don’t get scurvy! Q $900 Strawberry Strawberry shortcake $14 Is this a vegetable? Tomato Z 18

  15. Flexible Schema

  16. Detailed Architecture Remote regions Local region Clients REST API Routers Tribble Tablet Controller Storage units 20

  17. Tablet Splitting and Balancing Storage unit Tablet Each storage unit has many tablets (horizontal partitions of the table) Storage unit may become a hotspot Tablets may grow over time Overfull tablets split Shed load by moving tablets to other servers 21

  18. QUERY PROCESSING 22

  19. Accessing Data Record for key k Get key k Record for key k 1 2 3 4 Get key k SU SU SU 23

  20. Bulk Read {k1, k2, … kn} Get k1 Get k2 Get k3 Scatter/ gather server 1 2 SU SU SU 24

  21. Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1 Grapefruit…Pear? Grapefruit…Lime? Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1 Lime…Pear? Router Storage unit 1 Storage unit 2 Storage unit 3 Range Queries in YDOT • Clustered, ordered retrieval of records Apple Avocado Banana Blueberry Canteloupe Grape Kiwi Lemon Lime Mango Orange Strawberry Tomato Watermelon Apple Avocado Banana Blueberry Strawberry Tomato Watermelon Lime Mango Orange Canteloupe Grape Kiwi Lemon

  22. Updates Write key k SU SU SU 6 5 2 4 1 8 7 3 Write key k Sequence # for key k Routers Message brokers Write key k Sequence # for key k SUCCESS Write key k 26

  23. ASYNCHRONOUS REPLICATION AND CONSISTENCY 27

  24. Asynchronous Replication 28

  25. Goal: Make it easier for applications to reason about updates and cope with asynchrony What happens to a record with primary key “Alice”? Consistency Model Record inserted Delete Update Update Update Update Update Update Update v. 2 v. 5 v. 1 v. 3 v. 4 v. 6 v. 7 v. 8 Time Time Generation 1 As the record is updated, copies may get out of sync. 29

  26. Example: Social Alice East Record Timeline West ___ Busy Free Free

  27. Consistency Model Read Stale version Current version Stale version v. 2 v. 5 v. 1 v. 3 v. 4 v. 6 v. 7 v. 8 Time Generation 1 In general, reads are served using a local copy 31

  28. Consistency Model Read up-to-date Stale version Current version Stale version v. 2 v. 5 v. 1 v. 3 v. 4 v. 6 v. 7 v. 8 Time Generation 1 But application can request and get current version 32

  29. Consistency Model Read ≥ v.6 Stale version Current version Stale version v. 2 v. 5 v. 1 v. 3 v. 4 v. 6 v. 7 v. 8 Time Generation 1 Or variations such as “read forward”—while copies may lag the master record, every copy goes through the same sequence of changes 33

  30. Consistency Model Write Stale version Current version Stale version v. 2 v. 5 v. 1 v. 3 v. 4 v. 6 v. 7 v. 8 Time Generation 1 Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly transferred if site fails) Can be selectively weakened to eventual consistency (local writes that are reconciled using version vectors) 34

  31. Consistency Model Write if = v.7 ERROR Stale version Current version Stale version v. 2 v. 5 v. 1 v. 3 v. 4 v. 6 v. 7 v. 8 Time Generation 1 Test-and-set writes facilitate per-record transactions 35

  32. Consistency Techniques • Per-record mastering • Each record is assigned a “master region” • May differ between records • Updates to the record forwarded to the master region • Ensures consistent ordering of updates • Tablet-level mastering • Each tablet is assigned a “master region” • Inserts and deletes of records forwarded to the master region • Master region decides tablet splits • These details are hidden from the application • Except for the latency impact!

  33. Mastering A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W Tablet master C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E 37

  34. Bulk Insert/Update/Replace • Client feeds records to bulk manager • Bulk loader transfers records to SU’s in batches • Bypass routers and message brokers • Efficient import into storage unit Client Bulk manager Source Data

  35. Bulk Load in YDOT • YDOT bulk inserts can cause performance hotspots • Solution: preallocate tablets

  36. Index Maintenance • How to have lots of interesting indexes and views, without killing performance? • Solution: Asynchrony! • Indexes/views updated asynchronously when base table updated

  37. SHERPAIN CONTEXT 41

  38. Types of Record Stores • Query expressiveness S3 PNUTS Oracle Simple Feature rich Object retrieval Retrieval from single table of objects/records SQL

  39. Types of Record Stores • Consistency model S3 PNUTS Oracle Best effort Strong guarantees Eventual consistency Timeline consistency ACID Program centric consistency Object-centric consistency

  40. Types of Record Stores • Data model PNUTS CouchDB Oracle Flexibility, Schema evolution Optimized for Fixed schemas Object-centric consistency Consistency spans objects

  41. Types of Record Stores • Elasticity (ability to add resources on demand) PNUTS S3 Oracle Inelastic Elastic Limited (via data distribution) VLSD (Very Large Scale Distribution /Replication)

  42. User-partitioned SQL stores Microsoft Azure SDS Amazon SimpleDB Multi-tenant application databases Salesforce.com Oracle on Demand Mutable object stores Amazon S3 Versus PNUTS More expressive queries Users must control partitioning Limited elasticity Highly optimized for complex workloads Limited flexibility to evolving applications Inherit limitations of underlying data management system Object storage versus record management Data Stores Comparison

  43. Application Design Space Get a few things Sherpa MObStor YMDB MySQL Oracle Filer BigTable Scan everything Hadoop Everest Files Records 47

  44. Alternatives Matrix Consistency model Structured access Global low latency SQL/ACID Availability Operability Updates Elastic Sherpa Y! UDB MySQL Oracle HDFS BigTable Dynamo Cassandra 48

  45. Further Reading Efficient Bulk Insertion into a Distributed Ordered Table (SIGMOD 2008) Adam Silberstein, Brian Cooper, Utkarsh Srivastava, Erik Vee, Ramana Yerneni, Raghu Ramakrishnan PNUTS: Yahoo!'s Hosted Data Serving Platform (VLDB 2008) Brian Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Phil Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni Asynchronous View Maintenance for VLSD Databases, Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava and Raghu Ramakrishnan SIGMOD 2009 (to appear) Cloud Storage Design in a PNUTShell Brian F. Cooper, Raghu Ramakrishnan, and Utkarsh Srivastava Beautiful Data, O’Reilly Media, 2009 (to appear)

  46. QUESTIONS? 50

  47. Hadoop

  48. Problem How do you scale up applications? Run jobs processing 100’s of terabytes of data Takes 11 days to read on 1 computer Need lots of cheap computers Fixes speed problem (15 minutes on 1000 computers), but… Reliability problems In large clusters, computers fail every day Cluster size is not fixed Need common infrastructure Must be efficient and reliable

  49. Solution Open Source Apache Project Hadoop Core includes: Distributed File System - distributes data Map/Reduce - distributes application Written in Java Runs on Linux, Mac OS/X, Windows, and Solaris Commodity hardware

  50. Hardware Cluster of Hadoop Typically in 2 level architecture Nodes are commodity PCs 40 nodes/rack Uplink from rack is 8 gigabit Rack-internal is 1 gigabit

More Related