1 / 11

RAMCloud Overview

RAMCloud Overview. Storage for datacenters 1000-10000 commodity servers 32-64 GB DRAM/server All data always in RAM Durable and available Performance goals: High throughput: 1M ops/sec/server Low-latency access: 5-10 µs RPC. Application Servers. Storage Servers. Datacenter.

maxine
Download Presentation

RAMCloud Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RAMCloud Overview • Storage for datacenters • 1000-10000 commodity servers • 32-64 GB DRAM/server • All data always in RAM • Durable and available • Performance goals: • High throughput:1M ops/sec/server • Low-latency access:5-10µs RPC Application Servers Storage Servers Datacenter CS 142 Lecture Notes: Large-Scale Web Applications

  2. Example Configurations For $100-200K today: • One year of Amazon customer orders • One year of United flight reservations CS 142 Lecture Notes: Large-Scale Web Applications

  3. UI Bus.Logic RAMCloud Motivation: Latency Traditional Application Web Application • Large-scale apps struggle with high latency • Facebook: can only make 100-150 internal requests per page UI Application Servers App.Logic Storage Servers DataStructures Single machine Datacenter << 1µs latency 0.5-10ms latency CS 142 Lecture Notes: Large-Scale Web Applications

  4. UI Bus.Logic RAMCloud Motivation: Latency Traditional Application Web Application • RAMCloud goal: large scale and low latency • Enable new class of applications: • Crowd-level collaboration • Large-scale graph algorithms UI Application Servers App.Logic Storage Servers DataStructures Single machine Datacenter 0.5-10ms latency << 1µs latency 5-10µs CS 142 Lecture Notes: Large-Scale Web Applications

  5. RAMCloud Motivation: Technology Disk access rate not keeping up with capacity: • Disks must become more archival • More information must move to memory CS 142 Lecture Notes: Large-Scale Web Applications

  6. RAMCloud Research Issues • Data durability/availability • Fast RPCs • Data model, concurrency/consistency model • Data distribution, scaling • Automated management • Multi-tenancy • Client-server functional distribution • Node architecture CS 142 Lecture Notes: Large-Scale Web Applications

  7. DRAM DRAM DRAM disk disk disk Data Durability/Availability • Data must be durable and available when write RPC returns • Unattractive approaches: • Replicate in other memories (too expensive) • Synchronous disk write (100-1000x too slow) • Our approach: buffered logging write log log Storage Servers async, batch CS 142 Lecture Notes: Large-Scale Web Applications

  8. Buffered Logging,cont’d • Potential problem: power loss • Per-server battery backup? • Nonvolatile memory on disk controllers? • Potential problem: crash recovery • If master crashes, data unavailable until recovered from disks on backups • Read 64 GB from one disk? 10 minutes • Our goal: recover in 1-2 seconds • Solution: take advantage of system scale • Scatter backup data across many servers • Recover in parallel CS 142 Lecture Notes: Large-Scale Web Applications

  9. Recovery, First Try • Scatter log segments randomly across all servers • After crash, all backups read disks in parallel(64 GB/1000 backups @ 100 MB/sec = 0.6 sec) • Collect all backup data on replacement master(64 GB/1GB/sec ~ 60 sec: too slow!) ReplacementMaster ... Backups CS 142 Lecture Notes: Large-Scale Web Applications

  10. Recovery, Second Try • Divide each master's data into partitions • Recover each partition on a separate server: • 100 partitions, 640 Mbytes each • 1 GB/sec NIC per replacement master • Recovery time < 1 sec DeadMaster ReplacementMasters ... Backups CS 142 Lecture Notes: Large-Scale Web Applications

  11. CS 142 Lecture Notes: Large-Scale Web Applications

More Related