1 / 12

RAMCloud: a Low-Latency Datacenter Storage System

RAMCloud: a Low-Latency Datacenter Storage System. John Ousterhout Stanford University ouster@cs.stanford.edu. Introduction. RAMCloud: new class of datacenter storage All data always in DRAM Large scale: 100 - 10,000 storage servers 100 TB - 1 PB total capacity Low latency:

ailis
Download Presentation

RAMCloud: a Low-Latency Datacenter Storage System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RAMCloud: a Low-Latency Datacenter Storage System John Ousterhout Stanford University ouster@cs.stanford.edu

  2. Introduction • RAMCloud: new class of datacenter storageAll data always in DRAM • Large scale: • 100 - 10,000 storage servers • 100 TB - 1 PB total capacity • Low latency: • 5 µs remote access time from anywhere in datacenter • Durable/available • Overall goal: enable a new class of applications Does this make sense for Radio Astronomy apps? Exascale Radio Astronomy Conference

  3. Traditional Storage Choices Exascale Radio Astronomy Conference

  4. RAMCloud Architecture 1000 – 100,000 Application Servers High-speed networking: • 5 µs round-trip • Full bisection bwidth … Appl. Appl. Appl. Appl. Library Library Library Library DatacenterNetwork Coordinator Master Master Master Master CommodityServers Backup Backup Backup Backup … 64-256 GBper server 1000 – 10,000 Storage Servers Exascale Radio Astronomy Conference

  5. Data Model: Key-Value Store Object Tables • Basic operations: • read(tableId, key) => blob, version • write(tableId, key, blob) => version • delete(tableId, key) • Other operations: • cwrite(tableId, key, blob, version) => version • Enumerate objects in table • Efficient multi-read, multi-write • Atomic increment • Under development: • Secondary indexes • Atomic updates of multiple objects Key (≤ 64KB) Version (64b) Blob (≤ 1MB) (Only overwrite ifversion matches) Exascale Radio Astronomy Conference

  6. Data Durability • One copy of data in DRAM • Multiple copies on disk/flash • Each master’s backup data scattered across cluster • Fast crash recovery • Remaining servers work together to recover lost data • Typical recovery time: 1-2 seconds Exascale Radio Astronomy Conference

  7. RAMCloud Performance • Using Infiniband networking (24 Gb/s, kernel bypass) • Other networking also supported, but slower • Reads: • 100B objects: 5µs • 10KB objects: 10µs • Single-server throughput (100B objects): 700 Kops/sec. • Small-object multi-reads: 1-2M objects/sec. • Durable writes: • 100B objects: 16µs • 10KB objects: 40µs • Small-object multi-writes: 400-500K objects/sec. 1 client, 1 server Exascale Radio Astronomy Conference

  8. Comparisons Exascale Radio Astronomy Conference

  9. RAMCloud Status • Ongoing research project at Stanford • Goal: production-quality system • Source code freely available • Version 1.0 tagged in January 2014(first version suitable for real applications) • Starting to work with early adopters • System requirements: • x86 servers (minimum cluster size: 10-20 servers) • Linux operating system • Need networking with kernel-bypass NICs • Built-in support for Mellanox Infiniband • Driver for SolarFlare 10 Gbs Ethernet NICs under development Exascale Radio Astronomy Conference

  10. Is RAMCloud Right for You? Issues to consider: • Remote access data model • Sparse vs. bulk • Key-value store • Durability Exascale Radio Astronomy Conference

  11. Large-Scale Applications D D D D D D D D D D C C C C C C C C C C • Remote data access • Works best for: • Sparse and unpredictable data accesses • No locality • Performance dominated by latency • Example: transactional Web applications (Facebook) • Computation, data colocated • Works best for: • Bulk processing (touch all data) • High locality of access • Performance dominated by bandwidth • Examples: analytics C C C C C C C C C C Network Computation Nodes StorageNodes Network D D D D D D D D D D Exascale Radio Astronomy Conference

  12. Conclusion • RAMCloud: general-purpose DRAM-based storage • Scale • Latency • Goals: • Harness full performance potential of DRAM-based storage • Enable new applications: intensive manipulation of large-scale data • What could you do with: • 1M cores • 1 petabyte data • 5-10µs access time Exascale Radio Astronomy Conference

More Related