1 / 16

FAWN: A Fast Array of Wimpy Nodes

Presented by: Clint Sbisa & Irene Haque. FAWN: A Fast Array of Wimpy Nodes. Motivation. Large-scale data-intensive applications         Facebook, LinkedIn, Dynamo CPU-I/O Gap         storage, network and memory bottlenecks         low CPU utilization CPU Power

jett
Download Presentation

FAWN: A Fast Array of Wimpy Nodes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presented by: Clint Sbisa & Irene Haque FAWN: A Fast Array of Wimpy Nodes

  2. Motivation Large-scale data-intensive applications         Facebook, LinkedIn, Dynamo CPU-I/O Gap         storage, network and memory bottlenecks         low CPU utilization CPU Power         slower CPUs execute more queries per second per Watt         1 billion vs. 100 million instructions per Joule         inefficient energy saving techniques Memory Power

  3. FAWN Data-intensive, computational simple workloads Small objects - 100B - 1KB Cluster of embedded CPUs using flash storage         Efficient         Fast random reads         Slow random writes FAWN-KV          Key-value storage         Consistent Hashing FAWN-DS         Data store         Log structured

  4. FAWN - DS Log-structure key-value store Contains all values in a key range for each virtual ID Maps 160-bit key         Hash Index bucket = i low order index bits         key fragment = next 15 low order bits 6 byte in-memory Hash Index stores frag and pointer 

  5. Virtual Node Maintenance:     Split     Merge     Compact FAWN - DS Basic Functions:         Store         Lookup         Delete                                Concurrent operations

  6. FAWN - KV Consistent hashing of back-end VIDs Management node         assigns each front-end to circular key space Front-end nodes         manages its key space         forwards out-of-range request Back-end nodes - VIDs         contacts front-end when joining         owns a key range

  7. FAWN - KV Chain replication

  8. FAWN - KV Join     split key range     pre-copy     chain insertion     log flush Leave     merge key range     Join into each chain

  9. Individual Node Performance • Lookup speed • Bulk store speed: 23.2 MB/s, or 96% of raw speed

  10. Individual Node Performance • Put speed • Compared to BerkeleyDB: 0.07 MB/s – shows necessity of log-based filesystems

  11. Individual Node Performance • Read- and write-intensive workloads

  12. System Benchmarks • System throughput and power consumption

  13. Impact of Ring Membership Changes • Query throughput during node join and maintenance operations

  14. Impact of Ring Membership Changes • Query latency

  15. Alternative Architectures • Large Dataset, Low Query → FAWN+Disk • Small Dataset, High Query → FAWN+DRAM • Middle Range → FAWN+SSD

  16. Conclusion • Fast and energy efficient processing of random read-intensive workloads • Over an order of magnitude more queries per Joule than traditional disk-based systems

More Related