1 / 35

Design Principles and Architecture of Distributed File Systems

Explore the requirements, architecture, read/write operations, and research directions of distributed file systems like HDFS and GFS. Learn about their popularity, failure handling, and network properties in a data center.

jschultz
Download Presentation

Design Principles and Architecture of Distributed File Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HDFS/GFS

  2. Outline • Requirements for a Distributed File System • HDFS • Architecture • Read/Write • Research Directions • Popularity • Failures • Network

  3. Properties of a Data Center • Servers are built from commodity devices • Failure is extremely common • Servers only have a limited amount of HDD space • Network is over-subscribed • The bandwidth between servers is different • Demanding applications • High throughput, low latency • Resources are grouped into failure-zone • Independent units of failure

  4. Data-Center Architecture 10GB 25GB 100GB

  5. Properties of a Data Center • Servers are built from commodity devices • Failure is extremely common • Servers only have a limited amount of HDD space • Network is over-subscribed • The bandwidth between servers is different • Demanding applications • High throughput, low latency • Resources are grouped into failure-zone • Independent units of failure

  6. Data-Center Architecture Failure Domain 1 Failure Domain 2

  7. Data-Center Architecture Failure Domain 1 Failure Domain 2

  8. Data-Center Architecture Failure Domain 1 Failure Domain 2

  9. Goals for a Data Center File System • Reliable • Over come server failures • High performing • Provide good performance to application • Aware of network disparities • Make data local to the applications

  10. Common Design Principles • For performance: Partitioning the data • Split data into chunks and distribute • provides high throughput • Many people can read the chunks in parallel • Better than everyone one reading the same file How data is partitioned across nodes • For reliability: Replication: • overcome failure by making copies • At least one copy should be online How data is duplicated across nodes • For Network-disparity: rack-aware allocation • Read from the closest block • Write to the closest location

  11. Common Design Principles • For performance: Partitioning the data • Split data into chunks and distribute • provides high throughput • Many people can read the chunks in parallel • Better than everyone one reading the same file How data is partitioned across nodes • For reliability: Replication: • overcome failure by making copies • At least one copy should be online How data is duplicated across nodes • For Network-disparity: rack-aware allocation • Read from the closest block • Write to the closest location

  12. Common Design Principles • For performance: Partitioning the data • Split data into chunks and distribute • provides high throughput • Many people can read the chunks in parallel • Better than everyone one reading the same file How data is partitioned across nodes • For reliability: Replication: • overcome failure by making copies • At least one copy should be online How data is duplicated across nodes • For Network-disparity: rack-aware allocation • Read from the closest block • Write to the closest location

  13. Outline • Requirements for a Distributed File System • HDFS • Architecture • Read/Write • Research Directions • Popularity • Failures • Network

  14. HDFS Architecture • Name Node – Master (only 1 in a data center) • All reads/write go through the master • Manages the data nodes • Detects failures – triggers replication • Tracks performance • Tracks location of blocks • Data Node – • One per server • Stores the blocks • Tracks block to node mapping • Tracks status of data nodes • Rebalances the data center • Orchestrates read/writes Name Node • Tracks status of blocks • Ensures integrity of block Data Node B Data Node Data Node B B` B`

  15. What is a Distributed FS Write? • HDFS • For high-performance • Make N copies of the data to be written • Default N= 3 B HDFS Master Write B B B

  16. What is a Distributed FS Write? • HDFS • For Fault tolerance • Place in two different fault domains • 2 copies in the same rack • 1 in a different rack Zone 1 Zone 2 B B B

  17. What is a Distributed FS Write? • HDFS • For Network awareness • Currently does nothing Picks two random racks

  18. What is a Distributed FS Read? • HDFS • For Network awareness/performance • Pick closest copy to read from. • Nothing specific for Reliability Name Node Read B Zone 1 Zone 2 B B B

  19. Implications of Read/Write Semantics • One application write == 3 HDFS writes • Writes are costly!! • HDFS is optimized for write-once/read-many times workloads • What is an update/edit? Rewrite blocks? Name Node Modify B Zone 1 Zone 2 B B B

  20. Implications of Read/Write Semantics • One application write == 3 HDFS writes • Writes are costly!! • HDFS is optimized for write-once/read-many times workloads • An update/Edit: • delete old data + write new data Name Node Modify B B B B B` B` B`

  21. Interesting Challenges • How happens with more popular blocks? • Or less popular blocks? • What happens during server failures? • Can you loose data? • What happens if you have a better network? • No oversubscription

  22. Outline • Requirements for a Distributed File System • HDFS • Architecture • Read/Write • Research Directions • Popularity • Failures • Network

  23. Popularity in HDFS • Not all files are equivalent • E.g. More people search for bball than hockey • More popular blocks will have more contention • Leads to slower performance • Search for bball will be slower

  24. Popularity in HDFS • # of copies of a block = function(popularity) • If 50 people search for bball, then make 50 blocks • If only 3 search for hockey, then make 3 • You want as many copies of a block as readers

  25. Popularity in HDFS • # of copies of a block = function(popularity) • If 50 people search for bball, then make 50 blocks • If only 3 search for hockey, then make 3 • You want as many copies of a block as readers

  26. Popularity in HDFS • As data becomes old less people care about it • So last year’s weather versus today’s weather • When a block becomes old (older than a week) • Reduce the number of copies. • In Facebook data centers, only one copy of old data

  27. Failures in Data Center • Do servers fail???? • Facebook: 1% of servers fail after-reboot • Google: at least one server fails a day Name Node B Data Node Data Node B` • Failed node doesn’t send heart beat • Name node determines blocks on failed node • Starts replication. B Data Node Data Node B B` B`

  28. Failures in Data Center • Do servers fail???? • Facebook: 1% of servers fail after-reboot • Google: at least one server fails a day Name Node • Failed node doesn’t send heart beat • Name node determines blocks on failed node • Starts replication. B Data Node B Data Node Data Node Data Node B B` B` B`

  29. Problems With Locality aware DFS • Ignores contention on the servers • I/O contention greatly impacts performance

  30. Problems With Locality aware DFS • Ignores contention on the servers • I/O contention greatly impacts performance • Ignores contention in the network • Similar performance degradation

  31. Types of Network Topologies • Current Networks • Uneven B/W everywhere • Future Networks • Even B/W everywhere 10GB 100GB 25GB 100GB 100GB 100GB

  32. Implications of Network Topologies • Blocks can be more spread out! • No need for two blocks within the same rack • Same BW everywhere so no need for locality aware placement

  33. Summary • Properties for a DFS • Research Challenges • Popularity • Failure • Data Placement

  34. Un-discussed • Cluster rebalancing • Move blocks around based on utilization. • Data integrity • Use checksum to check if data has gotten corrupted. • Staging + pipeline

More Related