1 / 12

Lecture 24: GFS

Lecture 24: GFS. Google File System. Why yet another file system? Who are the “clients”? Design objectives Architecture Cache consistency Location transparent/independent? Stateful or stateless?. Google File System. Key background: New workload => new filesystem (why?)

bina
Download Presentation

Lecture 24: GFS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 24: GFS

  2. Google File System • Why yet another file system? • Who are the “clients”? • Design objectives • Architecture • Cache consistency • Location transparent/independent? • Stateful or stateless?

  3. Google File System • Key background: • New workload => new filesystem (why?) • Extreme scale: 100 TB over 1000s of disks on >1000 machines • New API • Architecture? • Who are the clients? • Four problems to solve: • Fault tolerance: this many components ensures regular failures. Must have automatic recovery. • Huge files (are they?) • Most common operation is append, not random writes • Most files are write once, and read-only after that • web snapshots, intermediate files in a pipeline, archival files • implies that streaming is much more important than block caching (and LRU would be a bad choice) • Customize the API to enable optimization and flexibility

  4. GFS: New Workload • Few million large files, rather than billions of small files • Large streaming reads, random reads • Large streaming writes, very rare random writes • Concurrency for append is critical • also producer/consumer concurrency • Focus on throughput (read/write bandwidth) not latency • Why?

  5. GFS Architecture

  6. GFS Architecture • single master, multiple chunkservers, multiple clients • fixed-size chunks (giant blocks) (64MB) • 64-bit ids for each chunk • clients read/write chunks directly from chunkservers • unit of replication? • master maintains all metadata • namespace and access control • map from filenames to chunk ids • current locations for each chunk • no caching for chunks • metadata is cached at clients

  7. GFS Master • Single master: • Claim: simple, but good enough • Enables good chuck placement (centralized decision) • Scalability is a concern. • What is the GFS solution? • Clients cache (file name -> chunk id, replica locations), this expires eventually • Large chunk size reduces master RPC interaction and space overhead for metadata • All metadata is in memory (why?) • metadata is 64B per chunk

  8. Question: Does GFS implement Location Transparency? What about Location Independence?

  9. Cache Consistency in GFS • Caching? • Of data? • Of metadata? • Metadata changes? • Namespace changes are atomic and serializable (easy since they go through one place: the master) • Replicas: • “defined” if it reflects a mutation and “consistent” • “consistent”: all replicas have the same value • Concurrent writes will leave region consistent, but not necessarily defined; some updates may be lost • “region”? • A failed write leaves the region inconsistent • To ensure definition, GFS must apply writes in the same order at all replicas: How?

  10. Writes and replica updates in GFS

  11. Consistency in GFS (cont) • consistency ensured via version numbers • stale versions not used and GC’d • client chunk caching may contain stale data! • this is OK for append only files, but not for random writes. • primary replica decides the write order (it has lease for this right from the master) • lease extensions piggybacked on the heartbeat messages • master can revoke a lease but only with the replicas agreement; otherwise it has to wait for expiration • client pushes data out to all replicas using a data flow tree • after all data received, client notifies the primary to issue the write, which then gets ordered and can now be executed • client retries if the write fails • writes across chunks: transactional? Consequences?

  12. GFS: statefull or stateless?

More Related