1 / 13

A Low-bandwidth Network File System

A Low-bandwidth Network File System. Athicha Muthitacharoen, Benjie Chen, and David Mazieres MIT Laboratory for Computer Science and NYU Department of Computer Science. Presented by: Khaled Elmeleegy. Overview. LBFS is a network file system designed for low-bandwidth networks.

ludwig
Download Presentation

A Low-bandwidth Network File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Low-bandwidth Network File System Athicha Muthitacharoen, Benjie Chen, and David Mazieres MIT Laboratory for Computer Science and NYU Department of Computer Science Presented by: Khaled Elmeleegy

  2. Overview • LBFS is a network file system designed for low-bandwidth networks. • LBFS provides traditional file system semantics and consistency. • To reduce its bandwidth requirements, LBFS exploits cross-file similarities.

  3. LBFS Design • Persistent file cache at the client. • For a modified file, the client must transmit the changes to the server. • Divides the files it stores into chunks and indexes the chunks by hash value. • Avoids transmitting the chunks the recipient already has.

  4. Files Chunking and Indexing • Files are divided into non-overlapping chunks. • LBFS selects the boundary regions between chunks using Rabin fingerprints. • LBFS indexes the files’ chunks to recognize identical chunks.

  5. Files Chunking and Indexing (cont’d) • If the client and server both have chunks producing the same SHA-1 hash, they are assumed to be the same chunk and avoid transferring it. 1. C1 C2 C3 C4 C5 C6 C7 2. C1 C2 C 8 C4 C5 C6 C7 3. C1 C2 C8 C4 C9 C10 C6 C7 4. C 11 C8 C4 C9 C10 C6 C7 Fig. Chunks of a file after various edits

  6. File Consistency • Whenever a client makes any RPC on a file in LBFS, it gets back a read lease on the file. • When a user opens a file, if the lease on the file has not expired, then the open succeeds immediately with no messages sent to the server.

  7. File Consistency (Cont’d) • If the lease has expired, then the client asks the server for the attributes of the file and implicitly is granted a lease on the file. • If the file is the same as when it was stored in the cache, then the client uses the version in the cache. • If the file is modified then the client must transfer the new contents from the server.

  8. File Reads Client Server File not in cache GETHASH(..) Breaks up file into chunks hashes First hash not in DB READ(..) /*Chuck #1*/ Second hash not in DB READ(..) /*Chuck #2*/ Return first chunk Chunk #1 Put hash #1 in DB Return second chunk Put hash #2 in DB Chunk #2 File reconstructed , return to user. Fig. Reading a file using LBFS

  9. File Writes • Atomic updates of files, using a temp file. • Incase of concurrent multiple file writes, the last writer to the file wins.

  10. File Writes Client Server User closes file MKTMPFILE Break file into chunks and send their corresponding hashes CONDWRITE Create tmp file OK CONDWRITE First hash in DB,write data into tmp file OK Second hash not in DB Server has hash #1 HASHNOTFOUND Server needs hash #2, send data Put hash #2 into database, write data into tmp file TMPWRITE Server has everything, commit COMMITTMP OK No error, copy data from tmp file into target file OK File closed , return to user. Fig. writing a file using LBFS

  11. Bandwidth Consumption Normalized bandwidth consumed by three workloads. The first four bars of each workload show upstream bandwidth, the second four downstream bandwidth.

  12. Performance Vs Bandwidth/Latency Performance of the gcc workload over various bandwidths with a fixed round-trip time of 10 ms. Performance of the gcc workload over a range of round-trip times with fixed 1.5 Mbit/sec symmetric links.

  13. Performance Vs Loss Rate Performance of a shortened edbenchmark over various loss rates, on a network with fixed 1.5 Mbit/sec symmetric links and a fixed round-trip time of 10 ms.

More Related