1 / 38

JOURNALING VERSUS SOFT UPDATES: ASYNCHRONOUS META-DATA PROTECTION IN FILE SYSTEMS

JOURNALING VERSUS SOFT UPDATES: ASYNCHRONOUS META-DATA PROTECTION IN FILE SYSTEMS. Margo I. Seltzer, Harvard Gregory R. Ganger, CMU M. Kirk McKusick Keith A. Smith, Harvard Craig A. N. Soules, CMU Christopher A. Stein, Harvard. INTRODUCTION.

flann
Download Presentation

JOURNALING VERSUS SOFT UPDATES: ASYNCHRONOUS META-DATA PROTECTION IN FILE SYSTEMS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. JOURNALING VERSUS SOFT UPDATES: ASYNCHRONOUS META-DATA PROTECTION IN FILE SYSTEMS Margo I. Seltzer, HarvardGregory R. Ganger, CMUM. Kirk McKusickKeith A. Smith, HarvardCraig A. N. Soules, CMUChristopher A. Stein, Harvard

  2. INTRODUCTION • Paper discusses two most popular approaches for improving the performance of metadata operations and recovery: • Journaling • Soft Updates • Journaling systems record metadata operations on an auxiliary log (Hagmann) • Soft Updates usesordered writes(Ganger & Patt)

  3. Metadata Operations • Metadata operations modify thestructure of the file system • Creating, deleting, or renamingfiles, directories, or special files • Data must be written to disk in such a way that the file system can be recovered to a consistent state after a system crash

  4. Metadata Integrity • FFS uses synchronous writes to guarantee the integrity of metadata • Any operation modifying multiple pieces of metadata will write its data to disk in a specific order • These writes will be blocking • Guarantees integrity and durability of metadata updates

  5. Deleting a file (I) i-node-1 abc def i-node-2 ghi i-node-3 Assume we want to delete file “def”

  6. Deleting a file (II) i-node-1 abc ? def ghi i-node-3 Cannot delete i-node before directory entry “def”

  7. Deleting a file (III) • Correct sequence is • Write to disk directory block containing deleted directory entry “def” • Write to disk i-node block containing deleted i-node • Leaves the file system in a consistent state

  8. Creating a file (I) i-node-1 abc ghi i-node-3 Assume we want to create new file “tuv”

  9. Creating a file (II) i-node-1 abc ghi i-node-3 tuv ? Cannot write directory entry “tuv” before i-node

  10. Creating a file (III) • Correct sequence is • Write to disk i-node block containing new i-node • Write to disk directory block containing new directory entry • Leaves the file system in a consistent state

  11. Synchronous Updates • Used by FFS to guarantee consistency of metadata: • All metadata updates are done through blocking writes • Increases the cost of metadata updates • Can significantly impact the performance of whole file system

  12. SOFT UPDATES • Use delayed writes (write back) • Maintain dependency informationabout cached pieces of metadata: This i-node block must be updated before/after this directory entry • Guarantee that metadata blocks are written to disk in the required order

  13. First Problem • Synchronous writes guaranteed that metadata operations were durable once the system call returned • Soft Updates guarantee that file system will recover into a consistent state but not necessarily the most recent one • Some updates could be lost

  14. Second Problem • Cyclical dependencies: • Same directory block contains entries to be created and entries to be deleted • These entries point to i-nodes in the same block

  15. NEW xyz NEW i-node-3 Example (I) Block A Block B def --- i-node-2 ---------- We want to delete file “def” and create new file “xyz”

  16. Example (II) • Cannot write block A before block B: • Block A contains a new directory entry pointing to block B • Cannot write block B before block A: • Block A contains a deleted directory entry pointing to block B

  17. def The Solution (I) • Roll back metadata in one of the blocks to an earlier, safe state (Safe state does not contain new directory entry) --- Block A’

  18. The Solution (II) • Write first block with metadata that were rolled back (block A’ of example) • Write blocks that can be written after first block has been written (block B of example) • Roll forward block that was rolled back • Write that block • Breaks the cyclical dependency but must nowwrite twice block A

  19. Last, block A ---------- def def NEW i-node-3 NEW xyz The solution (III) First, block A’ Then, block B ---

  20. JOURNALING (I) • Journaling systems maintain an auxiliary log that records all meta-data operations • Write-ahead loggingensures that the log is written to disk before any blocks containing data modified by the corresponding operations. • After a crash, can replay the log to bring the file system to a consistent state

  21. JOURNALING (II) • Log writes are performed in addition to the regular writes • Journaling systems incur log write overhead but • Log writes can be performed efficiently because they are sequential • Metadata blocks do not need to be written back after each update

  22. JOURNALING (III) • Journaling systems can provide • same durability semantics as FFS if log is forced to disk after each meta-data operation • the laxer semantics of Soft Updates if log writes are buffered until entire buffers are full • Will discuss two implementations • LFS-File • LFS-wafs

  23. LFS-File (I) • Maintains a circular log in a pre-allocated file in the FFS (about 1% of file system size) • Buffer manager uses a write-ahead logging protocol to ensure proper synchronization between regular file data and the log

  24. LFS-File (II) • Buffer header of each modified block in cache identifies the first and last log entries describing an update to the block • System uses • First item to decide which log entries can be purged from log • Second item to ensure that all relevant log entries are written to disk before the block is flushed from the cache

  25. 23 19 24 20 22 25 Example Log entries Not YetWritten Recycled Written to Disk 17 18 Buffer block Log entries 19 and above cannot be recycled until block is saved First = 19Last = 23 4KB ofData Block cannot be saved until all log entries up to 23 are saved

  26. LFS-File (III) • LFFS-file maintains its log asynchronously • Maintains file system integrity, but does not guarantee durability of updates

  27. LFS-wafs(I) • Implements its log in an auxiliary file system:Write Ahead File System (WAFS) • Can be mounted and unmounted • Can append data • Can return data by sequential or keyed reads • Keys for keyed reads are log-sequence-numbers (LSNs) that correspond to logical offsets in the log

  28. LFS-wafs(II) • Log is implemented as a circular buffer within the physical space allocated to the file system. • Buffer header of each modified block in cache contains LSNs of first and last log entries describing an update to the block • LFFS-wafs uses the same checkpointing scheme and the same write-ahead logging protocol as LFFS-file

  29. LFS-wafs(III) • Major advantage of WAFS is additional flexibility: • Can put WAFS on separate disk drive to avoid I/O contention • Can even put it in NVRAM • LFS-wafs normally usessynchronous writes • Metadata operations are persistent upon return from the system call • Same durability semantics as FFS

  30. LFFS Recovery • Superblock has address of last checkpoint • LFFS-file has frequent checkpoints • LFFS-wafs much less frequent checkpoints • First recover the log • Read then the log from logical end (backward pass) and undo all aborted operations • Do forward pass and reapply all updates that have not yet been written to disk

  31. OTHER APPROACHES (I) • Using non-volatile cache (Network Appliances) • Ultimate solution: can keep data in cache forever • Additional cost of NVRAM • Simulating NVRAM with • Uninterruptible power supplies • Hardware-protected RAM (Rio): cache is marked read-only most of the time

  32. OTHER APPROACHES (II) • Log-structured file systems • Not always possible to write all related meta-data in a single disk transfer • Sprite-LFS adds small log entries to the beginning of segments • BSD-LFS make segments temporary until all metadata necessary to ensure the recoverability of the file system are on disk.

  33. SYSTEM COMPARISON • Compared performances of • Standard FFS • FFS mounted with the async option • FFS mounted with Soft Updates • FFS augmented with a file log using asynchronous log writes • FFS augmented with a WAFS log using • Synchronous /asynchronous log writes • WAFS log on same/different drive

  34. Microbenchmark Best ???: Can explain it Worst

  35. Comments • FFS-async performsbest • Original FFS, LFFS-wafs-2sync andLFFS-wafs-1sync performworst • Synchronous log updates are costly • LFFS-file outperforms LFFS-wafs-2async and LFFS-wafs-1async • LFFS-file uses bigger block clustersfor log writes

  36. Ssh benchmark

  37. Netnews benchmark

  38. CONCLUSIONS • Journaling alone is not sufficient to “solve” the meta-data update problem • Cannot realize its full potential when synchronous semantics are required • When that condition is relaxed, journaling and Soft Updates perform comparably in most cases

More Related