190 likes | 352 Views
Ensuring File-System Integrity with Declarative Invariants. Daniel Fryer, Jack Sun, Mike Qin Ashvin Goel , Angela Demke Brown University of Toronto. Metadata Integrity is Crucial. You don’t know what you’ve got ’til it’s gone…. File System. Kernel. M. M. M. Block Layer. a. D. t.
E N D
Ensuring File-System Integrity withDeclarative Invariants Daniel Fryer, Jack Sun, Mike Qin AshvinGoel, Angela Demke Brown University of Toronto
Metadata Integrity is Crucial You don’t know what you’ve got ’til it’s gone… File System Kernel M M M BlockLayer a D t a D D D D D D D D Storage
File Systems Have Bugs Why can’t existing solutions handle this problem?
“Solutions” Existing approaches assume file systems are correct Kernel File System Journals? Block Layer Checksums? RAID? Storage None of these protect against bugs in file systems
Offline Checking • Check consistency offline, e.g., fsck • Consistency properties necessary for correctness FS1: No double allocation FS2: Refcount-based sharing M M M M metadata D D D Ref: 2 data
Problems with Offline Checking • Slow, getting slower with larger disks • Requires taking file system offline • No one uses it until something is obviously wrong • After the fact, repair is error prone M M metadata D data
Outline • Problem • Metadata can be corrupted by bugs • Existing techniques are inadequate • Key idea and design • Check metadata consistency at runtime • Allows protecting metadata from arbitrary kernel bugs • Evaluation
Runtime Consistency Checking • Ensure every update results in a consistent FS • Stop corrupting updates • Makes repair unnecessary! • “What happens in DRAM stays in DRAM” • Problems • Consistency properties are global • E.g., no two pointers in the file system point to same block • Global properties require full scan • We can’t run fsck at every write
Consistency Invariants • We transform global consistency properties to fast, local consistency invariants • Assume initial consistent state • New file system is clean • Use checksums/redundancy to handle errors below FS • At runtime, check only what is changing • Do so before changes become persistent • Resulting new state is consistent
Example: Block Allocation in Ext3 Ext3 maintains a block bitmap Every allocated block is marked in the bitmap Block Bitmap • inode 5 6 7 8 9 size time Block 7 Updated Block 7 Updated Block 8 8 Block 8
Example: Block Allocation in Ext3 • Consistency Invariant • Invariant fails if either update is missing • Should not mark allocated without setting block pointer • Should not set block pointer without marking allocated • Can any consistency property be transformed? • File systems need to maintain consistency efficiently • Allows checking invariants efficiently • Bitmap bit X flip • from “0” to “1” Block pointer set to X
The Recon Design File System Block/VMM Layer write Recon Metadata Write Cache check & commit Metadata Read Cache read Disk
Checking Invariants in Recon Invariants Logical change generation (compare blocks) Metadata interpretation (determine block type) Metadata write cache Modified block Check invariants Metadata read cache Original block logical change record: [type, id, field, old, new]
Invariant Example Transaction appends a new block to inode12 • Bitmap bit X flip • from “0” to “1” Block pointer set to X
Invariants in Datalog • Invariants are typically 3-4 lines, independent Block allocation block_violation(IN ,BN) :- block_allocated(IN, BN), not(change(b_freemap, _, _, BN, _, 1)). freemap_violation(BN) :- change(b_freemap, _, _, BN, _, 1), not(block_allocated(_, BN)). Directory cycle detection path(X , P) :- dir_get_parent(X, P). path(X , A) :- dir_get_parent(X, P), path(P, A). cycle(X) :- path(X, X).
inode (blkptr) inode (stat) inode (others) ibm dir bbm random bgd Corruption Detection Recon matches e2fsck
Performance Evaluation For reasonable cache sizes, performance impact is modest
Conclusion • All consistency properties for ext3 can be enforced during updates without full disk scan • Checking can be done entirely outside the file system • Runtime invariants can be expressed declaratively • Preventing corruption from being committed is a huge win over after-the-fact repair!
Ongoing Work • Avoid memory overhead from Recon caches • Apply to newly developed file systems, e.g., btrfs • Ensure that blocks are never overwritten on disk • Verify runtime invariants match fsck code • Further • Develop tools to simply file-system specific metadata interpretation, invariant generation • Apply ideas to applications, e.g., databases • Thanks!