210 likes | 309 Views
Using Declarative Invariants for Protecting File-System Integrity. By Kuei (Jack) Sun, Daniel Fryer, Ashvin Goel and Angela Demke Brown University of Toronto. Motivation. File systems have bugs Cause corruption and/or data loss
E N D
Using Declarative Invariants for Protecting File-SystemIntegrity By Kuei (Jack) Sun, Daniel Fryer, AshvinGoel and Angela Demke Brown University of Toronto
Motivation • File systems have bugs • Cause corruption and/or data loss • Existing reliability techniques, e.g., journaling, RAID, don’t help • Existing recovery solutions • Restore from backup, but slow, risks loss of data • Offline checker (i.e. fsck), but too slow • Can we do better?
Possible Alternatives? • Eliminate Bugs • Static Analysis • Does not scale • Bugs may be input dependent • Tolerate Bugs • N-version programming (e.g., Envyfs) • High overheads (performance and storage) • Can only check features common to all versions • Micro reboot of file system (e.g. Membrane) • Requires detectable failures • Many corruption bugs are fail-silent
Our Approach • Verify correctness at runtime • Benefit: Make silent failures detectable • What are we checking? • Same thing fsck checks • When are we checking for consistency? • When file system claims to be consistent • Leverage transactions provided for crash recovery • I.e. check at commit time • How are we doing the checks? • Convert global fsck checks into local checks on transactions
No Really… How? • File systems have consistency properties • E.g. all in-use data blocks are marked in the block allocation bitmap • This is a global property that fsck checks • For each property, we derive an invariant • Invariant must hold in any transaction to preserve the corresponding property • E.g. when Block N is allocated, bit N must be set in the allocation bitmap, in the same transaction • Invariants operate on changes in transactions, requiring local checks
Recon Data Flow • The focus of work… Change records encode the updates in a transaction Invariant Checker Write Cache Modified Block Logical Difference Engine Change Record Invariants Read Cache Original Block Violation?
Change Record • Example: write 8 bytes of data into an empty file with inode #1 Block 7 allocated Direct block Bit set
Datalog Invariant Checking • R1_violation(IN ,BN) :- block_allocated(IN, BN), not(change(b_freemap, _, _, BN, _, 1)). • R2_violation(BN) :- change(b_freemap, _, _, BN, _, 1), not(block_allocated(_, BN)). Change records are trivially converted to Datalog facts
Invariant Checking • On each transaction • Add facts from change records into Datalog knowledge base • Check all invariants on Datalog facts • Problem • Set of facts grows over time • Facts need to persist across reboots • Slows invariant checking • Introduces more consistency problems • Insight • After commit, all facts in the transaction are incorporated in file system
Querying File System State • FS state is available in Recon caches • We provide Datalogprimitivesto access caches • Can discard all facts after transaction commit Datalog Interpreter Change Record … Fact Disk Read Cache Invariants Write Cache Primitives Violation?
Using Primitives • Example:Directory Cycle Detection path(X , P) :- dir_get_parent(X, P). path(X , A) :- dir_get_parent(X, P), path(P, A). cycle(X) :- path(X, X). • No cycle for this tree! Primitive c b a /
Current Status • Implemented for a simple test file system • TestFS implemented at user level • Designed to be a simplified version of Ext3 • All TestFS invariants are applicable to Ext3 • TestFS has 12 Datalog invariants • Ext3 has 33 invariants in C • Invariants are independent • Total number of lines of invariant code is 38
Future Work • Datalog invariants for ext3, btrfs file systems • Currently, Ext3/Btrfs Recon is implemented in OS • We plan to implement it in a hypervisor to provide strong fault model • Don’t need to port Datalog to kernel! • Customize Datalog interpreter • Optimize for file-system specific operations
Conclusion • The Recon framework allows detecting arbitrary metadata corruption through runtime consistency checking • When a transaction commits, Recon checks invariants to ensure file system consistency • Invariants can be expressed in Datalogclearly and concisely
Using Declarative Invariants for Protecting File-SystemIntegrity By Kuei (Jack) Sun, Daniel Fryer, AshvinGoel and Angela Demke Brown • Questions?
Evaluation • Workload • ~203K commands • e.g. mkdir, rmdir, rm, touch, cd, write to file
Directory Cycle Detection • Example: move /a into /a/b/c : child entry c : parent entry b c b a a / /
Directory Cycle Detection • Change records for move /a into /a/b/c
Invariant Checking • cycle(3). • path(3, 3). • parent(3, 3). • parent(3, ?), path(?, 3). • parent(3, 2), path(2, 3). • parent(3, 2), parent(2, 3). • parent(3, 2), parent(2, ?), path(?, 3). • parent(3, 2), parent(2, 1), path(1, 3). • parent(3, 2), parent(2, 1), parent(1, 3). • We have a match, a.k.a: violation! path(IN , PIN) :- dir_get_parent(IN , PIN). path(IN , AIN) :- dir_get_parent(IN , PIN), path(PIN , AIN). cycle(IN) :- path(IN , IN). b c 2 3 a 1
Primitives change(dir_block, 3, 1, φ, ‘a’). change(dir_block, 1, 3, φ, ‘..’). • Problem: • The set of change records that we have is insufficient. • From the transaction alone, we cannot deduce the parent of ‘c’ and ‘b’. • We know the parent of ‘a’ is ‘c’. • Solution: • Primitives are predicates written inthe C language that is able to querythe read and write cache in Recon b c 2 3 a 1