160 likes | 297 Views
Mining Windows Kernel API Rules. Jinlin Yang jinlin@cs.virginia.edu 09/28/2005 CS696. My Background. Bounded exhaustive testing, 09/2001-01/2004
E N D
Mining Windows Kernel API Rules Jinlin Yang jinlin@cs.virginia.edu 09/28/2005 CS696
My Background • Bounded exhaustive testing, 09/2001-01/2004 • D. Coppit, J. Yang, S. Khurshid, W. Le, and K. Sullivan. Software Assurance by Bounded Exhaustive Testing. IEEE Transactions on Software Engineering. April 2005 • K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software Assurance by Bounded Exhaustive Testing. ISSTA ‘04 • Temporal properties inference, 01/2004-present • J. Yang and D. Evans. Dynamically Inferring Temporal Properties. PASTE ’04 • J. Yang and D. Evans. Automatically Inferring Temporal Properties for Program Evolution. ISSRE ’04 • J. Yang and D. Evans. Automatically Discovering Temporal Properties for Program Verification. Submitted to FMSD • J. Yang, D. Evans, D. Bhardwah, T. Bhat, and M. Das. Terracotta: Mining Temporal API Rules from Imperfect Traces. Submitted to ICSE ‘06 Jinlin Yang, CS696
Overview • Problem: unavailability of specification is a big issue in defect detection • Solution: automatically inferring specification from execution traces • Benefits: better understanding of legacy code and opportunity to find more defects • Experiments on finding kernel API rules • Found one previously unknown bug in Windows • Found interesting properties that should have been checked Jinlin Yang, CS696
Problem • Defect detection technique • Generic properties • E.g. pointer and buffer usage • PREfix [Bush et al, SP&E00], PREfast • Very effective • Application specific properties • E.g. lock/unlock, resource creation/deletion • SLAM/SDV [Ball et al, SPIN01], ESP [Das et al, PLDI02] • Where do we get such properties? Jinlin Yang, CS696
My Approach Instrumented Program Inferred Properties Execution Traces Program Report Running Inference Post-processing Instrumentation Property Templates Test Suite J. Yang and D. Evans. Dynamically inferring temporal properties. PASTE ‘04. Jinlin Yang, CS696
An Example • Alternating template (PS)*, P≠S. P and S are placeholders Jinlin Yang, CS696
Implementation • Terracotta • Inference engine • Context-aware trace analysis • Heuristics for prioritizing and presenting properties • Performance linear to length of trace and number of distinct events • More information http://www.cs.virginia.edu/terracotta Jinlin Yang, CS696
Lessons • Missing interesting properties • Original algorithm requires 100% satisfaction • Real world is never perfect • Trace collected by sampling • Object information unavailable • Imperfect programs • Can we develop better inference to handle this? • Too many noises in results • Interesting properties are buried in a group of uninteresting ones • Can we develop heuristics to select interesting ones? Jinlin Yang, CS696
Refinement of Inference • How to detect interesting properties in face of imperfect traces? • Example • PS PS PS PS PS PS PS PS PS PPP • The dominant behavior is P and S alternate • 10 subtraces, 90% satisfy Alternating Jinlin Yang, CS696
Refinement of Inference (2) • How to pick out interesting properties? • Which one is more likely to be interesting? • Heuristics: CD is often more interesting • Compute call graph for windows binaries • Keep AB if B is not reachable from A void A(){ ... B(); ... } Case 1 void KeSetTimer(){ KeSetTimerEx(); } void x(){ C(); ... D(); } Case 2 void x(){ ExAcquireFastMutexUnsafe(&m); ... ExReleaseFastMutexUnsafe(&m); } Jinlin Yang, CS696
Refinement of Inference (3) • Heuristics: the more similar two events are, the more likely that the properties is interesting • Relative edit distance between A and B • Partition A and B into words • A has wA words, B has wB, w common words • For example: • Ke Acquire In Stack Queued Spin Lock Ke Release In Stack Queued Spin Lock • Similarity = 85.7% Jinlin Yang, CS696
Results: Kernel • Approximation • PAL threshold = 0.90 • 7611 properties • Call-graph and edit distance based reduction • Use the call-graph of ntoskrnl.exe, edit dist > 0.5 • 142 properties. 53 times reduction! • Small enough for manual inspection • 56 apparently interesting properties (40%) • Locking discipline • Resource allocation and deletion Jinlin Yang, CS696
Result: Kernel (2) • Found interesting properties that should be checked • Several types of kernel SpinLock • The Static Device Verifier should have checked them • ESP found one previously unknown bug in ntfs.sys • Double-acquire of FastMutex • Confirmed and fixed by the responsible developers Static Driver Verifier: Finding Bugs in Device Drivers at Compile-Time. WinHEC, April 2004. M. Das, S. Lerner, and M. Seigle. ESP: Path-Sensitive Program Verification in Polynomial Time. PLDI ‘02 Jinlin Yang, CS696
Summary of Experiments • We inferred interesting rules about kernel APIs! • SDV already encodes some properties http://download.microsoft.com/download/5/b/5/5b5bec17-ea71-4653-9539-204a672f11cf/SDV-intro.doc • We inferred undocumented ones too • Inference scales well to realistic traces • Approximation is effective in tolerating imperfect traces and detect dominant patterns • Call-graph and edit distance based reduction is very effective • Check with defect detection tool is promising • Other experiments: Vulcan APIs, Daisy file system Jinlin Yang, CS696
Conclusion • Constructing interesting properties is important and difficult • Automatic inference from execution traces is light-weight and effective • Practical values • Helping developers understand legacy code • Giving us opportunity of leveraging sophisticated static analysis tools to find application specific defects Jinlin Yang, CS696
Q & A • For more information jinlin@cs.virginia.edu http://www.cs.virginia.edu/terracotta • Great collaborators • UVa David Evans, Ed Mitchell • Microsoft Stephen Adams, Deepali Bhardwaj, Thirumalesh Bhat, Manuvir Das, Damian Hasse, Marne Staples, Rick Vicik, Jason Yang, Zhe Yang Jinlin Yang, CS696