600 likes | 705 Views
Spring 2013 Program Analysis and Verification Lecture 1: Introduction. Roman Manevich Ben-Gurion University. 30GB Zunes all over the world fail en masse. December 31, 2008. Zune bug. 1 while (days > 365) { 2 if ( IsLeapYear (year)) { 3 if (days > 366) { 4 days -= 366;
E N D
Spring 2013Program Analysis and Verification Lecture 1: Introduction Roman Manevich Ben-Gurion University
30GB Zunes all over the world fail en masse December 31, 2008
Zune bug 1 while (days > 365) { 2 if (IsLeapYear(year)) { 3 if (days > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 }
Zune bug Suggested solution: wait for tomorrow 1 while (366 > 365) { 2 if (IsLeapYear(2008)) { 3 if (366 > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 }
Patriot missile failure On the night of the 25th of February, 1991, a Patriot missile system operating in Dhahran, Saudi Arabia, failed to track and intercept an incoming Scud. The Iraqi missile impacted into an army barracks, killing 28 U.S. soldiers and injuring another 98. February 25, 1991
Patriot bug – rounding error Suggested solution: reboot every 10 hours • Time measured in 1/10 seconds • Binary expansion of 1/10: 0.0001100110011001100110011001100.... • 24-bit register 0.00011001100110011001100 • error of • 0.0000000000000000000000011001100... binary, or ~0.000000095 decimal • After 100 hours of operation error is 0.000000095×100×3600×10=0.34 • A Scud travels at about 1,676 meters per second, and so travels more than half a kilometer in this time
Billy Gates why do you make this possible ? Stop making moneyand fix your software!! (W32.Blaster.Worm) August 13, 2003
Windows exploit(s)Buffer Overflow Memory addresses … Previous frame br Return address da Saved FP ca char* x ra buf[2] ab Stack grows this way void foo (char *x) { char buf[2]; strcpy(buf, x); } int main (intargc, char *argv[]) { foo(argv[1]); } ./a.out abracadabra Segmentation fault
Buffer overrun exploits intcheck_authentication(char *password) { intauth_flag = 0; char password_buffer[16]; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(intargc, char *argv[]) { if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); } else printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2nd Ed”)
(In)correct usage of APIs • Application trend: Increasing number of libraries and APIs • Non-trivial restrictions on permitted sequences of operations • Typestate:Temporal safety properties • What sequence of operations are permitted on an object? • Encoded as DFA e.g. “Don’t use a Socket unless it is connected” close() getInputStream() getOutputStream() init connected closed connect() close() err getInputStream() getOutputStream() getInputStream() getOutputStream() *
Challenges class SocketHolder{ Socket s; } Socket makeSocket() { return new Socket(); // A } open(Socket l) { l.connect(); } talk(Socket s) { s.getOutputStream()).write(“hello”); } main() { Set<SocketHolder> set = new HashSet<SocketHolder>(); while(…) { SocketHolder h = new SocketHolder(); h.s =makeSocket(); set.add(h); } for (Iterator<SocketHolder> it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); } }
Testing is not enough • Observe someprogram behaviors • What can you say about otherbehaviors? • Concurrency makes things worse • Smart testing is useful • requires the techniques that we will see in the course
Static analysis definition Reason statically (at compile time) about the possible runtime behaviors of a program “The algorithmic discovery of properties of a program by inspection of its source text1” -- Manna, Pnueli 1 Does not have to literally be the source text, just means w/o running it
Is it at all doable? x = ? if (x > 0) { y = 42; } else { y = 73; foo(); } assert (y == 42); • Bad news: problem is generally undecidable
Central idea: use approximation Exact set of configurations/ behaviors Over Approximation Under Approximation universe
Goal: exploring program states badstates reachablestates initialstates
Technique: explore abstract states badstates reachablestates initialstates
Technique: explore abstract states badstates reachablestates initialstates
Technique: explore abstract states badstates reachablestates initialstates
Technique: explore abstract states badstates reachablestates initialstates
Sound: cover all reachable states badstates reachablestates initialstates
Unsound: miss some reachable states badstates reachablestates initialstates
Imprecise abstraction False alarms badstates reachablestates initialstates
A sound message x = ? if (x > 0) { y = 42; } else { y = 73; foo(); } assert (y == 42); • Assertion may be violated
Precision UselessAnalysis(Program p) { printf(“assertion may be violated\n”); } Avoid useless result Low false alarm rate Understand where precision is lost
Precise API Usage Rules (SLIC) Defects 100% path coverage Static Driver Verifier Rules Static Driver Verifier Environment model Driver’s Source Code in C
Bill Gates’ Quote "Things like even software verification, this has been the Holy Grail of computer science for many decades but now in some very key areas, for example, driver verification we’re building tools that can do actual proof about the software and how it works in order to guarantee the reliability."Bill Gates, April 18, 2002. Keynote address at WinHec 2002
Patrick Cousot RadhiaCousot JérômeFeretLaurent Mauborgne Antoine Miné Xavier Rival The Astrée Static Analyzer ENS France
Objectives of Astrée • Prove absence of errors in safety critical C code • ASTRÉE was able to prove completely automatically the absence of any RTE in the primary flight control software of the Airbus A340 fly-by-wire system • a program of 132,000 lines of C analyzed
Objectives of Astrée By Lasse Fuss (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons • Prove absence of errors in safety critical C code • ASTRÉE was able to prove completely automatically the absence of any RTE in the primary flight control software of the Airbus A340 fly-by-wire system • a program of 132,000 lines of C analyzed
A little about me • History • Studied B.Sc., M.Sc., Ph.D. at Tel-Aviv University • Research in program analysis with IBM and Microsoft • Post-doc in UCLA and in UT Austin • Joined Ben-Gurion University this year • Example research challenges • What’s a good algorithm for automatically discovering (with no hints) that a program generates a binary tree where all leaves are connected in a list? • What’s a good algorithm for automatically proving that a parallel program behaves “well”? • How can we automatically synthesize parallel code that is both correct and efficient?
Why study program analysis? • Challenging and thought provoking • An approach for dealing with computationally hard (usually undecidable) problems • Treat programs as mathematical objects • Understand how to systematically • Design optimizations • Reason about correctness / find bugs (security) • Some techniques may be applied in other domains • Computational learning • Analysis of biological systems
What do you get in this course? • Learn basic principles of static analysis • Understand jargon/papers • Learn a few advanced techniques • Some principled way of developing analysis • Develop one in a small-scale project • Put to practice what you learned in logic, automata, programming
My role • Teach you theory and practice • Teach you how to think of new techniques • E-mail: romanm@cs.bgu.ac.il • Office hours: Wednesday 13:00-15:00 • Course web-page • Announcements • Forum • …
Requirements • Summarize one lecture: 10% of grade • Submit initial summary • Get corrections/suggestions • Submit revised summary • Theoretical assignments and programming assignments: 50% • About 8 (some very small) • Must submit all • Must solve all questions • Otherwise re-submit (and get a lower grade) • Final project: 40% • Implement a program analyzer for a given component
How to succeed in this course Joe (a day before assignment deadline):“I don’t really understand what you want from me in this assignment, can you help me/extend the deadline”? • Attend all classes • Make sure you understand material in class • Engage by asking questions and raising ideas • Be on top of assignments • Submit on time • Don’t get stuck or give up on exercises – get help – ask me • Don’t start working on assignments the day before • Be ethical
The static analysis approach • Formalize software behavior in a mathematical model (semantics) • Prove properties of the mathematical model • Automatically, typically with approximation of the formal semantics • Develop theoryand toolsfor program correctness and robustness
Kinds of static analysis • Spans a wide range • type checking … up to full functional verification • General safety specifications • Security properties (e.g., information flow) • Concurrency correctness conditions (e.g., absence of data races, absence of deadlocks, atomicity) • Correct usage of libraries (e.g., typestate) • Underapproximations useful for bug-finding, test-case generation,…
Static analysis techniques Abstract Interpretation Dataflow analysis Constraint-based analysis Type and effect systems
Static analysis for verification specification Valid Analyzer program Abstract counterexample
Fully automatic Applicable to a programming language Can be very imprecise May yield false alarms Requires specification and loop invariants Program specific Relatively complete Provides counter examples Provides useful documentation Can be mechanized using theorem provers Relation to program verification Static Analysis Program Verification
Verification challenge main(inti) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y; } Determine what states can arise during any execution Challenge: set of states is unbounded
Abstract Interpretation main(inti) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y; } Recipe Abstraction Transformers Exploration Determine what states can arise during any execution Challenge: set of states is unbounded Solution: compute a bounded representation of (a superset) of program states
1) Abstraction main(inti) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y; } abstract state (sign) : VarZ • concrete state #: Var{+, 0, -, ?} …
2) Transformers main(inti) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y; } abstract transformer y = y + 1 • concrete transformer y = y + 1
3) Exploration main(inti) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y; }
Incompleteness main(inti) { int x=3,y=1; do { y = y - 2; y = y + 3; } while(--i > 0) assert 0 < x + y; }
Parity abstraction while (x !=1 ) do { if (x % 2) == 0 { x := x / 2; } else { x := x * 3 + 1; assert (x %2 ==0); } } challenge: how to find “the right” abstraction
How to find “the right” abstraction? • Pick an abstract domain suited for your property • Numerical domains • Domains for reasoning about the heap • … • Combination of abstract domains • Another approach • Abstraction refinement