360 likes | 376 Views
Develop a static program analysis tool, Airac, that automatically verifies buffer overrun errors in C programs. It provides a sound and automatic approach to detecting all buffer overruns, ensuring the creation of secure and error-free software.
E N D
Airac/MairacStatic Analyzers for Automatic Verification of Buffer Overrun/Memory Leak Errors in C Programs 이광근 교수 ropas.snu.ac.kr/~kwang Programming Research Lab Seoul National University 11/1/2005
소개 • 93: Ph.D., Univ. of Illinois at Urbana-Champaign • 93-95: Bell Labs 연구원, Software Principles Research Dept. (Murray Hill) • 95-03: KAIST 전산학과 조교수/부교수 • 98-03: 과기부 [프로그램 분석시스템 연구단] 단장 • 03-현재: 서울대 컴퓨터공학부 부교수
프로그램 분석 시스템 연구단 • 1998-2003 과기부 창의적연구진흥사업 지정 [프로그램분석 시스템 연구단] 단장 • 목표: 무결점 소프트웨어를 만들고 확인할 수 있는 원천 기술 연구
원천 기술 프로그램 분석 기술 Static Program Analysis
프로그램 분석 • 프로그램 분석(static program analysis) = 실행전에 실행성질을 자동으로 안전하게 어림잡는 일반적인 방법 • “실행전”: 프로그램을 돌리기 전에 • “실행성질”: 실행중의 프로그램 성질 • “자동으로”: 프로그램이 프로그램을 분석 • “안전하게”: 모든 실제상황을 포섭 • “어림잡는”: 군더더기가없을 순 없다 • “일반적인”: 가능한 언어와 실행성질이 무제한
무결점 소프트웨어를 위해서는두개의 기둥이 필요 • 프로그램 개발 프로세스 • 개발팀을 구성하고 운영해야 하는 체계 • 체계적인 운영을 강제하는 개발 도구 • 목표의 50%만 달성시켜줌: sw 오류는 계속 나타남 • 프로그램 오류 자동 검증 기술 • 자동: 소프트웨어가 소프트웨어를 분석 • 검증: 오류가 없다는 것을 확인해 줌 • 기술의 성숙도: 무르익어 산업체로 흘러들고 있슴 • 목표의 나머지 49%를 달성시켜 줄 것임: 무결점 sw
자동검증 기술이 적용된 예(외국) • Microsoftware (2001년 이후) • device driver sw 검증: SLAM technology • 안전한, 오류없는 sw개발에 집중: 요즘 Bill Gates 연설의 기초 • Unix/Linux kernel 검증 (2000년 이후) • model checking, static analysis 의 조합 • os community에서 가장 주목받고 있는 기술 • AirBus (2002년 이후) • aviation controller모듈 sw 검증에 static analysis기술적용 • AirBus sw개발 프로세스의 표준으로 static analysis과정을 결정 • 이기술에 특화된 회사들 등장: • AbsInt, Astree, PolySpace technologies, Trusted Logic, GrammaTech, Esterel Technologies, Galois Connections, etc.
자동검증 기술이 적용된 예(국내) • 예 • C(삼성전자): 할당된 메모리 영역 바깥을 접근하는 경우가 있는가? 검증. • C(삼성전자): 할당된 메모리 사용이 끝났으면 모두 재활용 하는가? 검증. • C(정통부): 내장 병렬 소프트웨어의 오류? 검증. • WEB프로그램(국가보안기술연구소): 웹소스가, 알려진 해커의 침입방법에 뚤릴 수 있는가? 검증. • 기타등등
Contents • Introduction • what • Performance • for realistic sw’s • strength and weakness • in global competition • Discussion
AiracStatic Analyzer for Detecting All Buffer Overrun Errors in C Programs • “static”: no test runs • “all”: no un-noticed overruns • “C”: full ANSI C + (GNU C) int *c = (int *)malloc(sizeof(int)*10); c[i] = 1; c[i + f()] = 1; c[*k + (*g)()] = 1; x = c+5; x[1] = 1; z->a = c; (z->a)[i] = 1; foo(c+2); int foo(int *d) {…d[i] = 1; …}
Airac: technology keywords • static program analysis • exhaustive: detects all buffer overruns • sound: safe side when in doubt • automatic: no need of help from C pgmer’s • always stops: even for non-terminating C pgms • modular: separate C files • correct: based on a firm theoretical framework
Airac: internals (1/2) x1 = F1(x1,…,xN) x2 = F2(x1,…,xN) … xN = FN(x1,…,xN) C files equation solver C’ files bug identification
Airac: internals (2/2) • sound design by abstract interpretation • accuracy improvement by • narrowing, flow-sensitivity, context pruning, static inlining(bounded polyvariance), static loop unrolling • cost reduction by • widening, economic join/partial-order ops • careful worklist order: lazy at join points
Warnings About Performance • Assume typeful C programs • array sizes remain the same as declared • Artificial semantics after errors • No side-effect for library functions • No main() then • analyzing procedure calls in their defined order • No alarms about buffers whose size is unknown • Worst values for free variables
Airac: performance (1/3) 3.2GHz P4, 4GB RAM
Airac vs Swat(2/3) Airac Bugs Coverity
Taming False Alarms • For each alarm from Airac, compute its true-alarm probability • conditional probability given its symptoms • Sift out“probably false” alarms • threshold by user-provided risk ratio • Reportfirst“probably true” alarms
Sifting Out False Alarms • for parts of Linux kernels • half of alarms are randomly used for the training • :-) 74.84% of false alarms filtered out when Rs = 3 x Rr • :-| 31.40% of true alarms swept out
Ranking False Alarms • The user sees “truer” alarms first • 15.17% of false alarms were mixed up until the user sees 50% of the true alarms
Airac: in global competition • one of a few real-world static analyzer in support of full ANSI C • v.s. world-class powers on static analysis: • Coverity(USA): not sound, ad-hoc. Beaten by Airac. • Polyspace(France): comparable, sound, cost, assumption • all in the static analysis research community: • I know what they can do. • If I hadn’t known, they may be people of either shallow technology or the “disruptive technology”
Mairac Static Analyzer for Detecting All Memory Leak Errors in C Programs • Mairac = Airac + malloc/free-analysis • Soundly approximate all possible execution flows with pointer values • Check if all malloc’ed addresses are freed
Mairac • option 1: free-before-end • check if malloc’ed addresses are freed before the end of the program • option 2: free-before-return • check if malloc’ed addresses are freed before the return of a procedure
Mairac’s Limitation • one abstract location/“malloc(size)” • no information about the structure of heap data • all locations of ptr/“free(ptr)” • our design choice • soundness violation in principle yet rare in practice ...
Mairac UnderstandsInterprocedural Pointer Aliasing void pointer(char **p, char* s){ *p = s; } int ResourceLeak_TC03 (int arg1) { char str[10] ="STRING"; char *p1, *p2; p1 = (char *)malloc(sizeof(char)*10); if( p1 == NULL) return 1; strcat(p1,str); pointer(&p2,p1); free(p2); // both Mairac and Prevent conclude OK return 0; }
Mairac UnderstandsPointer Arithmetic int pointer_arithmetic(int arg1) { char *buf1, *buf2, *p; int i; buf1 = malloc(10); p = buf2 = malloc(10); for(i=0;i<10;i++){ buf1++; buf2++; } free(buf1); // Prevent doesn’t alarm free(p); }
Mairac Understands Paths int unclear_condition(int cond) { char *buf1, *buf2; int i; if (cond) buf1 = alloc(10); // cond != 0 else buf2 = alloc(10); // cond == 0 cond = cond + 10; if (cond) free(buf1); else free(buf2); }
Mairac’s Performance (1/2)small test cases * Pointer Arithmetic 9 files 360 LOC Mairac 3 True False 4 2 Prevent * No heap data structure analysis
Mairac’s Performance (2/2)real commercial sw 102 files 72,293 LOC Pentium 2.8 3GB about 1hr • 중복 알람 • 라이브러리 함수 Mairac Prevent 72 23 True False 20 22 337 8 목표: false alarm ratio · 50%
Technology Keywords • static analysis: abstract interpretation • fully automatic • always terminate • detecting all targetted bugs • false alarms • “software MRI”