400 likes | 557 Views
Program Behavior Characterization and Clustering: An Empirical Study for Failure Clustering. Danqing Zhang School of Software Engineering, Tongji University November 4 th , 2013. Outline. Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary.
E N D
Program Behavior Characterization and Clustering: An Empirical Study for Failure Clustering Danqing Zhang School of Software Engineering, Tongji University November 4th, 2013
Outline • Motivation • Runtime Behavior Representation • Behavior Clustering • Results and Analysis • Summary
Outline • Motivation • Runtime Behavior Representation • Behavior Clustering • Results and Analysis • Summary
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Failure clustering • The major work of failure clustering is to categorize different failed executions according to those induced by the same faults.
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Programs and their executions L1: L494: L495: f1 (a, b) { L496: int temp; L497: temp = a; L498: a = b; L499: b = temp; L500: } L501: printf (“a = %d\n”, a); L502: L561: } Source code of P1 Source code of P2 L1: L228: L229: func (x, y) { L230: int t; L231: t = x; L232: x = y; L233: y = t; L234: } L235: printf (“y = %d\n”, y); L236: L321: .…… .…… .…… .……
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Programs and their executions L1: L494: L495: f1 (a, b) { L496: int temp; L497: temp = a; L498: a = b; L499: b = temp; L500: } L501: printf (“a = %d\n”, a); L502: L561: } Source code of P1 Source code of P2 L1: L228: L229: func (x, y) { L230: int t; L231: t = x; L232: x = y; L233: y = t; L234: } L235: printf (“y = %d\n”, y); L236: L321: .…… .…… .…… .……
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Failure clustering • The major work of failure clustering is to categorize different failed executions according to those induced by the same faults. • Programs having similar attributes (e.g. structural features, execution profiles) are assumed to have similar fault behaviors and failure behaviors.
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Programs and their executions • The similarities of runtime behaviors of programs approximate similarities of the effects of “fault-error-failure” chain on programs. • If programs can be clustered based on their runtime behaviors, their failure behaviors can be clustered in the same way.
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Our work • Runtime behavior modeling • Behavior clustering • Experimental evaluation
Outline • Motivation • Runtime Behavior Representation • Behavior Clustering • Results and Analysis • Summary
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Assumptions • Programs are structured. • Runtime environment of programs is fault free. • Runtime characterization is defined for the IA32 platform and the target programs are assumed to be of 32 bit code.
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary BIP: Branch-instruction-based partition Code sequence (at assembly-level) BIP-based state Runtime characteristic mov %esp, %ebp call 8048344 <f1> push %ebp mov %esp, %ebp lea 0xb (%eax), %edx pop %ebp ret CALLS: CALL-state A procedure calling exists during program execution. RETS: RET-state
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary BIP: Branch-instruction-based partition Code sequence (at assembly-level) BIP-based state Runtime characteristic mov %esp, %ebp jmp 804842C <main+0x94> jne 904841a <main+0x82> int $0x80 The body of a loop is executed. UJS: Unconditional-jump-state CJFS(CJTS): Conditional-jump-with-false (true)-state If/else statements are executed. A software interrupt is generated during program execution. INTS: INT-state
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary How to characterize the runtime behavior ? State sequence UJS CJFS CJFS CALLS CALLS RETS UJS CJFS CJFS CALLS CALLS RETS UJS K K K K K K • K is denoted as the length of the short sequence of states. • The total number of whole K-mer combinations is . (When K=5, .) • The runtime behavior is represented as: {0, … , , … , , … , , … , , …, , … , 0} K K K 7776
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Runtime behavior • Runtime behavior : Runtime spectrum • When the optimal K value is chosen, the runtime behavior can be well represented.
Outline • Motivation • Runtime Behavior Representation • Behavior Clustering • Results and Analysis • Summary
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Similarity between two runtime behaviors Two runtime behaviors
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Similarity between two runtime behaviors Two runtime behaviors Similarity degree between two behaviors
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Similarity between two runtime behaviors • is symmetric and a non-negative value. • A smaller value of indicates a higher similarity. • if and only if two behaviors are exactly the same. Two runtime behaviors Similarity degree between two behaviors
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Behavior clustering • A fuzzy clustering algorithm is used to cluster program behaviors. • FCS: fuzzy compactness and separation • A method of determining the optimal cluster number (OCN) • Hazard rate • First order backward difference
Outline • Motivation • Runtime Behavior Representation • Behavior Clustering • Results and Analysis • Summary
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Implementation • Obtain the runtime behaviors during program executions (SPEC CPU2000 and SPEC CPU2006) • Cluster the runtime behaviors • Obtain the failure behaviors (by fault injection) • Evaluate the equivalence of a cluster in runtime characterization (clustering) to that in failure clustering
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Implementation : to obtain runtime behaviors M programs N program executions N runtime behaviors N=99 .… .… N .… State sequence State sequence 1 State sequence 1 • Runtime behavior 1 • Runtime behavior N inputs … … PIN • Runtime behavior Compile & link N M M=22
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Implementation : to cluster runtime behaviors Runtime behavior clustering Sets of runtime behaviors MDS : Multidimensional scaling .… N 1 Similarity degree • Runtime behavior • Runtime behavior • Runtime behavior Obtain similarity degrees of each two behaviors
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Implementation : to obtain failure behaviors ODC : Orthogonal Defect Classification
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Implementation : to obtain failure behaviors N program executions Occurrence frequencies of four failure modes .… N 1 Similarity degrees between each two failure behaviors Correct | Aborted | Hanged | Wrong % % % % Correct | Aborted | Hanged | Wrong % % % % Correct | Aborted | Hanged | Wrong % % % % Fault injection 1 N=99 … N
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Implementation : K and CN (cluster number) • Runtime behaviors can be well represented when the optimal K value is chosen. • Runtime behaviors can be well clustered when the optimal CN is chosen.
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : runtime behavior is well represented? • The case study: K=5 and CN=7 • For simplicity: UJS = 0, CJFS = 1 CJTS = 2, CALLS = 3 RETS = 4, INTS = 5 ?
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered? SPEC CPU2000 and SPEC CPU2006 MFC-fault injection
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered? • The case study: K=5 and optimal cluster number (OCN)=7 • is used to describe the similarity degree • The statistics of similarity degree of MFC-induced failure behaviors in each cluster are:
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered? The OCN (optimal cluster number) for all the Ks
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered?
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered?
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered?
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered?
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered?
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered?
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Evaluation : failure behaviors are well clustered? • The total similarity degree of failure behavior clustering reaches the maximum when K=7. • When K=7, the quality of failure behavior clustering based on the runtime behavior clustering is the highest.
Outline • Motivation • Runtime Behavior Representation • Behavior Clustering • Results and Analysis • Summary
Motivation Runtime Behavior Representation Behavior Clustering Results and Analysis Summary Summary • Once the optimal K and CN are chosen, • runtime behaviors are well represented based on BIP • failure behaviors can be clustered according to the runtime behavior clustering • Expand the range of K and CN • Analyze the effectiveness of clustering fault behaviors according to the runtime behavior clustering