MSc Software Maintenance MS Viðhald hugbúnaðar

MSc Software MaintenanceMS Viðhald hugbúnaðar Fyrirlestrar 15 & 16Programmers Use Slices When Debugging Dr Andy Brooks

Case StudyDæmisaga • Reference • Programmers Use Slices When Debugging, Mark Weiser, • Communications of the ACM, Volume 25, Number 7, • pp 446-452, 1982. Dr Andy Brooks

The basic debugging method • Reading 1 million lines of code, from beginning to end, to locate and remove a bug is not efficient. • 100 LOC/day equates to 10000 days... • 1000 LOC/day equates to 1000 days... • The basic debugging method is to begin at the statement where the error appears and then reason backwards about the previous sequence of statements. Dr Andy Brooks

Reasoning backwards • Reasoning backwards to determine all the influences on a variable usually reveals that many statements in the program have no influence. Sometimes you reason backward to the hardware or translation software... Dr Andy Brooks

Að sneiða Program Slicing “The process of stripping a program of statements without influence on a given variable at a given statement is called program slicing.” “An elementary slicing criterion of a program P is a tuple <i,V> where i denotes a specific statement in P and V is a subset of variables in P.” Dr Andy Brooks

A program and a program slice • BEGIN • READ(X,Y) • TOTAL:=0.0 • SUM:=0.0 • IF X<=1 • THEN SUM:=Y • ESLE BEGIN • READ(Z) • TOTAL:=X*Y • END • WRITE(TOTAL,SUM) • END. Slice on Z at statement 12 BEGIN READ(X,Y) IF X<=1 THEN ELSE READ(Z) END. TOTAL, SUM and Y have no influence on Z. Dr Andy Brooks

A program and a program slice • BEGIN • READ(X,Y) • TOTAL:=0.0 • SUM:=0.0 • IF X<=1 • THEN SUM:=Y • ESLE BEGIN • READ(Z) • TOTAL:=X*Y • END • WRITE(TOTAL,SUM) • END. Slice on X at statement 9 BEGIN READ(X,Y) END. Dr Andy Brooks

A program and a program slice • BEGIN • READ(X,Y) • TOTAL:=0.0 • SUM:=0.0 • IF X<=1 • THEN SUM:=Y • ESLE BEGIN • READ(Z) • TOTAL:=X*Y • END • WRITE(TOTAL,SUM) • END. Slice on TOTAL at statement 12 BEGIN READ(X,Y) TOTAL:=0.0 IF X<=1 THEN ELSE TOTAL:=X*Y END. Dr Andy Brooks

tilgáta Experimental Hypothesis H1 “... debugging programmers, working backwards from the variable and statement of a bug´s appearance, use that variable and statement as a slicing criterion to construct mentally the corresponding program slice.” Experimental Hypothesis H2 “... programmers look at code only in contiguous pieces.” Dr Andy Brooks

“Slices are generally not contiguous pieces, but contain statements scattered throughout the code.” ---------- ---------- ---------- ---------- ---------- ---------- xxxxxx xxxxxx xxxxxx xxxxxx xxxxxx ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- xxxxxx ---------- ---------- ---------- ---------- xxxxxx ---------- xxxxxx ---------- ---------- ---------- ---------- ---------- ---------- xxxxxx ---------- ---------- ---------- xxxxxx contiguous aðlægur slice Dr Andy Brooks

Method • Programmers debug three programs. • Test programmers´ memory of various code fragments • particularly the program slice relevant to the bug. “If the programmers did slice, then their memories for the relevant slices should be at least as good as their memories of contiguous code, and somewhat better than their memories of other non-contiguous code.” Andy says, this is more like a properly stated hypothesis. Dr Andy Brooks

Andy says: no protocol analysis • It is important to recognise that programmers were not observed working with the programs. • Their actions and the program statements they considered were not recorded. • Testing programmers´ memory is an indirect measurement. • And you may not be measuring what you think you are measuring... Dr Andy Brooks

Materials • Three programs written in Algol-W • Program sizes from 75 to 150 lines of code • Program TALLY • An IBM scientific subroutine • poorly structured and non-mnemonic variable names • Program PAYROLL • written for the experiment • computes salaries and deductions • well structured and mnemonic variable names • Program EVADE • written for the experiment • simulation of random aircaft turns • well structured and mnemonic variable names Dr Andy Brooks

Program bugs The bugs were chosen so that the entire experiment could be completed in less than an hour. Dr Andy Brooks

5 types of program fragments shown to programmers: • Relevant slice • Relevant contiguous • overlapped the relevant slice • Irrelevant contiguous • did not overlap relevant contiguous • did not overlap relevant slice • program TALLY had no irrelevant contiguous • Irrelevant slice • Jumble • every 3rd or 4th statement Dr Andy Brooks

Fragment overlaprelevant slice & relevant contiguous Andy asks: What were the number of statements in the relevant slices? Overlap is the fraction of statements shared by two fragments. Dr Andy Brooks

Syntactic changes • Syntactic changes were made to the code fragments to prevent recognition by a particular detail: • Variables and constants in the fragments were renamed as single letters followed by a unique number. • Indenting was adjusted from the original program to a form internally consistent with each fragment. Dr Andy Brooks

þátttakendur Participants • Experienced Algol-W programmers • Graduate student teaching assistants • all from the University of Michigan in Ann Arbor • 26 volunteers • 4 participated in pilot studies • 1 did not follow instructions in the experiment • 21 final participants Dr Andy Brooks

Andy´s view • Pilot studies are conducted to: • To check experimental materials are in order. • Instructions are clear. • To check experimental processes are sound. • There is sufficient time to complete tasks. • Participants behave in the way expected. • Weiser reports that pilot studies were conducted but fails to report on actions taken as a result of the pilot studies. Any actions taken should be briefly reported. Dr Andy Brooks

Procedure • Participants were given all three programs to debug in random order. • Participants were then asked to rate 14 program fragments for how sure they were the fragment had been used in one of the three programs. • remember, program TALLY had no irrelevant contiguous fragment (3*5-1 = 14) • Code fragments were given in random order each on a separate page with its rating scale. • Participants were told not to look back either at the programs or at previously rated code fragments. Dr Andy Brooks

Part of the relevant slice for PAYROLL Dr Andy Brooks

Fragment shown to participants Rating scale recognition Dr Andy Brooks

Results • All 21 participants found the bugs in TALLY and EVADE but only 17 found the bug in PAYROLL. Table IV Debugging times (minutes) Andy asks: what were the minimum and maximum times? Dr Andy Brooks

Results • A two-way analysis of variance using Friedman´s test indicated an overall difference in the ratings of the different fragments. • fragment type, program type Andy says: it is important for an overall test to be significant before looking at individual differences. The test is named, but the alpha level is not reported here (0.05?, 0.01?). Dr Andy Brooks

Results Figure 3 by fragment type 54% 28% 24% Why is recognition so high? Dr Andy Brooks

Significant differencesWilcoxon matched-pairs signed-ranks test • The difference between relevant slices and irrelevant slices is significant at the 0.03 level. • The difference between relevant slices and jumbles is very significant at the 0.005 level. Dr Andy Brooks

Results • Irrelevant contiguous was recognised because the programs were small and the irrelevant contiguous fragments were close to the output statements which wrote the incorrect variable values. • Participants would likely have examined code around these output statements. Dr Andy Brooks

Results Figure 4by fragment type and program type Dr Andy Brooks

Results Figure 4 • TALLY shows the greatest recognition of the relevant slice fragment. • Because TALLY was poorly structured (many GOTOS), perhaps more programmers adopted a slicing strategy to debug it. Dr Andy Brooks

Results Table V • To conclude the experiment, participants were asked about the typicalness of the programs and the bugs. • Table V shows that the mean ratings were at least 2.4 on a 1 to 4 scale. • 4 meant “very typical” • 1 meant “not at all typical”. • Weiser reasonably concluded that no program was especially atypical. Dr Andy Brooks

Examples of slices Figure 6 Slices that are large in relation to the program (e.g. 563/662 statements) are less useful to the program maintainer. Dr Andy Brooks

Implications • Tools that automatically generate program slices can help maintainers debug faulty code. • Novice programmers should be taught the concept of slicing. Today, researchers study many different kinds of slicing techniques. Dynamic slicing makes use of knowledge about the input, and this can greatly reduce the size of slices. Dr Andy Brooks

Slicing or not ? “Because the relevant slice fragment overlapped the relevant sequential fragment in each program, this experiment gives no absolute assurance that relevant slices were not recognised only because of that overlap.” Table II indicates that recognition ratings between relevant slice fragments and relevant sequential fragments are poorly correlated. This suggests that participants could have been recognising relevant slice fragments because they had indeed been slicing, but... Dr Andy Brooks

Andy´s view • In experimental work it is better to directly measure than indirectly measure. • Nowadays, it is possible to build and use tools to record all user actions and so help establish if program slicing occurred or not. • Even in Weiser´s day, he could have recorded participants speaking their thoughts and actions aloud and then analysed the recordings to help establish if program slicing had occurred or not. Dr Andy Brooks

Andy´s view • At the very least, Weiser should have asked his participants at the end of the experiment what actions they performed to debug the programs. • Because the programs were so small, it is quite possible that relevant slice recognition occurred because (some or all) participants had simply read all the code involved. • It would be interesting to know what the recognition rates would have been if fragments shown to participants had not been syntactically altered. Dr Andy Brooks

You never really know what is going on inside someone´s head. Dr Andy Brooks

MSc Software Maintenance MS Viðhald hugbúnaðar

MSc Software Maintenance MS Viðhald hugbúnaðar

Presentation Transcript

MSc Software Engineering Dissertation

Software Maintenance

Sensory Integration for HALD SIG

VeriFone Software Maintenance Program

Software Implementation Maintenance

SOFTWARE MAINTENANCE

Software Maintenance

BASIC SOFTWARE MAINTENANCE

MSc Software Maintenance MS Viðhald hugbúnaðar

Software Maintenance

Software Maintenance

Software Maintenance

Software Maintenance

SOFTWARE MAINTENANCE

Computer Lab Maintenance Software

MSc Software Testing and Maintenance MSc Prófun og viðhald hugbúnaðar

Software Test and Maintenance

Deepthi Rajeev, MS, MSc

Software Maintenance