400 likes | 596 Views
Automatically Grading Programming Assignments with Web-CAT. Stephen H. Edwards Virginia Tech Dept. of Computer Science edwards@cs.vt.edu http://web-cat.sourceforge.net/. My goals today are to ….
E N D
Automatically Grading Programming Assignments with Web-CAT Stephen H. Edwards Virginia Tech Dept. of Computer Science edwards@cs.vt.edu http://web-cat.sourceforge.net/
My goals today are to … • Explain how requiring students to formulate and test hypotheses about their own code can improve their understanding and performance • Describe our experiences with an alternate grading approach supported by a new tool: Web-CAT • Describe some of the flexibility in Web-CAT for supporting other approaches • Convince you software testing can be an important—and practical—addition to classroom practices Automatically Grading Programming Assignments with Web-CAT
Students hold onto ineffective techniques Too often, intro students believe that if their code: • compiles, the errors are mostly gone • runs correctly when I try it once, it is correct • runs on the instructor-provided sample input, it is correct • has a problem, it can be fixed by trial and error Automatically Grading Programming Assignments with Web-CAT
What is reflection-in-action? • For an expert, when the current technique is failing … • Step back and reflect: “I must be missing something” • Re-examine the situation, your solution, and your implicit assumptions about the problem • Leads to guesses (hypotheses) about why the solution isn’t working or why something else will be better • “[Carry] out an experiment which serves to generate both a new understanding of the phenomenon and a change in the situation” Automatically Grading Programming Assignments with Web-CAT
Practicing software testing will help students frame and carry out experiments • The problem: too much focus on synthesis and analysis too early in teaching CS • Need to be able to read and comprehend source code • Envision how a change in the code will result in a change in the behavior • Need explicit, continually reinforced practice in hypothesizing about program behavior and then experimentally verifyingtheir hypotheses Automatically Grading Programming Assignments with Web-CAT
Student comments suggest their current testing practices are often weak • I run them through some simple tests to ensure that it is operating as expected. But for the most part I have always relied on supplied test data • I don’t think about test cases until I am confident my program is 100% working. Of course, it almost never is … • I usually write the whole thing up and then start doing rapid-fire tests of everything I can think of. Automatically Grading Programming Assignments with Web-CAT
A comprehensive strategy is necessary for a culture shift in what students do • Students cannot test their own code • Want a culture shift in student behavior • A single upper-division course would have little impact on practices in other classes • So: Systematically incorporate testing practices across many courses CS1 CS2 Testing Practices OO Design Data Struct Automatically Grading Programming Assignments with Web-CAT
Expect students to apply their testing skills all the time in programming assignments • Expect students to test their own work • Empower students by engaging them in the process of assessing their own programs • Require students to demonstrate the correctness of their own work through testing • Do this consistently across many courses Automatically Grading Programming Assignments with Web-CAT
What tools and techniques should I teach? • We want to start with skills that are directly applicable to authentic student-oriented tasks • Don’t want to add bureaucratic busywork to assignments • Without tool support, this is a lost cause! • It is imperative to give students skills they value • … But most textbooks only give a “conceptual” intro to idealized industrial practices, not techniques students can use in their own assignments Automatically Grading Programming Assignments with Web-CAT
Test-driven development is very accessible for students • Also called “test-first coding” • Focuses on thorough unit testing at the level of individual methods/functions • “Write a little test, write a little code” • Tests come first, and describe what is expected, then followed by code, which must be revised until all tests pass • Encourages lots of small (even tiny) iterations • See http://web-cat.sf.net/ for on-line references Automatically Grading Programming Assignments with Web-CAT
Students can apply TDD in assignments and get immediate, useful benefits • Conceptually, easy for students to understand and relate to • Increases confidence in code • Increases understanding of requirements • Preempts “big bang” integration Automatically Grading Programming Assignments with Web-CAT
The problem is devising an effective assessment strategy • Need to assess student performance at testing • Need to give productive feedback • Need to provide rapid turnaround • Cannot afford huge increase in resources required Automatically Grading Programming Assignments with Web-CAT
Conventional automated assessment does not encourage good testing habits • Student uploads program • Program is compiled • Executed against test data • Scored based on output Automatically Grading Programming Assignments with Web-CAT
The conventional approach provides useful benefits that do lead to a cultural change • Fast, precise feedback to students • Chance(s) to improve based on feedback • Good assessment of behavior • Systematic use resulted in culture change Automatically Grading Programming Assignments with Web-CAT
But the conventional approach may discourage desired behavior and skills • Focus is on output correctness, first and foremost • “Get it working first, work on commenting, structure, etc. later” • Students not encouraged or rewarded for testing on their own • Students often do less testing Automatically Grading Programming Assignments with Web-CAT
Proper grading and feedback can provide positive incentive for desirable behavior • Decide what behavior to foster • Choose a corresponding scoring/reward system • Design feedback approach • Use students’ adaptive nature to drive cultural change Automatically Grading Programming Assignments with Web-CAT
Proper grading and feedback is critical to reinforcing desired behavior • Assess test validity: correctness of student’s tests • Assess test completeness: the “thoroughness” of student’s tests • Assess program correctness: behavior of student’s solution • Multiply scores as percentages Automatically Grading Programming Assignments with Web-CAT
Students improve their code quality when using Web-CAT Newly written “untested” code Commerical-quality code Automatically Grading Programming Assignments with Web-CAT
Students start earlier and finish earlier when they use Web-CAT Automatically Grading Programming Assignments with Web-CAT
An evaluation of submitted code indicates students program more effectively Automatically Grading Programming Assignments with Web-CAT
After using TDD and Web-CAT, students clearly perceive practical benefits Automatically Grading Programming Assignments with Web-CAT
Student reactions are very positive toward TDD • I am very excited about using TDD. • I agree that TDD can be beneficial and I’m glad we are being required to experiment with it in this course. • If it increases the effectiveness of my programming and decreases the time I spend debugging, then I am all for it. • [Previously,] I had to quit my detailed testing and stick to making the program appear to work with the sample data given every time a deadline drew near. With [TDD], the tests are such an integral part of the project that no time-conserving measure will save me. Automatically Grading Programming Assignments with Web-CAT
We use Web-CAT to automatically process student submissions and check their work • Web application written in 100% pure Java • Deployed as a servlet • Built on Apple’s WebObjects • Uses a large-grained plug-in architecture internally, providing for easily extensible data model, UI, and processing features Automatically Grading Programming Assignments with Web-CAT
Web-CAT’s strengths are targeted at broader use • Security: mini-plug-ins for different authentication schemes, global user permissions, and per-course role-based permissions • Portability: 100% pure Java servlet for Web-CAT engine • Extensibility: Completely language-neutral, process-agnostic approach to grading, via site-wide or instructor-specific grading plug-ins • Manual grading: HTML “web printouts” of student submissions can be directly marked up by course staff to provide feedback Automatically Grading Programming Assignments with Web-CAT
Grading plug-ins are the key to process flexibility and extensibility in Web-CAT • Processing for an assignment consists of a “tool chain” or pipeline of one or more grading plug-ins • The instructor has complete control over which plug-ins appear in the pipeline, in what order, and with what parameters • A simple and flexible, yet powerful way for plug-ins to communicate with Web-CAT, with each other • We have a number of existing plug-ins for Java, C++, Scheme, Prolog, Pascal, Standard ML, … • Instructors can write and upload their own plug-ins • Plug-ins can be written in any language executable on the server (we usually use Perl) Automatically Grading Programming Assignments with Web-CAT
The most well-known plug-in is for grading Java assignments that include student tests • ANT-based build of arbitrary Java projects • PMD and Checkstyle static analysis • ANT-based execution of student-written JUnit tests • Carefully designed Java security policy • Clover test coverage instrumentation • ANT-based execution of optional instructor reference tests • Unified HTML web printout • Highly configurable (PMD rules, Checkstyle rules, supplemental jar files, supplemental data files, java security policy, point deductions, and lots more) Automatically Grading Programming Assignments with Web-CAT
Web-CAT supports a variety of languages, and its Java plug-in is aimed at software testing • ANT-based build of arbitrary Java projects • PMD and Checkstyle static analysis • ANT-based execution of student-written JUnit tests • Carefully designed Java security policy • Clover test coverage instrumentation • ANT-based execution of optional instructor reference tests • Unified HTML web printout • Highly configurable (PMD rules, Checkstyle rules, supplemental jar files, supplemental data files, java security policy, point deductions, and lots more) Automatically Grading Programming Assignments with Web-CAT
Web-CAT provides timely, constructive feedback on how to improve performance • Indicates where code can be improved • Indicates which parts were not tested well enough • Provides as many “revise/ resubmit” cycles as possible Automatically Grading Programming Assignments with Web-CAT
The most important step in writing testable assignments is … • Learning to write tests yourself • Writing an instructor’s solution with tests that thoroughly cover all the expected behavior • Practice what you are teaching/preaching Automatically Grading Programming Assignments with Web-CAT
Students get frustrated without feedback, so reference tests must provide some • If students only get a score, but no other feedback for how to improve, they get easily frustrated • We augment our reference tests to provide “hints” for failed tests, cross-referenced to the program assignment Automatically Grading Programming Assignments with Web-CAT
Students will try to get Web-CAT to do their work for them • Students appreciate the feedback, but will avoid thinking at (nearly) all costs • Too much feedback encourages students to use Web-CAT for testing instead of writing their own tests—they use it as a development tool instead of simply to check their work • This limits the learning benefits, which come in large part from students writing their own tests • Lesson: balance providing suggestive feedback without “giving away” the answers: lead the student to think about the problem Automatically Grading Programming Assignments with Web-CAT
We have also tried to influence student work habits to improve their success • Encourage early submission by providing extra incentives or using late penalties • Score bonuses and/or penalties are easy • Another useful approach: • Generous limit on the total number of submissions (60) • Hints disappear one day before the due date • Project closes for one day to encourage students to step away and reflect on “the last bug” • Project opens again for one day with hints re-enabled, but with a cap on how much the score can improve Automatically Grading Programming Assignments with Web-CAT
Lessons for writing program assignments intended for automatic grading • Requires greater clarity and specificity • Requires you to explicitly decide what you wish to test, and what you wish to leave open to student interpretation • Requires you to unambiguously specify the behaviors you intend to test • Requires preparing a reference solution before the project is due, more upfront work for professors or TAs • Grading is much easier as many things are taken care by Web-CAT; course staff can focus on assessing design Automatically Grading Programming Assignments with Web-CAT
Areas to look out for in writing “testable” assignments • How do you write tests for the following: • Main programs • Code that reads/write to/from stdin/stdout or files • Code with graphical output • Code with a graphical user interface Automatically Grading Programming Assignments with Web-CAT
Testing main programs • The key: think in object-oriented terms • There should be a principal class that does all the work, and a really short main program • The problem is then simply how to test the principal class (i.e., test all of its methods) • Make sure you specify your assignments so that such principal classes provide enough accessors to inspect or extract what you need to test Automatically Grading Programming Assignments with Web-CAT
Testing input and output behavior • The key: specify assignments so that input and output use streams given as parameters, and are not hard-coded to specific sources destinations • Then use string-based streams to write test cases; show students how • In Java, we use BufferedReaders and PrintWriters for all I/O • In C++, we use istreams and ostreams for all I/O Automatically Grading Programming Assignments with Web-CAT
Testing programs with graphical output • The key: if graphics are only for output, you can ignore them in testing • Ensure there are enough methods to extract the key data in test cases • We use this approach for testing Karel the Robot programs, which use graphic animation so students can observe behavior Automatically Grading Programming Assignments with Web-CAT
Testing programs with graphical UIs • This is a harder problem—maybe too distracting for many students, depending on their level • The key question: what is the goal in writing the tests? Is it the GUI you want to test, some internal behavior, or both? • Three basic approaches: • Specify a well-defined boundary between the GUI and the core, and only test the core code • Switch in an alternative implementation of the UI classes during testing • Test by simulating GUI events Automatically Grading Programming Assignments with Web-CAT
Conclusion: including software testing helps promote learning and performance • If you require students to write their own tests … • Our experience indicates students are more likely to complete assignments on time, produce one third less bugs, and achieve higher grades on assignments • It is definitely more work for the instructor • But it definitely improves the quality of programming assignment writeups and student submissions Automatically Grading Programming Assignments with Web-CAT
Visit our SourceForge project! • http://web-cat.sourceforge.net/ • Info about using our automated grader, getting trial accounts, etc. • Movies of making submissions, setting up assignments, and more • Custom Eclipse plug-ins for C++-style TDD • Links to our own Eclipse feature site Automatically Grading Programming Assignments with Web-CAT