370 likes | 722 Views
Automated Program Grading. Outline. Systems for Automated Assessment of Programming Assignments WeBWorK JUnit-based program fragment grader Conclusions and Future Work. Systems for Automated Assessment of Programming Assignments. Web-based systems
E N D
Outline • Systems for Automated Assessment of Programming Assignments • WeBWorK • JUnit-based program fragment grader • Conclusions and Future Work
Systems for Automated Assessment of Programming Assignments • Web-based systems • Programming as the first skill a computer science undergraduate is expected to master • To improve, reinforce and improve students’ understanding of programming • Types of problems • True / false, matching, multiple-choice, program writing • Grading • Correctness + authenticity + quality
Existing Systems • Boss www.dcs.warwick.ac.uk/boss • CodeLab www.turingscraft.com • CourseMarker www.cs.nott.ac.uk/CourseMarker • Gradiance www.gradiance.com • MyCodeMate www.mycodemate.com • OWL owl.course.com • Viope www.viope.com
WeBWorK • webwork.rochester.edu • Web-based, automated problem generation, delivery and grading system • Free, open-source project funded by NSF • Initial development and applications in the fields of mathematics and physics • Currently in use at more than 50 colleges and universities
WeBWorK • Problems are written in the Problem Generating macro language (PG) • Text, HTML, Latex, Perl • Underlying engine dedicated to dealing with mathematical formulae x+1 = (x^2-1)/(x-1) = x+sin(x)^2+cos(x)^2 • Individualized and parameterized versions of problems
WeBWorK for Programming Fundamentals • Programming fundamentals [CC2001] • Fundamental programming constructs, algorithms and problem solving, elementary data structures, recursion, event-driven programming • Extension of WeBWorK for use in the core courses of the Computer Science Curriculum • Interface WeBWorK with other tools to facilitate grading of new problem types • Demo site: • webwork.cornellcollege.edu/webwork2/csc213Apr07 • atlantis.seidenberg.pace.edu/webwork2/demo • Work funded by NSF grant
Types of WeBWorK Programming Problems • True / false, matching and multiple choice problems for Java, Python and SML • Sample problems designed from textbook (with permission) • Java Software Solutions: Foundations of Program Design (4th Edition), John Lewis and William Loftus, 2004 • Evaluation of Java programs / fragments by interfacing WeBWorK with JUnit [www.junit.org]
Evaluation of Java Fragments • Want to provide a system that can automatically grade program fragments in real time • Individual lines of a program • Single or multiple methods • Full .java file
Goals of System • Real time, intelligent grading • More gentle than ACM contest standards • Relative ease of authoring new problems • Using standard tools and techniques
Components of a Problem • PG file to specify a problem • All problems in WeBWorK specified in PG • Code to typeset the question and compute an answer • Answer evaluator determines if answer matches • We provide new evaluator that calls JUnit • Template file • When correct answer inserted, forms valid .java file • JUnit test file • Provides a series of JUnit tests to assess the response.
PG Problem DOCUMENT(); # This should be the first executable line in the problem. loadMacros("PG.pl","PGbasicmacros.pl","PGchoicemacros.pl", "PGanswermacros.pl", "PGauxiliaryFunctions.pl", "javaAnswerEvaluators.pl"); TEXT("Boolean Operator"); BEGIN_TEXT $PAR Write a static method named 'flip' of return type 'boolean' which will take a single boolean parameter and simply return its opposite. $BR \{ANS_BOX(1,5,60);\} END_TEXT ANS(java_cmp("JavaSampleSet/BoolOp/","BoolOp")); ENDDOCUMENT(); # This should be the last executable line in the problem.
Template File public class BoolOp { replaceme } • Note the student response will replace replaceme.
import java.lang.reflect.InvocationTargetException; import java.lang.reflect.Method; import java.lang.reflect.Modifier; import java.util.Random; import junit.framework.*; public class BoolOpJUnitTest extends TestCase{ boolean exists,returntype,paramtype,isStatic; Method flip; public static Test suite(){ return new TestSuite(BoolOpJUnitTest.class); } public static void main (String [] args){ BoolOpJUnitTest bojunit = new BoolOpJUnitTest(); } JUnit Test File
Introspection to Avoid Unnecessary Errors • In setup method, set instance vars to show result of introspection on method signature • exists = there is a method named flip • isStatic = it is static • isPublic = it is public • isPrivate = it is private • returnType = has correct return (boolean) • paramType = has correct params (boolean)
JUnit Tests for Method Signature public void testExists(){ Assert.assertTrue("Creating the method",exists); } public void testStatic(){ Assert.assertTrue("Making the method static",isStatic); } public void testReturnType(){ Assert.assertTrue("Making the method return type 'boolean'",returntype); } public void testParamType(){ Assert.assertTrue("Making the method take one 'boolean' parameter",paramtype); }
JUnit Tests for Correct Functionality public void testWorks(){ boolean works=false; if(exists&&returntype&¶mtype&&isStatic){ try { Boolean testBool = new Boolean(false); Object[] args = {testBool}; Boolean result = (Boolean)flip.invoke(BoolOp.class,args); works=(result.booleanValue()); Object[] args2 = {result}; Boolean result2 = (Boolean)flip.invoke(BoolOp.class,args2); works=(works&&!result2.booleanValue()); } catch (Exception e){ Assert.assertTrue("Exception: <BR>"+e.getCause(),false); } } Assert.assertTrue("Making the method return the opposite of its parameter",works); }
General Execution Flow • User’s question is displayed by WeBWorK • User enters answer and submits • Tmp directory is created • Template file with user response inserted • JUnit test file • Both .java files compiled (syntax errors reported) • JUnit tests run • User score is % of tests that are correct
User Sandbox • User code is run in very tight sandbox: • Permissions set in .policy file • File permissions on a per-directory level • Programs run in separate thread and killed aka CPU_LIMIT • Java is executed with low/hard stack/heap limit
Early Results • Pilot at Pace University • CS1/CS2 • Higher level course actually designing new problems to help teach JUnit • Pilot at Cornell College • Used very briefly in CS1.5 • Will use more in CS2.5
(Positive) Feedback on JUnit Extension • Students liked being able to test interactively • Students missed IDE features • Syntax coloring: found silly syntax errors distracting • Some used IDEs to preview answer • Preferred WeBWorK/JUnit to CodeLab • Became more helpful as you used it longer
(Negative) Feedback on JUnit Extension • HCI issues • Found question language rough/confusing • Want even more detail/feedback/guidance on errors • Tendency to fight system • One student spent 60+ minutes submitting Flip
Future Directions (1) • Need to further massage feedback • Need to develop a full set of problems • Problems often text-specific • Check style as well as correctness • Quality control/service for AP
Future Directions (2) • Unit testing is not just for Java • Same architecture works for most languages • Edit the “system” call in Java.pm • Provide appropriate sandbox • Write XML output for xUnit • Interface with other CMSes
Summary • Implemented system to test Java program fragments in real time via web • Part of larger project to provide auto grading support for CS1/2 • Rest of project ready for prime time (java, python) • Already in use at Pace, Cornell (a bit)
Acknowledgement • NSF CCLI AI Grant #0511385 • Collaborative Research: Adapting and Extending WeBWorK for Use in the Computer Science Curriculum
Demo Site • http://webwork.cornellcollege.edu/webwork2 • Login as student0/student0 … student9/student9 • Choose csc213Apr07 class