330 likes | 462 Views
Jenilyn L. Agapito Ateneo de Manila University. Towards An Analysis of Novice Programmers’ Compilation Behaviours in C++. Aplusix, Scatterplot, Ecolab, BlueJ, etc. Observation Logs. Student Interaction Logs. Biometrics Logs. Interventions Intelligent agents Improved error messages.
E N D
Jenilyn L. Agapito Ateneo de Manila University Towards An Analysis of Novice Programmers’ Compilation Behaviours in C++
Aplusix, Scatterplot, Ecolab, BlueJ, etc. Observation Logs Student Interaction Logs Biometrics Logs • Interventions • Intelligent agents • Improved error messages • Analysis • Affect detectors • Behavior detectors • Novice programmer errors 2
The Novice Programmer • First-year computing students should learn the “computer-science way” of solving problems to produce appropriate and correct programs [McCracken et al, 2001]. • They are required to learn the basic constructs of a programming language. • Programs deal with abstract entities that have no direct analogies in day-to-day life [Kuittinen et al, 2004].
The Struggle with Syntax Errors • Syntax errors are encountered during compile-time. • Some common syntax errors made by beginning students are often repeated even after teachers have already explained and reiterated them [Flowers et al, 2004]. • How do novices cope with compiler errors?
Online Protocols • Online Protocols -the sequence of automatically collected source codes written by students while writing a solution to a programming problem [Jadud, 2006; Rodrigo et al, 2009 ].
Motivation • Students find programming a complicated subject and skill to learn. • Students who are likely to stay are those who succeed in their first programming subject [Fenwick et al, 2009]. • Students in jeopardy would stay if they pass; otherwise, shift to another course.
Research Objectives • To develop a tool that captures online protocols • To analyze students’ compilation behaviours through gathered online protocols using Error Quotient (EQ) – a quantification of a student’s compilation behaviours [Jadud, 2006]
Research Question • How do we measure novice programmer comprehension? • Novice Programmer Comprehension - the ability to understand and correct compilation errors encountered in the process of authoring a program.
Related Works Non-Technical Measures of Novice Programmer Comprehension • Analysis of Programmer Doodles [Lister et al, 2004; de Leon et al, 2008] • Doodle - the markings or annotations made by students on their exam or laboratory papers • Progress test was used to investigate common programming errors of students (Gobil et al, 2009)
Related Works [cont.] Technical Measures of Novice Programmer Comprehension • Gauntlet - a program that pre-processes students' source codes and elucidates the rather puzzling error messages into layman’s terms [Flowers, 2004] • James Jackson's group's error logger automatically captured errors encountered by Java programmers in real-time [Jackson et al, 2005]
Related Works [cont.] • ClockIt - a data collection and analysis toolset that allowed instructors to measure students’ programming practices through its data logger and visualizer tool [Norris et al, 2008]. • Retina - a tool that collected observable data on students’ programming activities and provided useful information based on aggregated data [Murphy et al, 2009]
Related Works [cont.] • Analysis of Online Protocols [Jadud, 2006; Rodrigo et al, 2009] • focused on novice programmers who were learning to program in Java • BlueJ IDE that was instrumented by PoulHenriksenand Matthew Jadud
The Logging Tool • The subjects of this study • are being trained to program in C++; • are working in a Linux OS environment; • are not utilizing any IDEs; and • are using GNU C++ (g++) – C++ compiler that is part of the GNU compiler collection (GCC).
The Logging Tool [cont.] • Online protocols are collected by instrumenting the student’s programming environment. • Programming with IDEs • Patches / plug-ins to save snapshots of students’ source codes and other data • Programming without IDEs • Wrapper programs for automated data collection
The Logging Tool [cont.] • g++ compiler was renamed to g++-main • Logging tool was then named g++ • g++-main was called within the g++ logging tool
The Logging Tool [cont.] Logging Tool Design
The Logging Tool [cont.] • Each source file will have its own equivalent compilation log file • Created the first time a file is compiled • Updated on subsequent compilations • Found in the same directory as the source file compiled
The Logging Tool [cont.] • What information were logged? • date of compilation • time of compilation • compiler outputs • a copy of the source code compiled
The Logging Tool [cont.] Tags for the data logged • <date> ... </date> - date of compilation • <time> ... </time> - time of compilation • <compiler outputs> ... </compiler outputs> - messages thrown by the compiler • <source code> ... </source code> - source code compiled
Work in Progress • Data gathering has been completed. • To design and develop a program that will process the log files. • Parse the contents of the data logs • Save them into a database • Data Analysis – Error Quotient
Work in Progress [cont.] • Deduce other pertinent information • Most encountered syntax errors • Average time interval between compilations
The Error Quotient • ERROR QUOTIENT • quantification of student’s compilation behaviours by considering the type, location, as well as the frequency of encountered syntax errors. • algorithm formulated by Matthew Jadud • generates a single number (0 - 1) describing a student’s programming session
The Error Quotient [cont.] • Students who encountered many syntax errors and failed to fix them from one compilation to the next are characterized by High EQ. • Students who had few syntax errors, or who corrected their syntax errors in between compilations are characterized by Low EQ.
The Error Quotient [cont.] Error Quotient Algorithm • With compilation events • Collate – Create consecutive pairs from the events in the session, for example, • Calculate – Score each pair.
The Error Quotient [cont.] The Error Quotient Calculation Process [Jadud]
The Error Quotient [cont.] • Normalize – Divide the score assigned to each pair by 9 (the maximum value possible for each pair). • Average – Sum the scores and divide by the number of pairs. This average is taken as the EQ for the session. Session – the time a student starts to compile a specific program up to his last compilation of that same file.
The Error Quotient [cont.] • However, the g++ compiler can give out multiple error messages per compilation. • Several computations of EQ will be performed: looking at only the first error, the first two errors, and the first three errors.
The Error Quotient [cont.] • One-Way ANOVA will be performed to check for significant differences among the three windows. • If there exists a significant difference, each window will be correlated with students’ midterm exam scores. • Otherwise, the first experiment will then be correlated with students’ midterm exam scores.
Acknowledgments • Ateneo de Naga University particularly the Department of Computer Science • Dr. Allan Sioson, Mitchell Zachary Imperial, the technical staff of DCS and the ICST102 students of first semester School Year 2010-2011 • Dr. Ma. Mercedes Rodrigo, my thesis adviser • Engineering Research and Development for Technology (ERDT) for my scholarship • This research was made possible in part through a grant from the Department of Science and Technology entitled 'Development and Deployment of an Intelligent Affective Detector for the BlueJ Interactive Development Environment’