An automated system with a versatile test oracle for assessing student programs

Chung M. Tang, Yuen T. Yu, Chung K. Poon*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

2 Citations (Scopus)

Abstract

Automated program assessment systems have been widely adopted in many universities. Many of these systems judge the correctness of student programs by comparing their actual outputs with predefined expected outputs for selected test inputs. A common weakness of such systems is that student programs would be marked as incorrect as long as their outputs deviate from the predefined ones, even if the deviations are only minor, insignificant, and considered acceptable by a human assessor that the programs have satisfied the specifications. This critical weakness caused undue frustration to students and undesirable pedagogical consequences that undermine these systems’ benefits. To address this issue, we developed an improved mechanism for program output comparison to serve as a versatile test oracle that brings the results of automated assessment much closer to those of human assessors. We evaluated the new mechanism in real programming classes using an existing automated program assessment system. We found that the new mechanism achieved zero false-positive error (did not wrongly accept any incorrect output) and very low (0%–0.02%) false-negative error (that wrongly rejected correct outputs), with very high accuracy (99.8%–100%) in correctly recognizing outputs deemed acceptable by instructors. This represents a major improvement over an existing assessment mechanism, which had 56.4%–64.1% false-negative error with an accuracy of 25.4%–40.9%. Moreover, about 67%–96% of students achieved their best results in their first attempt, which could be encouraging to them and reduce their frustration. Furthermore, students generally welcomed the new assessment mechanism and agreed it was beneficial to their learning.
Original languageEnglish
Pages (from-to)176-199
JournalComputer Applications in Engineering Education
Volume31
Issue number1
Online published11 Oct 2022
DOIs
Publication statusPublished - Jan 2023

Funding

The authors would like to thank all the participants involved in the evaluation reported in this paper. The work described in this paper is fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project UGC/FDS11(14)/E02/15).

Research Keywords

  • automated program assessment system
  • computer science education
  • learning computer programming
  • program assessment
  • test oracle

Fingerprint

Dive into the research topics of 'An automated system with a versatile test oracle for assessing student programs'. Together they form a unique fingerprint.

Cite this