Document Type

Honors Project

Abstract

Computers routinely grade multiple-choice questions by simply matching them to an answer key. Can they effectively score essay exams? This report examines an automated technique for grading short answer responses, using a grading system I have constructed. This system assigns a grade to an student answer based on its similarity to a model answer provided by an instructor. Similarity is measured using 1) the semantic similarity between isolated words, and 2) the similarity between the order of those words. The performance of the system was evaluated by scoring actual exam questions and comparing the computer-assigned grades to those given by human instructors. In favorable situations, the correlation between the computer- and the human-assigned grades was only a little less than between human instructors. Several characteristics of texts can cause the performance of the system to degrade. For example, the system's performance is poor with answers that contain negations, truly unique phrases or idioms, misspelled words, or contradictory information.

Share

COinS
 
 

© Copyright is owned by author of this document