A new analysis seeks to pinpoint how much can be saved by “machine scoring” test-essay questions, and concludes that the costs can be as low as 20 percent of the price of human grading of those items, depending on the volume of students being tested and other factors.
The study, which was supported by the William and Flora Hewlett Foundation, was released as many test developers, with the support of policymakers, are trying to move away from fill-in-the-bubble items to exams that evaluate students’ skills in more complex ways, such as by requiring written responses.
For years, simpler, multiple choice items have had an obvious allure: They’re typically easier and cheaper to score than are essays.
Published by the Assessment Solutions Group, a Danville, Calif.-based company, the study concludes that the costs of machine scoring of long-form essays could be as low as 20 percent to 50 percent the costs of human scoring, for tests involving big volumes of student responses. The savings tend to be lower, but still significant, where smaller groups of students are being tested.
To put some of the savings in dollar terms, in a state that is testing a medium volume of students, between 1.5 million and 3 million of them, the costs of human scoring would range from $1.51 to $2.08 per student, or $3.4 million to $4.7 million in total.
With machine scoring, the cost drops, ranging from 41 cents to 86 cents per student, or $922,000 to $1.9 million.
Machine scoring of essays is controversial among those who see the computer-based reviews as crude approximations of the kind of close evaluation of student writing that human scorers can provide. Critics also note that the quality of the scoring can vary by the type or length of essays.
Barry Topol, a study author and the managing partner at the Assessment Solutions Group, told Education Week that part of the goal of the analysis is to give states, districts, and other entities seeking to develop or administer tests better information, so that if they’re interested in machine scoring, they can write more informed and specific requests for proposals soliciting those services.
In general, the larger the number of students tested, the more the cost of machine-scoring falls because fixed costs (such as training the system to score essays) are spread out over more exams, Topol explained.
The analysis of pricing was based partly on a request for information from eight organizations that the author said dominate the machine-scoring market (not all of whom responded): the American Institutes for Research, CTB/McGraw Hill, Educational Testing Service, Measurement Incorporated, MetaMetrics, Pacific Metrics, Pearson Education, and Vantage Learning.