Study Examines Cost Savings Through ‘Machine Scoring’ of Tests

Senior Editor

A new analysis seeks to pinpoint how much can be saved by “machine scoring” test-essay questions, and concludes that the costs can be as low as 20 percent of the price of human grading of those items, depending on the volume of students being tested and other factors.

The study, which was supported by the William and Flora Hewlett Foundation, was released as many test developers, with the support of policymakers, are trying to move away from fill-in-the-bubble items to exams that evaluate students’ skills in more complex ways, such as by requiring written responses.

For years, simpler, multiple choice items have had an obvious allure: They’re typically easier and cheaper to score than are essays.

Published by the Assessment Solutions Group, a Danville, Calif.-based company, the study concludes that the costs of machine scoring of long-form essays could be as low as 20 percent to 50 percent the costs of human scoring, for tests involving big volumes of student responses. The savings tend to be lower, but still significant, where smaller groups of students are being tested.

To put some of the savings in dollar terms, in  a state that is testing a medium volume of students, between 1.5 million and 3 million of them, the costs of human scoring would range from $1.51 to $2.08 per student, or $3.4 million to $4.7 million in total.

With machine scoring, the cost drops, ranging from 41 cents to 86 cents per student, or $922,000 to $1.9 million.

Machine scoring of essays is controversial among those who see the computer-based reviews as crude approximations of the kind of close evaluation of student writing that human scorers can provide. Critics also note that the quality of the scoring can vary by the type or length of essays. 

Barry Topol, a study author and the managing partner at the Assessment Solutions Group, told Education Week that part of the goal of the analysis is to give states, districts, and other entities seeking to develop or administer tests better information, so that if they’re interested in machine scoring, they can write more informed and specific requests for proposals soliciting those services.

In general, the larger the number of students tested, the more the cost of machine-scoring falls because fixed costs (such as training the system to score essays) are spread out over more exams, Topol explained.

The analysis of pricing was based partly on a request for information from eight organizations that the author said dominate the machine-scoring market (not all of whom responded): the American Institutes for Research, CTB/McGraw Hill, Educational Testing Service, Measurement Incorporated, MetaMetrics, Pacific Metrics, Pearson Education, and Vantage Learning.

5 thoughts on “Study Examines Cost Savings Through ‘Machine Scoring’ of Tests

  1. It’s all about the money, isn’t it? No one is saying that the grading is better, just that it’s cheaper. Well, you know what they say, you get what you pay for. Cheap tests will guarantee that the writing taught will be that which is easiest to grade, formulaic, unthoughtful…So, for all this talking of higher more rigorous standards, we will be creating students who can write to a formula that is useless except for computer test grading…wonderful. Thanks for that. Look at "college and career ready" higher ed, this will be the students who show up in your classes!

  2. The National Council of Teachers of English well-researched and well-documented Position Statement on Machine Scoring: "Machine Scoring Fails the Test." See the full document at NCTE.org.

  3. Students need to write more. Teachers do not have time to grade all the papers students should be writing. That leaves a few choices for school districts that want to remedy this. 1)contract out grading, 2) have students blog their writing so that their peers and other interested community members can provide feedback, 3) give students access to intelligent graders, 4) do nothing. Whether it costs more or less, there needs to be a change so that students can get timely feed back on enough of their writings to improve. This is not happening because all of the solutions, except do nothing, involves changes to the current teacher centered system. And for all the talk of putting student learning first, resisting change is still the status quo.

    1. I agree that students need to write more, and philosophically have no problem with contracting out the grading as long as there was some sort of method for the teacher providing initial grading requirements to those contract graders. However, and herein lies the rub, all the new jingoistic focus on "assessment" and "formative assessment" and revising my practice based on that assessment is not conducive to anyone but me doing the grading if I am to truly inform my instruction based on what I see in the writing. Having someone else grade it and then me going over what they graded doesn’t save me any time and not looking closely at what was graded doesn’t allow me to fully understand what my students understand. So unless we are going to reduce the emphasis in teacher evaluation models on how I am informing my practice with assessments nothing can alter the current teacher grading model.

  4. There is no question that cost is always an issue in education. Yet savings is not a bottom line issue as it often is in business, nor is it a always a value proposition.

    The real issue is far more problematic. What exactly is the message to students when educators say something akin to "Your writing is so unimportant that it is cheaper and easier to have a machine score it"?

    To the best of my knowledge humans have never endeavored to write prose with the intended audience being a machine. What would be the purpose of doing so even? To pass a test of dubious validity anyway?

    Somewhere along the line, we have lost the plot even thinking machine scoring of student writing is a valid or even good idea from the beginning.

    Of course there is no irony that the organizations requesting this kind of information are all involved in student assessment in some way.

    An even more blackly comic notion is that machine scoring of student writing can be done a 20-50% of the cost of humans. For how long exactly? The first time maybe, but exactly how long will it take before ever "better" technology will be used at an even greater cost and, incidentally, steeper profit?

    All the while the students are the ones losing as the demands for data and meaningless scores on even more meaningless tests of writing ability drive teachers to coach students to write for a machine, rather than endeavor to communicate with greater sophistication and clarity in the hopes of being understood by another human being, which is kind of the point of writing anything anyway.

    It is all a bit absurd, really.

Leave a Reply

Your email address will not be published. Required fields are marked *