By Richard Phelps, Ph.D.
Education Consumers Consultants Network
Benefit-cost analysis is imbedded in all studies
that ask the essential question of an activity,
"Is it worth doing?" Benefit-cost analysis is
a set of techniques, philosophy, and logic that
can impose an order and rigor on the process used
to answer the essential question.
The logic of benefit-cost analysis is that of
the accountant's spreadsheet. Indeed, one could
accurately describe it as economists' accounting
method. The essential idea is to capture all relevant
costs and benefits, broadly considered, on one
sheet of paper and weigh them in the balance.
If the enterprise or project shows more benefit
than cost (i.e. net benefits are positive) it
can be said to be economically worthwhile. It
is assumed that the researcher will do an honest
and responsible job of trying to capture all the
relevant benefits and costs. If they can't be
estimated with any precision, the researcher should
at least enumerate them and leave it to the reader
to estimate their value.
What one person considers a benefit, however,
another person may not. Indeed, what one person
considers a benefit, another person may regard
as a cost. The details of benefit-cost analyses,
then, are often subject to debate. It is, however,
considered incumbent upon the researcher to properly
identify what perspective she is adopting. Ideally,
a benefit-cost analysis calculates the benefits
and costs as they accrue to all of society - such
is the nature of a social benefit-cost
analysis. Anything less - an analysis that calculates
benefits and costs for a sub-group - is a private
benefit-cost analysis, and the researcher is obligated
to explicitly declare it as such.
Benefit-cost analysis should be most welcome
in education research. Benefit-cost analysis imposes
a structure in which "the whole picture" gets
considered. It provides a framework that can impose
rigor and honesty onto evaluations that could
otherwise be sloppy.
By the same token, most readers are probably
also well aware of how benefit-cost analysis can
be misused. A researcher can make unreasonable
or dishonest estimates, ignore some relevant benefits
or costs, and include some irrelevant benefits
or costs, or double count. There can be a tendency
among advocates to exclude or include benefits
or costs according to their preferences.
What costs and benefits are relevant?
Generally, they are the marginal costs or benefits
that are attributable to the activity in question
and not another activity. When someone argues
that the cost of a test is X, the appropriate
cost to cite is the marginal cost of the test,
the cost that can be attributed to the existence
of the test and not to any other activity. Looked
at another way, a marginal cost of a test is a
cost that is caused by the test, one that
doesn't exist without the test. An heuristic one
can use to determine if an activity or object
is a marginal cost of a test or not: take the
test away and see if the activity or object disappears.
It turns out that the costs of standardized
testing are minuscule by comparison with huge
potential benefits. This fact is little known
among educators, as few mainstream education researchers
are trained to attempt such studies, more common
to economists, and the few who have attempted
such studies have produced bungled, or biased,
In the early 1990s, the Center for the Study
of Testing, Evaluation, and Educational Policy
(CSTEEP), at Boston College, calculated a "high"
estimate of $22.7 billion spent on standardized
testing per year. U.S. schools, the CSTEEP report
claimed, suffer from "too much standardized testing"
that amounts to "a complete and utter waste of
resources." Their estimate breaks down to about
$575 per student per year.
A report from the federally-funded Center for
Research on Education, Standards, and Student
Testing (CRESST) counted cost components in much
the same way as the CSTEEP study estimated costs
of a certain state test at between $848 and $1,792
per student tested ($1,320 would be mid-range).
Testing critics exaggerate their cost estimates
by counting the costs of any activities "related
to" a test as costs of a test. In the CRESST study
of Kentucky's performance-based testing program,
for example, teachers were asked to count the
number of hours they spent "preparing materials
related to the assessment program for classroom
use." In an instructional program like Kentucky's,
with the intention of unifying all instruction
and assessment into a "seamless" web, where the
curriculum and the test mutually determine each
other, all instruction throughout the entire
school year will be "related to" the assessment.
The CSTEEP study counted even more cost items,
such as student time. The CSTEEP researchers assumed
that there is no instructional value whatsoever
to student time preparing for or taking a test
(i.e., students learn absolutely nothing while
preparing for or taking tests). Then they calculated
the present discounted value of that "lost" learning
time against future earnings, assuming all future
earnings to be the direct outcome of school instruction.
The CSTEEP researchers also counted building overhead
(maintenance and capital costs) for the amount
of time spent testing, even though those costs
are constant (i.e., "sunk") and not affected by
the existence of a test. In sum, CSTEEP counts
any and all costs incurred simultaneously to tests,
not just those caused by testing, which would
not exist without testing.
In stark contrast to these incredible estimates
are the actual prices charged for tests such as
the ACT, SAT, and AP exams, ranging from $20 to
$70 a student. The makers of these tests must
cover all their costs, or they would go out of
The bipartisan U.S. General Accounting Office
(GAO) also conducted a survey of state and local
testing directors and administrators to learn
the costs of statewide and districtwide tests.
The GAO estimate of $15 to $33 per student contrasts
markedly with CRESST and CSTEEP estimates of $575
and $1,320. And, the GAO estimates counted all
relevant costs, including that for teacher time
used in administering tests. The GAO estimate
for the total national cost of systemwide testing
of about $500 million contrasts with a CSTEEP
estimate 45 times higher.
> The GAO estimated all-inclusive, stand-alone
marginal costs of large-scale, systemwide
tests, costs that would portend in a situation
where the tests had to be administered independent
of any school system structure or schedule, say
during the summer months and by hired personnel.
The independent SAT, ACT, and AP exams are administered
Recalculating the GAO study's estimates under
two reasonable assumptions: (1) that the tests,
as is usually the case, would be administered
during the regular school year, using regular
school personnel, and would be integral parts
of the school system curricular and instructional
plan; and (2) that the tests would be used in
many school districts to replace, rather than
supplement, some preexisting test. With these
adjustments, marginal costs become $2 per student
for multiple-choice and $11 per student for performance
Far from being the hugely expensive enterprise
that some testing critics claim for it, standardized
testing is not very expensive by most standards.
Even under the rather unrealistic assumptions
of the GAO study's upper-bound estimates, systemwide
tests impose a time and cost burden, as one state
testing director put it, "on a par with field
Distilled to the most rudimentary elements,
the main benefits of standardized testing are
four - information, motivation, organizational
clarity, and goodwill. But, that amounts to quite
a thorough distillation. The information benefits
alone can manifest themselves in several different
forms, to several different audiences. Test results
can tell us about the performance of an individual
student. They can provide information about a
teacher, a curriculum, a textbook, a school, a
program, a district, or a state policy. Moreover,
the information provided by test results can inform
one or more among many parties - parents, voters,
employers, higher education institutions, other
schools, state departments of education, and so
Perhaps the simplest, and least disputed, benefit
of standardized tests is in diagnosis. Test results
can pinpoint a student's academic strengths and
weaknesses, areas that need work, and areas where
help is needed. Test scores provide a measurement
tool that can be used to judge the effectiveness
of preexisting or proposed school programs. Test
results can inform teachers, schools, and school
systems about their curricular and instructional
strengths and weaknesses. That may lead to a better
alignment of curriculum with instruction, a benefit
often enumerated by teachers and administrators
in evaluations of testing programs. Teachers have
also reported that they learn more about their
students, their own teaching, and other teachers'
methods from high-stakes external tests.
Information can also be used for accountability
purposes. Higher-level school system administrators
can use information to make judgments about performance
at the school or school district level and to
increase efficiency. In an environment of school
choice (e.g., school districts with open enrollment),
information about school performance can help
parent-student school shoppers to make a better-informed
Finally, information benefits can consist of
signaling, screening, and credentialing
effects. College admissions counselors and employers
can make a more informed decision about applicants'
academic achievement with test scores than they
can without. Colleges, for example, use measures
of predictive validity (correlation coefficient
of entrance test score with college achievement)
to justify requiring applicants to submit scores
from college admissions tests (ACT or SAT). Measures
of allocative efficiency (efficient sorting
of applicants to organizations) are more difficult
to measure, but are relevant benefits as well.
Of the four main categories of benefits listed
above, information is arguably the only one common
to educational tests whether or not they have
"stakes," and whether or not they are conducted
"internally" or "externally." The other categories
of benefits - motivation, organizational clarity
and efficiency, and goodwill - are unlikely to
occur when tests "do not count."
Motivation may not be an end in itself, but
can lead to desirable behaviors, such as a student
paying greater attention in class and studying
more-activities that, in turn, lead to the accumulation
of more knowledge and understanding. Like information
benefits, motivation can affect many different
parties to the educational enterprise and provide
benefits to many different sectors of our society.
Motivational effects are manifest when rewards
or punishments are provided (or imposed upon)
students, teachers, administrators, schools, districts,
programs, service providers, politicians, or even
parents. The beneficial effects of motivated efforts
accrue to all of the parties above, employers,
higher education institutions, and society in
Just one example of the organizational clarity
or efficiency benefit of standardized testing
is provided by the testimony of teachers in many
states, provinces, and countries who participate
in test development, administration, and scoring.
Overwhelmingly, they assert that the experience
helps them as instructors. After struggling, along
with other teachers and testing experts, to design
and score assessments fairly, they understand
better how their students might misunderstand
concepts and how they might better explain the
concepts. Moreover, they can much more efficiently
align their own instructional program with state
standards after undergoing a deep immersion into
the state standards.
The final general category of benefit cited
above - goodwill - is certainly the most often
overlooked, and is the most difficult to measure,
but may be the most important. The public pays
for the public schools and hands over responsibility
for its children's welfare to the public school
authorities for substantial periods of time. The
public has a right to objective, impartial information
about the performance of the public schools' main
function - the academic achievement of their children.
Classroom grades are unreliable and often invalid
sources of such information. Standardized tests,
when they are used validly, provide far more reliable
and trustworthy information.
Examples of goodwill, then, include: renewed
public confidence in the school system; public
faith that the schools really are working to uphold
standards; and the peace of mind that teachers
and school administrators might gain in the wake
of the new parental and public trust. Students
have also reported in some surveys feelings of
genuine achievement and accomplishment when they
pass important, meaningful tests.
Even the four categories of benefits mentioned
above, in all their varied manifestations, does
not cover the gamut. Still other benefits probably
exist, but may be more difficult to pin down,
more hypothetical, or more difficult to measure.
The economist John Bishop, for example, argues
that it is illogical and counterproductive to
insist that a teacher be both a "coach" and a
"judge." The teacher is a coach when she helps
a student to succeed; a judge when she grades
a student's test and decides that the student
should not be promoted to the next grade or level
of education. By Bishop's theory, this dual role
puts the teacher in a moral dilemma that is often
resolved through social promotion. Most teachers
would rather be coaches than judges and, so, promote
students to the next level even though they are
not ready. After a few years of social promotion,
of course, students may be so far behind that
they cannot possibly succeed by any objective
standard. They may become disillusioned, give
up trying, and drop out. Bishop argues for external
high-stakes testing as a means to free each teacher
to be a coach the student can trust to help him
meet the challenge of the examination which is
"external" to both of them.
We may have reached the point in the United
States where standardized tests provide the only
pure measure of subject-matter mastery. For some
time now, education schools have encouraged teachers
to grade students using a cornucopia of criteria
that include perceived persistence or effort;
perceived level of handicap due to background,
participation or enthusiasm, and perceived need.
Subject matter mastery is just one, and usually
not the most important factor, considered in calculating
a student's course grade. In addition to the missionary
directive of the education schools, Bishop's theory
of the irreconcilability of the coach and judge
roles may also explain the degradation of grades.
But, regardless of the reason, if standardized
tests are, indeed, the only trustworthy measure
of academic achievement, can our society afford
to not use them? External standardized tests may
be the only reliable source of information on
education performance not controlled by groups
with an incentive to corrupt or suppress it.
Even for teachers who desire to grade their
students only on the basis of academic achievement,
few have training in testing and measurement.
Those who criticize standardized tests for their
alleged imperfections of structure and content
seldom mention that standardized tests are written,
tested, and retested by large groups of Ph.D.s
with highly technical training in testing and
measurement. By contrast, the typical classroom
teacher has had no training in testing and measurement.
The full effect of all the benefits mentioned
above, however, numerous as they are, cannot be
felt so long as standardized tests are in use.
"External" measures, such as systemwide standardized
test scores, serve as a check on other measures
of performance (psychometricians label this phenomenon
generally "restriction of range"). To fully appreciate
the benefits of external standardized testing,
one must imagine a society without standardized
testing. What would happen to grade inflation
if there were no standardized test scores to which
one could compare the grades? How much effort
would students, teachers, and administrators make
to improve achievement if there were no standardized
tests with which to check their progress?
Economic studies that have focused primarily
on the motivational, or incentive, effects of
high-stakes testing programs estimate average
benefits to students over their lifetimes of around
$13,000 per subject area tested. That is,
students in jurisdictions with high-stakes testing
programs tend to learn more, and that increased
amount of knowledge and skill is rewarded throughout
their lives, through higher wages and greater
Psychologists have conducted many studies -
in excess of a thousand, actually - on the predictive
validity and allocative efficiency of tests. Education
professors have attacked the dollar estimates
based on such studies but even they concede benefits
on the order of $5,000 to $8,000 per student lifetime.
Total testing benefits vastly outweigh the costs,
by a benefit-to-cost ratio that probably exceeds
a thousand. The benefits can be so high because
they affect a large number of people and they
produce lasting and cumulative effects. Meanwhile,
the testing costs are low and incurred only once
or a few times.
For further reading:
Bishop, John H. "Education Quality and the Economy,"
Paper presented at the Conference on Socio-Economics
of the Society for the Advancement of Socio-Economics,
Arlington, VA., 1995
Boudreau, John W. "Economic Considerations in
Estimating the Utility of Human Resource Productivity
Improvement Programs," Personnel Management,
Hunter, John E. and Schmidt, Frank L. (1982)
"Fitting People to Jobs: The Impact of Personnel
Selection on National Productivity," in Marvin
D. Dunnette and Edwin A. Fleishman, Eds. Human
Performance and Productivity: Volume 1 -- Human
Capability Assessment. Hillsdale, NJ,
Lawrence Erlbaum Associates.
Phelps, Richard P. "Estimating the Cost of Standardized
Student Testing in the United States," Journal
of Education Finance. v.25, n.3, Winter
2000, pp. 343-380.
Phelps, Richard P. "Test Basher Benefit-Cost
Analysis," Network News & Views, Education
Excellence Network, ( http://www.edexcellence.net/issuespl/subject/standar/testbash.html).
Schmidt, Frank L. and Hunter, John E. (1998)
"The Validity and Utility of Selection Methods
in Personnel Psychology: Practical and Theoretical
Implication of 85 Years of Research Findings."
Psychological Bulletin, v.110.
Solmon, Lewis C. and Cheryl L. Fagnano, "Speculations
on the Benefits of Large-scale Teacher Assessment
Programs: How 78 Million Dollars Can be Considered
a Mere Pittance," Journal of Education Finance,
Vol.16, Summer, 1990, pp. 21-36.
U.S. General Accounting Office, Student
Testing: Current Extent and Expenditures, With
Cost Estimates for a National Examination.
GAO/PEMD-93-8. Washington, D.C.: author, January,