Test Scores, Budget Cuts and Equity

If we don't know how students are doing, does the problem disappear?

Apr 29, 2025

Many of the national tests that measure student achievement have fallen victim to federal budget cuts. Will their absence make any difference?

Faithful readers of this Substack may recall that I’ve criticized the National Assessment of Educational Progress (NAEP)—known informally as the Nation’s Report Card—which regularly tests a representative sample of U.S. students in reading, math, and other subjects. (See, for example, here and here.) So they might think I would be cheering the recently announced decision to cancel administrations of a number of those tests. They would be wrong, for reasons I’ll explain in a minute.

But first: what exactly has been canceled? Not the biannual reading and math tests that make headlines and are mandated by Congress—or at least, not officially. As reported by Jill Barshay in the Hechinger Report, those tests are still on the schedule, but DOGE-mandated cuts to the U.S. Department of Education have cast doubt on the department’s ability to administer them, as planned, next year.

Officially, the only tests that have been scrapped are some lesser known ones in science, U.S. history, and writing, along with some state- and district-level results. In addition, a test designed to measure long-term trends, going back to the 1970s, has been consigned to the dustbin.

Members of the board that sets policy for the NAEP told Barshay they’re trying to maintain standards for a smaller number of tests, given budget constraints. Those constraints aren’t new, having previously resulted in the postponement of the writing test to 2032—a timeline that would have left a 21-year gap since its previous administration in 2011. But now, in the era of vanishing government resources, we may never see another national writing test again.

The Reading Test Is the Problem

Here’s why I’m not cheering these cuts: The test I have found fault with is the reading test, which is still (theoretically) going to be administered every two years. That test is a problem because, like all reading comprehension tests, it perpetuates the mistaken idea that you can test reading comprehension in the abstract—which leads to the equally mistaken assumption that you can teach reading comprehension in the abstract, divorced from any particular text or content.

That approach to comprehension instruction may not be the only reason NAEP reading scores have been stagnant or declining for over two decades now—and that the gap between high- and low-scorers has widened. But because it leads schools away from a focus on building the knowledge and vocabulary that truly undergirds reading comprehension, it’s a huge contributing factor.

Does it really matter that some lesser known tests have been canceled, perhaps never to be revived again? (They haven’t all been canceled. Tests in eighth-grade Civics and U.S. History are still on tap for 2025 and 2030, along with 12^th grade Civics in 2030—at least for now.) After all, no one seems to pay much attention to them; witness the fact that Barshay, as far as I can tell, is the only journalist to report on their cancellation. And states still give their own tests, although few test history or social studies.

Actually, it does matter. State test scores can be and have been manipulated to create the illusion of progress. And we should be paying attention to the fact that NAEP scores in U.S. History, Civics, and writing are even worse than those in reading, where about a third of students test at the proficient level or above. The most recent tests in U.S. History showed that only 13 percent of eighth graders met that bar, with 40 percent testing below the “basic” level. On the 2011 writing tests, only about a quarter of eighth and twelfth graders were proficient.

Connecting the Dots

Standardized tests can only tell us we have a problem; they can’t tell us what to do about it. But if we carefully connect the dots between low scores on these other tests and low scores on the reading tests, we can—maybe—begin to scope out a solution.

Some have concluded that students score low on content-area tests because they need more practice in reading comprehension skills like finding the main idea of a text.1 But the more likely connection goes in the opposite direction. That is, reading scores are low because no one has provided students with the background knowledge and vocabulary they need to understand the passages on the tests. Evidence suggests that a lot of that knowledge and vocabulary comes from history and social studies, areas that are sadly neglected, especially at the elementary level.

Thanks for reading Minding the Gap! This post is public so feel free to share it.

Similarly, writing test scores are connected to reading scores. Good readers aren’t always good writers, but it’s rare to find a good writer who isn’t also a good reader. One reason is that if students have learned how to use complex syntax in their own writing—structures like subordinating conjunctions—they’re better able to understand that kind of syntax when they encounter it in their reading. Another is that learning to write in more complex ways develops the habits of analytical thinking that you need to understand complex text.

Having data on these lesser known tests doesn’t guarantee, of course, that people will make the appropriate connection between low results on them and low reading scores. So far, few have. But if the scores exist, it’s at least possible to use them to make the argument. Without them, we’re left with anecdotes about students’ lack of knowledge about the world. Although such anecdotes are plentiful, they’re not as scientific as test scores from a representative student sample.

Test Score Gaps

Another reason to keep giving any of these tests is that they present us with stark evidence of inequity in our education system—inequity that has only been growing. As I’ve argued before, gaps in test scores aren’t fundamentally a matter of race or ethnicity, although that’s the way they’re typically reported. At heart, they stem from different levels of parental education: more highly educated parents are more likely to immerse their children in the kind of academic knowledge and vocabulary that predicts success.

A few months ago, I presented suggestions for improving the NAEP reading test. Given the reduction in resources since then, I’m not expecting any of those to be implemented. But I’ll scale my suggestions back to the following two (not that I expect these to be implemented either):

Ground the reading passages in commonly taught topics in social studies and science—instead of trying to avoid those topics in a misguided effort to level the playing field. Using random topics for test passages only privileges the kids who are more likely to have picked up knowledge of those topics outside school—in other words, those who are already privileged. Every state has social studies and science content standards. They do vary, but there’s a lot of overlap. For example, most states require U.S. history in eighth grade. Why not have some passages on the eighth grade NAEP reading test that are grounded in U.S. history?
When reporting scores, put more weight on factors like income and parental education. Of course, race is still hugely important in our society, and racism exists. But it’s not the root cause of low reading scores. If people are led to believe it is the root cause, they’ll naturally focus on combating racism rather than on providing students with the knowledge they need to succeed—which cognitive science suggests is the only thing that will work.

Education Equity and Cognitive Science

Speaking of which … for anyone interested in cognitive science and equity, I highly recommend this recent post on the Science of Learning Substack, titled “Teaching for more equitable outcomes: The missing ingredient.”

The post, by researchers Nidhi Sachdeva and Jim Hewitt, analyzes the evidence for the three most commonly embraced prescriptions for education equity: Culturally Relevant Pedagogy, Universal Design for Learning, and Social-Emotional Learning.

I won’t go into descriptions of these approaches, but I’ll just say that—as Sachdeva and Hewitt point out—they’re all premised on the theory that if instruction is designed to engage struggling learners, the result will be more equity, however you measure that.2 The authors note that while all of these approaches have something to recommend them, none focus directly on improving academic achievement—and the research on whether any of them do improve such achievement is equivocal at best.

“Real equity,” they conclude, “depends on something else: knowledge.” To build and deepen knowledge, they recommend instructional practices backed by cognitive science—like explicit instruction, retrieval practice, and low-stakes assessment with targeted feedback. The post provides a handy catalog of studies showing that these approaches benefit all students, but those who benefit the most are those from disadvantaged backgrounds and those who struggle the most to learn.

Eliminating data on test score gaps doesn’t mean we’re eliminating the gaps themselves. We should still strive to narrow them, whether we can see them clearly or not—and classroom practices backed by cognitive science are our most reliable guide in that effort.

Others have pinned the blame for reading difficulties on inadequate instruction in phonics—which could be part of the explanation, but we just don’t know how much. In any event, the positive effects of phonics-focused reform don’t seem to last beyond fifth grade.

Measurement can be a tricky issue, because some proponents of Culturally Responsive Pedagogy reject typical assessments of academic achievement (like the NAEP) as inherently racist, on the grounds that most students of color do worse on them than most white students. This is the kind of stance that race-based reports of test scores can lead to. It’s true that Black students are disproportionately likely to score low. But in terms of absolute numbers, more white students score low on these tests than Black students do. If the tests were racist, you wouldn’t expect to see that result.

Paul Kirschner

Apr 29

The stable genius did something equivalent if I remember correctly with respect to covid. If you don’t count the cases they cannot increase.

Expand full comment

Nicholas Wilson

May 4

Natalie,

Thanks for sharing this post with us! I had two follow-up questions I'm curious to get your thoughts on:

1. You mention that reading comprehension is tied to background knowledge—how can schools balance the need for content-rich instruction with the current emphasis on tested reading skills?

2. Do you see any promising models (districts, states, or programs) that are successfully integrating knowledge-building curricula and showing gains in equity or performance?

3 more comments...

Minding the Gap