What Nobody Is Saying About the NAEP Reading Scores

They're not just about phonics.

Natalie Wexler

Feb 02, 2025

134

1×

0:00

-12:43

The NAEP, or National Assessment of Educational Progress, is sometimes called “the nation’s report card.” It consists, essentially, of tests in reading and math given to a nationally representative sample of fourth- and eighth-graders every two years. The most recent results came out a few days ago and have triggered a tsunami of dismay.

I’ve written about problems with the reading tests before, and I didn’t think I would have much more to say this time around. But after reading the barrage of media reports and commentary, I’ve changed my mind. Much of it, in my view, is misleading.

The test results have been reported extensively elsewhere. Suffice it to say that aside from a slight uptick in fourth-grade math scores, they’re dismal. Reading scores dropped to a historic low for eighth graders, with 33 percent scoring below the lowest level, “basic.” Forty percent of fourth graders also fell into that category, the highest proportion in 20 years.

“I don’t think this is the canary in the coal mine,” education researcher Dan Goldhaber told the Washington Post.“This is a flock of dead birds in the coal mine.”

To make matters worse, the decline was driven by the lowest-achieving students. In other words, the scores of high performers stayed pretty much the same, but scores for low performers plummeted.

“Certainly the most striking thing in the results is the increase in inequality,” the vice-chair of the board that oversees the NAEP, Martin West, told the Hechinger Report. He added, “That’s a big deal. It’s something that we hadn’t paid a lot of attention to traditionally.”

I certainly hope the governing board will start paying a lot of attention to inequality, but the question is why they apparently haven’t been doing that before. NAEP scores have long reflected inequality, and the gap between high- and low-scorers has been expanding for years. The government officials who announce the scores routinely mention that fact, and it’s been reported by the media.

Another thing I hope will change is the way those gaps are reported. They’ve almost always been framed in terms of race and ethnicity rather than income or socioeconomic status (SES). That has been due partly to a lack of confidence in the standard measure of income: whether students qualify for free or reduced school meals. This year NAEP has added two other criteria that bolster reliability: the number of books in a student’s home and the highest level of education of either parent.

Based on the new formula, 77 percent of students from high-SES families scored above the national average in reading, compared with only 32 percent of those from low-SES families. Of course, there are racial gaps too. For example, 64 percent of white students scored above average, versus 36 percent of Black students. But as I’ll explain below, I think SES rather than race goes to the heart of the reason that these test-score gaps exist and persist.

NAEP Scores and the “Science of Reading”

Education experts have long cautioned that it’s risky to draw conclusions from NAEP scores about the causes of declines or improvements. The fourth graders who took the test in 2024 were a different group than those who took the test in prior years; the difference in their scores could have to do with all sorts of other changes, including demographic ones, rather than changes in educational methods. The NAEP isn’t a randomized controlled trial, so it isn’t possible to infer the reasons for a change in scores with any degree of confidence.

But of course, many observers have succumbed to the temptation to speculate. The most obvious explanation is the pandemic, but commentators generally agree it’s an inadequate one. Scores were declining and gaps were widening even before Covid. Screens and social media have also been blamed.

Then there’s the “science of reading.” News reports have generally expressed surprise that the drop in scores occurred in the face of the recent widespread embrace of that term, and the movement behind it—which is almost always defined to mean an emphasis on phonics.

Thanks for reading Minding the Gap! This post is public so feel free to share it.

The New York Times, for example, reported that scores dropped “despite a robust, bipartisan movement in recent years to improve foundational literacy skills”—which means phonics plus things like teaching kids to hear and manipulate sounds in words. The Wall Street Journal noted that the results come at a time when “many school districts and states have emphasized phonics-based instruction, known as the science of reading.” The Washington Post joined the chorus, observing that “the results come as many districts and states work to change their reading instruction to focus more on phonics and what’s called the science of reading.”

(To be fair, the New York Times defined the “science of reading” as “a strong focus on structured phonics and vocabulary building”—emphasis added. But even that formulation suggests that you can turn kids into proficient readers just through phonics drills and lists of random vocabulary words. That’s not the case.)

The same phonics-focused commentary popped up in connection with one of the few bright spots in the data: Louisiana, where fourth graders’ scores rose to surpass pre-pandemic levels. Previously, the state had ranked near the bottom in that category nationally; now it’s number sixteen.

The official charged with relaying the perennially depressing NAEP results, Peggy Carr, pointed to Louisiana and explained that “they did focus heavily on the science of reading” as a sign that “hope is not lost.” Carr didn’t provide a definition of “the science of reading,” but, unsurprisingly, media reports have assumed she was talking about phonics.

The NAEP Is a Test of Comprehension

Louisiana may well have done a better job of teaching phonics than most other states—or started doing that before others did—and for that the state deserves credit. You can’t be a proficient reader unless you can sound out written words automatically, and most students need systematic phonics instruction to be able to do that. But as everyone acknowledges, you also have to be able to understand what you read.

In fact, the NAEP—even at the fourth-grade level—doesn’t even purport to measure students’ ability to read individual words. Rather, as one member of the board overseeing the test explained to the Hechinger Report, it’s “a test of comprehension.” It purports to measure things like the ability to infer the meanings of words from context or find the main idea of a text.

Given that the NAEP supposedly assesses comprehension, it’s puzzling that commentary on the scores has almost exclusively focused on phonics. You’d think someone might suspect there’s a problem with instruction in reading comprehension. In fact, a panel of reading experts convened by the NAEP’s governing board back in 2018 made that argument, but it doesn’t seem that anyone connected with NAEP—or the media—remembers it, or even noticed it in the first place.

Briefly, what those experts said (and what has been said many times since, by me and others) was this: We have made the mistake of trying to teach comprehension as though it were a set of transferable skills. In fact, comprehension is far more dependent on knowledge. Whether you can find the main idea of a text has a lot more to do with whether you have relevant knowledge in long-term memory than with how much you’ve practiced finding the main idea.

That brings us to Louisiana. I’m cautious about drawing any conclusions about what has worked there, based on the NAEP, for the reasons mentioned above. But if there is a causal relationship, it shouldn’t be reduced to phonics or a “back to basics” approach, which also suggests a focus on decoding words.

For students to acquire the kind of knowledge that enables reading comprehension, schools need to build it systematically, through a content-rich curriculum, ideally beginning in kindergarten. That’s especially true in the case of students from low-SES families, who are less likely to acquire academic knowledge and vocabulary at home. Hence the importance of focusing on SES-based score gaps.

Louisiana Embraced More than Phonics

Louisiana did embrace the phonics part of the “science of reading” long ago, but it also did a lot of other things, including ensuring that educators understand the connection between reading and knowledge.

Years ago, the state created its own content-rich literacy curriculum, Guidebooks, which begins at third grade is used in most classrooms in the state. It also identified other knowledge-building curricula, including some that start in kindergarten, and provided incentives for districts to adopt them. And it has experimented with a different kind of state reading comprehension test—one that’s based in content students have actually learned about through the curriculum.

In addition, the state has done much to encourage teachers to adopt an explicit method of writing instruction that begins at the sentence level and is embedded in the content of the curriculum. Activities based on that method have been incorporated into its Guidebooks curriculum in grades three to five.

Thanks for reading Minding the Gap! This post is public so feel free to share it.

What does writing have to do with reading comprehension? A lot. For one thing, teaching students how to use the complex syntax of written language puts them in a far better position to understand that syntax when they encounter it in their reading. For another, research tells us that when students write about what they’re learning, it boosts their retention and understanding of the information—as long as they’re taught to write in a manageable way.

I suspect that all these initiatives have something to do with the uptick in Louisiana’s fourth grade reading scores, and I expect to see more increases in the state’s scores in the future if educators there continue what they’re doing. Fundamentally, though, I think tests like the NAEP are not only unreliable indicators of causation but also extremely rough guides to where progress is being made and why.

What Test Scores Can Miss

For one thing, there’s a lot of variation even in a state like Louisiana, and districts that aren’t doing a great job can lower the state’s average, possibly obscuring districts that are knocking it out of the park. State reading tests, which show results for individual districts and even schools, might be better measures—but even they don’t tell us everything we need to know.

About a year ago, I visited a high-poverty district in Louisiana, Monroe, that was combining a content-rich curriculum with an explicit approach to writing instruction, and what I saw and heard there blew me away. Early elementary teachers told me their students’ oral language had improved “tenfold.” Teachers at upper elementary levels relayed stories of kids doing things that, as one teacher said, “I didn’t know how to do in college.”

“Before, it was just a bunch of words,” one special education teacher told me. “Now they enjoy reading.” (There’s lots more about what I saw in my new book, Beyond the Science of Reading.)

Changes in a place like Monroe won’t show up on a test like the NAEP, which focuses on state-level results and those for some large urban districts. The city has, however, made some significant gains on state tests. At one elementary school, 43 percent of economically disadvantaged fifth-graders scored in the highest category on writing, ten points higher than the state average for that subgroup, and 74 percent got a passing score, 13 points above the average.

That’s impressive, but it’s not nearly as impressive as the learning I saw taking place in classrooms. If we want to make real progress, especially with students who struggle the most, we need to stop patterning instruction after tests that artificially separate reading comprehension from content knowledge—and from writing. We need to recognize that these things are all connected.

And if we want to know if kids are learning how to decode words, we should have a national test that focuses on that—not on reading comprehension.

Linda Diamond

Feb 2

First of all the majority of states that enacted Science of Reading or research-based literacy laws did so in the past year or 2 and primarily for grades K-2 or K-3. Improvements will not show up in grades 4 or 8 yet and anyone who understands that will know this. Secondly science of reading is not just phonics and none of the legislation says that and the Louisiana work was much more than phonics with a heavy focus on content-rich text to build knowledge as well. It is important to be honest about this.

Expand full comment

2 replies

David Ziffer

Feb 2Edited

I agree, but none of this changes the fact that children who can't decode the words obviously can't comprehend the text. Regarding the states' extremely recent passage of LETRS legislation, far too recent to have had any effect on current scores: This stuff is a farce that legislatures and school systems, both blue and red, have been pulling on naive voters for decades. Public school reading instruction has been completely insane for over 80 years now; in 1955, Rudolf Flesch published "Why Johnny Can't Read", exposing how utterly preposterous their reading ideology had already become during the decade following the end of WW2. Since then, three generations of parents have labored under the fallacy that the system could be reformed if only the supposed professionals within the system could be shown the light by those of us outside. I know because I was part of the second generation to do so back in the late 1990s. We had no idea what we were up against. To get some idea of how preposterous it is that we're still fighting to get public schools to do what nearly every home-schooling mom does successfully at her own kitchen table, find "My Child Will Read".

5 replies

55 more comments...

Minding the Gap