Whatever "Model" You Use, Reading Tests Prop Up a Toxic Education Environment
The assumption that general reading comprehension can be measured has led to a regime that stifles the potential of millions of students
An effort to revise the nation’s preeminent reading test has divided the board that oversees it. One faction subscribes to an academic perspective that has unintentionally stifled the potential of millions of students, but the other faction is advocating much the same thing.
Every couple of years—except when delayed by a pandemic—a representative sample of American students take reading and math tests as part of the National Assessment of Educational Progress, or NAEP. Reading scores have been stagnant or declining for decades, with about two-thirds of students scoring below the “proficient” level, and the gap between high- and low-achievers is widening.
Last year, a committee appointed by the 26-member board that oversees the tests—the National Assessment Governing Board, or NAGB—proposed a revised “framework” for the reading test. Since then, there’s been a high degree of internal strife, possibly leading to what some have termed a “gag rule” on written communications. On June 21, NAGB released a document—called the “Chair’s Draft”—that attempts to broker a compromise. Whether that compromise will lead to consensus or more polarization when the board votes on it in August remains to be seen.
This is more than an academic tempest in a teapot. The original proposal was based on the “sociocultural” model of reading and learning, widely embraced at schools of education. That model emphasizes social interaction and the influence of culture—particularly cultures that have been historically marginalized. Proponents generally see it as being in opposition to the “cognitive” model, which emphasizes what goes on within the mind.
Obviously, learning is influenced by both cognition and things like social interaction and culture; no one denies that. The question is how much emphasis each of these factors should get—and what knowledge is most important. Both models see prior knowledge as key to learning, but the sociocultural one prioritizes the knowledge students acquire at home and in their communities, often referred to as “funds of knowledge.” The cognitive model prioritizes whatever knowledge students need to understand what the curriculum expects them to learn next. You’ll have an easier time understanding the Civil War, for example, if you know something about what happened in American history before that.
Holding students accountable for knowledge we’ve denied them
Both kinds of knowledge should be valued. But if kids aren’t acquiring much academic knowledge at home or at school, they end up being held accountable in upper grades—and on reading tests—for knowledge to which they’ve been denied access. That’s essentially the situation in which many American students currently find themselves. The sociocultural model may be relatively new, but it leads to the same problem as earlier well-intentioned pedagogical models like progressivism and constructivism, which also took a dim view of explicitly building academic knowledge: many kids end up not learning much—and scoring low on reading tests.
How does all this relate to the proposed revisions to the NAEP framework? Most controversially, the original draft proposed measures designed to “mitigate” differences in students’ background knowledge, apparently on the theory that such differences interfere with measuring true reading comprehension ability. Among those measures were text boxes that would pop up to explain words students might not know—for example, the phrase “talent show” in a fourth-grade test passage about a girl who enters one. (The changes, which would take effect in 2026, are designed for tests taken on digital devices.)
It would certainly make sense for a teacher to explain a phrase like “talent show” in the classroom, if students hadn’t picked up that information elsewhere. But to explain it on a reading comprehension test threatens to undermine the whole endeavor. A test like the NAEP should reveal whether schools are building, for all students, the kind of vocabulary that enables reading comprehension. If the tests provide the vocabulary definitions, we’ll never know.
The Chair’s Draft has eliminated the “talent show” pop-up example, along with many references to sociocultural factors. But it retains the concept of pop-up definitions for “obscure” words, providing as an example the word akche in a fourth-grade test passage called “Five Boiled Eggs.” A text box defines it as “a unit of Turkish money.”
That’s certainly obscure, but the rest of the story illustrates a more fundamental problem with standardized reading tests—both the NAEP and the reading tests that states mandate every year. It’s full of words that appear in written language but rarely in conversation: fortune, innkeeper, dwindled, merchant. If enough of these terms are unfamiliar, readers’ comprehension can be affected—or their test scores. One guide to scoring answers to questions about “Five Boiled Eggs” gives an example deemed to show “little or no comprehension”: a child had provided an accurate summation of the story but used merchant to refer to the innkeeper. If you don’t know what either term means, it’s easy to confuse them. Does that mean you don’t understand the story?
Reading comprehension can’t be measured in the abstract
The basic problem, in both the previous draft and the Chair’s Draft, is the assumption that reading comprehension can be accurately measured in the abstract. That problem is most obvious with test items that relate to science or social studies, which will constitute 50% of the NAEP items at the fourth grade level, 60% at eighth, and 66% at twelfth. Largely because of the emphasis on teaching reading comprehension “skills” in elementary schools—an emphasis implicitly encouraged by tests like the NAEP—many fourth-graders will have had little or no social studies or science. If they haven’t been able to pick up that kind of knowledge at home, they’ll have a hard time understanding test passages grounded in those subjects. They’ll probably also have a hard time absorbing knowledge of these subjects when, eventually, they do get taught, because their textbooks will assume background knowledge they don’t have—as will the passages on reading tests.
The Chair’s Draft tries to dismiss this concern about “topic knowledge,” arguing that if, for example, students who don’t know much about cricket are confronted with a passage about it, they “could use their knowledge of other sports to make sense of the text.” Any American sports fan who has tried to make sense of a passage about cricket without knowing much about it will see the fallacy there. To be sure, if you have enough general knowledge about, say, history, you could read and understand a passage on a historical topic for which you lack specific knowledge, as long as the passage provides enough information. But NAEP’s own subject-matter tests show that few American students reach that threshold of “enough” general knowledge; only 15% of eighth-graders score proficient or above in U.S. history, for example. That’s the result of an education system that prizes teaching illusory comprehension skills over teaching content.
Even with fictional stories like “Five Boiled Eggs,” though, general academic vocabulary is vital to comprehension. You might think students would pick up that vocabulary through reading instruction, but in almost all elementary schools—and some secondary schools—kids are limited to reading books they can easily understand on their own. So if they don’t already have a lot of academic vocabulary, they’re unlikely to acquire it through their reading.
The Chair’s Draft concedes that the effects of “topic knowledge” on comprehension are significant and well established. But the document argues there’s less consensus on how to build students’ knowledge in a way that will boost comprehension. “What is the role of knowledge in the classroom?” the draft asks rhetorically. “What are the different types of knowledge? What knowledge needs to be taught? Whose responsibility is it to teach that knowledge?”
NAGB doesn’t need to resolve all those questions (although I think most people would say that schools have primary responsibility for “teaching that knowledge”). But it could at least admit that attempts to measure reading comprehension in a vacuum are inevitably misleading when schools fail to provide all students with access to the kind of knowledge that is actually being tested. The Chair’s Draft appears to reaffirm the cognitive model, as opposed to the sociocultural bent of the previous draft, but it overlooks evidence from cognitive science that reading tests are, as cognitive psychologist Daniel Willingham has memorably put it, “knowledge tests in disguise.”
The NAEP isn’t just a bystander that merely “provides information about what students have learned,” as the Chair’s Draft claims. By helping to perpetuate the notion that reading comprehension can be accurately measured in the current context, the NAEP is propping up a massive, deeply entrenched regime that suppresses the potential of the most vulnerable students. Abolishing the NAEP reading test wouldn’t bring down that regime overnight. But it’s considered the “gold standard” of such tests, and people pay attention to what NAGB says. So it could certainly provide a powerful nudge in the right direction.
I’m well aware that reading comprehension tests are supported by what appears to be a highly scientific and longstanding body of evidence and expertise. But there’s a whole body of evidence that points the other way (and, of course, medical practices like bloodletting were once supported by a longstanding body of “evidence and expertise”). Nowhere does the recent draft mention theories of working memory and cognitive load and their relationship to comprehension. Nor does it mention evidence that countries with content-focused curricula—and national tests grounded in that content—do better on international assessments and have smaller score gaps between high- and low-achievers.
The NAEP reading test provides—or should provide—a much-needed barometer of how much academic knowledge and vocabulary American children are or are not acquiring at school. But so do the NAEP tests in U.S. history and geography, civics, and science, which routinely yield even more dismal results that get far less attention than the reading scores do. Why not stop giving the reading test and shine the spotlight where it belongs?
Realistically, I don’t expect NAGB to do that. But I hope they’ll at least correlate scores on social studies and science reading test items with evidence on students’ “topic knowledge” and release the results. Still, while I’m sketching out my dream testing-regime, here are some more details:
· Give phonics screening tests to students at the end of kindergarten, as is currently done in England. This should be done by NAEP with a national sample, to see how the U.S. is doing overall in providing kids with crucial foundational reading instruction (my prediction: not great). Such screening should also be required at the local level for all kids. Those who fail don’t need to be held back—they should just get more support and take the test again the following year.
· States should eliminate their annual reading tests and instead administer tests based in their social studies and science standards—or in a more detailed curriculum aligned to those standards. Alternatively, states can continue to call them “reading” tests but ground the passages in the substance of the English and social studies curriculum. Louisiana is currently experimenting with that idea, with promising results.
If we don’t do at least some of these things, I fear we’ll never free ourselves from the toxic morass of our current “reading” regime, which has unintentionally damaged the self-esteem and blighted the futures of far too many students.
This post originally appeared on Forbes.com.
Update: The NAEP governing board adopted the Chair’s Draft on the revised reading test framework on August 5, 2021.