Eight years ago, Louisiana’s top education official announced a bold experiment: a radically different kind of state reading test, one that would assess students’ learning based on what they had actually been taught.

What’s so bold and radical about that, I hear you say? (Or some of you, anyway—perhaps those who are new to this Substack.) Don’t states routinely test students on what they’ve been taught?

Well, yes and no. Standardized reading tests aim to measure students’ abilities to do things like find the main idea of a text or make an inference about what a word means in a particular context. In most schools, reading or ELA instruction focuses primarily on those kinds of skills. So in that sense, students are being tested on what they’ve been taught.

The problem, as then-state superintendent of Louisiana John White explained in a 2018 opinion piece, is that those skills don’t transfer from one context to another. To be able to make an inference, for example, you need a certain threshold of relevant background knowledge. Lots of kids lack enough background knowledge to make sense of the passages on reading tests, which are designed to avoid topics that might be covered in the school curriculum. A student might be asked to find the main idea of a passage on, say, rugby, when she’s never even heard of the sport.

Kids from less highly educated families are particularly likely to lack the background knowledge they need for the tests. And the problem becomes most apparent at higher grade levels, when the test passages assume the reader possesses increasing amounts of knowledge and vocabulary.

In his 2018 op-ed, White announced that Louisiana had just submitted a proposal to the federal Department of Education to develop an “Innovative Assessment Pilot” under the recently passed ESSA legislation.

“Rather than administering separate social studies and English tests at the end of the year,” White wrote, “Louisiana schools participating in the pilot will teach short social studies and English curriculum units in tandem over the course of the year, pausing briefly after each unit to assess students’ reading, writing and content knowledge. Students, teachers and parents will know the knowledge and books covered on the tests well in advance. Knowledge of the world and of specific books will be measured as a co-equal to students’ literacy skills. And teachers would have good reason to focus on the hard and inspiring lessons of history and books.”

The tests, to be administered three times a year at the middle school level, wouldn’t just assess students on texts they had read—referred to as “hot” reads. They would also introduce “warm” reads, other texts related thematically to those in the curriculum. If students read The Giver, for example, they would get some questions about that text and some about passages from other works of dystopian fiction. Through a mix of multiple-choice and essay questions, students would be asked to make connections across texts.

The US DOE did approve Louisiana’s Innovative Assessment, or “IA.” And those of us who advocate for recognizing knowledge as a prime component of reading comprehension eagerly awaited the results of the experiment.

What Happened to the IA?

It’s been a long wait. White stepped down as superintendent in 2020, after eight years in the position. The IA experiment continued, but Covid disrupted testing for a couple of school years. And White’s successor as superintendent seemed to lose interest in the project.

Louisiana adopted new social studies standards in 2022, which—White told me—required separate reading and social studies tests rather than the combined test that was envisioned in the state’s proposal. That reduced the appeal of the IA for school districts. At its peak, only 28 percent of districts in the state were participating. In 2024, the state quietly discontinued the experiment—so quietly that if you ask Google when it was discontinued, it will tell you that it’s still going, even though I have been assured by people involved in the IA that that is not the case.

So there isn’t a lot of data to work with. The IA was fully operational for only two school years, 2021-22 and 2022-23. For the first of those years, when it covered just grade seven, only about 4,700 students participated. The second year, when it covered grades six through eight, it reached 17,000 students.

It’s frustrating that the experiment wasn’t given more of a chance to succeed. But the data we do have are promising—particularly with regard to changing teacher practice.

Student Performance

Let’s start, though, with the preliminary data on student performance. One of the state’s partners in the effort, NWEA, released a white paper in 2021 finding that “many students” reported feeling less anxious while taking the IA as compared to the regular state test, known as the LEAP. The white paper also showed that students were generally more engaged when taking the IA. And Louisiana’s 2023 annual report on the IA to the U.S. Department of Education reported that 66 percent of students who took the IA preferred it to the LEAP, with the same proportion saying they felt confident or very confident in answering the questions.

What about narrowing the gap between students who are economically disadvantaged and those who are more affluent? That gap remained substantial on the IA, but the NWEA report found that it was significantly smaller than on the LEAP. “The new test design may be ‘leveling the playing field’ by providing students a more equitable opportunity to show what they know,” the report suggested.

More Focus on the Meaning of Texts

The IA was also the subject of a 2022 PhD dissertation, written by none other than John White himself. White focused not on students but on teachers in a subset of districts using the IA in the 2019-20 and 2020-21 school years, relying on surveys and interviews.

White, now CEO of Great Minds—a curriculum publisher whose products include the knowledge-building literacy curriculum Wit & Wisdom—turned up one significant finding: Teachers who used the IA were more likely to focus their instruction on the meaning of a whole text rather than on isolated comprehension skills.

White and others have found that even in districts using content-rich, knowledge-building curricula, teachers often continue to focus on isolated skills. Standardized “benchmark” tests, given throughout the school year, appear to be a major factor; they send educators the message that struggling students need more work on skills—when in fact, the problem may be a lack of background knowledge. The result, according to a recent rigorous study from SRI, is that instruction is “superficial” rather than “robust.”

In districts participating in Louisiana’s AI experiment, on the other hand, teachers trusted that the IA’s interim tests were reliable measures of their students’ progress. Those tests guided them to focus primarily on content rather than skills.

In the interviews White conducted, teachers made it clear that testing was the major influence shaping their instruction. “The most important thing is whatever test we’re giving,” one teacher told him, “because it is how I’m judged as a teacher. It’s how your students are compared to other students. It’s how your school is compared to other schools.”

Even though teachers were using the same curriculum as before—Louisiana’s state-created ELA Guidebooks 2.0, which is content-rich—they reported that the IA changed their approach. “We used to devote time to test prep, and we would just do practice LEAP tests,” one teacher said. “We don’t do that anymore. We devote our time to diving into the unit and making sure that students have a strong understanding, as much background knowledge as we can possibly give them.”

Little Change in Teacher Beliefs

While the IA did prompt teachers to change how they taught, White found no strong evidence that it changed their underlying beliefs about the goals of comprehension instruction. He told me that didn’t surprise him, given the brevity of the experiment and the many influences on teachers’ beliefs. A change in a “procedural dictate,” which is typical in a large system, is likely to produce only “procedural responses,” he said.

Research by assessment expert Thomas Guskey supports that observation. Guskey found that changes in teachers’ beliefs generally occur only after they try out a new classroom practice—and see that it produces student success. But if the system guides them to measure that success by increases in supposed skills, they may not notice or attach much importance to increases in students’ knowledge. A test that rewards increased knowledge and skills could change teacher beliefs eventually, but unfortunately Louisiana’s IA didn’t last long enough to reveal whether it would have had that effect.

Even if the IA had lasted longer and produced more dramatic results, the exact model couldn’t have been replicated in other states. Louisiana was able to engage in its experiment only because the vast majority of its schools use the same literacy curriculum, ELA Guidebooks. The state’s 2023 report to the US DOE estimated that 80 percent of Louisiana classrooms were using it. In other states, districts use a wide variety of curricula, covering different texts and topics. There would be no common content in which to ground a test like the IA.

What Other States Could Do

There is, however, another possibility. State literacy standards rarely specify content—they just list comprehension skills—but all states have social studies and science standards that do include specific topics. Any state could, theoretically, ground the passages on its state reading test in the content of those standards, giving teachers an incentive to teach those subjects and helping to level the playing field for students.

White is skeptical that will happen. While he believes the education reform movement’s “gravest mistake is to be agnostic on the substance of what kids learn,” he says there are practical obstacles to tying tests to specific content. State officials in charge of curriculum have no control over the design of the tests. That authority belongs to psychometricians, whose primary concerns are that tests be “reliable,” aligned to relevant state standards, and comparable within and between states.

That perspective, White says, is narrow. “There’s not great curiosity, in that technocratic worldview, of getting under the messy hood of, well, was it worth it?” he says. “Did kids learn the content that people need to be productive?”

Changing state reading tests to align with content standards would, White says, “take a level of vision, coordination and time.” Officials would need to not only engineer a new kind of test and pilot it but also convince others—including the state board of education and the legislature—that “it’s worth moving off the old reliable model and onto the new one.” It would also cost money. There was no federal funding for the IA, and Louisiana had to raise the funds in part through philanthropic support.

Steep as the obstacles are, the potential rewards are huge, and I can only hope that at least one state will take on the challenge. White says the federal government could help by establishing a pool of funding that states could draw on for research and development in the area of assessment.

He added that the current Republican administration might be more open to experimentation with testing than a Democratic one. In the past, White says, Democratic administrations have seen proposed changes in testing more as a threat to civil rights than as a means of improving outcomes for students in historically disadvantaged groups.

“What gets tested gets taught,” according to a timeworn but clearly evidence-based adage. If we continue to test illusory skills, that’s what teachers will continue to focus on, to the continued detriment of many students. What those students need, if they’re going to have a chance to succeed, is another state education leader who is willing to try a bold and radical experiment—and whose efforts aren’t swept away by political winds before they’ve had a chance to take root.